A month or so ago, in anticipation of attending THATCamp, I thought I should finally start learning a new computer language and chose python. Alas, the reality was that I only got as far as writing the standard “hello world” program and no further. Last weekend’s time in Canberra has been percolating in the back of my mind and when a problem popped up this week, I thought “aha! I could write a script to do that”. I’ve long had a basic grasp of simple regular expressions for examining strings of text or numbers and used that to edit simple authentication scripts for a previous employer.
For the problem at hand, I needed to search for a particular barcode pattern in a spreadsheet and count them. As my regular expression experience was from perl, I’d see if I could learn enough perl to do the task. And I did. Though I cheated a little and used “grep” initially to pull out all the potential lines; basically I wanted to do a pattern match on a barcode for all lines that also included the word “Success”. So I used grep to pull out all the “Success” lines:
grep Success filename.csv >success.csv
and then wrote a perl script (csvbarcode.pl) to handle the pattern matching for the right barcode. I ran that script on my original file and, via the unix command line, piped it through a line counter:
perl bin/csvbarcode.pl | wc -l
to give me the total number of times it appeared. “wc” is an old unix tool that’s an acronym for Word Count. Adding the “-l” forces it to count lines instead of words. I’ve hardcoded the reference to “success.csv” into the perl script.
No doubt I could have done all of the steps via a perl script but my skill level isn’t there yet. I relied heavily on this tutorial and followed it through and modified it to handle the search for the barcode pattern.
Update: Had a couple of ideas overnight and improved the code this morning. It can now take a CSV file as an argument, and output the total number of successful barcodes. I no longer need to grep the file before, or run wc to count the lines afterward. The only thing it probably needs is a check to make sure the file itself is in the correct format.