Stephen J. Barr

some GH pounds and in a hurry Payday loan 100 up to

csvfix is cool

I have been manipulating some data files in comma separated value format. I have been finding the tool, csvfix to be extremely cool.

For example, I have recently been working with some COMPUSTAT data. The data files are a few hundred megabytes and have several hundred columns, and hundreds of thousands of rows. I wanted all the rows, but only 30 or 40 of the columns, so I had subsetted the large files into much smaller ones. However, I forgot to get the SIC codes. So, rather than open all the files again, I used csvfix.

csvfix order -fn cusip,SIC *.csv | csvfix unique -o quarterly_cusip_SIC.csv

The first command grabbed the cusip and SIC columns from all the .csv files in the current directory. The output of this is redirected to the second command which takes the incoming csv data, separates out just the unique rows, then outputs them to the file quarterly_cusip_SIC.csv.

csvfix seems to be able to do much more than this. The manual looks pretty good – RTFM.


Categorised as: Uncategorized


Leave a Reply

Your email address will not be published. Required fields are marked *

*

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>