Each year The State newspaper submits FOIA requests to all state agencies and compiles a database containing salaries of all employees making at least $50,000. This data is then made freely available through a useable but fairly crappy little web interface provided by Caspio.
If you know a particular employee’s name, simply want to see who makes the most money, or just want to skim the salary numbers for a specific agency (25 at a time) this is great. If you actually want to harness the data and do something interesting, though, it… well, sucks.
So this afternoon I wrote a script that scrapes the HTML output generated by the Caspio database and spits it out into a convenient CSV file. A couple of quick formatting fixes in Excel later and it’s now available for download over on my BuzzData account.
I’m going to run a couple of basic queries over the data, but don’t have any major plans. If someone else has any ideas, please contribute them to the BuzzData project and post a comment.