We are building a database of publicly available data from 6-8 sites. The data is in public domain and will consists of 75k - 100k records. You don't have to parse the data on the page or put it in a database, all we need is the raw html page.
I'd prefer to have a script that will download the pages, but it's ok if it's not.
## Deliverables
Final deliverables will be raw html files organized by site url.