Screen scraping

Anulat Postat la Jun 6, 2006 S-au achitat serviciile după ce au fost prestate
Anulat S-au achitat serviciile după ce au fost prestate

Scraping Requirements: The application should look for member profiles at certain websites, screen scrape the member profile information and save it in a database for later transformation. Available sites and locations for v1.0: Websites: - [url removed, login to view] - [url removed, login to view] Locations to scrape: New York City, NY (US) Miami, FL (US) Boston, MA (US) San Francisco, CA (US) Seattle, WA (US) San Diego, CA (US) Los Angeles, CA (US) Sacramento, CA (US) Fort Lauderdale, FL (US) Orlando, FL (US) Philadelphia, PA (US) Dallas, TX (US) New Orleans, LA (US) Detroit, MI (US) Chicago, IL (US) London, England (UK) Manchester, England (UK) Sydney, Australia Indexing needs: - The application should be able to auto-run scheduled searches on appointed sites as indicated, browsing for new or updated profiles that meet the indicated criteria (seel below). Profiles with no changes since the last scheduled scraping session should not be modified. - The application should leave a timestamp on every updated or newly added profile to the proprietary database, in order to automatically cleanup profiles that have reflected no activity in more than 6 months. - Whenever possible, the application should be developed so that new sites with similar profiles structures can be added to the indexing schedule without specific hardcoding needed for them. - The application should make all scraped profiles available for later data transformation in a database. - New Locations should be easily added to The application for any selected site from the system’s back office. - Scraping should be developed in such away to prevent or minimize IP detection or blocking from any of the source websites. Indexing restrictions: - Only profiles created or updated in the past 6 months. - Only profiles with photos - Only members between 18 and 59 years of age at the time of each indexation Thanks! Luciano

## Deliverables

1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.

2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):

a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.

b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.

3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).

## Platform

Please, specify turnaround, budget. Only explain how you will develop the application in order for it to be fast and easily configured for more cities. Note that you can develop a single application for both websites or two applications with seperate screen scraping logic for each one. You will have to log in to both sites ([url removed, login to view] and [url removed, login to view]) in order to be able to execute a search of profiles. We want to select a winning bidder within two days.

PHP

ID Proiect: #3556592

Detalii despre proiect

Proiect la distanță Activ Jun 7, 2006