Classified Website Scraping and Posting (1828362)

已取消 已发布的 Feb 9, 2012 货到付款
已取消 货到付款

1. I need to have data scraped/extracted from a classifieds site and then imported into my classifieds website database

2.

## Deliverables

1. I need to have data scraped/extracted from a classifieds site and then imported into my classifieds website database

2. My website url will be provided by pm.

3.

1. The urls of the 4 websites that data needs to be copied from are will be provided by pm

4. The scraper should sit and run on my server with a standard lamp config.

5. We should be able to enable / disable script

6.

a. The scraper will scrape new ads n number of times a day ( should be a variable we can specify) This will ideally be set to 2.00 am India Standard time .

b. We should also be able to specify the number of ads category wise to be extracted per day.

e.g. 100 ads, 150 ads or set to zero to extract all ads for a particular date

7. The scraper will scrape initially scrape the last 1 months data from the above mentioned four sites to populate our website initially.

8. The scraper will populate the relevant categories / sub categories in the our sites database automatically. Since the categories may not be identical , categories would need to be mapped to their relevant ones in our website where required.

9. In case we wish to populate only certain categories/ subcategories rather than all, a provision for the same should exist. Ideally a tick mark with select all or the categories we need to select.

10. Bad words filter : We should be able to filter out ads containing *bad words* we specify.

11. In case we would want it to pull only ads with photos, an option should exist for the same.

12. Here is the information that will need to be extracted:

Title, Description, Seller Name, Seller, Seller email, Category, Sub Category, Phone Number, City, Photos

13. If the script tries to import and ad and it finds a duplicate username already in use, then It should be able to modify the username and then import it again. E.g. if user abc already exists in the database then the script should be able to add a number abc123 etc.

14. Please also show examples of other extraction/scraping scripts you have done in the past.

15. Lastly we are coming with a second classified website based on [][1] [url removed, login to view] in which the same data would be need to be filled . The script would need to populate the same as well.

Should be a fairly straight forward job for someone who knows what they are doing. Please feel free to message me with any clarifications or suggestions.

数据库管理 MySQL 脚本安装 shell脚本 SQL

项目ID: #2709863

关于项目

远程项目 活跃的Feb 9, 2012