Classified Website Scraping and Posting

已关闭 已发布的 Feb 9, 2012 货到付款
已关闭 货到付款

1. I need to have data scraped/extracted from a classifieds site and then imported into my classifieds website database

2. My website url is [url removed, login to view]

3. The websites that data needs to be copied from are:

a. [url removed, login to view]

b. [url removed, login to view]

c. [url removed, login to view]

d. [url removed, login to view]

4. The scraper should sit and run on my server with a standard lamp config.

5. We should be able to enable / disable script

6.

a. The scraper will scrape new ads n number of times a day ( should be a variable we can specify) This will ideally be set to 2.00 am India Standard time .

b. We should also be able to specify the number of ads category wise to be extracted per day.

e.g. 100 ads, 150 ads or set to zero to extract all ads for a particular date

7. The scraper will scrape initially scrape the last 1 months data from the above mentioned four sites to populate our website initially.

8. The scraper will populate the relevant categories / sub categories in the our sites database automatically. Since the categories may not be identical , categories would need to be mapped to their relevant ones in our website where required.

9. In case we wish to populate only certain categories/ subcategories rather than all, a provision for the same should exist. Ideally a tick mark with select all or the categories we need to select.

10. Bad words filter : We should be able to filter out ads containing bad words we specify.

11. In case we would want it to pull only ads with photos, an option should exist for the same.

12. Here is the information that will need to be extracted:

Title, Description, Seller Name, Seller, Seller email, Category, Sub Category, Phone Number, City, Photos

13. If the script tries to import and ad and it finds a duplicate username already in use, then It should be able to modify the username and then import it again. E.g. if user abc already exists in the database then the script should be able to add a number abc123 etc.

14. Please also show examples of other extraction/scraping scripts you have done in the past.

15. Lastly we are coming with a second classified website based on [url removed, login to view] in which the same data would be need to be filled . The script would need to populate the same as well.

Should be a fairly straight forward job for someone who knows what they are doing. Please feel free to message me with any clarifications or suggestions.

数据挖掘 数据处理 MySQL PHP 软件构架

项目ID: #1436263

关于项目

2个方案 远程项目 活跃的Mar 20, 2012

有2名威客正在参与此工作的竞标,均价$300/小时

sonviet

I typically write the data collection tools from the website or forum, website or forum will fill the post that you want as quickly and cost-saving. I have 4 years experience in designing websites and web-based applic 更多

$300 USD 在6天内
(5条评论)
3.2
nitincool4urchat

We deal only in java right now, but we have enough experience in creating bots according to requirements. We assure you quality work within time-limit.

$700 USD 在15天内
(1条评论)
0.0