Web scraping and data extraction (Scrapy + ScrapingHub)

已取消 已发布的 Aug 17, 2015 货到付款
已取消 货到付款

We require a web scraping and Python expert. PLEASE ready the project description before you apply with a full understanding of our needs. We do not need the results of this scrape to be delivered in a spreadsheet. The data will need to be collected in Scrapinghub so we have access to the data via API.

This project is for specific brands in two websites to be scraped every 12 hours, retrieving SKU, Price and Page URL. Scrapy must be setup then configured in ScrapingHub with the data that is extracted, stored in ScrapingHub.

The spider should store the last four scrapes of pricing data for each SKU. After that, the spider can overwrite the first entry. So for example, the spider should crawl and extract the data mentioned above. Then crawl every 12 hours for 3 periods so the data in ScrapingHub contains the SKU, page URL and pricing for 4 x 12 hour periods. After 48 hours (4 periods), the spider can overwrite the first entry and continue, over-writing the subsequent three entries so we only ever have 4 price entries in ScrapingHub for each SKU.

You MUST have demonstrated experience with Scrapy. We aren't interested in other systems as this is a long term project and we need a known tool. We're also not interested in alternatives to ScrapingHub - we need this data stored where we can access it.

Looking forward to working with the best. This will be easy for someone with experience. You must have a good commend of English please.

数据挖掘 Python 网页搜罗

项目ID: #8286123

关于项目

1个方案 远程项目 活跃的Aug 17, 2015

1 威客就此工作平均出价 $188

amityadav4788

A proposal has not yet been provided

$188 CAD 在3天内
(2条评论)
2.3