Create a Scrapy spider / Python
$30-250 USD
货到付款
Need a programmer to make a Scrapy spider in order to fetch events from venue webpages. Webpages are usually written in French.
Scrapping should me made for 5 different websites (ie 5 spiders) one of them requires http authentification (which will be provided on due time).
Step 1:
- Parse webpage in order to get list of event (date + time (french locale to ISO format), end date when applicable, title, description)
Step 2:
- For each event when applicable parse event detail page in order to get more information (ie video link, photo link, description, pricing information, venue, artist myspace page... etc)
Step 3:
The script should then populate a Postgresql database.
Required Skills:
Python, Scrapy, PostgreSQL
Example
[url removed, login to view]
For each event of the calendar, get information in the event
Here is a example from the first event of the page named "Le mariage" [url removed, login to view]
Title : Le mariage
Date start: 2010-09-29
Date end : 2010-10-01
Url image : NULL
Url event : [url removed, login to view]
Description: [...]Entre rêve et rire, le monde de Gogol est peuplé de créatures d’un égoïsme enfantin, en quête de bonheur. Qui leur en voudrait? [...]
For descreiption format should strip html tags but replace all <p> tags by "\n"
Schedule:
Each event may have many dates
* 2010-09-29 20:30
* 2010-09-30 20:30
* 2010-10-01 20:30
PostgreSQL schema is ready and consist of those two tables (Event / Schedule) that will be popualted by the spider you will program.
This project is quite simple for someone skilled in Pythoin & Scrapy. It should only take few hours to complete first spider.
项目ID: #818811
关于项目
有8名威客正在参与此工作的竞标,均价$192/小时
I did more than 4 web spider projects with python. I can provide mysql+python solution for your project. this spider will work by mutil-thread.
This is my specialty, I have already got a lot of the code for this job. I just need to tweak it to you particular needs. See PM for further details.