发布项目

crawler

已取消已发布的 Jul 29, 2010 货到付款

$500-700 USD

货到付款

已取消货到付款

*expert rating removed as it seems it is not very popuplar here

Crawler needs to be capable of completing below tasks:

1. collect external links from "list pages" specified by URL (eg.: specified categories of dmoz, yahoo, and other a 5-10 other major link directories)

1.

1. Analyze PageRank of links and save only sites with pagerank higher than 3.

2. Find RSS feed(s) on the pages found above. Save RSS feed(s) in DB and make crawling result downloadable in CSV.

2. Collect twitter URLs on specified sections of [[url removed, login to view]][1]'s. Visit twitters where follower number is greater than X (defined at the beginning of the crawling with the URL - section of wefollow) and save the RSS of the twitter.

1. there is a DFD image which clarifies this but I can't attach it here...

3. minimalistic UI where results of crawlings can be accessed in a list and results downloadable in CSV. List contains: date of crawling, URL where crawling stared.

## Deliverables

DFD attached.

工程 Linux MySQL PHP 项目管理软件构架软件测试网络主机网站管理网站测试

项目ID： #3608648

关于项目

远程项目活跃的Jul 29, 2010

想要赚钱吗？

在Freelancer上竞标的好处

设置你的预算和时间表

通过工作获取报酬

大致描述您的提案

免费注册并竞标工作

发布一个这样的项目

crawler

关于项目

想要赚钱吗？

在Freelancer上竞标的好处

Freelancer

关于

条款

应用