Application to run automated deep search of targeted domains

已关闭 已发布的 7 年前 货到付款
已关闭 货到付款

OVERVIEW

Goal of application:

A Windows 7 through 10 native application that allows the user to run scheduled searches of targeted domains, including linked documents, for a series of keywords and phrases with a deeper search than is provided by Google Alerts, then publish any new results to an RSS feed or send to an email address. The application will have a simple user interface that can be used by anyone with average computer skills. The application will run in the background with an icon in the system tray for easy access.

I have detailed my anticipated approach, function and interface below. Other approaches are welcome, but submitter should justify their approach and demonstrate the functionality of their approach in both ease of use and ability to return search results.

Anticipated solution:

Use Scrapy to scrap domains and pyPDF to search through linked PDFs.

USER INTERFACE

Enter Domains:

Ability to input at least 10,000 domains, Can be a simple text box with each domain separated on a different line.

Search Words & Phrases:

I want to be able to input as many as 30 individual keywords or phrases that will be searched for on these domains.

Activity Indicator:

I want a button to turn the scraper on/off, along with a dialogue that indicates that the scraper is turned on, turned off, or actively running.

Run-time Selector:

The ability to input a start and finish time for the scraper to run between. Scraper will pause at end time allowed and resume where left off the next day at the prescribed start time.

Output selector:

The ability to select between publishing content as an RSS feed or distributing through a daily email, with the ability to enter an email address for distribution.

FUNCTIONALITY

The program will work as a native Windows 7+ application that can be installed, uninstalled, and operated by someone with average computer skills. If additional resources are required to run application, they should be included in the installer. Once activated the program will run in the background and sit in the System Tray for easy access. The program will open and run in the background any time the computer is booted up.

There will be a user interface that does not have to be fancy, but must work as described above. There will be no command line operations required.

The scraper will run at the prescribed time in the user interface.

The scraper will scrape all web pages and documents published to all domains within the list, along with any linked PDFs. This data will then be searched through using the keywords and phrases and all new results since the previous run will be published to an RSS feed or emailed to an address that can be entered through the UI.

There will be a list of keywords and phrases that can be entered through the user interface. Keywords will work as single words returning search results for pages and documents on which they are found. Key phrases will work as “and” functions, with any page or document including all of the words entered returning search results. That is to say, key phrases should not look for exact word ordering, any page that contains all words within the key phrase should return as a result.

OUTPUT

All pages and documents that meet the the keywords / key phrase selection criteria that are new or changed since the last run should be returned. Search results need only be compared to the previous run, not all previous results, to be considered new or changed.

Whether outputting through RSS or email, the output should be similar to Google search results in that it shows the title of the page or document as a heading and link, and below should be an excerpt from the document showing the use of the keywords or phrases as a paragraph or span.

数据挖掘 Python 网页搜罗 网页搜索 视窗桌面

项目ID: #12885079

关于项目

18个方案 远程项目 活跃的7 年前

有18名威客正在参与此工作的竞标,均价$1266/小时

mmadi

Dear Client, Greetings from Flowgica technologies, I have experience with these skills. We do have similar experience doing something similar to yours therefore I am looking forward to discuss more about more details a 更多

$1050 USD 在24天内
(9条评论)
6.0
ianoc

What experience do you have that is relevant to this project? NOTE: I have answered your question in my project proposal. Please refer my work history/client feedback if you need more convincing. Thank you. Proposal: 更多

$1500 USD 在21天内
(41条评论)
5.8
Alexod

hi. i can do this

$866 USD 在20天内
(703条评论)
5.9
esalem

What experience do you have that is relevant to this project? I have long experience in developing desktop applications and I think it will help me in serve you better. Proposal: Hello, I am VB, VBA, C#, WPF, Java, 更多

$2657 USD 在35天内
(4条评论)
5.6
Shopify

What experience do you have that is relevant to this project? Hello, I want to show you all relevant Demo and Designs which is similar to your project completed previously. To make sure about the requirement set and c 更多

$1546 USD 在25天内
(2条评论)
5.1
softsolution2000

I am ready to get started right away.... Can we discuss the project details? My distinction, payment after your complete satisfaction with the resulted task.

$1027 USD 在20天内
(5条评论)
5.2
shahiddar

What experience do you have that is relevant to this project? I am a highly skilled web researcher,data entry provider seeking an opportunity to leverage my expertise and demonstrate my high level of technical an admin 更多

$755 USD 在20天内
(15条评论)
5.4
cracken

Hi, I am competitive to this kind of task, can take good care of this project. In fact, I already done related to this job before. Let me know the best of your time so we can discuss further based on your requirements 更多

$1499 USD 在20天内
(23条评论)
4.8
eagleblackdesign

Hello! We are a professional team of web developers with huge experience in using python for custom webapps based on django and odoo. We are available and will be happy to help you with the project. Looking forward fo 更多

$1300 USD 在30天内
(2条评论)
4.3
Harunorrashed86

What experience do you have that is relevant to this project? Hi sir, I am scraping expert, I have did too many similar projects, please provide me website list that i can give you exact time frame and sample. Check 更多

$1250 USD 在20天内
(25条评论)
4.2
quickapp100

What experience do you have that is relevant to this project? Hi we are a software development company from india in past 8 years me & my Team developed approx 950 projects for global market , we try to satisfy cli 更多

$750 USD 在20天内
(3条评论)
2.9