Find Jobs
Hire Freelancers

Create a keyword/phrase counting program for webpages listed in a search result

$30-150 USD

已取消
已发布将近 13 年前

$30-150 USD

货到付款
This program should allow for entering a query term which will be searched for using Google. The webpages listed in that Google search should then all be analyzed and a final list of common phrases should be produced. It should include an option to specify of the amount of Google search results included up to 100. It should only issue a single Google query per term as up to 100 result can be obtained in a single query from Google. After this list of URLs is obtained from the Google results a list should be produced of the most common 2, 3, 4, and 5 word combinations across all webpages in the list. All special characters and code should be removed such as -.,! or any HTML, java script, etc. Note that this is webpages, not sites so only the specific page URL found in the Google query needs to be analyzed. Finally a list should be saved or output for each of the word combination amounts. **For example** if the query term is "lyme disease" this program will perform a Google search for "lyme disease." The 100 results returned for this query (max Google allows for one query) will have their URL added to a list. Every URL will then have text content scraped, special characters and code removed, and analyzed to produce the most common 2,3,4, and 5 word combinations. These lists will then be saved as a text file or output another way (console or whatever). Word combinations means words that are found together and separated by a space. So the text "one word two word three word" could have a two word combination of "word two" or a three word combination of "two word three." ## Deliverables This program should allow for entering a query term which will be searched for using Google. The webpages listed in that Google search should then all be analyzed and a final list of common phrases should be produced. It should include an option to specify of the amount of Google search results included up to 100. It should only issue a single Google query per term as up to 100 result can be obtained in a single query from Google. After this list of URLs is obtained from the Google results a list should be produced of the most common 2, 3, 4, and 5 word combinations across all webpages in the list. All special characters and code should be removed such as -.,! or any HTML, java script, etc. Note that this is webpages, not sites so only the specific page URL found in the Google query needs to be analyzed. Finally a list should be saved or output for each of the word combination amounts. **For example** if the query term is "lyme disease" this program will perform a Google search for "lyme disease." The 100 results returned for this query (max Google allows for one query) will have their URL added to a list. Every URL will then have text content scraped, special characters and code removed, and analyzed to produce the most common 2,3,4, and 5 word combinations. These lists will then be saved as a text file or output another way (console or whatever). Word combinations means words that are found together and separated by a space. So the text "one word two word three word" could have a two word combination of "word two" or a three word combination of "two word three."
项目 ID: 3347905

关于此项目

远程项目
活跃13 年前

想赚点钱吗?

在Freelancer上竞价的好处

设定您的预算和时间范围
为您的工作获得报酬
简要概述您的提案
免费注册和竞标工作

关于客户

UNITED STATES的国旗
United States
0.0
0
会员自9月 11, 2005起

客户认证

谢谢!我们已通过电子邮件向您发送了索取免费积分的链接。
发送电子邮件时出现问题。请再试一次。
已注册用户 发布工作总数
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
加载预览
授予地理位置权限。
您的登录会话已过期而且您已经登出,请再次登录。