Find Jobs
Hire Freelancers

Scrap a website for company information

€30-250 EUR

已完成
已发布大约 6 年前

€30-250 EUR

货到付款
For a research project, I would need to find identifiers (a tax code, called "YTJ", made up of 8 numbers) for a list of roughly 7700 Finnish companies. In addition, I would also need to know whether the names of the company have changed over time. I have written some R code, using rvest and RSelenium for the newspaper webpage, and jsonlite for the API, to do this. It would be greatly appreciated if you could alter this code such that I can continue to use it later if the list of the companies I need would change. I also would really like to learn more about how to do this, and from your coding. I found out that the information of YTJ and namechanges is available from two sources: an API (link: [login to view URL]) and a webpage of a Finnish newspaper (which seems actually linked to the same API mentioned above) (link: [login to view URL]). I already wrote some R-code, but haven't finalized it and of course would appreciate if could be continued--however, if it is absolutely useless of course it's great if you write your own. I would look forward to your help and to hearing from you! Below an example of what I have done so far--I'll be happy to clarify more if helpful. ********** On the example of Nokia: 1) First, I try to F - On [login to view URL] I search for "Nokia" - the result page shows a lot of "Nokia"-companies, and their Y-tunnus (also called "YTJ")--this is the number/identifier I need - clicking on for instance "Nokia Oyj", there are a list of characteristics. In the attached two images img1 and img2, you can see the information I would like to have, from the tables "Perustiedot" and "Edellinen nimitieto". The second shows whether there has been a name change in the past. Problems I can't solve: - I could not figure out how to loop through my company list such that if the search engine doesn't find a result, it just goes to the next item in my list. - I didn't find out how I could loop through the results page to get the info of all "Nokias" (for instance, "Nokia Technologies Oy") - the second table of the name changes "Edellinen nimitieto" --some companies have it and some don't, and on different "spots" 2) From the API - on the page [login to view URL] I can type "Nokia" under "name" - When clicking "try it out" I receive a code of Nokia companies (Img3 for what it looks like) - Using jsonlite, I turn this file in to a table of information.. Problems I can't solve: - loop through my company list using this page - the information in json-script includes a URL to the additional info--which should include the namechange. Other problems: - some firms cant be found because there are typos or there are company forms like "Ltd" or "Oy" that are useless and hinder the search results. I tried to just take those formulas out, but was wondering if one could also experiment with substring searches or something.... Attached: - the list of firms as "[login to view URL]" with 7744 names in it - Images of the search outcomes - my R code so far in
项目 ID: 16566340

关于此项目

7提案
远程项目
活跃6 年前

想赚点钱吗?

在Freelancer上竞价的好处

设定您的预算和时间范围
为您的工作获得报酬
简要概述您的提案
免费注册和竞标工作
颁发给:
用户头像
Hi, This is Santhosh from India. I am a Business Management Graduate with Computer Science engineering background who is passionate about Data, Mathematics and Technology. I have worked in many data analytics projects involving SAS & R. Also I have worked in projects involving Excel (Pivot Table, Macros) and VBA. I believe I have enough expertise in R to complete this project as per your requirement. I am sure that you will be 100% satisfied with my work. Please get in touch. Looking forward to hear from you. Thanks & Regards, Santhosh
€155 EUR 在5天之内
5.0 (62条评论)
6.0
6.0
7威客以平均价€150 EUR来参与此工作竞价
用户头像
A proposal has not yet been provided
€100 EUR 在2天之内
5.0 (126条评论)
6.9
6.9
用户头像
Hi sir I am Hasan Jack and I can help you to scrape data from website for company information as I have made 300+ Scrappers so far. Looking forward to hear from you. Best Regards, Hasan Jack
€250 EUR 在3天之内
5.0 (2条评论)
1.4
1.4
用户头像
Scrap a website for company information. we can easily do. we have done similar past project. Please send me a message so that we can discuss more.
€155 EUR 在3天之内
0.0 (0条评论)
0.0
0.0
用户头像
A proposal has not yet been provided
€155 EUR 在3天之内
0.0 (0条评论)
0.0
0.0

关于客户

FRANCE的国旗
Fontainebleau, France
5.0
2
付款方式已验证
会员自3月 25, 2018起

客户认证

谢谢!我们已通过电子邮件向您发送了索取免费积分的链接。
发送电子邮件时出现问题。请再试一次。
已注册用户 发布工作总数
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
加载预览
授予地理位置权限。
您的登录会话已过期而且您已经登出,请再次登录。