• The script will be run on Windows Server 2016. As you know the RDP session will terminate. So I'm not sure if you can use call a browser to launch but I prefer not , use to use a headless browser in the background.
• There are 7 root sites and in total 48 subsites.
• The script take and read a list of names, generate links for 90% of the sites and crawl the remaining sites. ( Many input files, the format always remains the same however, the data/names will be different)
• All of the data is in a table on the site
• All output formats and documentation are written
• Basic features such as enabling/disabling sites, custom crawl delay, pause, play, skip, on-screen status display, custom timeout limits /retry attempts are required.
• Proxies functionality required.
• 1 site has a login.
Should be optimized for efficient use of memory and CPU. I am the project manager and a Windows System/Networking Administrator with a high IT expertise and project high feedback with 5 years experience here. I'll provide a lot of testing and system resources such as a few Windows VPS'S. Contact me if you are serious about the project. Python is preferred but not required.
Long term work/ more projects are available.
Unfinished script available.
21 威客就此工作平均出价 $175
Hi I always provide scraper/crawler for windows platform. I can use headless browsers/ or httprequests . I bid 50 for each site in the 7 sites. I will provide the code in C# Thanks
Hi there! I have read the details and i am sure i can do this task for you. I have more than 7 years of experience in this field and can get you good quality work. Looking forward to hearing back from you. Thanks
Hello! I am a python developer. I looked at your project and it seems interesting. I have all necessary skills required for this project. Ping me to discuss in detail.
Hello sir, I am a computer scientist with a lot of experience using python and selenium webdriver. I have previously discussed the project details with you but we didnt start because i went on a vacation.