已取消

Scrape Wayback for AU domain name and first archived date

Unless one of the Freelancer community knows different, there doesn't seem to be a way of querying the AU TLD whois records for the creation date of Australian domain names, even with DIG or nay other commandline tools. My imperfect solution is to harvest the earliest Wayback archive entry .

Image 1 shows the two data elements I need to extract from the zonefile csv I have : (1) the domain name (2) the date the domain name was first archived in Wayback.

Using a script I found in the Wayback APIs I have built a [clunky] batch script [[login to view URL], attached] that captures data into a series of files that I batch rename to a csv [[login to view URL] example attached].

Each capture from [login to view URL] produces a file with the oldest archive date on line 1. The domain name is obvious from each line, the date is in the YYYY:MM:DD:HH:MM:SS format.

It will need a chunk of regex written into the script to filter off the first line of each successful data capture, clean up time data element to readable format (eg 19981202014938 becomes 02/12/1998) and appends the domain name and date to an external csv file.

Some of the domains have no entry in Wayback but the script will still need to write the URL to the csv with a 'nul' value as the date element. so I can see which have no Wayback records.

The script will preferably draw the urls from an external text file.

The successful bid can use the URLs listed in the '[login to view URL]' file attached as a test list for the script. I dont need the scrape, just the script (there are 2.1 million domain names to query).

Any other questions, DM me.

Some resources:
https://github.com/internetarchive/wayback/blob/master/wayback-cdx-server/README.md#basic-usage

https://blog.archive.org/developers/
https://blog.archive.org/2013/07/04/metadata-api/
https://archive.org/help/json.php

Autobids will be deleted if the proposal is not read, and acknowledged within 12 hours...

Reissuing this project as it should have been listed in AUD not USD. check for new project just listed.

技能: 数据输入, Excel, Javascript, JSON, shell脚本

查看更多: scrape wayback machine, scrape wayback, scrape site wayback machine, scrape twitter time range, check renewal date domain name, script renewal date domain, check creation date domain, release date renewal date domain, php renewal date domain, php check exp date domain, find created date domain php, creation date domain name, php check expiry date domain, check date domain, renewal date domain, find renewal date domain, find renewal date domain name, check expiration date domain, scrape data google map, php web scrape script

关于此雇主:
( 2个评论 ) Melbourne, Australia

项目ID: #14356249

7 威客就此工作平均出价 $155

Venkat2011sri

Hi, I am working as a freelancer since 12 years and completed 1500 projects. I assure you 100% accuracy in the delivered work. I look forward to work with you. Relevant Skills and Experience Data Extraction Proposed 更多

$166 USD 在3天内
(165条评论)
6.7
$155 USD 在3天内
(17条评论)
5.7
dustafo

I can get this done for you in just a couple days. check my feedback I do a lot of [login to view URL] work

$210 USD 在2天内
(66条评论)
5.6
ChinmoySarker

Hi, Being attracted with your declaration of the program, I feel tempted to have the chance to make your work complete carefully and sincerely. I would like at present to have your kind mind and as soon as possible. 更多

$100 USD 在3天内
(22条评论)
3.8
vietdevteam

I have read your project. I'm sure i can help you to do it. I have completed many projects similar to this project. Relevant Skills and Experience I am expert in web scraping. I have created many scraping tools. I h 更多

$150USD 在1天里
(8条评论)
3.9
technoweb7

Please look at amazing discount on website development-- [login to view URL] Greetings, we are a team of 35 developers and 20 designers and each having more than 8 years experience. I would very much happy to wo 更多

$155 USD 在3天内
(2条评论)
3.5
huongth

Hi. I am an expert in VBA, VBScript, Visual Basic, C#, F#, C, C++, ASM, Delphi, Java, iMacros, Flash, ASP, ASP.NET, Access, MySQL, MSSQL, QuickBooks, Oracle. I can create auto scripts to scrape websites, auto click, fo 更多

$150 USD 在3天内
(16条评论)
3.6