Find Jobs
Hire Freelancers

data crawler to login & spider inventory data from distributor website to csv file

$30-100 USD

进行中
已发布超过 18 年前

$30-100 USD

货到付款
We need to create a automated crawler that will log into a distributor warehouse website and download inventory data from tables to a delimtered file. The website we will be crawling is the login/search catalog section of www.electrograph.com. I have saved copies of their site locally to demonstrate what needs to be done. After closing of project we will provide actual login details to the live site for the job to be completed. Walk through process of what needs to be done: Login Home Page [login to view URL] Goto main website and login using the form in the uppler left hand corner of page. User name and password should be definable. Successfully Logged In [login to view URL] After the login has been processed successfully the page is refreshed now including a "My Account" section in the upper left hand corner. Additionlly, The "keyword/ Item# search" form is now enabled for our specific account. It will display the specific pricing, and inventory quantities available for our account when submitted. Currently their web site allows you browse through the inventory of items by category, and then paginate through the results (cannot show all products in one iteration). We need to follow each category link through the select menu "ddlCategory" individually, download all the data in the page to specified format, and continue on to the next page of results if another page exists. Crawling first result page of the first category searched "Accessories" [login to view URL] This page displays the information that we are looking to store in a delimitered file format. We need to trim & store Model #, Manufacturer, Description, Availability, Reseller Price columns. Each table row, a new line in the delimtered file created. Take note of the Availability column, it provides a total quantity number in stock and then a "I" icon. When you hover above this "I" icon it displays the breakup of which warehouse locations that product is stored in. For example: 18 (I says: 14 - NY, 4, NV, meaning 14 units in stock in New York, 4 units in stock in Nevada). We need to store both the total quantity available as well as those individual location listings. A column for each warehouse location. Crawling second/additional result page(s) of the first category searched "Accessories" (page 2+) [login to view URL] Perform the same process as Step2 downloading & storing all the inventory data, and continue onto the next page if it exists. (Note on the saved version of the this page i povided you; the javascript is not working to show the individual warehouse splitup, it will of course be operating on the live site) Crawling first result page of additional LARGE category searched "Plasma Displays" [login to view URL] (interim refine page) [login to view URL] (actual results page) Some categories of their website that contain a substation amount of products, when you first click on "SEARCH" it does not display results. It brings you to another "search plasma displays" form where you can refine your results, and search by attributes. We do not care to do this, we simply want to select the "GO" button, which will display all the products under that category in the same manner as step2. Crawling second/additional result page(s) of additional LARGE category searched "Plasma Displays" [login to view URL] Perform the same process as Step2 downloading & storing all the inventory data, and continue onto the next page if it exists. The end result needs to create a file that is Delimitered by Comma Example result for parsing of example link [login to view URL] Model Number, Manufactuer, Description, Reseller Price, Total Available Qty, Location NY Qty, Location NV Qty, Location XX Qty ACE615, ADCOM, ACE-615 ILS SURGE (120V), 315.00, 12, 12, 0, 0 TRAVEL CS/42"PANASON, CALZONE CASE CO, TRAVEL CASE 42" PANASONIC, 345.33, 0, 0, 0, 0 FSD-4100, CHIEF MANUFACTURING, FSD-4100, 97.39, 0, 0, 0, 0 CMA-0608, CHIEF MANUFACTURING, 6'-8' ADJUSTABLE PLATE, 93.39, 0, 0, 0, 0 RC-1PXL, ELECTROGRAPH SYSTEMS, 24-BUTTON SWITCH PANEL FOR VS-1XL, 104.76, 0, 0, 0, 0 RC-1XL, ELECTROGRAPH SYSTEMS, NEW MODEL NUMBER (WAS VS-1XL) REMO, 104.76 0, 0, 0, 0 FRAME-O, ELECTROGRAPH SYSTEMS, SINGLE GANG FRAME TO HOLD UP TO 3 W, 245.35, 0, 0, 0, 0 FRAME-W, ELECTROGRAPH SYSTEMS, SINGLE GANG FRAME TO HOLD UP TO 3 W, 14.89, 5, 5, 0, 0 Notice on the website, some products it gives a quantity, some it says "call for availability". We need to be able to map whatever text is in that field to a text/numerical equivalent. For example in this impelentation we define "Call for availability" as 0. Also, because they are always adding and changing warehouse locations we need to leave room at the end of the delimitered file for new locations that are added. When text is found in the quantity available field, and we compare it to find its equivalency and apply that to all the other location columns. For example: "call for availabiilty" will result in 0, 0, 0, 0 (Total Quantity Available, Location 1 Qty, Location 2 Qty, Location 3 Qty). We should make room for up to 10 warehouse locations (0, 0, 0, 0, 0, 0, 0, 0, 0, 0). When a quantity is not defined for a warehouse that is indexed we will replace it with zero. In this example Call for availbility means the product is not in stock, thus we are marking it and all subsequent warehouse locations as 0. I also need to able to control the delimiter used in the output file (I have used comma in this illustration for ease). I also need to be able to control the delay between page navigation (milliseconds) A database should not be necessary; a simple config file is fine. Need to get this project completed ASAP. We have several data crawlers that need to be created: Winner of this project can expect future work in the development of similar crawlers.
项目 ID: 29093

关于此项目

1条提案
远程项目
活跃19 年前

想赚点钱吗?

在Freelancer上竞价的好处

设定您的预算和时间范围
为您的工作获得报酬
简要概述您的提案
免费注册和竞标工作
1威客以均价$95 USD来参与此工作竞标
用户头像
I have developed site crawlers in past. These crawlers are able to handle Cookie based sessions, Javascript URLs and http/html redirects. I can use existing codebase to complete this project. This poject can be implemented with Java. With Java you can run it on your desktop and move it off to a server if you want to automate it in future. Please let me know if I can provide you more information.
$95 USD 在5天之内
5.0 (2条评论)
4.2
4.2

关于客户

UNITED STATES的国旗
brooklyn, United States
5.0
19
付款方式已验证
会员自6月 23, 2005起

客户认证

谢谢!我们已通过电子邮件向您发送了索取免费积分的链接。
发送电子邮件时出现问题。请再试一次。
已注册用户 发布工作总数
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
加载预览
授予地理位置权限。
您的登录会话已过期而且您已经登出,请再次登录。