Find Jobs
Hire Freelancers

Big data project

$30-250 AUD

已关闭
已发布将近 2 年前

$30-250 AUD

货到付款
In this project, you will develop an Oozie workflow to process and analyze a large volume of flight data. • Instructions: 1. Form a project team of four students (including yourself). 2. Install Hadoop/Oozie on your AWS VMs. 3. Download the Airline On-time Performance data set (flight data set) from the period of October 1987 to April 2008 on the following website: [login to view URL]:10.7910/DVN/HG7NV7 4. Design, implement and run an Oozie workflow to find out a. the 3 airlines with the highest and lowest probability, respectively, of being on schedule; b. the 3 airports with the longest and shortest average taxi time per flight (both in and out), respectively; and c. the most common reason for flight cancellations. • Requirements: 1. Your workflow must contain at least three MapReduce jobs that run in fully distributed mode. 2. Run your workflow to analyze the entire data set (total 22 years from 1987 to 2008) at one time on two VMs first and then gradually increase the system scale to the maximum allowed number of VMs for at least 5 increment steps, and measure each corresponding workflow execution time. 3. Run your workflow to analyze the data in a progressive manner with an increment of 1 year, i.e. the first year (1987), the first 2 years (1987-1988), the first 3 years (1987-1989), …, and the total of 22 years (1987-2008), on the maximum allowed number of VMs, and measure each corresponding workflow execution time. • Submission (all in a zipped file: [login to view URL]): 1. A [login to view URL] text file that lists all the commands you used to run your code and produce the required results in a fully distributed mode 2. An [login to view URL] text file that stores the final results from all the runs 3. The source code of your MapReduce programs (including the JAR files) and any other programs you might have developed and included in the workflow 4. The Oozie workflow XML file 5. A project report in PDF that includes: a. A diagram that shows the structure of your Oozie workflow b. A detailed description of the algorithm you designed to solve each of the problems c. A performance measurement plot that compares the workflow execution time in response to an increasing number of VMs used for processing the entire data set (22 years) and an in-depth discussion on the observed performance comparison results d. A performance measurement plot that compares the workflow execution time in response to an increasing data size (from 1 year to 22 years) and an in-depth discussion on the observed performance comparison results Read Less
项目 ID: 33639738

关于此项目

7提案
远程项目
活跃2 年前

想赚点钱吗?

在Freelancer上竞价的好处

设定您的预算和时间范围
为您的工作获得报酬
简要概述您的提案
免费注册和竞标工作
7威客以平均价$303 AUD来参与此工作竞价
用户头像
Hi there! I have vast experience in Data Science and Hadoop, and I am confident that I can produce quality content for you that will be of interest to your readers. I am available for a full-time writing position, and I am happy to work on a freelance basis if that is what you prefer. I look forward to hearing from you soon! Thank you for your time. Thanks & regards - Wilfred
$100 AUD 在1天之内
0.0 (0条评论)
0.0
0.0
用户头像
Hi Btw I already did exactly same project for one of the airlines company in Europe. I have around 20+ years of professional experience in IT which includes work experience in Big Data, Hadoop ecosystem related technologies. we can start working on it as soon as possible. Thank you
$1,000 AUD 在10天之内
0.0 (0条评论)
0.0
0.0
用户头像
big data expert/Hadoop Am ready to develop an Oozie workflow to process and analyze a large volume of flight data. please send me a message to discuss further details about this project in chat. thank you
$222.22 AUD 在6天之内
0.0 (0条评论)
0.0
0.0
用户头像
Hi, I have +5 years of experience dealing with machine learning algorithms and worked on multiple projects in this field, I absolutely can do your project as you like. Please contact me to discuss more. Have a nice day
$140 AUD 在7天之内
0.0 (0条评论)
0.0
0.0

关于客户

EGYPT的国旗
Cairo, Egypt
4.9
39
付款方式已验证
会员自10月 25, 2018起

客户认证

谢谢!我们已通过电子邮件向您发送了索取免费积分的链接。
发送电子邮件时出现问题。请再试一次。
已注册用户 发布工作总数
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
加载预览
授予地理位置权限。
您的登录会话已过期而且您已经登出,请再次登录。