Deadline: Thursday 10/ Nov/2016
Using python, pandas, numpy and scikit learn.
For visualizations, you will not need anything more complex than scatter-plots, histograms or line plots. You will provide a single ipython notebook that contains the code for all the answers. Use a separate tab for each question. For each task, also write your appropriate answers in a .txt, .doc or .pdf and submit this along with your code.
1. I have provided you with a dataset called data1. It contains a train and test dataset. Use a suitable method to predict the “Value” given the features (there are 100 features) (there are a number of redundancies in the features). Evaluate and present your results using an appropriate error measure.
2. I have provided you with two datasets in data2.zip. For each dataset:
a. Analyze the data using an appropriate visualization
b. Use an appropriate method to cluster similar data-points together. Justify why you
picked the specific method for each dataset.
c. Output the clustered points using an appropriate visualization.
13 freelancers are bidding on average $196 for this job
Hey there, I'm a professional Python software developer. I'm having expertise in data analysis using python libraries like scikit, pandas, numpy. Ping me for discussion.
Hi, I am proficient in Python and Data Science. I would like an opportunity to work with you on this project. Please feel free to ping me on chat. My bid is an estimate and may change based on scope of work. Thanks!