R Code to Calculate Random Forest Out-of-Bag Estimate of Error (Revised Price)

$30-250 USD

已关闭

已发布

将近 8 年前

$30-250 USD

货到付款

Will pay $50 for project to start immediately and be completed within 24 hours (by June 18th, 11:30pm GMT). $10 bonus if completed by within next 4.5 hours (by June 18th, 4:00am GMT). I am available to work with you by chat until then. Preference given to freelancers who have R and Random Forest experience. You will need to be very familiar with Random Forests and R as I am not and can not provide much assistance. Essentially, I am looking for an small enhancement of the Random Forest process in the R GUI called Rattle. From what I can tell by looking at the R Add-In called Party, there are a number of functions included which might mean adding perhaps 5-15 additional lines of code to what I already have (although I could certainly be off on that estimate). Using the R GUI called Rattle, I can easily select my dataset (see below) and choose a single Y, as well as the random seed, and choose the ratio of training to testing data. Next, I execute the RF (Random Forest) model choosing only the number of trees (default is 500) and the number of predictors (default is the integer of the square root of m total predictors). From this, R (through Rattle's code) gives me the Out-of-Bag Error and the traditional 2x2 classification grid for both training and testing data. Not including the 5 seconds it takes R to run the code, I can set up this scenario from scratch in less than 1 minute. Due to Rattle’s limitations, I can only execute for a single Y at a time. This issue, as well as the inability to aggregate those Out-of-Bag results, is my problem. The algorithm above is outlined very succinctly at [login to view URL]~dzeng/BIOS740/[login to view URL] on the first page under the title “The algorithm” and is covered in the listed points 1, 2, 3 and 1. Essentially, what I need done is the very next point they list that says: 2. Aggregated the OOB predictions. (On the average, each data point would be out-of-bag around 36% of the times, so aggregate these predictions.) Calculate the error rate, and call it the OOB estimate of error rate. However, as I am really after the PPV (Positive Prediction Value - i.e. where a 1 is predicted for Yn) and not the global OOB error (due to my data being skewed towards y-values of 0) of the models, I am more interested in the raw prediction counts so I can calculate error rates myself. I will supply a CSV data sample of ~4000 observations (~50/50 training/testing split) with multiple binary Y's and multiple binary X's and one continuous X (an integer ranging from 0 to ~30) for each observation. I can even supply the R code from Rattle for the procedure I am currently using. I would like your R code to be able to accept the following inputs from me: -observations in the format: Observation #, Y1…Yn, X1…Xm -random seed value -number of trees value (default is 500) -number of predictors to be randomly sampled (default is the integer of the square root of m total predictors) -number of rows at bottom of data list for holdout data (to be scored each round) -number of rounds (which will be ~1,000 – 1,000,000) I would like your R code to be able to supply the following outputs to me: -CSV file with full original data plus the aggregated OOB prediction totals (for both training and testing data) for each observation for each Y (i.e. the number of times the OOB prediction was 0 for each observation for each Y and the number of times the OOB prediction was 1 for each observation for each Y) If you happen to be aware of an open source R GUI that will already do all of the above for me (and that I can understand and use), you can just help me install it and will not need to supply the R code. As long as it works for me, the project will be considered completed.

Algorithm

Machine Learning (ML)

Mathematics

R Programming Language

Statistics

项目 ID: 10800664

关于此项目

6提案

远程项目

活跃8 年前

想赚点钱吗？

电子邮箱地址

在Freelancer上竞价的好处

设定您的预算和时间范围

为您的工作获得报酬

简要概述您的提案

免费注册和竞标工作

6威客以平均价$236 USD来参与此工作竞价

@Softeria

I am a STATISTICS tutor for last 5 years. I have expertise in Statistical Analysis. I can show you some of my previous analysis. I have excellent concepts of Random variables, Probability Distribution, Sampling and different tests. . I had a course on DATA ENINEERING and Artificial Intelligence as well. I know all data mining techniques (Predication & Classification) and data analysis techniques. I have worked on K-mean, ID3, Bayesian theorem, confusion matrix, Hungarian algo and so on .My research was on Rough Set Theory. The tool I uses are SPSS, EXCEL, minitab, Weka and R for programming. Thank you for considering my proposal

$300 USD 在5天之内

4.9

(77条评论)

6.1

@garimamiet

dear sir, i have more than 8+ years of experience in r Programming.i can provide you best suitable solution for your requirement. looking forward for more discussion.

$222 USD 在1天之内

5.0

(8条评论)

4.1

@leplas

I am Herilalaina RASOLONJATOVO from Madagascar, and I am an expert in Data Analysis using R programming and I can help you do this project according to your specification. I have done several projects in this field in the past and locally in my country. I have gone through your requirements and I am ready to start work as soon as possible. Please respond to give a go ahead so work can start. Thank you in advance, Herilalaina RASOLONJATOVO.

$200 USD 在1天之内

4.9

(2条评论)

2.1

@pinetree202

i am an expert with data structure, algorithm and so on. if you select me, you will be lucky. please keep in touch.

$150 USD 在1天之内

2.9

(3条评论)

1.0

@atulsubhash1

Hi, I have extensive experience in R and I am the author of muRandomForest (Product for leading Analytics Player Mu Sigma). I can complete in next 5 hours. Thanks, Atul

$250 USD 在3天之内

0.0

(0条评论)

0.0

@tayyabsaeed1

I have 3 year experience in the same. My most of the previous work experience is in field of healthcare or more precisely analyzing patients data for developing algorithms for automates disorder detection. I have developed many novel algorithms for seizure, cancer etc detection. I am expert in MATLAB and R programming only.

$222 USD 在3天之内