Find Jobs
Hire Freelancers

Applying text algorithm again, part1

$750-1500 USD

进行中
已发布超过 7 年前

$750-1500 USD

货到付款
Note: The first three payment/milestones below fall under “Applying text algorithm again, part1”, and the second three under “Applying text algorithm again, part2” The aim of this project is to improve a classification+clustering method developed in a previous project, based on some technical issues that were identified. 1. The team is going to send the freelancer a list of irrelevant words. 2. As agreed in the end of the previous project, the same overall method should be carried out, with the same tools. As before, we are going to try three different levels of tolerance for the algorithm. We are going to try five versions of algorithm+clusterization to deal with the irrelevant word problem: i. considering only words which occur in less than 10% of entries to create the algorithm and clustering ii. considering only words which occur in less than XX% of entries (another cut-off chosen after the results from iii.) to create the algorithm and clustering iii. ignoring words from the list of irrelevant words #1 to create the algorithm and clustering iv. ignoring words from the list of irrelevant words #2 to create the algorithm and clustering v. a version combining the use of the best list of irrelevant words and the best frequency cut-off (In all versions we use the tool to fix typos. For each version we test three tolerance levels.) 3. In each case, compute the silhouette score both on predicted codes and on clusters 4. The team will assess the results using the list of irrelevant words #1 and if necessary bring some modifications to the list for the algorithm and clustering to be re-run (version ii. using list of irrelevant words #2). 5. Once the algorithm and clustering are finalized: assign a predicted code to each cluster, by comparing of "mean cluster sentence" with all code descriptions (from initial learning dataset + additional codebook) to choose the best matching code description. 6. The project ends when the algorithm and the clustering perform in a satisfactory way. The team will then receive from the freelancer the codes/tool allowing them to re-run the exact same algorithm and clustering in the future and adjust them if necessary. Payment plan: First $278 for 2i, including algorithm results, clusterization results, silhouette score, and predicted code assigned to each cluster Second Same, for 2ii Third Same, for 2iii Fourth Same, for 2iv Fifth Same, for 2v Sixth $280 for all codes/tools necessary to re-run and adjust the algorithm, clusterization, silhouette score calculation, and code for assigning codes to clusters
项目 ID: 12583580

关于此项目

6提案
远程项目
活跃7 年前

想赚点钱吗?

在Freelancer上竞价的好处

设定您的预算和时间范围
为您的工作获得报酬
简要概述您的提案
免费注册和竞标工作
6威客以平均价$1,227 USD来参与此工作竞价
用户头像
Masters of science & professional statistician is here to help.................................................................................................................................
$1,500 USD 在20天之内
5.0 (45条评论)
6.5
6.5
用户头像
Hi I am a very experienced statistician and academic writer. I have completed several PhD level thesis projects involving advanced statistical analysis of data. I have worked with data from several companies and have done projects involving high level quantitative analysis and data interpretation skills to study the trends, time behaviour and compare the variables in the data. I can do advanced level analysis in SPSS, R, WEKA and excel tools like machine learning, hypothesis testing, forecasting, T-test, ANOVA etc. Looking forward to discussion, Best Regards, Suyash
$1,500 USD 在20天之内
4.1 (50条评论)
6.3
6.3
用户头像
note: I have previous experience in similar works hi, basically I'm an electronics engineer. expert in python. Surely I could help you. Come to chat for more discussion. thank you
$1,250 USD 在20天之内
5.0 (5条评论)
3.2
3.2
用户头像
We can discuss I can do in R. . Regards
$1,250 USD 在20天之内
4.9 (3条评论)
3.0
3.0
用户头像
Data Analyst/Scientist with more than 6+ years of experience in R Language, SPSS, STATA, SAS, MINITAB. I have been doing descriptive and inferential statistics Key Techniques are Regression Model Binary Logistic Model Factor Analysis Cluster Analysis Neural Network Parametric and Non-Parametric Test Good in data visualization using Data mining technique like CRT,QUESTetc Please refer my client's feedback ( 5 star rated) Kindly reach out to me for further discussions
$1,111 USD 在15天之内
0.0 (0条评论)
0.0
0.0

关于客户

UNITED STATES的国旗
New York, United States
5.0
4
付款方式已验证
会员自8月 24, 2016起

客户认证

谢谢!我们已通过电子邮件向您发送了索取免费积分的链接。
发送电子邮件时出现问题。请再试一次。
已注册用户 发布工作总数
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
加载预览
授予地理位置权限。
您的登录会话已过期而且您已经登出,请再次登录。