Python Guru needed / mathematician for BK-trees and Levenshtein distance

已完成 已发布的 4 年前 货到付款
已完成 货到付款

We want to make a program demonstrating how words can be changed step by step, so that from an initial word we can move on to new words, up to a final word we have chosen with the minimum number of steps. In addition, at each step we want one word to differ from one another by one letter:either we will erase a letter, either we will enter a letter, either we will change a letter.

For example, let's say we want to go from the word spring to the word summer. The order of conversions will be: spring -> string -> sting -> ting -> you -> time -> timer -> dimer -> dimmer -> simmer -> summer

To do this, we can think that all the words of a dictionary are the vertices of a graph. Two vertices (words) are linked to each other if they differ by one letter, with the rule mentioned above. Then, our problem goes back to looking for the shortest path between two words in the [login to view URL] fact, however, we do not have to build the whole graph to solve our problem. Suffice it to have a way that by every word we can identify its neighbors, that is, the words that differ from it by one character. Distance Levenshtein

The first step in this approach is to have a way to measure how two words differ, so we can decide if two words differ by insertion, deletion, or letter change. The metric that gives us exactly the number of insertions,deletions, or letter changes required to convert a word into another word is called Levenshtein distance (Levenshtein distance).

Although we can implement the Levenshtein distance calculation by following the retrofit definition, this is not effective, because retrospective calls count over and over the distances between prefixes that have already been calculated in the past. In practice, to calculate Levenshtein distance we use the Wagner-Fisher algorithm. The idea is that we use a table to save the distances between all the prefixes of the first string and all the prefixes of the second string, and fill the table progressively.

Trees BK

The second step to solving our problem is to find an effective way to dynamically locate neighbors of every word. Such a way is given by a data structure called the BK tree (BK tree, from the names of the inventors of the structure, Burkhard and Keller). A tree BK is defined as follows: We select any element as root.

Each node of the tree can have any children, let n

, corresponding to n

subframes. The k

subspace of a node contains elements that have a distance from the node equal to k

Let's see how such a tree is made; so we will better understand its form.

Let's start with the word food.

We introduce the word good. Good distance from food is 1, so good will become a child of food, and under that branch we will put words that are 1 away from the food. To show this, we put the label 1 on the twig. Then we add the word cook. The cook's distance from food is 2, so the cook will become a child of the food through another branch where all the words 2 away from the food will go. Put the label 2 on the new branch.

We introduce the word fowl. The distance of the fowl from the food is 2, so go to the food branch 2, the cook node. There we create a new sub-tree. The fowl distance from the cook is 3, so fowl becomes a cook's child through a new tagged 3 [login to view URL] introduce the word spoon. The distance of the spoon from food is 3, so the spoon will become a child of the food through a new branch with a label 3 where all the words 3 away from the food will [login to view URL] introduce the word fork. It has a distance of 2 from the food, so we go to the cook node. This fork has a distance of 2 from the cook, so it becomes a child of the cook through a new branch with a label [login to view URL] trees serve us because they allow us to quickly find words that have a Levenshtein distance up to a limit r

by a word that interests us.

The third step in the approach is to search for the path in the graph. This can be done with a modification of the Dijkstra algorithm. theres a tiny bit more. any help would be appreciated.

机器学习(ML) 数学 矩阵及数学软件 Python

项目ID: #19337511

关于项目

3个方案 远程项目 活跃的4 年前

授予:

hemsingh1

Machine Learning, Mathematics, Matlab and Mathematica, Python skilled professional with the required expertise. Can help to deliver the solution as needed on the expected lines. Please private message me to discuss the 更多

€23 EUR 在1天内
(3条评论)
2.4

有3名威客正在参与此工作的竞标,均价€99/小时

VirtualBrainInc

Hello! I have briefly read the description on /python-guru-needed-mathematician development project, and I can deliver as per the requirements however I need us to discuss for more clarity on the details, deadl 更多

€250EUR 在1天里
(7条评论)
4.2
ramakrishnan9294

Hi, I can complete this project in around 2-3 days due to my (technical) background and I've a small team as well in order to deliver projects efficiently and with utmost quality.

€24 EUR 在3天内
(0条评论)
0.0