Class StringDistance


  • public class StringDistance
    extends Object
    Author:
    pavlidis
    • Constructor Detail

      • StringDistance

        public StringDistance()
    • Method Detail

      • editDistance

        public static int editDistance​(String s,
                                       String t)
        The edit distance counts the differences between two strings, where we would count a difference not only when strings have different characters but also when one has a character whereas the other does not. The formal definition follows.

        For a string s, let s(i) stand for its ith character. For two characters a and b, define r(a, b) = 0 if a = b. Let r(a, b) = 1, otherwise.

        Assume we are given two strings s and t of length n and m, respectively. We are going to fill an (n+1) x (m+1) array d with integers such that the low right corner element d(n+1, m+1) will furnish the required values of the Levenshtein distance L(s, t).

        The definition of entries of d is recursive. First set d(i, 0) = i, i = 0, 1,..., n, and d(0, j) = j, j = 0, 1, ..., m. For other pairs i, j use (1) d(i, j) = min(d(i-1, j)+1, d(i, j-1)+1, d(i-1, j-1) + r(s(i), t(j)))

        That last step is also described as:

      • Set cell d[i,j] of the matrix equal to the minimum of:
        a. The cell immediately above plus 1: d[i-1,j] + 1.
        b. The cell immediately to the left plus 1: d[i,j-1] + 1.
        c. The cell diagonally above and to the left plus the cost: d[i-1,j-1] + cost.

        (Description partly cribbed from http://www.cut-the-knot.org/do_you_know/Strings.shtml and http://www.merriampark.com/ld.htm)

Parameters:
a -
b -
Returns: