Posts
Common Optimizations for Levenshtein Edit Distance
I have been studying algorithms to compute the Levenshtein edit distance between two strings, which is defined as the minimum number of “edits” required to transform one string into another string. This is useful for a variety of things, from DNA sequence alignment to spell correction to address validation. I am studying it to resolve misspelled and OCR’ed scientific names of organisms to their real names, a famous problem in the field.
read morePosts
Making a Pratt Parser Generator Part 1
A brief history of the Pratt parsing algorithm The history of programming language parsers is dominated by the thorny challenge of parsing expressions, mathematical expressions in particular, taking into account the precedence of operators in the expressions. Modern formal language theory began with the work of Noam Chomsky in the 1950s, in which Chomsky lays out a mathematical framework for linguistics. Under this mathematical framework, languages exist within a hierarchy of languages defined according to how difficult the language is to parse.
read more