Soundex, Metaphone and Miracode are phonetic algorithms for indexing words and phrases by their sound, when pronounced in the English language. The basic aim is for words with the same pronounciation to be encoded to the same string.

They are necessarily complex algorithms with many rules and exceptions, because English spelling and pronunciation is complicated by historical changes in pronunciation and words borrowed from many languages.

Soundex was developed by Russell, originally for a US census, and patented in 1918 (US patent 1,261,167). The Soundex code for a name consists of a letter followed by three numbers: the letter is the first letter of the name, and the numbers encode the remaining consonants. Similar sounding consonants share the same number so, for example, the labial B, F, P and V are all encoded as 1. Vowels can affect the coding, but are never coded directly unless they appear at the start of the name.

Although 'Soundex algorithm' refers to a specific algorithm, it is often used (incorrectly) when 'phonetic algorithm' is intended.

Metaphone was developed by Lawrence Philips as a response to deficiencies in the Soundex algorithm. Metaphone is available as a built-in operator in a number of systems, including later versions of PHP.

See also:

  • Porter stemming algorithm

External links: