An algorithm specifies how to quickly identify names that approximately match any specified name when searching a list or database of geographic names. Based on comparisons of the digraphs (ordered letter pairs) contained in geographic names, this algorithmic technique identifies approximately matching names by applying an artificial but useful measure of name similarity. A digraph index enables computer name searches that are carried out using this technique to be fast enough for deployment in a Web application. This technique, which is a member of the class of n-gram algorithms, is related to, but distinct from, the soundex, PHONIX, and metaphone phonetic algorithms. Despite this technique's tendency to return some counterintuitive approximate matches, it is an effective aid for fast, inclusive searches for geographic names when the exact name sought, or its correct spelling, is unknown.
Additional Publication Details
USGS Numbered Series
Fast, Inclusive Searches for Geographic Names Using Digraphs