com.swabunga.spell.engine
Class DoubleMeta
java.lang.Object
com.swabunga.spell.engine.DoubleMeta
- All Implemented Interfaces:
- Transformator
- public class DoubleMeta
- extends java.lang.Object
- implements Transformator
A phonetic encoding algorithm that takes an English word and computes a phonetic version of it. This
allows for phonetic matches in a spell checker. This class is a port of the C++ DoubleMetaphone() class,
which was intended to return two possible phonetic translations for certain words, although the Java version
only seems to be concerned with one, making the "double" part erroneous.
source code for the original C++ can be found
here: http://aspell.sourceforge.net/metaphone/
DoubleMetaphone does some processing, such as uppercasing, on the input string first to normalize it. Then, to
create the key, the function traverses the input string in a while loop, sending sucessive characters into a giant
switch statement. Before determining the appropriate pronunciation, the algorithm considers the context
surrounding each character within the input string.
Things that were changed:
The alternate flag could be set to true but was never checked so why bother with it. REMOVED
Why was this class serializable?
The primary, in, length and last variables could be initialized and local to the
process method and references passed arround the appropriate methods. As such there are
no class variables and this class becomes firstly threadsafe and secondly could be static final.
The function call SlavoGermaic was called repeatedly in the process function, it is now only called once.
Method Summary |
char[] |
getReplaceList()
gets the list of characters that should be swapped in to the misspelled word
in order to try to find more suggestions. |
java.lang.String |
transform(java.lang.String word)
Take the given word, and return the best phonetic hash for it. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DoubleMeta
public DoubleMeta()
transform
public final java.lang.String transform(java.lang.String word)
- Take the given word, and return the best phonetic hash for it.
Vowels are minimized as much as possible, and consenants
that have similiar sounds are converted to the same consenant
for example, 'v' and 'f' are both converted to 'f'
- Specified by:
transform
in interface Transformator
getReplaceList
public char[] getReplaceList()
- Description copied from interface:
Transformator
- gets the list of characters that should be swapped in to the misspelled word
in order to try to find more suggestions.
In general, this list represents all of the unique phonetic characters
for this Tranformator.
The replace list is used in the getSuggestions method.
All of the letters in the misspelled word are replaced with the characters from
this list to try and generate more suggestions, which implies l*n tries,
if l is the size of the string, and n is the size of this list.
In addition to that, each of these letters is added to the mispelled word.
- Specified by:
getReplaceList
in interface Transformator
- Returns:
- char[] misspelled words should try replacing with these characters to get more suggestions
- See Also:
Transformator.getReplaceList()