Spelling Suggestions via the Bisect Module
April 13, 2015
I know that those who want to implement their own search and retrieval systems learn that some features are tricky to implement. I read “Typos in Search Queries at Khan Academy.”
The author states:
The idea is simple. Store a hash of each word in a sorted array and then do binary search on that array. The hashes are small and can be tightly packed in less than 2 MB. Binary search is fast and allows the spell checking algorithm to service any query.
What is not included in the write up is detail about the time required and the frustration experienced to implement what some senior managers assume is trivial. Yep, search is not too tough when the alleged “expert” has never implemented a system.
With education struggling to teach the three Rs, the need for software that caulks the leaks in users’ ability to spell is a must have.
Stephen E Arnold, April 13, 2015