So the other day I was trying to work out how to translate pictures of food menus into text without typing them in. Don’t ask why… This lead me down the dark path of trying to find a optical character recognition (OCR) library that I could quickly use.. something that I thought was fairly new feature to appear in the ‘app age’.
Well, it turns out all of these apps .. such as Google translate, Barclays PingIt and others. All of these can be traced back to the same library called Tesseract that was originally written in 1984.
What is nuts is the sheer universality of Tesseract. Just about everything which claims to have text recognition as a feature is backed by it. At one point, I was expecting some super powerful wizz kids with super computers sat inside of Google or Microsoft had some up with a clever artificial intelligence and symbolic new kinds of sciences and evolved automata pattern recognition. Nope!
It blows my mind that some code written over a quarter of a century ago – which has most likely outlasted the original author is now being billed as a major feature.
Anyway here is the nodejs module and the original Tesseract library.
Tesseract was originally developed in a garden shed in Britsol and then adopted by Hewlett-Packard Laboratories between 1985 and 1994, with some more changes made in 1996 to port to Windows, and some C++izing in 1998.
In 2005 Tesseract was open sourced by HP. Since 2006 it is developed by Google. You can see the github code base updates since then https://github.com/tesseract-ocr/tesseract/wiki/ReleaseNotes
So next time you are taking a picture of your credit card or trying to translate something into welsh, remember you are using something from the cold war era that predates wifi.. or even dial up modems!