Choosing an OCR system

August 10, 2007

Hi everybody,

According to comments on my last post about the state of OCR in Gnome, I fill the need to clarify the situation about supporting or not proprietary software.

Manifest: Gnome Scan is part of Gnome and thus, part of GNU. Yes, Gnome Scan has GNU in its name, and that’s not for fashion. Gnome Scan goal is to provide a libre scan infrastructure for the GNU OS on top of Gnome (rocking) technologies Gtk+, GEGL, etc. Gnome Scan also uses non GNU free software such as SANE for accessing scanners and yet OCRopus for OCR.

Someone would say : « why choosing OCRopus ? OCR-Shop or IRIS Toolkit or rocks ! »

Yet free OCR engine are years behing proprietary software ; right. However, using proprietary solution won’t help them. Paying for a SDK for adding value to proprietary software without even receiving incoming is just crazy ! It’s up to their respective company to provide support for their software. Please don’t complain that i don’t use your proprietary software. I really accept the fact that Gnome OCR must make room for every OCR engine, just because no one is perfect (especially libre ones).

Comment on supporting different OCR engine is rightful. Taking this feedback in account, I plan to build an API for Gnome OCR just like GtkPrint do for printing, and Gnome Scan for scanning; i.e. in a modular fashion. This change from my preliminary plan to provide this library in OCRopus itself. However, i’m pretty sure i will only support OCRopus. Just like SANE in Gnome Scan up to 0.4, Gnome Scan use OCRopus and only OCRopus (i.e. hardcoded) for OCR. Even worst, AbiScan itself uses directly OCRopus. That’s experimental solution, comments are welcome.

Asking for a libre OCR API is very important. That’s one value of Gnome Scan. OCRopus and libre OCR engines needs love. Don’t refuse them what they need 😉

I wish everyone understand my point of view and Gnome Scan goals, without fearing commenting. Feedback makes me happy :).

Regards,
Étienne.