Choosing an OCR system

August 10, 2007

Hi everybody,

According to comments on my last post about the state of OCR in Gnome, I fill the need to clarify the situation about supporting or not proprietary software.

Manifest: Gnome Scan is part of Gnome and thus, part of GNU. Yes, Gnome Scan has GNU in its name, and that’s not for fashion. Gnome Scan goal is to provide a libre scan infrastructure for the GNU OS on top of Gnome (rocking) technologies Gtk+, GEGL, etc. Gnome Scan also uses non GNU free software such as SANE for accessing scanners and yet OCRopus for OCR.

Someone would say : « why choosing OCRopus ? OCR-Shop or IRIS Toolkit or rocks ! »

Yet free OCR engine are years behing proprietary software ; right. However, using proprietary solution won’t help them. Paying for a SDK for adding value to proprietary software without even receiving incoming is just crazy ! It’s up to their respective company to provide support for their software. Please don’t complain that i don’t use your proprietary software. I really accept the fact that Gnome OCR must make room for every OCR engine, just because no one is perfect (especially libre ones).

Comment on supporting different OCR engine is rightful. Taking this feedback in account, I plan to build an API for Gnome OCR just like GtkPrint do for printing, and Gnome Scan for scanning; i.e. in a modular fashion. This change from my preliminary plan to provide this library in OCRopus itself. However, i’m pretty sure i will only support OCRopus. Just like SANE in Gnome Scan up to 0.4, Gnome Scan use OCRopus and only OCRopus (i.e. hardcoded) for OCR. Even worst, AbiScan itself uses directly OCRopus. That’s experimental solution, comments are welcome.

Asking for a libre OCR API is very important. That’s one value of Gnome Scan. OCRopus and libre OCR engines needs love. Don’t refuse them what they need 😉

I wish everyone understand my point of view and Gnome Scan goals, without fearing commenting. Feedback makes me happy :).

Regards,
Étienne.

State of OCR in Gnome

August 7, 2007

Hi everybody,

My work on flegita-gimp does not mean i forgot OCR which is, IMHO, the first class feature of scanning. Writing AbiScan clears my vision on how to design Gnome OCR UI. Before writing AbiScan, i was wondering how to integrate OCR in Gnome Scan. I was really worried because Gnome Scan is designed to pass image (as GeglBuffer) to application, not text or HTML or wathever OCR output format. I decided to write AbiScan and use ocropus directly instead of through Gnome Scan.

This lead me to find the way Gnome will receive OCR and OCR UI. AbiScan use ocropus command line tool, the idea is to use a library providing common OCR UI instead. This library should be ship by OCRopus. Why OCRopus and not Gnome Scan ? Because i think this library depends more on OCRopus and not on Gnome Scan. I may provide an OCR sink in Gnome Scan which help pluging Gnome Scan and OCRopus, but that’s not all the UI and OCR interaction part which should heavily rely on OCRopus itself, just like OCRopus command line tool.

Publishing AbiScan seems to have revealed questions from users. At the risk of repeating OCRopus website, let me explain a bit of OCRopus. OCRopus is not an OCR engine. OCRopus is a document analysis and OCR system. Instead of rewriting its own OCR engine, it uses existing one, especially tesseract, but more are to come. The difference between OCRopus and an OCR engine is exactly the same as between HTML and plain/text. HTML contains semantic, formatting and test itself while plain/text contains only … text ! So, if ever you read a comparision between OCRopus and e.g. gocr or ocrad, you can laught at it. Well, in fact, ocrad has a minimal layout analyser for text column, but that’s not as advanced as OCRopus layout analyser.

Regards,
Étienne.

AbiScan Preview

August 6, 2007

Hi all,

Resulting in about one week of lazy effort, i reach to produce a preliminary version of AbiScan on top of OCRopus. I produced a screencast video of direct OCR import into Abiword Frame. This is very buggy, but very exciting too :).

I must thanks #abiword people, especially Dominic Lachowicz, Marc Mauer, Martin Sevior, jean, sum1 and Hubert Figuière. Thanks goes to OCRopus and Gegl people for their work and advices.

I provide AbiScan patch against abiword-plugins SVN. The plugins does not work if abiword use G_MODULE_BIND_LAZY flags, this is a bug in abiscan, not abiword. I provide a patch against abiword SVN removing g_module_open flags, but it will hopefully never be merged.

If you want to try it, follow the following steps :

  1. Install tesseract-ocr from SVN, with the patch i provide in tesseract BTS ;
  2. Install ocropus ;
  3. Install Gegl SVN ;
  4. Install Gnome Scan SVN ;
  5. Install abiword SVN with g-module-open-flags.diff patch ;
  6. Install abiword-plugins SVN with abiscan.diff patch ;
  7. Launch Abiword
  8. Launch Insert > Import from scanner and follow the steps.

Warning : that’s really buggy.

  • Gnome Scan does not handle device list very well if you launch several times the dialog.
  • OCRopus does not provide any API, so the plugin use system() and isn’t able to monitor progress. OCRopus might take very long time.
  • Sometimes, it eats tons of memory.
  • Currently, it lose formating, that’s due to a HTML import pasteFromBuffer() bug. I had to make a choice between paste into existing document losing formating, or open directly tmp OCRopus HTML directly.

Bug reports are very welcome, please file bugs to gnome-scan product in Gnome bugzilla, for the abiscan component. Note that OCRopus prefer 150dpi images.

Anyway, that’s a rought draft with the key feature provided by Gnome Scan and OCRopus : tight integration into application and advanced OCR.


becomes

Regards,
Étienne


E Ultreïa !

Back from scouting

July 30, 2007

Hi all Gnome lovers,

I’m back from 17 days of scouting in nature. This was great. I published some photos of the camp at Faye in Nièvre. I came back last friday and was exhausted.

I didn’t resumed yet Gnome Scan development. I’ll take the time to think the future of Gnome Scan, espcially OCR. Sadely, there were not that much work on OCRopus during the past 3 weeks. I wonder how to pass data to the application. OCRopus output is in HTML with OCR tags. That’s useful but not very clean. I wonder how to integrate that in AbiWord.

So, my plan for the end of the summer is to implement OCR, rotation, Gimp and Abiword plugin.

Investigating OCR

July 6, 2007

Hi,

Since 0.5.1, i’m investigating OCR for Gnome through Gnome Scan. The most advanced software is OCRopus. OCRopus is to tesseract what HTML is to plain/text. And in fact, OCRopus output HTML :). OCRopus is currently based on the famous tesseract OCR engin, but some hocr code is in the repo, and more is to come.

Just like Gegl, i ensure that even alpha softwares i use in gnome-scan are at least packaged, either in an official repos or by my own. Tesseract is packaged, but OCRopus require tesseract SVN which does not built and has a bug (public headers includes config.h). The built failure received a patch waiting for inclusion, and i provide a patch fixing the last issue.

The main issue is that OCRopus has never publish a release. It has only SVN repo. OCRopus uses Jam without any rules to generate a distribution package (like make distcheck from automake). After some research, i decided to add automake build system to OCRopus. Tesseract has two build system : one for developer, one for distributor. Automake is still the best solution for distributor.

I provided a quite big but incomplete patch at OCRopus mailing list which received some attention. It seems that tesseract and OCRopus are very close together. Both are not very active. I hope that gnome-scan will tick those project developers and put them to users. OCRopus has only a command line tool, it should provide a library. Tesseract provide 11 libraries, it should provide only one !

So, gnome-scan OCR is kind of blocked by upstream project that’s needs some love. Both projects needs to modify a bit their design. I think i will provide a “protocropus” plugin for gnome-scan which will provide a simple bridge between the various command line OCR tools or libraries (including OCRopus), but not depending on OCRopus.

Étienne.


Hi all,

I’m not blogging a lot since i produce weekly report on gnome-soc-list. Also, the motivation is not that high. I continue the new preview area system. It is quite nice. It use GtkStyle colors and is extensible. That’s not easy to rewrite such software. In some hand you got excitation and motivation due to new features and huge improvement, but in the other end, its a bit of duplication, waste of time, etc.

After preview area, the big job is processing. I prepared a bit that part of the software, but wasn’t able to know exactly what i wanted. I guess i will provide a basic processor for rotation and color correction, however i don’t know how to get it really modular. Should i use multiple processors ? Also, i wonder how to take hardware color correction in account as well as calibration.

Anway, i would like to write OCR as soon as possible. Writing a dumb plugin for gedit or abiword would be funny. I tested OCRopus. Well, tesseract SVN did not build so ocropus can’t be built. tesseract development seems not active. I know some people are working on it, but not in the official SVN. Finally, i found a patch in their bugtracker which solved the problem. Tesseract and OCRopus definitly needs some feedbacks. This is where gnome-scan comes at point.

Even if OCRopus is far from yet ready for distribution, i investigate into it. Gnome Scan en Gegl are is similar state. I wish this will help and motivate their developers. Some projects really needs some love. Let’s give it what they need.

OCRopus should require an additional dialog with preview of the result. I bet the new preview area will be suitable for this thanks to it’s modular design. The new preview area will also fit exotic needs like multiple area without bloating the library.

Étienne.

Good news

November 28, 2006

Hi all,

After quite some time of hibernation, i wake up the development process of Gnome Scan. Gnome Scan hit the 100th revision ! This revision is the first contributed patch Olaf Leidinger. It added –disable-gnome option for xfce people and other. I’m also very pleased to see news Swedish translation and nearby Dutch translation

Gnome Scan seems to spread more and more. How surprised where i to see that Foresight Linux 0.9.4 include Gnome Scan/Flegita ! Gnome Scan 0.3.1 is now in feisty/universe. I even saw unofficial ebuild. A bunch of people talk about Gnome Scan all around the web in French and German (some consider it as a good replacement of xsane, i find Gnome Scan in a too early stage to fully replace xsane).

Emmanuel Fleury, associate professor at Bordeau-I university, submitted Gnome-OCR project to its student. The project will start in January. The idea is to link tesseract-OCR and Gnome Scan. I wish this project will end up with an Abiword plugin. Even if i mainly work on flegita, the purpose of Gnome Scan is not a standalone app, but flegita is the first step. As long as Gnome Scan has no stable API, i don’t want to provide too many unmanagable plugins. I don’t want to end up with 7.0.0 libtool version for gnome-scan 1.0 :).

I guess that 0.3.2 will arrive as Chrismas gift to the FOSS world :). It will include nice features that may make Gnome Scan a reasonable replacement of xsane for basic end user.

So yes, Gnome Scan is still alive ! And very active !

Étienne.