Investigating OCR

July 6, 2007

Hi,

Since 0.5.1, i’m investigating OCR for Gnome through Gnome Scan. The most advanced software is OCRopus. OCRopus is to tesseract what HTML is to plain/text. And in fact, OCRopus output HTML :). OCRopus is currently based on the famous tesseract OCR engin, but some hocr code is in the repo, and more is to come.

Just like Gegl, i ensure that even alpha softwares i use in gnome-scan are at least packaged, either in an official repos or by my own. Tesseract is packaged, but OCRopus require tesseract SVN which does not built and has a bug (public headers includes config.h). The built failure received a patch waiting for inclusion, and i provide a patch fixing the last issue.

The main issue is that OCRopus has never publish a release. It has only SVN repo. OCRopus uses Jam without any rules to generate a distribution package (like make distcheck from automake). After some research, i decided to add automake build system to OCRopus. Tesseract has two build system : one for developer, one for distributor. Automake is still the best solution for distributor.

I provided a quite big but incomplete patch at OCRopus mailing list which received some attention. It seems that tesseract and OCRopus are very close together. Both are not very active. I hope that gnome-scan will tick those project developers and put them to users. OCRopus has only a command line tool, it should provide a library. Tesseract provide 11 libraries, it should provide only one !

So, gnome-scan OCR is kind of blocked by upstream project that’s needs some love. Both projects needs to modify a bit their design. I think i will provide a “protocropus” plugin for gnome-scan which will provide a simple bridge between the various command line OCR tools or libraries (including OCRopus), but not depending on OCRopus.

Étienne.

Picasa for Linux

June 28, 2007

Hi,

Most of you read the latest news about picasa for Linux. Apart from beeing just a wine linked software, thanks to Google for providing regular repository for tons of distros. This is amazing to see such compagny taking FOSS in account, trying to understand the way it works instead of trying to use Windows® methods on Libre systems.

Anyway, the key point about picasa is that scan feature. You may have found it or not. I guess they use an internal SANE/TWAIN bridge. I have to admit the scanner support is really rought and buggy (or quick and dirty if you mind).

Devices option are dumped with one tab per option group. There is no preview area. The overall UI is messy, inconsistent and unfriendly. I’m not critizing this because it’s proprietary, i bet even Google people behind that think the same.

However, that’s quite obvious why they come to such result. After writing too SANE frontend (considering the entire gnome-scan rewrite for 0.6 release), i can say that SANE needs huge work for frontend developers, for various reasons :

  • options are very primitive and not directly suitable for UI (see screenshot : for options for selecting scan area). However, these options are still declared with “high level” meta data such as title, description, etc. suitable for building UI. Options are even grouped.
  • backends are very very inconsistent. See my various posts at sane-devel list. SANE 1.X lacked tons of recommendations. That’s normal that SANE 1.0 did not provide all the recommandations for today scanners. However, no update has been done since SANE standard 1.0 publication. That’s the lame. Since, all backends implements options with a lot of originality making frontend developer life a hell.

I would add another problem : misintegration in hosting OS. SANe pretend to be very portable and it actually is. However, portability does not mean integration anywhere with only generic code reduce to the least common denominator. This should mean use of each native OS specific models and structures, all of this abstracted using internals models and structures. That’s true for abstraction layer, device probe, network, etc.

I really like SANE. It’s leightweight. It’s extendable. It’s well designed. Yes, it’s well designed ! However, it’s implementation lakes consistency and real portability.

Étienne.


Hi all,

I’m not blogging a lot since i produce weekly report on gnome-soc-list. Also, the motivation is not that high. I continue the new preview area system. It is quite nice. It use GtkStyle colors and is extensible. That’s not easy to rewrite such software. In some hand you got excitation and motivation due to new features and huge improvement, but in the other end, its a bit of duplication, waste of time, etc.

After preview area, the big job is processing. I prepared a bit that part of the software, but wasn’t able to know exactly what i wanted. I guess i will provide a basic processor for rotation and color correction, however i don’t know how to get it really modular. Should i use multiple processors ? Also, i wonder how to take hardware color correction in account as well as calibration.

Anway, i would like to write OCR as soon as possible. Writing a dumb plugin for gedit or abiword would be funny. I tested OCRopus. Well, tesseract SVN did not build so ocropus can’t be built. tesseract development seems not active. I know some people are working on it, but not in the official SVN. Finally, i found a patch in their bugtracker which solved the problem. Tesseract and OCRopus definitly needs some feedbacks. This is where gnome-scan comes at point.

Even if OCRopus is far from yet ready for distribution, i investigate into it. Gnome Scan en Gegl are is similar state. I wish this will help and motivate their developers. Some projects really needs some love. Let’s give it what they need.

OCRopus should require an additional dialog with preview of the result. I bet the new preview area will be suitable for this thanks to it’s modular design. The new preview area will also fit exotic needs like multiple area without bloating the library.

Étienne.

Hi all of you !

It’s time for a release after the reset of the project. Gnome Scan 0.5.1 has almost all features of 0.4 and tons of both user visible and internals improvements. Highlight of this release are plugin system, dynamic UI building, Gegl-base processing, multi-threaded programming, file acquisition and more. Regressions from 0.4 are : no rotation, no gimp plugin (please notify me if you find other). Read the NEWS and ChangeLog for more informations.

I produce this release in order to get feedbacks. Especially, i would like feedbacks on SANE support, UI design, responsiveness as well as API. If anyone feel like writing the Gimp plugin, that would be a great test for the new design.

Update: I fixed packaging, thanks to Karel Demeyer and Anonymous #1.

Cheers !
Étienne.

Hi,

Today i release 0.4.1 version of gnome-scan. This is a bugfix release. It handle resolution enum and drop dual resolution handling. Thanks Jean-François Fortin for reporting the bug and Philipp Sadleder for the patch.

Gnome Scan 0.4 is far too hardcoded. It’s incredible to see how gnome-scan 0.5 change this. I really don’t want to maintain 0.4 branch, it’s a waste of time. I won’t maintain any paste branch with 0.X version number.

You can test gnome-scan 0.5 SVN by using ubuntu packages i provide at

deb http://bersace03.free.fr/ubuntu feisty universe

The repository contains gnome-scan SVN and all the dependencies. (babl and gegl SVN). Deb sources are available.

Currently, gnome-scan 0.5 implements device and sink configuration. Acquisition works one or multiple (sane backend does not support yet multiple acquisition). No preview area yet, not rotation nor any processing yet. Also, beware that flegita overwrite existing files. Anyway, with gnome-scan 0.5, you can taste all the new features and improvements. I let you discover them ;).

Regards,
Étienne

Hi,

Some of you may have seen the changes in the past days. Gnome Scan project continue its migration from Gna! hosting to Gnome hosting. The first step was the wiki, which has always been Gnome Live ; then the bug tracking system with the Gnome bugzilla. Then the SVN repos. And now, the web site. I took the opportunities to update the webpage. I dropped documentation, added some screenshots and some propaganda.

I made a shiny transition page at Gna! which will remain as long as search engines point it in the first result page. I could have just pointed GNOME Scan link in Gnome projects page to the old page at Gna! However, Gna! project page use “gnomescan”, not “gnome-scan”; Gna! homepage url begin with “home.gna.org” which is a bit ugly. (gnomescan.gna.org should have been better).

The new webpage is still translated in french. The english version has received correction from Ori. Many thanks to him. Other translations are incomplete or inexistant. If you want to translate, contact me : bersace at gnome.org. (Please don’t use comments to contact me, that’s uselesse, since i can only answer you through comment)

Don’t hesitate to update your links to point to the new Gnome Scan homepage address !

The next steps are blog and download area. For the first, the simple trick is to add the blog to the planet. However, i have my own blog where i may gather all my posts. For the second, it’s more difficult than having a Gnome SVN access. For now, i will send my tarballs to the release maintainer (aka Vincent Untz).

Except the migration, what’s up in the gnome-scan front ? … Many things !!! Gnome Scan rework is very deep. I polished the Gnome Scan-ng specs , i event produce a class diagram !!! Some keypoints of the new design :

  • Backend has four stage : configure, acquire, process, sink
  • Frontend has two stage (two dialog) : configure, acquire
  • Plugin for scanner and sink.
  • Huge reuse of existing work : GLib, Gtk+, Gegl, etc. The entire library contains 10 classes (exluding plugins which just reimplement a base class) for far more fonctionnalities and extensibilities. Gnome Scan 0.4 was using 12 library for less functionnalities and was all static.

I won’t expand on new features, this will be the subject for antoher post. You can checkout the latest development version at http://svn.gnome.org/svn/gnome-scan/trunk . You’ll need babl and gegl library. I debianized this libraries and uploaded them to REVU. Expect them to be in universe at least in feisty+1. Note that i published them to my packages directory.

Étienne.

Hi all,

Some of you may be disappointed by that, but i don’t. Gnome Scan won’t be added in Gnmoe 2.18. Strong objections from Vincent Untz were non-Gnome hosting (which is my primary goal since Gnome SVN migration). The real objections are : Gnome Scan is far too immature for release. I expect huge and deep changes (API break, features, etc.). Following the Gnome schedule is too restrictive for Gnome Scan which is still at early stage. So keeping Gnome Scan out of Gnome schedule allow a more active development for Gnome Scan.

See the Gnome Scan inclusion proposal at desktop-devel-list.

Happy new year !

Arabic translation

January 8, 2007

Hi all,

Thanks to Djihed Afifi, gnome-scan now have arabic translation ! I produce a 0.4.0.2 release for that. Note that i prefer receiving .po by mail instead of waiting for translation team acknowledgment in rosetta …

Étienne.

Hi all !!

Update: emergency 0.4.0.1 release : fix build failure & include latest swedish translation

For Chrismas, i expected a 0.3.2 release, but enhancments and API breakage are so important, i decided to make a major release. i codenamed this release « Is your app people-ready ? » in order to wink at Microsoft advertisments in the past month : « Are you people ready ? ». I really hate this ads which don’t expose product quality, but “feelings” and “concepts”. That’s manipulation. This release of Gnome Scan add nice smart behaviour that make it very suitable for daily use, both for basic and advanced users.

Highlights

  • The scan dialog has been entirely review and is now very consistent with Gtk print dialog. The new dialog implements a mecanism to include extras widget inside the dialog instead of building an entire dialog using provided widgets. This add consistency between app using Gnome Scan, and reduce the amount of code in app/plugin.
  • Gnome Scan now implement a smart way of selecting area and rotation for you. Select the device, select the source, select the format and the page orientation. Gnome Scan will compute if the document will be rotated in order to fit the scanner, and centered if you use ADF.
  • Gnome Scan preview now allow you to resize only “custom size”. So if you choose A4 paper, you will just be able to move the area. Select back Custom size to adjust area again.
  • The preview area now shows a “document” icon in top left corner instead of a centered application icon, this is far more useful when you set the rotation, in order to know where is you document, even if you didn’t trigger preview acquisition.
  • Gnome Scan now handle X/Y resolution. Gnome Scan try determine wether the device allow dual resolution and allow you to unsync res.
  • Gimp plugin now allow you to choose layer name before scan. The acquisition dialog is now 4 times smaller !!
  • Updated translations. Many thanks to Philip Sadleder for deutsch translation, Gil Forcada for catalan translation and Daniel Nylander for swedish translation !

Links

Under the hood

  • GnomeScanDialog has been highly reviewed. It now herit (again) from GtkDialog, like GtkPrintUnixDialog. GnomeScanDialog now use GtkNotebook in order to separate fields. This allow to have a tab completely dedicated to preview. In contrary to printing, scan preview is interactive : user can choose area and rotation, this is why i choose to use a tab instead of a separate window. The preview tab is shown only if user select “Flatbed” source.
  • The GnomeScanDialog implement a “front widget” mecanism that allow developer to add a custom widget in the General tab, below source and format selectors. There is not yet an API for adding/hiding custom tabs.
  • Introducing GnomeScanAreaSelector, a smart widget that allow user to choose document format and orientation. When a device is selected, the widget compute a list of available formats that fit the device geometry. The widget compute if the document needs to be rotated to fit the device geometry and setup the context in order to rotate back the document in the right orientation. This behaviour apply both with flatbed or adf source selected. GnomeScanAreaSelector use new Gtk+-2.10 GtkPaperSize API which use PWG standard for sizes and names.
  • Gnome Scan now store wether an area is user defined (custom) or not, this allow GnomeScanPreviewArea to hide resize handle if area is not user defined.
  • GnomeScanContext now handle dual resolution. However, i based the device dual resolution capability on SANE 2 standard. SANE backends may implement differently that for SANE 1. That’s one of the problem with SANE.
  • GnomeScanPreview and GnomeScanAdvancedPreview has been dropped. They were quite duplicated. now GnomeScanDialog provide a unique preview GUI.
  • GnomeScannerSelector has been dropped in favor of GnomeScanListStore which allow to feed both GtkTreeView (implemented) or GtkComboBox. GnomeScanDialog now use a GtkTreeView feeded with a GnomeScanListStore.
  • GnomeScanOptionWidget has also been dropped. This widget only added automaticcaly a field before a widget. This widget didn’t allow to keep fields aligned. Now let user use either a GtkTable or frame (like GnomeScanDialog).
  • Add a –disabled-gnome option in order to build flegita for xfce desktop target. (Patch from Olaf Leidinger).

How to help ?

  • Please test the smart behaviour. If your does not center document in ADF, if should review the process.
  • If you didn’t find useful stock paper size, please ask for adding it.
  • Please translate this major alpha release.
  • If your device handle dual resolution, please test wether Gnome Scan detect this capability.
  • Spread the word, ask for package in your distribution or package it !

About Gnome Scan and Gnome

I failed to follow Gnome 2.18 schedule. Mostly because i’m new to free software development, i do not yet manage my work as i want. I often let Gnome Scan hibernate during few weeks, and then develop it during on weeks and produce a release. I wish i’ll be able to spend this week just before tarball dues.
Also, Gnome Scan still have deep API changes. I wonder if i will modify the API to be more consistant with GtkPrint API, as far as possible.

Gnome SVN migration is soon to be completed. I wish to switch to a a gnome as possible hosting solution for SVN, FTP, translation, …

Future

For next releases, i intend to implement a few features.

  • Select maximal possible area.
  • Auto select area.
  • Gamma correction.
  • Highlight/Shadow point
  • The return of GnomeScanOptionSet.
  • Device specific options tab in GnomeScanDialog.
  • Depth selection.
  • Colorspace selection.
  • Whatever you asked for …

SANE and HAL

Currently, SANE manage two task : probe and access. That’s a bad thing. SANE must let OS do the probe and ask it for access. SANE should provide a simple glue between various OSes and driver so that OSes can load driver on plug, monitor buttons, and trigger acquisition. I wish that SANE 2 will move toward this design or scanner in HAL will be a dream (or cost a fork which i really do not want). After 1.0 release, i plan to implement one or two driver for SANE (my father owns about 9 scanners at home from all-in-one printers to various pens), especially a nice business card reader. This will allow me to dive deeply into SANE, hoping to make constructive critic for SANE 2 design. Martin Owens is working hard on Scanner in HAL, wish that for this summer, Gnome Scan will use HAL.

Allan, a most active SANE devloper said that linux distro user often do not use HAL. I really think that’s a mistake. All major distro ship HAL, and tons of other ditros do that too. SANE can handle probe, but must allow OS integration. Mac OS X has nothing for scanners. Vista has a very nice scanner handling (espcially probe and detection), but drivers lakes or are unsusable, also Vista scan UI is a joke (not people ready !). Freedesktops has a huge opportunity to make the difference with proprietary OSes ; Droping “probe each launch” and adding hotplug+button support is a must-have feature to fulfill this goal, along with OCR.

Merry Chrismas and see you in 2007 !

Étienne.

Good news

November 28, 2006

Hi all,

After quite some time of hibernation, i wake up the development process of Gnome Scan. Gnome Scan hit the 100th revision ! This revision is the first contributed patch Olaf Leidinger. It added –disable-gnome option for xfce people and other. I’m also very pleased to see news Swedish translation and nearby Dutch translation

Gnome Scan seems to spread more and more. How surprised where i to see that Foresight Linux 0.9.4 include Gnome Scan/Flegita ! Gnome Scan 0.3.1 is now in feisty/universe. I even saw unofficial ebuild. A bunch of people talk about Gnome Scan all around the web in French and German (some consider it as a good replacement of xsane, i find Gnome Scan in a too early stage to fully replace xsane).

Emmanuel Fleury, associate professor at Bordeau-I university, submitted Gnome-OCR project to its student. The project will start in January. The idea is to link tesseract-OCR and Gnome Scan. I wish this project will end up with an Abiword plugin. Even if i mainly work on flegita, the purpose of Gnome Scan is not a standalone app, but flegita is the first step. As long as Gnome Scan has no stable API, i don’t want to provide too many unmanagable plugins. I don’t want to end up with 7.0.0 libtool version for gnome-scan 1.0 :).

I guess that 0.3.2 will arrive as Chrismas gift to the FOSS world :). It will include nice features that may make Gnome Scan a reasonable replacement of xsane for basic end user.

So yes, Gnome Scan is still alive ! And very active !

Étienne.