Hi everybody,

I was away the past week and did not work very much on Gnome Scan. I met with Lionel Dricot and Raphaël Slinkx at Louvain-la-Neuve. Funny 🙂 I’m just back to home and checking out mails, RSS, and such.

Weekly Report

The past two weeks were busy on various improvements in Gnome Scan, particularily printing (using GtkPrint). I added the codebase to handle multiple actions (mail to come soon). That’s not that easy since GtkPrint is not flawless for use in such special case : configure scan and print ; acquire and print. It’s more used for a workflow like configure>scan>configure>print.

Final Report

No doubt this Soc has been ful of new features and improvements for Gnome Scan, however, there is place for lot of development.

The bad

  • I still didn’t reach the 0.6 stage (i.e. all feature from Gnome Scan 0.4.1 reimplemented).
  • No real processing at all (rotation, deskew, gamma, etc.). I would like to use CPU only where hardware fail this is one reason why i didn’t implemented it yet.
  • I need another project (temporarily named libgnocr) in order to provide a modular OCR API and UI (primarily on top of OCRopus). Contribution are welcome.
  • Preview is not stable nor optimized.
  • Page Orientation is not yet handle (Adding button is not enough for adding a feature :/).
  • Memory leaks.
  • Printing suck (yet).
  • Documentation is incomplete (but we have screenshots).
  • I didn’t attend at GUADEC 2007 🙁

The goods

  • Far far far better SANE support.
  • Far far far better scanner handling (no more hardcoded options, etc.).
  • Multi threaded (no news about thread safe, comments welcome)
  • Gegl based for effective huge image handling
  • AbiScan : the power of Gnome Scan and OCRopus in Abiword
  • Flegita Gimp : You can use Gegl in Gimp yet !
  • Preliminary printing support
  • Preview acquisition monitoring
  • Modular preview area
  • Module loading system, allow to install/uninstall backend (e.g. choosing twain/SANE; removing file, etc.)

I’m not satisfied with my amount of work during august. Well, i plan to work fulltime on Gnome Scan next week. Also, is should code a bit this week. 🙂 Gnome Scan is about 12000 lines of code (excluding headers).

The best thing was contact with other people. First with Vincent Untz, my mentor, which was very distant, except when i need him :). Then come all developers from external project like Gegl, OCRopus, AbiWord. The same for -hackers and various GIMPNet chan members. All that people must receive a hertlich thank from me. The same goes to user for their very important feeback (Merci Jean-François Fortin !) also you, dear reader and commenter.

Merci et à bientôt !

Étienne.


E Ultreïa !

State of OCR in Gnome

August 7, 2007

Hi everybody,

My work on flegita-gimp does not mean i forgot OCR which is, IMHO, the first class feature of scanning. Writing AbiScan clears my vision on how to design Gnome OCR UI. Before writing AbiScan, i was wondering how to integrate OCR in Gnome Scan. I was really worried because Gnome Scan is designed to pass image (as GeglBuffer) to application, not text or HTML or wathever OCR output format. I decided to write AbiScan and use ocropus directly instead of through Gnome Scan.

This lead me to find the way Gnome will receive OCR and OCR UI. AbiScan use ocropus command line tool, the idea is to use a library providing common OCR UI instead. This library should be ship by OCRopus. Why OCRopus and not Gnome Scan ? Because i think this library depends more on OCRopus and not on Gnome Scan. I may provide an OCR sink in Gnome Scan which help pluging Gnome Scan and OCRopus, but that’s not all the UI and OCR interaction part which should heavily rely on OCRopus itself, just like OCRopus command line tool.

Publishing AbiScan seems to have revealed questions from users. At the risk of repeating OCRopus website, let me explain a bit of OCRopus. OCRopus is not an OCR engine. OCRopus is a document analysis and OCR system. Instead of rewriting its own OCR engine, it uses existing one, especially tesseract, but more are to come. The difference between OCRopus and an OCR engine is exactly the same as between HTML and plain/text. HTML contains semantic, formatting and test itself while plain/text contains only … text ! So, if ever you read a comparision between OCRopus and e.g. gocr or ocrad, you can laught at it. Well, in fact, ocrad has a minimal layout analyser for text column, but that’s not as advanced as OCRopus layout analyser.

Regards,
Étienne.

New flegita-gimp for r400

August 7, 2007

Hi everyone,

I rewrote flegita-gimp on top of Gnome Scan 0.5.2. It’s in SVN for the 400th revision ! This was very easy to write this plugin, reusing code from the previous one. Once thing to notice, it’s the first ever use of Gegl inside the Gimp ! Keep in mind that Gegl is designed as the future of Gimp. In the future, Gnome Scan and Gimp should speek natively using GeglBuffer, but for Gimp 2.3, i had to translate GeglBuffer into the GimpPixelRgn.

Using GeglBuffer allow to manipulate unlimited sized image from high res scan (given there is no memleak). I wonder if that make sense to add advanced option such as layer opacity or layer mode. I already add a field entry allowing to name the field. This avoid to rename the field after scan. Feedback welcome.

I provide a screencast video using flegita-gimp 0.5.2 showing the two features provided by the plugin : scan as new image and scan as layer. flegita-gimp should receive improvments like better integration with undo/redo action.

If you want to test it, that’s easy (compared to AbiScan) :

  1. install Babl and Gegl from SVN
  2. install Gimp 2.3 or SVN
  3. install Gnome Scan SVN
  4. launch the Gimp and use /File->Acquisition->Scan or /File->Scan as Layer to trigger the plugin.

This is the second plugin on top of Gnome Scan. I’m quite happy with that. Gnome Scan is really cool to use. It really needs debugging and polishing, but the API actually rules.

Regards.

AbiScan Preview

August 6, 2007

Hi all,

Resulting in about one week of lazy effort, i reach to produce a preliminary version of AbiScan on top of OCRopus. I produced a screencast video of direct OCR import into Abiword Frame. This is very buggy, but very exciting too :).

I must thanks #abiword people, especially Dominic Lachowicz, Marc Mauer, Martin Sevior, jean, sum1 and Hubert Figuière. Thanks goes to OCRopus and Gegl people for their work and advices.

I provide AbiScan patch against abiword-plugins SVN. The plugins does not work if abiword use G_MODULE_BIND_LAZY flags, this is a bug in abiscan, not abiword. I provide a patch against abiword SVN removing g_module_open flags, but it will hopefully never be merged.

If you want to try it, follow the following steps :

  1. Install tesseract-ocr from SVN, with the patch i provide in tesseract BTS ;
  2. Install ocropus ;
  3. Install Gegl SVN ;
  4. Install Gnome Scan SVN ;
  5. Install abiword SVN with g-module-open-flags.diff patch ;
  6. Install abiword-plugins SVN with abiscan.diff patch ;
  7. Launch Abiword
  8. Launch Insert > Import from scanner and follow the steps.

Warning : that’s really buggy.

  • Gnome Scan does not handle device list very well if you launch several times the dialog.
  • OCRopus does not provide any API, so the plugin use system() and isn’t able to monitor progress. OCRopus might take very long time.
  • Sometimes, it eats tons of memory.
  • Currently, it lose formating, that’s due to a HTML import pasteFromBuffer() bug. I had to make a choice between paste into existing document losing formating, or open directly tmp OCRopus HTML directly.

Bug reports are very welcome, please file bugs to gnome-scan product in Gnome bugzilla, for the abiscan component. Note that OCRopus prefer 150dpi images.

Anyway, that’s a rought draft with the key feature provided by Gnome Scan and OCRopus : tight integration into application and advanced OCR.


becomes

Regards,
Étienne


E Ultreïa !

Back from scouting

July 30, 2007

Hi all Gnome lovers,

I’m back from 17 days of scouting in nature. This was great. I published some photos of the camp at Faye in Nièvre. I came back last friday and was exhausted.

I didn’t resumed yet Gnome Scan development. I’ll take the time to think the future of Gnome Scan, espcially OCR. Sadely, there were not that much work on OCRopus during the past 3 weeks. I wonder how to pass data to the application. OCRopus output is in HTML with OCR tags. That’s useful but not very clean. I wonder how to integrate that in AbiWord.

So, my plan for the end of the summer is to implement OCR, rotation, Gimp and Abiword plugin.

Hi,

Monday, i will take off for more than two weeks. I will serve about 41 boys for scouting from 10th to 27th. The last few weeks has seen terribily slow progress for Gnome Scan. Even if the TODO list is far from empty, i didn’t have the motivation to get things done. This results on about 3 month of hard working on Gnome Scan. It’s time to get a pause. Scouting is perfect for that. Playing with boys in the nature will make me forget gnome-scan for some times !

During those weeks of fun, i wish that gnome-scan related project such as OCRopus and tesseract or even Gegl will get some improvements. I wish to get back my motivation in late July. August will be a rush for gnome-scan, just for the final evaluation and 0.5.2 release. 0.6 will land for Christmas, just like 0.4. That may be enough to get it included in Gnome 2.22 !?

While Gnome Scan 0.4 was quite a “proof of concept”, Gnome Scan 0.5 is a solid base for future development. I intend to begin writing tutorial and – why not ? – a lightning talk for GUADEC 2008. 2.24 or even 2.22 may be an important step for scanning in Gnome. I’m really waiting for a desktop with rocking printing, burning and scanning dialogs in a lot of applications.

Anyway, i’ll be off during the following three weeks after 3 months of heavy development. The future is full of promises. I hope to find things changed after the scout camp in order to find back the Gnome Love 🙂

Photo of two friends at Chamechaude in the Alpes moutains in France during winter scout camp.

Étienne.

Investigating OCR

July 6, 2007

Hi,

Since 0.5.1, i’m investigating OCR for Gnome through Gnome Scan. The most advanced software is OCRopus. OCRopus is to tesseract what HTML is to plain/text. And in fact, OCRopus output HTML :). OCRopus is currently based on the famous tesseract OCR engin, but some hocr code is in the repo, and more is to come.

Just like Gegl, i ensure that even alpha softwares i use in gnome-scan are at least packaged, either in an official repos or by my own. Tesseract is packaged, but OCRopus require tesseract SVN which does not built and has a bug (public headers includes config.h). The built failure received a patch waiting for inclusion, and i provide a patch fixing the last issue.

The main issue is that OCRopus has never publish a release. It has only SVN repo. OCRopus uses Jam without any rules to generate a distribution package (like make distcheck from automake). After some research, i decided to add automake build system to OCRopus. Tesseract has two build system : one for developer, one for distributor. Automake is still the best solution for distributor.

I provided a quite big but incomplete patch at OCRopus mailing list which received some attention. It seems that tesseract and OCRopus are very close together. Both are not very active. I hope that gnome-scan will tick those project developers and put them to users. OCRopus has only a command line tool, it should provide a library. Tesseract provide 11 libraries, it should provide only one !

So, gnome-scan OCR is kind of blocked by upstream project that’s needs some love. Both projects needs to modify a bit their design. I think i will provide a “protocropus” plugin for gnome-scan which will provide a simple bridge between the various command line OCR tools or libraries (including OCRopus), but not depending on OCRopus.

Étienne.

Picasa for Linux

June 28, 2007

Hi,

Most of you read the latest news about picasa for Linux. Apart from beeing just a wine linked software, thanks to Google for providing regular repository for tons of distros. This is amazing to see such compagny taking FOSS in account, trying to understand the way it works instead of trying to use Windows® methods on Libre systems.

Anyway, the key point about picasa is that scan feature. You may have found it or not. I guess they use an internal SANE/TWAIN bridge. I have to admit the scanner support is really rought and buggy (or quick and dirty if you mind).

Devices option are dumped with one tab per option group. There is no preview area. The overall UI is messy, inconsistent and unfriendly. I’m not critizing this because it’s proprietary, i bet even Google people behind that think the same.

However, that’s quite obvious why they come to such result. After writing too SANE frontend (considering the entire gnome-scan rewrite for 0.6 release), i can say that SANE needs huge work for frontend developers, for various reasons :

  • options are very primitive and not directly suitable for UI (see screenshot : for options for selecting scan area). However, these options are still declared with “high level” meta data such as title, description, etc. suitable for building UI. Options are even grouped.
  • backends are very very inconsistent. See my various posts at sane-devel list. SANE 1.X lacked tons of recommendations. That’s normal that SANE 1.0 did not provide all the recommandations for today scanners. However, no update has been done since SANE standard 1.0 publication. That’s the lame. Since, all backends implements options with a lot of originality making frontend developer life a hell.

I would add another problem : misintegration in hosting OS. SANe pretend to be very portable and it actually is. However, portability does not mean integration anywhere with only generic code reduce to the least common denominator. This should mean use of each native OS specific models and structures, all of this abstracted using internals models and structures. That’s true for abstraction layer, device probe, network, etc.

I really like SANE. It’s leightweight. It’s extendable. It’s well designed. Yes, it’s well designed ! However, it’s implementation lakes consistency and real portability.

Étienne.


Hi all,

I’m not blogging a lot since i produce weekly report on gnome-soc-list. Also, the motivation is not that high. I continue the new preview area system. It is quite nice. It use GtkStyle colors and is extensible. That’s not easy to rewrite such software. In some hand you got excitation and motivation due to new features and huge improvement, but in the other end, its a bit of duplication, waste of time, etc.

After preview area, the big job is processing. I prepared a bit that part of the software, but wasn’t able to know exactly what i wanted. I guess i will provide a basic processor for rotation and color correction, however i don’t know how to get it really modular. Should i use multiple processors ? Also, i wonder how to take hardware color correction in account as well as calibration.

Anway, i would like to write OCR as soon as possible. Writing a dumb plugin for gedit or abiword would be funny. I tested OCRopus. Well, tesseract SVN did not build so ocropus can’t be built. tesseract development seems not active. I know some people are working on it, but not in the official SVN. Finally, i found a patch in their bugtracker which solved the problem. Tesseract and OCRopus definitly needs some feedbacks. This is where gnome-scan comes at point.

Even if OCRopus is far from yet ready for distribution, i investigate into it. Gnome Scan en Gegl are is similar state. I wish this will help and motivate their developers. Some projects really needs some love. Let’s give it what they need.

OCRopus should require an additional dialog with preview of the result. I bet the new preview area will be suitable for this thanks to it’s modular design. The new preview area will also fit exotic needs like multiple area without bloating the library.

Étienne.

Hi all of you !

It’s time for a release after the reset of the project. Gnome Scan 0.5.1 has almost all features of 0.4 and tons of both user visible and internals improvements. Highlight of this release are plugin system, dynamic UI building, Gegl-base processing, multi-threaded programming, file acquisition and more. Regressions from 0.4 are : no rotation, no gimp plugin (please notify me if you find other). Read the NEWS and ChangeLog for more informations.

I produce this release in order to get feedbacks. Especially, i would like feedbacks on SANE support, UI design, responsiveness as well as API. If anyone feel like writing the Gimp plugin, that would be a great test for the new design.

Update: I fixed packaging, thanks to Karel Demeyer and Anonymous #1.

Cheers !
Étienne.