muellis blog

DFN Workshop 2011

I had the opportunity to attend the 18th DFN Workshop (I wonder how that link will look like next year) and since it’s a great event I don’t want you to miss out. Hence I’ll try to sum the talks and the happenings up.

It was the second year for the conference to take place in Hotel Grand Elysee in Hamburg, Germany. I was unable to attend last year, so I didn’t know the venue. But I am impressed. It is very spacious, friendly and well maintained. The technical equipment seems to be great and everything worked really well. I am not too sure whether this is the work of the Hotel or the Linux Magazin though.

After a welcome reception which provided a stock of caffeine that should last all day long, the first talk was given by Dirk Kollberg from Sophos. Actually his boss was supposed to give the talk but cancelled it on short notice so he had to jump in. He basically talked about Scareware and that it was a big business.

He claimed that it used to be cyber graffiti but nowadays it turned into cyber war and Stuxnet would be a good indicator for that. The newest trend, he said, was that a binary would not only be compressed or encrypted by a packer, but that the packer itself used special techniques like OpenGL functions. That was a problem for simulators which were commonly used in Antivirus products.

He investigated a big Ukrainian company (Innovative Marketing) that produced a lot of scareware and was in fact very well organised. But apparently not from a security point of view because he claimed to have retrieved a lot of information via unauthenticated HTTP. And I mean a lot. From the company’s employees address book, over ERM diagrams of internal databases to holiday pictures of the employees. Almost unbelievable. He also discovered a server that malware was distributed from and was able to retrieve the statistics page which showed how much traffic the page made and which clients with which IPs were connecting. He claimed to have periodically scraped the page to then compile a map with IPs per country. The animation was shown for about 90 scraped days. I was really wondering why he didn’t contact the ISP to shut that thing down. So I asked during Q&A and he answered that it would have been for Sophos because they wouldn’t have been able to gain more insight. That is obviously very selfish and instead of providing good to the whole Internet community, they only care about themselves.

The presentation style was a bit weird indeed. He showed and commented a pre-made video which lasted for 30 minutes out of his 50 minutes presentation time. I found that rather bold. What’s next? A pre-spoken video which he’ll just play while standing on the stage? Really sad. But the worst part was as he showed private photos of the guy of that Ukrainian company which he found by accident. I also told him that I found it disgusting that he pillared that guy in public and showed off his private life. The people in the audience applauded.

A coffee break made us calm down.

The second talk about Smart Grid was done by Klaus Mueller. Apparently Smart Grids are supposed to be the new big thing in urban power networks. It’s supposed to be a power *and* communications network and the household or every device in it would be able to communicate, i.e. to tell or adapt its power consumption.

He depicted several attack scenarios and drew multiple catastrophic scenarios, i.e. what happens if that Smart Grid system was remotely controllable (which it is by design) and also remotely exploitable so that you could turn off power supply for a home or a house?
The heart of the Smart Grid system seemed to be so called Smart Meters which would ultimately replace traditional, mechanical power consumption measuring devices. These Smart Meters would of course be designed to be remotely controllable because you will have an electrified car which you only want to be charged when the power is at its cheapest price, i.e. in the night. Hence, the power supplier would need to tell you when to turn the car charging, dish or clothes washing machine on.

Very scary if you ask me. And even worse: Apparently you can already get Smart Meters right now! For some weird reason, he didn’t look into them. I would have thought that if he was interested in that, he would buy such a device and open it. He didn’t even have a good excuse, i.e. no time or legal reasons. He gave a talk about attack scenarios on a system which is already partly deployed but without actually having a look at the rolled out thing. That’s funny…

The next guy talked about Smart Grids as well, but this time more from a privacy point of view. Although I was not really convinced. He proposed a scheme to anonymously submit power consumption data. Because the problem was that the Smart Meter submitted power consumption data *very* regularly, i.e. every 15 minutes and that the power supplier must not know exactly how much power was consumed in each and every interval. I follow and highly appreciate that. After all, you can tell exactly when somebody comes back home, turns the TV on, puts something in the fridge, makes food, turns the computer on and off and goes to bed. That kind of profiles are dangerous albeit very useful for the supplier. Anyway, he committed to submitting aggregated usage data to the supplier and pulled off self-made protocols instead of looking into the huge fundus of cryptographic protocols which were designed for anonymous or pseudonymous encryption. During Q&A I told him that I had the impression of the proposed protocols and the crypto being designed on a Sunday evening in front of the telly and whether he actually had a look at any well reviewed cryptographic protocols. He didn’t. Not at all. Instead he pulled some random protocols off his nose which he thought was sufficient. But of course it was not, which was clearly understood during the Q&A. How can you submit a talk about privacy and propose a protocol without actually looking at existing crypto protocols beforehand?! Weird dude.

The second last man talking to the crowd was a bit off, too. He had interesting ideas though and I think he was technically competent. But he first talked about home routers being able of getting hacked and becoming part of a botnet and then switched to PCs behind the router being able to become part of a botnet to then talk about installing an IDS on every home router which not only tells the ISP about potential intrusions but also is controllable by the ISP, i.e. “you look like you’re infected with a bot, let’s throttle your bandwidth”. I didn’t really get the connection between those topics.

But both ideas are a bit weird anyway: Firstly, your ISP will see the exact traffic it’s routing to you whatsoever. Hence there is no need to install an IDS on your home router because the ISP will have the information anyway. Plus their IDS will be much more reliable than some crap IDS that will be deployed on a crap Linux which will run on crappy hardware. Secondly, having an ISP which is able to control your home router to shape, shut down or otherwise influence your traffic is really off the wall. At least it is today. If he assumes the home router and the PCs behind it to be vulnerable, he can’t trust the home router to deliver proper IDS results anyway. Why would we want the ISP then to act upon that potentially malicious data coming from a potentially compromised home router? And well, at least in the paper he submitted he tried to do an authenticated boot (in userspace?!) so that no hacked firmware could be booted, but that would require the software in the firmware to be secure in first place, otherwise the brilliantly booted device would be hacked during runtime as per the first assumption.

But I was so confused about him talking about different things that the best question I could have asked would have been what he was talking about.

Finally somebody with practical experience talked and he presented us how they at Leibniz Rechenzentrum. Stefan Metzger showed us their formal steps and how they were implemented. At the heart of their system was OSSIM which aggregated several IDSs and provided a neat interface to search and filter. It wasn’t all too interesting though, mainly because he talked very sleepily.

The day ended with a lot of food, beer and interesting conversations 🙂

The next day started with Joerg Voelker talking about iPhone security. Being interested in mobile security myself, I really looked forward to that talk. However, I was really disappointed. He showed what more or less cool stuff he could do with his phone, i.e. setting an alarm or reading email… Since it was so cool, everybody had it. Also, he told us what important data was on such a phone. After he built his motivation, which lasted very long and showed many pictures of supposed to be cool applications, he showed us which security features the iPhone allegedly had, i.e. Code Signing, Hardware and File encryption or a Sandbox for the processes. He read the list without indicating any problems with those technologies, but he eventually said that pretty much everything was broken. It appears that you can jailbreak the thing to make it run unsigned binaries, get a dump of the disk with dd without having to provide the encryption key or other methods that render the protection mechanisms useless. But he suffered a massive cognitive dissonance because he kept praising the iPhone and how cool it was.
When he mentioned the sandbox, I got suspicious, because I’ve never heard of such a thing on the iPhone. So I asked him whether he could provide details on that. But he couldn’t. I appears that it’s a policy thing and that your application can very well read and write data out of the directory it is supposed to. Apple just rejects applications when they see it accessing files it shouldn’t.
Also I asked him which protection mechanisms on the iPhone that were shipped by Apple do actually work. He claimed that with the exception of the File encryption, none was working. I told him that the File encryption is proprietary code and that it appears to be a designed User Experience that the user does not need to provide a password for syncing files, hence a master key would decrypt files while syncing.

That leaves me with the impression that an enthusiastic Apple fanboy needed to justify his iPhone usage (hey, it’s cool) without actually having had a deeper look at how stuff works.

A refreshing talk was given by Liebchen on Physical Security. He presented ways and methods to get into buildings using very simple tools. He is part of the Redteam Pentesting team and apparently was ordered to break into buildings in order to get hold of machines, data or the network. He told funny stories about how they broke in. Their tools included a “Keilformgleiter“, “Tuerfallennadeln” or “Tuerklinkenangel“.
Once you’re in you might encounter glass offices which have the advantage that, since passwords are commonly written on PostIts and sticked to the monitor, you can snoop the passwords by using a big lens!

Peter Sakal presented a so called “Rapid in-Depth Security Framework” which he developed (or so). He introduced to secure software development and what steps to take in order to have a reasonably secure product. But all of that was very high level and wasn’t really useful in real life. I think his main point was that he classified around 300 fuzzers and if you needed one, you could call him and ask him. I expected way more, because he teased us with a framework and introduced into the whole fuzzing thing, but didn’t actually deliver any framework. I really wonder how the term “framework” even made it into the title of his talk. Poor guy. He also presented softscheck.com on every slide which now makes a good entry in my AdBlock list…

Fortunately, Chritoph Wegener was a good speaker. He talked about “Cloud Security 2.0” and started off with an introduction about Cloud Computing. He claimed that several different types exist, i.e. “Infrastructure as a Service” (IaaS), i.e. EC2 or Dropbox, “Platform as a Service” (PaaS), i.e. AppEngine or “Software as a Service (SaaS), i.e. GMail or Twitter. He drew several attack scenarios and kept claiming that you needed to trust the provider if you wanted to do serious stuff. Hence, that was the unspoken conclusion, you must not use Cloud Services.

Lastly, Sven Gabriel gave a presentation about Grid Security. Apparently, he supervises boatloads of nodes in a grid and showed how he and his team manage to do so. Since I don’t operate 200k nodes myself, I didn’t think it was relevant albeit it was interesting.

To conclude the DFN Workshop: It’s a nice conference with a lot of nice people but it needs to improve content wise.

OCRing a scanned book

I had the pleasure to use a “Bookeye” book scanner. It’s a huge device which helps scanning things like books or folders. It’s very quick and very easy to use. I got a huge PDF out of my good 100 pages that I’ve scanned.

Unfortunately the light was very bright and so the scanner scanned “through” the open pages revealing the back sides of the pages. That’s not very cool and I couldn’t really dim the light or put a sheet between the Pages.
Also, it doesn’t do OCR but my main point of digitalising this book was to actually have it searchable and copy&pastable.

There seem to be multiple options to do OCR on images:

tesseract

covered already

ocropus

Apparently this is supposed to be tesseract on steroids as it can recognise text on paper and different layouts and everything.
Since it’s a bit painful to compile, I’d love to share my experiences hoping that it will become useful to somebody.

During compilation of ocropus, you might run into issues like this or that, so be prepared to patch the code.

cd /tmp/ svn checkout http://iulib.googlecode.com/svn/trunk/ iulib cd iulib/ ./configure --prefix=/tmp/libiu-install make && make install

cd /tmp/ wget http://www.leptonica.com/source/leptonlib-1.67.tar.gz -O- | tar xvzf - cd leptonlib*/ ./configure --prefix=/tmp/leptonica make && make install

cd /tmp/ svn checkout http://ocropus.googlecode.com/svn/trunk/ ocropus # This is due to this bug: http://code.google.com/p/ocropus/issues/detail?id=283 cat > ~/bin/leptheaders < #!/bin/sh echo /tmp/leptonica/include/leptonica/ EOF chmod a+x ~/bin/leptheaders ./configure --prefix=/tmp/ocropus-install --with-iulib=/tmp/libiu-install/ make && make install

muelli@bigbox /tmp $ LD_LIBRARY_PATH=/tmp/ocropus-install/lib/:/tmp/leptonica/lib/ ./ocropus-install/bin/ocroscript --help
usage: ./ocropus-install/bin/ocroscript [options] [script [args]].
Available options are:
  -e stat  execute string 'stat'
  -l name  require library 'name'
  -i       enter interactive mode after executing 'script'
  -v       show version information
  --       stop handling options
  -        execute stdin and stop handling options
muelli@bigbox /tmp $

However, I can’t do anything because I can’t make LUA load the scripts from the share/ directory of the prefix. Too sad. It looked very promising.

Cuneiform

This is an interesting thing. It’s a BSD licensed russian OCR software that was once one the leading tools to do OCR.
Interestingly, it’s the most straight forward thing to install, compared to the other things listed here.
bzr branch lp:cuneiform-linux cd cuneiform-linux/ mkdir build cd build/ cmake .. -DCMAKE_INSTALL_PREFIX=/tmp/cuneiform make make install

This is supposed to produce some sort of HTML which we can glue to a PDF with the following tool.

hocr2pdf

Apparently takes “HTML annotated OCR data” and bundles that, together with the image, to a PDF.

cd /tmp/ svn co http://svn.exactcode.de/exact-image/trunk ei cd ei/ ./configure --prefix=/tmp/exactimage make && make install

That, however, failed for me like this:

  LINK EXEC objdir/frontends/optimize2bw
/usr/bin/ld: objdir/codecs/lib.a: undefined reference to symbol 'QuantizeBuffer'
/usr/bin/ld: note: 'QuantizeBuffer' is defined in DSO /usr/lib64/libgif.so.4 so try adding it to the linker command line
/usr/lib64/libgif.so.4: could not read symbols: Invalid operation
collect2: ld returned 1 exit status
make: *** [objdir/frontends/optimize2bw] Error 1

Adding “LDFLAGS += -lgif” to the Makefile fixes that. I couldn’t find a bug tracker, hence I reported this issue via email but haven’t heard back yet.

Although the hOCR format seems to be the only option to actually know where in the file the text appears, no OCR program, except cuneiform and tesseract with a patch, seems to support it 🙁

gscan2pdf

as a full suite it can import pictures or PDFs and use a OCR program mentioned above (tesseract or gocr). The whole thing can then be saved as a PDF again.
Results with gocr are not so good. I can’t really copy and paste stuff. Searching does kinda work though.

Using Tesseract, however, doesn’t work quite well:

 Tesseract Open Source OCR Engine
tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.
sh: line 1:  6187 Aborted                 (core dumped) tesseract /tmp/4jZN0oNbB1/dLNBLkcjph.tif /tmp/4jZN0oNbB1/_4BdZMfGXJ -l fra
*** unhandled exception in callback:
***   Error: cannot open /tmp/4jZN0oNbB1/_4BdZMfGXJ.txt
***  ignoring at /usr/bin/gscan2pdf line 12513.
Tesseract Open Source OCR Engine
tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.
sh: line 1:  6193 Aborted                 (core dumped) tesseract /tmp/4jZN0oNbB1/ELMbnDkaEI.tif /tmp/4jZN0oNbB1/C47fuqxX3S -l fra
*** unhandled exception in callback:
***   Error: cannot open /tmp/4jZN0oNbB1/C47fuqxX3S.txt
***  ignoring at /usr/bin/gscan2pdf line 12513.

It doesn’t seems to be able to work with cuneiform 🙁

Archivista Box

This is actually an appliance and you can download an ISO image.
Running it is straight forward:
cd /tmp/ wget 'http://downloads.sourceforge.net/project/archivista/archivista/ArchivistaBox_2010_IV/archivista_20101218.iso?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Farchivista%2F&ts=1295436241&use_mirror=ovh' qemu -cdrom /tmp/archivista_20101218.iso -m 786M -usb

Funnily enough, the image won’t boot with more than 786MB of RAM. Quite weird, but qemu just reports the CPU to be halted after a while. If it does work, it boots up a firefox with a nice WebUI which seems to be quite functional. However, I can’t upload my >100MB PDF probably because it’s a web based thing and either the server rejects big uploads or the CGI just times out or a mixture of both.

Trying to root this thing is more complex than usual. Apparently you can’t give “init=/bin/sh” as a boot parameter as it wouldn’t make a difference. So I tried to have a look at the ISO image. There is fuseiso to mount ISO images in userspace. Unfortunately, CDEmu doesn’t seem to be packaged for Fedora. Not surprisingly, there was a SquashFS on that ISO9660 filesystem. Unfortunately, I didn’t find any SquashFS FUSE implementation 🙁 But even with elevated privileges, I can’t mount that thing *sigh*:

$ file ~/empty/live.squash
/home/muelli/empty/live.squash: Squashfs filesystem, little endian, version 3.0, 685979128 bytes, 98267 inodes, blocksize: 65536 bytes, created: Sat Dec 18 06:54:54 2010
$ sudo mount ~/empty/live.squash /tmp/empty/
mount: wrong fs type, bad option, bad superblock on /dev/loop1,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so
$ dmesg | tail -n 2
[342853.796364] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[342853.796726] SQUASHFS error: Major/Minor mismatch, older Squashfs 3.0 filesystems are unsupported

But unsquashfs helped to extract the whole thing onto my disk. They used “T2” to bundle everything to a CD and packaged software mentioned above. Unfortunately, very old versions were used, i.e. cuneiform is in version 0.4.0 as opposed to 1.0.0. Hence, I don’t really consider it to be very useful to poke around that thing.

It’s a huge thing worth exploring though. It all seems to come from this SVN repository: svn://svn.archivista.ch/home/data/archivista/svn.

WatchOCR

For some reason, they built an ISO image as well. Probably to run an appliance.
cd /tmp/ wget http://www.watchocr.com/files/watchocr-V0.6-2010-12-10-en.iso qemu -cdrom /tmp/watchocr-V0.6-2010-12-10-en.iso -m 1G

The image booted up a webbrowser which showed a webinterface to the WebOCR functionality.
I extraced the necessary scripts which wraps tools like cuniform, ghostscript and friends. Compared to the archivista box, the scripts here are rather simple. Please find webocr and img2pdf. They also use an old cuneiform 0.8.0 which is older than the version from Launchpad.

However, in my QEMU instance, the watchocr box took a very long time to process my good 100 pages PDF.

Some custom script

That tries to do the job did in fact quite well, although it’s quite slow as well. It lacks proper support for spawning multiple commands in parallel.

After you have installed the dependencies like mentioned above, you can run it:
wget http://www.konradvoelkel.de/download/pdfocr.sh PATH="/tmp/exactimage/bin/:/tmp/cuneiform/bin/:$PATH" LD_LIBRARY_PATH=/tmp/cuneiform/lib64/ sh -x pdfocr.sh buch-test-1.pdf 0 0 0 0 2500 2000 fra SomeAuthor SomeTitle

The script, however, doesn’t really work for me, probably because of some quoting issues:

+ pdfjoin --fitpaper --tidy --outfile ../buch-test-1.pdf.ocr1.pdf 'pg_*.png.pdf'
          ----
  pdfjam: This is pdfjam version 2.08.
  pdfjam: Reading any site-wide or user-specific defaults...
          (none found)
  pdfjam ERROR: pg_*.png.pdf not found

Having overcome that problem, the following pdfjoin doesn’t work for an unknown reason. After having replaced pdfjoin manually, I realised, that the script sampled the pages down, made them monochrome and rotated them! Hence, no OCR was possible and the final PDF was totally unusable *sigh*.

It’s a mess.

To conclude…

I still don’t have a properly OCRd version of my scanned book, because of not very well integrated tools. I believe that programs like pdftk, imagemagick, unpaper, cuneiform, hocr2pdf, pdfjam do their job very well. But it appears that they are not very well knit together to form a useful tools to OCR a given PDF. Requirements would be, for example, that there is no loss of quality of the scanned images, that the number of programs to be called is reduced to a minimum and that everything needs to be able to do batch processing. So far, I couldn’t find anything that fulfills that requirements. If you know anything or have a few moments to bundle the necessary tools together, please tell me :o) The necessary pieces are all there, as far as I can see. It just needs someone to integrate everything nicely.

LaTeX leaftlet and background colours

I was playing around with the LaTeX’s leaflet class to produce brochures, leaflets or flyers, however you’d like to call them. Basically a DIN A4 in portrait mode and three “columns” which I wanted to feel like pages. The backside needs to be upside down and the “pages” need to be properly ordered in order for the whole thing to be printed and folded properly.

So I had a look at the manual and noticed, that it uses background colour for pages. I wanted that, too.

As the manual reads, you can use \AddToBackground to add stuff to the background. But what is to add if you want a page to have background colour? Well, Wikibooks says to use \pagecolor. But that colours the whole DIN A4 paper and not just one virtual page in a column on the DIN A4 sheet.

I browsed around and didn’t find any real explanation but an example. At least the code uses different colours for different virtual pages and it just works. Nice.

So whenever you want to have a background colour on a single column with the leaflet class, use

\usepackage[usenames,dvipsnames]{color}
 
\AddToBackground{1}{
    \put(0,0){\textcolor{green}{\rule{\paperwidth}{\paperheight}}}}
\AddToBackground{2}{
    \put(0,0){\textcolor{red}{\rule{\paperwidth}{\paperheight}}}}
\AddToBackground{3}{
    \put(0,0){\textcolor{blue}{\rule{\paperwidth}{\paperheight}}}}
\AddToBackground{4}{
    \put(0,0){\textcolor{Magenta}{\rule{\paperwidth}{\paperheight}}}}
\AddToBackground{5}{
    \put(0,0){\textcolor{Orange}{\rule{\paperwidth}{\paperheight}}}}
\AddToBackground{6}{
    \put(0,0){\textcolor{Fuchsia}{\rule{\paperwidth}{\paperheight}}}}

It doesn’t seem to be possible to have coloured virtual pages *and* a background picture spanning over the whole DIN A4 page. I tried several things, including playing around with the wallpaper package, but I didn’t have any success so far. One could split the background up in three pieces and include one of those on each page, but that’s really ugly and hacky. I don’t like that.

I kinda got it working using eso-pic and transparent, but the result is messy, because the image, which is supposed to be in the background, is in foreground. And even with transparency, it looks bad. Just like a stamp, not a watermark.

I also tried to make the pages background colour transparent but putting the background image is very idiotic: I would have to place \AddToShipoutPicture to the very correct place in the TeX file instead of defining it in the headers somewhere *sigh*
But anyway, it still wouldn’t work correctly as the image, which is supposed to be in the background, would be rendered *on top* of the first virtual page on each physical page, making the colours look very weird:

So I stepped back and didn’t really want to use LaTeX anymore. So I had a look at pdftk. It is able to put a watermark behind a given PDF once the PDF has transparent background colours. I changed my Makefile to read like that (which is not necessarily beautiful but I still want to share my experience):

Logo390BG-DINA4-180.pdf: Logo390BG-DINA4.pdf
        # Expand background to two pages and rotate second page by 180 deg
        pdftk I=$< cat I1 I1D output $@
 
broschuere-print.pdf: broschuere.pdf Logo390BG-DINA4-180.pdf *.tex
        # Doesn't work with pdftk 1.41, but with pdftk 1.44.
        pdftk broschuere.pdf multibackground Logo390BG-DINA4-180.pdf output $@

That worked quite well:

But I wasn't quite happy having to use external tools. I want my LaTeX to do as much as possible to not have to rely on external circumstances. Also, my Fedora doesn't ship a pdftk version that is able to do the multibackground. So I had another look and by now it is almost obvious. Just put the background picture at (0,0), and *then* draw the background. Note that virtual pages 2 and 5 make the first column on a physical page. Hence, we draw the background picture there and scale it by three, to make it spawn across the physical page.

\AddToBackground{1}{
    \put(0,0){\transparent{0.5}{\textcolor{green}{\rule{\paperwidth}{\paperheight}}}}
}
\AddToBackground{2}{
    \put(0,0){
        \includegraphics[width=3\paperwidth]{Logo390BG}%
    }
    \put(0,0){%
        \transparent{0.5}{\textcolor{red}{\rule{\paperwidth}{\paperheight}}}}
}
\AddToBackground{3}{
    \put(0,0){\transparent{0.5}{\textcolor{blue}{\rule{\paperwidth}{\paperheight}}}}}
\AddToBackground{4}{
    \put(0,0){\transparent{0.5}{\textcolor{Magenta}{\rule{\paperwidth}{\paperheight}}}}%
}
\AddToBackground{5}{
    \put(0,0){
        \includegraphics[width=3\paperwidth]{Logo390BG}%
    }
    \put(0,0){%
        \transparent{0.5}{\textcolor{Orange}{\rule{\paperwidth}{\paperheight}}}
    }
}
\AddToBackground{6}{
    \put(0,0){\transparent{0.5}{\textcolor{Fuchsia}{\rule{\paperwidth}{\paperheight}}}}}

FOSS.in last edition 2010

I had the pleasure to be invited to FOSS.in 2010. As I was there to represent parts of GNOME I feel obliged to report what actually happened.

The first day was really interesting. It was very nice to see that many people having a real interest in Free Software. It was mostly students that I have talked to and they said that Free Software was by far not an issue at colleges in India.

Many people queued up to register for the conference. That’s very good to see. Apparently, around 500 people showed up to share the Free Software love. the usual delays in the conference setup were there as expected 😉 So the opening ceremony started quite late and started, as usual, with lighting the lamp.

Danese from the Wikimedia Foundation started the conference with her keynote on the technical aspects of Wikipedia.

She showed that there is a lot of potential for Wikipedia in India, because so far, there was a technical language barrier in Wikipedia’s software. Also, companies like Microsoft have spent loads of time and money on wiping out a free (software) culture, hence not so many Indians got the idea of free software or free content and were simply not aware of the free availability of Wikipedia.

According to Danese, Wikipedia is the Top 5 website after companies like Google or Facebook. And compared to the other top websites, the Wikimedia Foundation has by far the least employees. It’s around 50, compared to the multiple tens of thousands of employees that the other companies employ. She also described the openness of Wikipedia in almost every aspect. Even the NOC is quite open to the outside world, you can supposedly see the network status. Also, all the documentation is on the web about all the internal process so that you could learn a lot about the Foundation a lot if you wanted to.

She presented us several methods and technologies which help them to scale the way the Wikipedia does, as well as some very nerdy details like the Squid proxy setup or customisations they made to MySQL. They are also working on offline delivery methods because many people on the world do not have continuous internet access which makes browsing the web pretty hard.

After lunch break, Bablir Singh told us about caching in virtualised environments. He introduced into a range of problems that come with virtualisation. For example the lack of memory and that all the assumption of caches that Linux makes were broken when virtualising.
Basically the problem was that if a Linux guest runs on a Linux host, both of them would cache, say, the hard disk. This is, of course, not necessary and he proposed two strategies to mitigate that problem. One of them was to use a memory balloon driver and give the kernel a hint that the for the caching allocated pages should be wiped earlier.

Lenny then talked about systemd and claimed that it was Socket Based Activation that made it so damn fast. It was inspired by Apples launchd and performs quite well.

Afterwards, I have been to the Meego room where they gave away t-shirts and Rubix-cubes. I was told a technique on how to solve the Rubix-cube and I tried to do it. I wasn’t too successful though but it’s still very interesting. I can’t recite the methods and ways to solve the cube but there are tutorials on the internet.

Rahul talked about failures he seen in Fedora. He claimed that Fedora was the first project to adopt a six month release cycle. He questioned whether six month is actually a good time frame. Also the governance modalities were questioned. The veto right in the Fedora Board was prone to misuse. Early websites were ugly and not very inviting. By now, the website is more appealing and should invite the audience to contribute. MoinMoin was accused of not being as good MediaWiki, simply because Wikipedia uses MediaWiki. Not a very good reasoning in my opinion.

I was invited to do a talk about Security and Mobile Devices (again). I had a very interested audience which pulled off an interesting Q&A Session. People still come with questions and ideas. I just love that. You can find the slides here.

As we are on mobile security, I wrote a tiny program for my N900 to sidejack Twitter accounts. It’s a bit like firesheep, but does Twitter only (for now) and it actually posts a nice message. But I’ve also been pnwed… 😉

But more on that in a separate post.

Unfortunately, the FOSS.in team announced, that this will be the last FOSS.in they organise. That’s very sad because it was a lot of fun with a very interesting set of people. They claim that they are burnt out and that if one person is missing, nothing will work, because everyone knew exactly what role to take and what to do. I don’t really like this reasoning, because it reveals that the Busfactor is extremely low. This, however, should be one of the main concerns when doing community work. Hence, the team is to blame for having taken care of increasing the Busfactor and thus leading FOSS.in to a dead end. Very sad. But thanks anyway for the last FOSS.in. I am very proud of having attended it.

BAföG, PDF and Evince – Decrypted PDF documents

In Germany, students may apply for BAföG which basically makes them receive money for their studies. In order to apply, you have to fill out lots of forms. They provide PDFs with forms that you can –at least in theory– fill out. Well, filling out with Evince works quite well, but saving doesn’t. It complains, that the document is encrypted. WTF?

It’s a form provided by the government. You wouldn’t think that there is anything subject to DRM and that they stop you actually saving a filled document. Producing the document in first place was paid by us citizens so I’d fully expect to be at least allowed save the filled form. I don’t request the sources of that document (well, I like the idea but I probably couldn’t do anything with it anyway) but only that my government helps me filling out all those forms and that it doesn’t unnecessarily restrict me.

So I wrote those folks at the office, stating that they’ve accidentally restricted me saving the form. I received an answer quite quickly:

leider handelt es sich hier nicht um ein Versehen. Die Speicherbarkeit der Formulare unterliegt einem Rechtekonzept des Programm-Herstellers, nach welchem ab einer gewissen Abrufzahl das Abspeichern der Formulare nicht kostenfrei möglich ist.

Unterschiedliche Freewares bieten jedoch die Möglichkeit, die vorhandenen Formblätter auf dem eigenen PC abzuspeichern. Beispielhaft wird Ihnen auf dem Internet-Auftritt hierzu ein entsprechendes Softwarepaket zum kostenfreien Download genannt

Sorry for the German. The translation is roughly: It’s not an accident. The “program vendor’s right management” is responsible for that. And if many people actually download the PDF file, that Digital Restrictions Management requires that office to not allow the people to save the forms. Erm. Yes. I haven’t verified this but I fully expect the authoring software “Adobe LiveCycle Designer ES 8.2” to have a very weird license that makes us citizens suffer from those stupid restrictions. This, ladies and gentlemen, is why we need Free Software. And we need governments to stop using proprietary software with such retarded licenses.

Apparently, there are a few DRM technologies within PDF. One of them are stupid flags inside the document, that tell you whether you are allowed to, say, print or fill forms in the document. And it was heavily discussed what to do about those, because they can be silently ignored.

Anyway, I came across Ubuntu bug 477644 which mentions QPDF, a tool to manipulate PDFs while preserving its content. So if you go and download all those PDFs with forms, and do a “qpdf –decrypt input.pdf output.pdf” on them, you can save your filled form.
pushd /tmp/ for f in 1 1_anlage_1 1_anlage_2 2 3 4 5 6 7 8; do wget --continue "http://www.das-neue-bafoeg.de/intern/upload/formblaetter/nbb_fbl_${f}.pdf" qpdf --decrypt "/tmp/nbb_fbl_${f}.pdf" "/tmp/nbb_fbl_${f}_decrypted.pdf" done popd

I’ve prepared that and you can download the fillable and savable decrypted BAfoeG Forms from here:

Hope you can use it.

MeeGo Conference 2010 in Dublin

The MeeGo Conference 2010 took place from 2010-11-15 until 2010-11-17 and it was quite good. I think I haven’t seen so much money being put into a conference so far. That’s not to be read as a complaint though 😉

The conference provided loads of things, i.e. lunch, which was apparently sponsored by Novell. It was very good: Yummie lamb stew, cooked salmon and veg was served to be finished with loads of ice cream and coffee. Very delicious. Breakfast was provided by Codethink as far as I can tell. The first reception in the evening was held by Collabora and drinks and food were provided. That was, again, very well and a perfect opportunity to meet and chat with people. In fact, I’ve met a lot old folks that II haven’t seen for at least half a year. But with the KDE folks entering the scene I’ve also met a few new interesting people.

The venue itself is very interesting and they definitely know how to accommodate conference attendees. It’s a stadium and very spacious. There were an awful lot of stadium people taking care of us. The rooms were well equipped although I was badly missing power supply.

The second evening was spent in the Guinness Warehouse, an interesting museum which tells you how the Guinness is made. They also have a bar upstairs and food, drinks and music was provided. I guess the Guinness couldn’t have been better 🙂

Third evening was spent in the Stadium itself to watch Ireland playing Norway. Football that is. There was a reception with drinks and food downstairs in the Presidents Suite. They even handed out own scarfs which read “MeeGo Conference”. That was quite decadent. Anyway, I’ve only seen the first half because I was at the bar for the second half, enjoying Guinness and Gin Tonic 😉

Having sorted out the amnesties (more described here), let’s have a look at the talks that were given. I actually attended a few, although I loved to have visited more.

Enterprise Desktop – Yan Li talked about his work on making MeeGo enterprise ready, meaning to have support for VPNs, Exchange Mail, large LDAP address books, etc… His motivation is to bring MeeGo to his company, Intel. It’s not quite there yet, but apparently there is an Enterprise MeeGo which has a lot of fixes already which were pushed upstream but are not packaged in MeeGo yet. His strategy to bring the devices to the people was to not try to replace the people’s old devices but rather give them an additional device to play with. Interesting approach and I’d actually like to see the results in a year or so.

Compliance – There is a draft specification but the final one will be ready soon. If you want to be compliant, you have to ensure that you are using MeeGo API (Qt, OpenGL ES, …) only. That will make it compatible for the whole minor version series. There will also be profiles (think: Handset, Netbook) which well define additional APIs or available screen estate. In return, you are allowed to use the MeeGo name and the logo. Your man asked the audience to try the compliance tools and give feedback and to review the handset profile draft.

Security – There will be a MSSF, a Mobile Simplified Security Framework in MeeGo 1.2. It’s a MAC system which is supposed to be in mainline. So yes, it is yet another security framework in Linux and I didn’t really understand, why it’s necessary. There’ll be a “Trusted Execution Environment’ (TrEE) as well. That will mean that the device has to have a TPM with a hardwired key that you can’t see nor exchange. I don’t necessarily like TPMs. Besides all that, “Simplified Mandatory Access Control” (SMACK) will be used. It is supposedly like SELinux, but doesn’t suck as much. Everything (processes, network packets, files, I guess other IPC, …) will get labels and policies will be simple. Something like “Application 1 has a red label and only if you have a red label, too, you can talk to Appilcation 1”. We’ll see how that’ll work. On top of all that, an Integrety Protection “IMA” system will be used to load and execute signed binaries only.

Given all that, I don’t like the development in this direction. It clearly is not about the security of the person owning the device in question but about protecting the content mafia. It’s a clear step into the direction of Digital Restriction Management (DRM) under the label of protection the users data. And I’m saying that they are trying to hide it, but they are not calling it by its right name either.

A great surprise was to see Intel and Nokia handing out Lenovo Ideapads to everybody. We were asked to put MeeGo on the machine, effectively removing the Windows installation. Three years ago, when I got my x61s, it was a piece of cake to return your Windows license. By now, things might have changed. We’ll see. I’ll scratch the license sticker off the Laptop and write a letter to Lenovo and see what happens. Smth like this (copied from here):

Lenovo Deutschland GmbH
Gropiusplatz 10
70563 Stuttgart

Rückgabe einer Windows-Lizenz

Sehr geehrte Damen und Herren,

hiermit gebe ich die gemeinsam mit einem Lenovo-Notebook erworbene Windows-Lizenz gemäß des End User License Agreement (EULA) von Microsoft Windows zurück.

Das EULA von Windows gewährt mir das Recht, beim Hersteller des Produkts, mit dem ich die Lizenz erworben habe, den Preis für die Windows-Lizenz zurückerstattet zu bekommen, falls die mitgelieferte Windows-Lizenz beim Start nicht aktiviert und registriert wurde und das EULA nicht akzeptiert worden ist. Ich habe der EULA nicht zugestimmt, da sie zahlreiche für mich inakzeptable Punkte enthält, beispielsweise:

– Die Aktivierung der Software sendet Hardware-Informationen an Microsoft (Punkt 2 des EULA).
– „Internetbasierte Dienste“ wie das „Windows-Updatefeature“ können von Microsoft jederzeit gesperrt werden (Punkt 7 des EULA). Dadurch existiert de facto kein Recht auf Security-Updates.

Ich entschied mich stattdessen für das Konkurrenz-Produkt Ubuntu, da dieses eine bessere Qualität aufweist und ein verbraucherfreundlicheres EULA hat.

Sie haben anderen Lenovo-Kunden in der Vergangenheit die Rückgabe der Windows-Lizenz verweigert mit der Verweis, dass es sich bei dem mit dem Gerät erworbenen Windows-Betriebssystem um einen “integrativen Bestandteil” des Produkts handle und man die Windows-Lizenz nur mit dem gesamten Produkt zurückgeben kann.

Diese Auffassung ist aus den folgenden Gründen nicht zutreffend:
– Windows-Lizenzen werden auch einzeln verkauft, eine Bindung von Software an ein bestimmtes Hardware-Gerät (OEM-Vertrag) ist nach deutschem Recht nicht zulässig. [1]
– Das betreffende Notebook lässt sich auch mit anderen, einzeln erhältlichen Betriebssystemen (u.a. Ubuntu) produktiv betreiben. Insbesondere Ihre Produkte laufen mit Ubuntu (mit sehr wenigen Ausnahmen) ganz hervorragend.
– Jedoch lässt sich das vorliegende Notebook nicht ohne Windows-Lizenz oder ganz ohne Betriebssystem erwerben.

Mir sind desweiteren mehrere Fälle bekannt, in denen Sie erfolgreich mit dem von mir verwendeten Formular Windows-Lizenzen zurückerstattet haben.

Ich bitte Sie deshalb, mir die Kosten für die Windows-Lizenz zurückzuerstatten und die erworbene Windows-Lizenz einzeln zurückzunehmen.

Hilfsweise teilen sie mir mit, wie ich das Geraet als ganzes zurureck geben kann.

Mit freundlichen Grüßen

[1] Vgl. dazu das Urteil des BGH I ZR 244/97 vom 6. Juli 2000
(http://tiny.cc/IZR24497 sowie http://www.jurpc.de/rechtspr/20000220.htm).

The performance of MeeGo on that device is actually extremely bad. WiFi is probably the only thing that works out of the box. The touchpad can’t click, the screen doesn’t rotate, the buttons on the screen don’t do anything, locking the screen doesn’t work either, there is no on-screen keyboard, multi touch doesn’t work with the screen, accelerometer doesn’t work. It’s almost embarrassing. But Chromium kinda works. Of course, it can’t actually do all the fancy gmail stuff like phone or video calls. The window management is a bit weird. If you open a browser it’ll get maximised and you’ll get a title bar for the window. And you can drag the title bar to unmaximise the window. But if you then open a new browser window, it’ll be opened on a new “zone”. Hence, it’s quite pointless to have a movable browser window with a title bar. In fact, you can put multiple (arbitrary) windows in zone if you manually drag and drop them from the “zones” tab which is accessible via a quake style top panel. If you put multiple windows into one zone, the window manager doesn’t tile the windows. By the way: If you’re using the touchscreen only, you can’t easily open this top panel bar, because you can’t easily reach the *very* top of the screen. I hope that many people will have a look at these issues now and eventually fix them. Anyway, thanks Intel and Nokia 🙂

jOEpardy released as Free Software

As mentioned in an earlier post, I was investigating the possibility to set jOEpardy free. It’s a Java program that let’s you hold a Jeopardy session based on a XML file. It has been used quite a few times and is pretty stable. A boatload of credits go to Triphoenix, who coded an awful lot without very much time, probably lacking sleep or coffee or even both. Thanks.

So to make the announcement: jOEpardy is GPLv3+ software (*yay*) and you can download the code via Mercurial here: https://hg.cryptobitch.de/joepardy. I don’t intend to make tarball or binary releases as I (at your option) either don’t have the time or simply don’t see a need.

But to actually use it, you want to have some buzzers. You could play it with a regular keyboard though. At the end of the day, you need to generate a keycodes for a “1”, a “2”, a “3” or a “4”. If you’re nerdy enough, you can get yourself an emergency button in the next hardware store and solder some tiny serial logic to it. Then you could read that serial signal and convert to X events via xtest.

You’ll figure smth out 😉

Stip-OUT report

For my recent year in Dublin I got a “STIP-OUT” scholarship from my local university and I was supposed to write a tiny report. So here it comes (in German though).

I won’t go into too much detail about the course I attended just yet, but in a nutshell: Dublin was a nice experience, personally and technically. Going to Dublin for a year abroad is a good option to just “get out”, because although it is different, it is still European enough. However, people and universities work differently. Everything is very open but also very stressful.

Oh srsly? 300MBs for a scanner driver (/.-)

My granny asked me to bring her a driver for her all-in-one scanner thingy, because it would take her too long to download it. Well, I wasn’t too sure whether it’s HP’s fault by not supporting the generic classes or Windows 7‘s fault by not implementing the USB Printer or Scanner class driver (But they should). However, I didn’t think a driver can be that huge. However, HP supposes you to download 290 whopping MB! For making their product work!

But they are serious. You cannot download anything smaller than that. ๏̯͡๏ I thought they were kidding me. Must be a very complicated device… Well, I’m copying their BLOBs onto a pendrive now…

The beauty of a free (Maemo) handset

During GUADEC, I of course wanted to use my N900. But since the PR1.2 update, the Jabber client wouldn’t connect to the server anymore, because OpenSSL doesn’t honor imported CAs. So the only option to make it connect is to ignore SSL errors. But as I’m naturally paranoid, I didn’t dare to connect… It’s a nerdy conference with a lot of hackers after all.

Fortunately, I had all those nice Collaborans next to me and I could ask loads of (stupid?) questions. Turns out, that the Jabber client (telepathy-gabble) on the N900 is a bit old and uses loudmouth and not wocky.

So I brought my SDK back to life (jeez, it’s very inconvenient to do stuff with that scratchbox setup 🙁 ) and I was surprised that apt-get source libloudmouth1-0 was sufficient to get the code. And apt-get build-dep libloudmouth1-0 && dpkg-buildpackage -rfakeroot built the package. Almost easy (I had to fix loads of dependency issue but it then worked out).

As neither I nor the Collaborans knew how to integrate with the Certificate Manager, I just wanted to make OpenSSL aware of the root CA which I intended to drop somewhere in ~/.certs or so.

After a couple of busy conference days I found out that code which implements the desired functionality already exists but was commented out. So I adapted that and now loudmouth imports certificates from /home/user/.config/telepathy/trusted-cas.pem or /home/user/.config/telepathy/certs /home/user/.maemosec-certs/ssl-ca before it connects. The former is just a file with all root CAs being PEM encoded. The latter is a directory where you have to put PEM or DER encoded certs into and then run c_rehash . in it the certificate manager puts the certificates in after you’ve imported it. Because just loading any .pem or .der file would have been to easy to work with. It was hard for me to understand OpenSSL’s API. This article helped me a bit though, so you might find it useful, too.

So if you want your jabber client on the N900 to connect to a SSL/TLS secured server that uses a root CA that is not in the built in certificate store, grab the .deb here. You can, of course, get the source as well.

Turns out, that there is a workaround mentioned in bug 9355 hence you might consider it to be easier to modify system files yourself instead of letting the package manager do it.

Bottom line being that it’s wonderful to be allowed to study the code. It’s wonderful to be allowed fix stuff. And it’s wonderful to be allowed to redistribute the software. Even with my own modifications. And that it will be that way for the lifetime of that piece of software. I do love Free Software.