Taint Tracking for Chromium

I forgot to blog about one of my projects. I had actually already talked about it more than one year ago and we had a paper at USENIX Security.

Essentially, we built a protection against DOM-based Cross-site Scripting (DOMXSS) into Chromium. We did that by detecting whenever potentially attacker provided strings become JavaScript code. To that end, we made the HTML rendering engine (WebKit/Blink) and the JavaScript engine taint aware. That is, we identified sources of values that an attacker could control (think window.name) and marked all strings coming from those sources as tainted. Then, during parsing of JavaScript, we check whether the string to be compiled is actually tainted. If that is indeed the case, then we abort the compilation.

That description is a bit simplified. For example, not compiling code because it contains some fragments of the URL would break a substantial number of Web sites. It’s an unfortunate fact that many Web sites either eval code containing parts of the URL or do a document.write with a string containing parts of the URL. The URL, in our attacker model, can be controlled by the attacker. So we must be more clever about aborting compilation. The idea was to only allow literals in JavaScript (like true, false, numbers, or strings) to be compiled, but not “code”. So if a tainted (sub)string compiles to a string: fine. If, however, we compile a tainted string to a function call or an operation, then we abort. Let me give an example of an allowed compilation and a disallowed one.


<HTML>

<TITLE>Welcome!</TITLE>
Hi

<SCRIPT>
var pos=document.URL.indexOf("name=")+5;
document.write(document.URL.substring(pos,document.URL.length));
</SCRIPT>

<BR>
Welcome to our system

</HTML>

Which is from the original report on DOM-based XSS. You see that nothing bad will happen when you open http://www.vulnerable.site/welcome.html?name=Joe. However, opening http://www.vulnerable.site/welcome.html?name=alert(document.cookie) will lead to attacker provided code being executed in the victim’s context. Even worse, when opening with a hash (#) instead of a question mark (?) then the server will not even see the payload, because Web browsers do not transmit it as part of their request.

“Why does that happen?”, you may ask. We see that the document.write call got fed a string derived from the URL. The URL is assumed to be provided by the attacker. The string is then used to create new DOM elements. In the good case, it’s only a simple text node, representing text to be rendered. That’s a perfectly legit use case and we must, unfortunately, allow that sort of usage. I say unfortunate, because using these APIs is inherently insecure. The alternative is to use createElement and friends to properly inject DOM nodes. But that requires comparatively much more effort than using the document.write. Coming back to the security problem: In the bad case, a script element is created with attacker provided contents. That is very bad, because now the attacker controls your browser. So we must prevent the attacker provided code from execution.

You see, tracking the taint information is a non-trivial effort and must be done beyond newly created DOM nodes and multiple passes of JavaScript (think eval(eval(eval(tainted_string)))). We must also track the taint information not on the full string, but on each character in order to not break existing Web applications. For example, if you first concatenate with a tainted string and then remove all tainted characters, the string should not be marked as tainted. This non-trivial effort manifests itself in the over 15000 Lines of Code we patched Chromium with to provide protection against DOM-based XSS. These patches, as indicated, create, track, propagate, and evaluate taint information. Also, the compilation of JavaScript has been modified to adhere to the policy that tainted strings must only compile to literals. Other policies are certainly possible and might actually increase protection (or increase compatibility without sacrificing security). So not only WebKit (Blink) needed to be patched, but also V8, the JavaScript engine. These patches add to the logic and must be execute in order to protect the user. Thus, they take time on the CPU and add to the memory consumption. Especially the way the taint information is stored could blow up the memory required to store a string by 100%. We found, however, that the overhead incurred was not as big as other solutions proposed by academia. Actually, we measure that we are still faster than, say, Firefox or Opera. We measured the execution speed of various browsers under various benchmarks. We concluded that our patched version added 23% runtime overhead compared to the unpatched version.

xss-runtime

As for compatibility, we crawled the Alexa Top 10000 and observed how often our protection mechanism has stopped execution. Every blocked script would count towards the incompatibility, because we assume that our browser was not under attack when crawling. That methodology is certainly not perfect, because only shallowly crawling front pages does not actually indicate how broken the actual Web app is. To compensate, we used the WebKit rendering tests, hoping that they cover most of the important functionality. Our results indicate that scripts from 26 of the 10000 domains were blocked. Out of those, 18 were actually vulnerable against DOM-based XSS, so blocking their execution happened because a code fragment like the following is actually indistinguishable from a real attack. Unfortunately, those scripts are quite common 🙁 It’s being used mostly by ad distribution networks and is really dangerous. So using an AdBlocker is certainly an increase in security.


var location_parts = window.location.hash.substring(1).split(’|’);
var rand = location_parts[0];
var scriptsrc = decodeURIComponent(location_parts[1]);
document.write("<scr"+"ipt src=’" + scriptsrc + "’></scr"+"ipt>");

Modifying the WebKit for the Web parts and V8 for the JavaScript parts to be taint aware was certainly a challenge. I have neither seriously programmed C++ before nor looked much into compilers. So modifying Chromium, the big beast, was not an easy task for me. Besides those handicaps, there were technical challenges, too, which I didn’t think of when I started to work on a solution. For example, hash tables (or hash sets) with tainted strings as keys behave differently from untainted strings. At least they should. Except when they should not! They should not behave differently when it’s about querying for DOM elements. If you create a DOM element from a tainted string, you should be able to find it back with an untainted string. But when it comes to looking up a string in a cache, we certainly want to have the taint information preserved. I hence needed to inspect each and every hash table for their usage of tainted or untainted strings. I haven’t found them all as WebKit’s (extensive) Layout tests still showed some minor rendering differences. But it seems to work well enough.

As for the protection capabilities of our approach, we measured 100% protection against DOM-based XSS. That sounds impressive, right? Our measurements were two-fold. We used the already mentioned Layout Tests to include some more DOM-XSS test cases as well as real-life vulnerabilities. To find those, we used the reports the patched Chromium generated when crawling the Web as mentioned above to scan for compatibility problems, to automatically craft exploits. We then verified that the exploits do indeed work. With 757 of the top 10000 domains the number of exploitable domains was quite high. But that might not add more protection as the already existing built in mechanism, the XSS Auditor, might protect against those attacks already. So we ran the stock browser against the exploits and checked how many of those were successful. The XSS Auditor protected about 28% of the exploitable domains. Our taint tracking based solution, as already mentioned, protected against 100%. That number is not very surprising, because we used the very same codebase to find vulnerabilities. But we couldn’t do any better, because there is no source of DOM-based XSS vulnerabilities…

You could, however, trick the mechanism by using indirect flows. An example of such an indirect data flow is the following piece of code:


// Explicit flow: Taint propagates
var value1 = tainted_value === "shibboleth" ? tainted_value : "";
// Implicit flow: Taint does not propagate
var value2 = tainted_value === "shibboleth" ? "shibboleth" : "";

If you had such code, then we cannot protect against exploitation. At least not easily.

For future work in the Web context, the approach presented here can be made compatible with server-side taint tracking to persist taint information beyond the lifetime of a Web page. A server-side Web application could transmit taint information for the strings it sends so that the client could mark those strings as tainted. Following that idea it should be possible to defeat other types of XSS. Other areas of work are the representation of information about the data flows in order to help developers to secure their applications. We already receive a report in the form of structured information about the blocked code generation. If that information was enriched and presented in an appealing way, application developers could use that to understand why their application is vulnerable and when it is secure. In a similar vein, witness inputs need to be generated for a malicious data flow in order to assert that code is vulnerable. If these witness inputs were generated live while browsing a Web site, a developer could more easily assess the severity and address the issues arising from DOM-based XSS.

FOSDEM 2016

It the beginning of the year and, surprise, FOSDEM happened 🙂 This year I even managed to get to see some talks and to meet people! Still not as many as I would have liked, but I’m getting there…

Lenny talked about systemd and what is going to be added in the near future. Among many things, he made DNSSEC stand out. I not sure yet whether I like it or not. One the one hand, you might get more confidence in your DNS results. Although, as he said, the benefits are small as authentication of your bank happens on a different layer.

Giovanni talked about the importance of FOSS in the surveillance era. He began by mentioning that France declared the state of emergency after the Paris attacks. That, however, is not in line with democratic thinking, he said. It’s a tool from a few dozens of years ago, he said. With that emergency state, the government tries to weaken encryption and to ban any technology that may be used by so called terrorists. That may very well include black Seat cars like the ones used by the Paris attackers. But you cannot ban simple tools like that, he said. He said that we should make our tools much more accessible by using standard FLOSS licenses. He alluded to OpenSSL’s weird license being the culprit that caused Heartbleed not to have been found earlier. He also urged the audience to develop simpler and better tools. He complained about GnuPG being too cumbersome to use. I think the talk was a mixed bag of topics and got lost over the many topics at hand. Anyway, he concluded with an interesting interpretation of Franklin’s quote: If you sacrifice software freedom for security you deserve neither. I fully agree.

In a terrible Frenglish, Ludovic presented on Python’s async and await keywords. He said you must not confuse asynchronous and parallel execution. With asynchronous execution, all tasks are started but only one task finishes at a time. With parallel execution, however, tasks can also finish at the same time. I don’t know yet whether that description convinces me. Anyway, you should use async, he said, when dealing with sending or receiving data over a (mobile) network. Compared to (p)threads, you work cooperatively on the scheduling as opposed to preemptive scheduling (compare time.sleep vs. asyncio.sleep).

Aleksander was talking on the Tizen security model. I knew that they were using SMACK, but they also use a classic DAC system by simply separating users. Cynara is the new kid on the block. It is a userspace privilege checker. A service, like GPS, if accessed via some form of RPC, sends the credentials it received from the client to Cynara which then makes a decision as to whether access is allowed or not. So it seems to be an “inside out” broker. Instead of having something like a reference monitor which dispatches requests to a server only if you are allowed to, the server needs to check itself. He went on talking about how applications integrate with Cynara, like where to store files and how to label them. The credentials which are passed around are a SMACK label to identify the application. The user id which runs the application and privilege which represents the requested privilege. I suppose that the Cynara system only makes sense once you can safely identify an application which, I think, you can only do properly when you are using something like SMACK to assign label during installation.

Daniel was then talking about his USBGuard project. It’s basically a firewall for USB devices. I found that particularly interesting, because I have a history with USB security and I do know that random USB devices pose a problem. We are also working on integrating USB blocking capabilities with GNOME, so I was keen on meeting Daniel. He presented his program, what it does, and how to use it. I think it’s a good initiative and we should certainly continue exploring the realm of blocking USB devices. It’s unfortunate, though, that he has made some weird technological choices like using C++ or a weird IPC system. If it was using D-Bus then we could make use of it easily :-/ The talk was actually followed by Krzyzstof who I reported on last time, who built USB devices in software. As I always wanted to do that, I approached him and complained about my problems doing so 😉

Chris from wolfSSL explained how they do testing for their TLS implementation. wolfSSL is 10 years old and secures over 1 billion endpoints, he said. Most interestingly, they have interoperability testing with other TLS implementations. He said they want to be the most well tested TLS library available which I think is a very good goal! He was a very good speaker and I really enjoyed learning about their different testing strategies.

I didn’t really follow what Pam was talking about implicit trademark and patent licenses. But it seems to be an open question whether patents and trademarks are treated similarly when it comes to granting someone the right to use “the software”. But I didn’t really understand why it would be a question, because I haven’t heard about a case in which it was argued that the right on the name of the software had also been transferred. But then again, I am not a lawyer and I don’t want to become one…

Jeremiah referred on safety-critical FOSS. Safety critical, he said, was functional safety which means that your device must limp back home at a lower gear if anything goes wrong. He mentioned several standards like IEC 61508, ISO 26262, and others. Some of these standards define “Safety Integrity Levels” which define how likely risks are. Some GNU/Linux systems have gone through that certification process, he said. But I didn’t really understand what copylefted software has to do with it. The automotive industry seems to be an entirely different animal…

If you’ve missed this year’s FOSDEM, you may want to have a look at the recordings. It’s not VoCCC type quality like with the CCCongress, but still good. Also, you can look forward to next year’s FOSDEM! Brussels is nice, although they could improve the weather 😉 See you next year!

Talking on Searchable Encryption at 32C3 in Hamburg, Germany

This year again, I attended the Chaos Communication Congress. It’s a fabulous event. It has become much more popular than a couple of years ago. In fact, it’s so popular, that the tickets (probably ~12000, certainly over 9000) have been sold out a week or so after the sales opened. It’s gotten huge.

This year has been different than the years before. Not only were you able to use your educational leave for visiting the CCCongress, but I was also giving a talk. Together with my colleague Christian Forler, we presented on Searchable Encryption. We had the first slot on the last day. I think that’s pretty much the worst slot in the schedule you could get 😉 Not only because the people are in Zombie mode, but also because you have received all those viruses and bacteria yourself. I was lucky enough, but my colleague was indeed sick on day 4. It’s a common thing to be sick after the CCCongress :-/ Anyway, we have hopefully entertained the crowd with what I consider easy slides, compared to the usual™ Crypto talk. We used a lot imagery and tried to allude to funny stuff. I hope people enjoyed it. If you have seen it, don’t forget to leave feedback! It was hard to decide on the appropriate technical level for the almost 1800 people. The feedback we’ve received so far is mixed, so I guess we’ve hit a good spot. The CCCongress was amazingly organised for speakers. They really did care for us and made sure everything was right. So everything was perfect expect for pdfpc which crashed whenever it was meant to display a certain slide… I used Evince then and it worked…

The days at the CCCongress were intense as you might be able to tell from the Fahrplan. It generally started at about 12:00 and ended at about 01:00. And that’s only the talks. You can’t avoid bumping into VIP (very interesting people) and thus spend time in the hallway. And then you have these amazing parties. This year, they had motor-homes and lasers in the dance hall (last year it was a water cannon…). Very crazy atmosphere. It’s highly recommended to spend a night there.

Anyway, day 1 started for me with the Keynote by Fatuma Musa Afrah. The speaker stretched her time a little, I felt. At the beginning I couldn’t really grasp what her topic was or what she wanted to tell us. She repeatedly told us that we had to “kill the time together” which killed my sympathy to some extent. The conference’s motto was Gated Communities. She encouraged us to find ways to open these gates by educating people and helping them. She said that we have to respect each other irrespective of the skin colour or social status. It was only later that she revealed being refugee who came to Germany. Although she told us that it’s “Newcomers”, not “refugees”. In fact, she grew up in Kenya where she herself was a newcomer. She fled to Kenya, so she fled twice. She told us stories about her arriving and living in Germany. I presume she is breaking open the gates which separate the communities she’s living in, but that’s speculation. In a sense she was connecting her refugee community with our hacker community. So the keynote was interesting for that perspective.

Joanna Rutkowska then talked about trustworthy laptops. The basic idea is to have no state on the laptop itself, i.e. no place where malware could be injected. The state should instead be kept on a personal storage medium, like an SD card or a pen drive. She said that laptops are inherently not trustworthy. Trust, she said can be broken up into Trusted, Secure, and Trustworthy. Secure is resistant to attacks. Trusted is something we, as Security community, do not want to have, like a Trusted third party. Trustworthy, she said, is something different, like the Intel Management Engine which might be resistant to attacks, yet it is not acting in the interest of the user. Application level security is meaningless, she said, when we cannot trust the Operating System, because it is the trusted part. If it is compromised then every effort is not useful. Her project, Qubes OS, attempts to reduce the Trusted Computing Base. What is Operating system to the application, is the hardware to the Operating system. The hardware, she said, has been assumed to be trusted. A single malicious peripheral, like a malicious wifi module, can compromise the whole personal computer, the whole digital life, she said. She was referring to problems with Intel x86 platforms. Present Intel processors integrate everything on the main chip. The motherboard has been made more or less only a holder for the CPU and the memory. The construction of those big chips is completely opaque. We have no control over what is inside. And we cannot look inside, even if we wanted to. The firmware is being loaded during boot from a discrete element on the mainboard. We cannot, however, verify what firmware really is on the chip. One question is how to enforce read-only-ness of the system or how to upload your own firmware. For many years, she and others believed that TPM, TXT, or UEFI secure boot could solve that problem. But all of them have shown to fail horribly, she said. Unfortunately, she didn’t mention how so. So as of today, there is no such thing as a secure boot. Inside the processor is a management engine which special, because it is the perfect entry for backdooring and zombification of personal computing. By zombification, she means that the involvement of the Apps (vs. OS vs. Hardware) is decreasing heavily and make the hardware have much more of a say. She said that Intel wants to make the Hardware fully control your computing by having much more logic in the management engine. The ME is, in a sense, a gated community, because you cannot, whatsoever, inspect it, tinker with it, or otherwise get in touch. She said that the war is lost on X86. Even if we didn’t have the management engine. Again, she didn’t say why. Her proposal is to move all those moving firmware parts out to a trusted storage. It was an interesting perspective on what I think is a “simple” Free Software problem. Because we allow proprietary software, we now have the problem to see what is loaded into the hardware. With Free Software we’d still have backdoors in hardware, but assuming that most functionality is encoded in firmware, we could see and modify the existing firmware and build, run, and share our “better” firmware.

Ilja van Sprundel talked about Windows driver security or rather their attack surface. I’m not necessarily interested in Windows per se, but getting some lower level knowledge sounded intriguing. He gave a more high level overview of what to do and what to not do when doing driver development for Windows. The details, he said, matter. For example whether the IOManager probes a buffer in *that* instance. The Windows kernel is made of several managers, he said. The Windows Driver Model (WDM) is the standard model for how drivers are written. The IO Manager proxies requests from user to (WDM drivers. It may or may not validate arguments. Another central piece in the architecture are IO Request Packets (IRPs). They are being delivered from the IO Manager to the driver and contain all the necessary information for the operation in question. He went through the architecture really fast and it was hard for a kernel newbie like me to follow all the concepts he mentioned. Interestingly though, the IO Manager seems to also care about transferring the correct amount of memory from userspace to kernel space (e.g. makes sure data does not overflow) if you want it to using METHOD_BUFFERED. But, as he said, most of the drivers use METHOD_NEITHER which does not check anything at all and is the endless source of driver bugs. It seems as if KMDF is an alternative framework which makes it harder to have bugs. However, you seem to need to understand the old framework in order to use that new one properly. He then went on to talk about the actual attack surface of drivers. The bugs are either privilege escalation, denial of service, or information leak. He said that you could avoid the problem of integer overflow by using the intsafe library. But you have to use them properly! Most importantly, you need to check their return type and use the actual values you want to have been made safe. During creation of a device, a driver can call either IoCreateDeviceSecure with an SDDL string or use an INF file to ACL the device. That is, however, done either rarely or wrongly, he said. You need to think about who needs to have access to your device. When you work with the IOManager, you need to check whether Irp->MdlAddress is NULL which can happen, he said, if it’s a zero sized buffer. Similarly, when using the safer METHOD_BUFFERED mentioned earlier, Irp->AssociatedIrp.SystemBuffer can also be NULL. So avoid having that false sense of security when using that safe API. Another area of bugs is the cancellation of IRPs. The userland can cancel requests which apparently is not handled gracefully by drivers and leads to deadlocks, memory leaks, race conditions, double frees, and other classes of bugs. When dealing with data from userland, you are supposed to “probe” the memory which is basically checking whether the pointers are valid and in the expected range. If you don’t do that, it’ll lead to you writing to arbitrary kernel memory. If you do validate the data from userspace, make sure you don’t fetch it again from user space assuming that it hasn’t changed. There might be race between your check and your usage (TOCTOU). So better capture, validate, and use the data. The same applies when using MDLs. That, however, is more tricky, because you have a double mapping and you are using a kernel pointer. So it is very subtle. When you do memory allocation you can either use ExAllocatePool or ExAllocatePoolWithQuota. The latter throws an exception instead of returning NULL. Your exception or NULL pointer handling needs to be double checked, he said. It was a very technical talk on Windows which was way out of my comfort zone. I only understood a tiny fraction of what he was presenting. But I liked it for the new insight on Windows drivers and that the same old classes of bug have not died yet.

High up on my list of awaited talks was the talk on train systems by the SCADA strangelove people. Railways, he said, is the biggest system built by mankind. It’s main components are signals and switches. Old switches are operated manually by pure force. Modern switches are interlocked with signals such that the signals display forbidden entry when switches are set in certain positions. On tracks, he said, signals are transmitted over the actual track by supplying them with AC or DC. The locomotive picks up the signals and supplies various systems with them. The Eurostar, they said, has about seven security systems on board, among them a “RPS”, a Reactor Protection System which alludes to nuclear trains… They said that lately the “Bahn Automatisierungssystem (SIBAS)” has been updated to use much more modern and less proprietary soft- and hardware such as VxWorks and x86 with ELF binaries as well as XML over HTTP or SS7. In the threat model they identified, they see several attack vectors. Among them are making someone plug a malicious USB device in controlling machines in some operation center. He showed pictures from supposedly real operation centers. The physical security, he said, is terrible. With close to no access control. In one photograph, he showed a screenshot from a documentary aired on TV which showed credentials sticking on the screen… Even if the security is quite good, like good physical security and formally proven programs, it’s still humans who write the software, he said, so there will be bugs to be exploited. For example, he showed screenshots of when he typed “railway” into Shodan and the result included a good number of railway stations. Another vector is GSM-R. If you jam the train’s GSM-R connection, the train will simply stop. Also, you might be able to inject SIM toolkit malware. Via that vector, you might make the modem identify as, e.g. a keyboard and then penetrate further into the systems. Overall an entertaining talk, but the claims were a bit over the top. So no real train hacking just yet.

The talk on memory corruption by Mathias Payer started off by saying that software is unsafe and insecure. Low level languages trade type safety and memory safety for performance. A large set of legacy applications, he said, are prone to memory vulnerabilities. There are, he continued too many bugs to find and fix manually. So we need a runtime system to ensure security. An invalid dereference or an out of bounds pointer is the core of memory unsafety problems. But according to the C language, he claimed, it’s only a violation if the invalid pointer is read, written to, or freed. So during runtime, there are tons and tons of dangling pointers which is perfectly fine. With such a vulnerability a control-flow attack could be executed. Several defenses exist: Data Execution Prevention prevents code from actually being executed. Address Space Layout Randomisation scrambles the memory locations of executable code which makes it harder to successfully exploit a vulnerable. Stack canaries are special values which are supposed to detect overflowing writes. Safe exception handlers ensure that exception code paths follow predefined patterns. The DEP can only work together with ASLR, he said. If you broke ASLR, you could re-use existing code; as it turns out, people do break ASLR every now and then. Two new mechanisms are Stack Integrity and Code Flow Integrity. Stack Integrity enforces to return to the actual caller by having a shadow stack. He didn’t mention how that actually works, though. I suppose you obtain a more secret stack address somewhere and switch the stack pointer before returning to check whether the return address is still correct. Control Flow Integrity builds a control flow graph during compilation and for every control flow change it checks at run time whether the target address is allowed. Apparently, many CFI implementations exist (eleven were shown). He said they’ve measured those and IFCC and Lockdown performed rather badly. To show how all of the protection mechanisms fail, he presented printf-oriented programming. He said that printf was Turing complete and presented a domain specific language. They have built a brainfuck interpreter with snprintf calls. Another rather technical talk by a good speaker. I remember that I was already impressed last year when he presented on these new defense mechanisms.

DJB and Tanja Lange started their “late night show” by bashing TLS as a “gigantic clusterfuck”. They were presenting on quantum computing and cryptography. They started by mentioning that the D-Wave quantum computer exists, but it’s not useful, he said. It doesn’t do the basic things, and can only do limited computations. It can especially not perform Shor’s algorithm. So there’s no “Shor monster coming”. They recommended the Timeline of Quantum Computing as a good reference of the massive research effort going into quantum computing. If there was a general quantum computer pretty much every public key scheme deployed on the Internet today will be broken. But also symmetric schemes are under attack due to Grover’s algorithm which speeds up brute force algorithms significantly. The solution could be physical crypto like using strong (physical) locks. But, he said, the assumptions of those systems are already broken. While Quantum Key Distribution is secure under certain assumptions, those assumptions are off, he said. Secure schemes that survive the quantum era were the topic of their talk. The first workshop on that workshop happened in 2006 and efforts are still being made, e.g. with EU projects on the topic. The time it takes for a crypto scheme to gain significant traction has been long, so far. They gave ECC as an example. It has been introduced in the 1980s, but it’s only now that it’s taking over the deployed crypto on the Internet. So the time it takes is long. They gave recommendations on what to do to have connections that are secure “for at least the next hundred years”. These include at least 256 bit keys for symmetric encryption. McEliece with binary Goppa codes n=6960 k=5413 t=119. An efficient implementation of such a code based scheme is McBits, she said. Hash based signatures with, e.g. XMSS or SPHINCS-256. All you need for those is a proper hash function. The stuff they recommend for the next 100 years, like the McEliece system, are things from the distant past, she said. He said that Post Quantum Cryptography will be the standard in a couple of years from now so he urged the cryptographers in the audience to “get used to this stuff”.

Next on my list was Markus’ talk on Landesverrat which is the incident of Netzpolitik.org being investigated for revealing secret documents. He referred on the history of the case, how it came around that they were suspected of revealing secret documents. He said that one of their believes is to publish their sources, even the secret ones. They want their work to be critically reviewed and they believe that it is only possible if the readers can inspect the sources. The documents which lead to the criminal investigations were about finances of the introduction of the XKeyscore software. Then, the president of the “state security” filed a case against because of revealing secret documents. They wanted to publish the investigation files, but they couldn’t see them, because they were considered to be more secret than the documents they have already published… From now on, he said, they are prepared for the police raiding their offices, which I suppose is good standard preparation. They were lucky, he said, because their case fell into the regular summer low of news which could make the case become quite popular in the media. A few weeks earlier or later and they were much less popular due to the refugees or Greece. During the press coverage, they had a second battleground where they threw out a Russian television team who entered their offices without having called or otherwise introduced themselves… For the future, he wants to see changes in what is considered to be a state secret. He doesn’t want the government to decide what such a secret is. He also wants to have much more protection for whistle blowers. Freedom of press should also hold for people who do not blog for their “occupation”, but also hobbyists.

Vincent Haupert was then talking on App-based TAN online banking methods. It’s a classic two factor method: Not only username and password, but also a TAN. These TAN methods have since evolved in various ways. He went on to explain the general online banking process: You log in with your credentials, you create a new wire transfer and are then asked to provide a TAN. While ChipTAN would solve many problems, he said, the banking industry seems to want their customers to be able to transfer money everywhere™. So you get to have two “Apps” on your mobile computer. The banking app and a TAN app. However, malware in “official” app stores are a reality, he said. The Google Playstore cannot protect against malware, as a colleague of him demonstrated during his bachelor thesis. This could also been by the “Brain Test” app which roots your device and then loads malware. Anyway, they hijacked the connection from the banking app to modify the recipient of the issued wire transfer and the TAN being pushed on the device. They looked at the apps and found that they “protected” their app with “Promon Shield“. That seems to be a strong obfuscation framework. Their attack involved tricking the root and hooks detection. For the root detection they check on the file system for certain binaries. He could simply change the filenames and was good to go. For the hooks (Xposed) it was pretty much the same with the exception of a few filenames which needed more work. With these modifications they could also “hack” the newer version 1.0.7. Essentially the biggest part of the problem is that the two factors are on one device. If the attacker hijacks that one device then ,

The talk by Christian Schaffner on Quantum Cryptography was introducing the audience to quantum mechanics. He said that a qubit can be imagined as the direction of a polarised photon. If you make the direction of the photons either horizontal or vertical, you can imagine that as representing 0 or 1. He was showing an actual physical experiment with a laser pointer and polarisation filters showing how the red dot of the laser pointer is either filtered or very visible. He also showed how actually measuring the polarisation changes the state of the photons! So yet another filter made the point in the back brighter. That was a bit weird, but that’s quantum mechanics. He showed a quantum random number generator based on that technology. One important concept is the no-cloning theorem which state that you can make a perfect copy of a quantum bit. He also compared current and “post quantum” crypto systems against efficient classical attackers and efficient quantum attackers. AES, SHA, RSA (or discrete logs) will be broken by quantum attacks. Hash-based signatures, McEliece, and lattice-based cryptography he considered to be resistant against quantum based attacks. He also mentioned that Quantum Key Distribution systems will also be against an exhaustive attacker who applies brute force. QKD is based on the no-cloning theorem so an eavesdropper cannot see the same bits as the communicating parties do. Finally, he asked how you could prove that you have been at a certain location to avoid the pizza delivery problem (i.e. to be certain about the place of delivery).

Fefe was talking on privileges. He said that software will be vulnerable. Various techniques should be applied such as simply fixing all the bugs (haha…) or make exploitation harder by applying ASLR or ROP protection. Another idea is to put the program in a straight jacket and withdraw privileges. That sounds a lot like containerisation. Firstly, you can drop your privileges from superuser down to the least privileges you need, then do privilege separation. Another technique is the admin confining the app in a jail instead of the app confining itself. Also, you can implement access control via a broker service by splitting up your process into, say, a left half which opens and reads files and a right half which processes data. When doing privilege separation, the idea is to split up the process into several separately running programs. Jailing is like firewall rules for syscalls which, he said, is impossible for complex programs. He gave Firefox as an example of it being impossible to write a rule set for. The app containing itself is like a werewolf chaining itself to the wall before midnight, he said. You restrict yourself from opening files, creating socket, or from attaching yourself as a debugger to other processes. The broker service is probably like a reference monitor. He went on showing how old-school privilege dropping works. You could do it as easily as seteuid(getuid()), but that’s not enough, because there is the saved UID, so you need to setresuid and not forget to check the return code. Because the call can fail if, for example, the target UID had already been running too many processes for its quota. He said that you should fail the build if your target platform does not provide setresuid. However, dropping privileges is more than setting your UID. It’s also about freeing resources you don’t necessarily need. Common approaches to jailing your process are to have a fake filesystem with only the necessary files, so your process cannot ever access anything that it shouldn’t. On Linux, that would probably be chroot. However, you can escape using fchdir . Also, mounting your /proc into the chroot, information about the host is exposed. So you need to do more work than calling chroot. The BSDs, he said, have Securelevel which is a kernel mode that only increases which withdraws certain privileges. They also have jails which is a chroot on steroids, he said. It leaks some information due the PIDs, though, he said.

The next talk was on Shellphish, an automatic exploitation framework. This is really fascinating stuff. It’s been used for various Capture the Flag contests which are basically about hacking other teams’ software services. In fact, the presenters were coming from the UCSB which is hosting the famous iCtF. They went from solving security challenges to developing a system which solves security challenges. From a vulnerability binary, they (automatically) develop an exploit and a patched binary which is immune to the exploit, but preserves the functionality of the program. They automatically find vulnerabilities, patches, and test both the exploits and the patches. For the automated vulnerability component, they presented Angr. It has a symbolic execution engine looking for memory accesses outside allocated regions and unconstrained instruction pointer which is a jump controlled by user input (JMP eax). They have written a paper for NDSS about “Augmenting Fuzzing Through Selective Symbolic Execution“. Angr is a Python library and they showed how to use it for identifying the overhyped Back to 28 vulnerability. Actually, there is too much state for a regular symbolic executor to find this problem. Angr does “veritesting“. He showed that his Angr script found the vulnerability by him having excluded many paths of execution that don’t really generate new state with a few lines of code. He didn’t show though what the lines of code were and how he determined how the states are not adding any new information.

The next talk was given by the people behind Intelexit was about convincing NSA agents to stop their work and serve democracy instead. They rented a van with big mottoes printed on them, like “Listen to your heart, not to private phone calls”. They also stuck the constitution on the “constitution protection office” which then got torn apart. Another action they did was to fly over the dagger complex and to release flyers about leaving the secret services. They want to have a foundation helping secret service agents to leave their job or to blow the whistle. They also want an anonymous call service where agents can call to talk about their job. I recommend browsing their photos.

Another artsy talk was on a cheap Facebook army. Actually it was on Instagram followers. The presenter is an artist himself and he’d buy Instagram followers for fellow artists “to make them all equal”. He dislikes the fact that society seems to measure the value or quality of art in followers or likes on social media.

Around the CCCongress were also other artsy installations like this one called “machine learning”:

It’s been a fabulous event. I really admire the people organising this event each and every year. Thank you so much and see you next year, hopefully.

Using NetworkManager to export your WiFi settings as a barcode

With my new phone, I needed to migrate all the WiFi settings. For some reason, it seems to be hard to export WiFi configuration from Android and import it in another. The same holds true for GNOME, I guess.

The only way of getting WiFi configuration into your Android phone (when not being able to write the wpa_supplicant file) seems to be barcodes! When using the barcode reader application, you can scan a code in a certain format and the application would then create a wifi configuration for you.

I quickly cooked up something that allows me to “export” my laptop’s NetworkManager WiFis via a QR code. You can run create_barcode_from_wifi.py and it creates a barcode of your currently active configuration, if any. You will also see a list of known configurations which you can then select via the index. The excellent examples in the NetworkManager’s git repository helped me to get my things done quickly. There’s really good stuff in there.

I found out that I needed to explicitely render the QR code black on white, otherwise the scanning app wouldn’t work nicely. Also, I needed to make the terminal’s font smaller or go into fullscreen with F11 in order for the barcode to be printed fully on my screen. If you have a smaller screen than, say, 1360×768, I guess you will have a problem using that. In that case, you can simply let PyQRCode render a PNG, EPS, or SVG. Funnily enough, I found it extremely hard to print either of those formats on an A4 sheet. The generated EPS looks empty:

Printing that anyway through Evince makes either CUPS or my printer die. Converting with ImageMagick, using convert /tmp/barcode.eps -resize 1240x1753 -extent 1240x1753 -gravity center -units PixelsPerInch -density 150x150 /tmp/barcode.eps.pdf
makes everything very blurry.

Using the PNG version with Eye of GNOME does not allow to scale the image up to my desired size, although I do want to print the code as big as possible on my A4 sheet:

Now you could argue that, well, just render your PNG bigger. But I can’t. It seems to be a limitation of the PyQRCode library. But there is the SVG, right? Turns out, that eog still doesn’t allow me to print the image any bigger. Needless to say that I didn’t have inkscape installed to make it work… So I went ahead and used LaTeX instead

Anyway, you can get the code on github and gitlab. I guess it might make sense to push it down to NetworkManager, but as I am more productive in writing Python, I went ahead with it without thinking much about proper integration.

After being able to produce Android compatible WiFi QR codes, I also wanted to be able to scan those with my GNOME Laptop to not having to enter passwords manually. The ingredients for a solution to this problem is parsing the string encoded as a barcode and creating a connection via the excellent NetworkManager API. Creating the connection is comparatively easy, given that an example already exists. Parsing the string, however, is a bit more complex than I initially thought. The grammar of that WiFi encoding language is a bit insane in the sense that it allows multiple encodings for the same thing and that it is not clear to encode (or decode) certain networks. For example, imagine your password is 12345678. The encoding format now wants to know whether that is ASCII characters or the hex encoded passphrase (i.e. the hex encoded bytes 0x12,0x34,0x56,0x78). In the former case, the encoded passphrase must be quoted with double quotes, e.g. P:"12345678";. Fair enough. Now, let’s imagine the password is "12345678" (yes, with the quotes). Then you need to hex encode that ASCII string to P:22313233343536373822. But, as it turns out, that’s not what people have done, so I have seen quite a few weird QR codes for Wifis out there 🙁

Long story short, the scan_wifi_code.py program should also scan your barcode and create a new WiFi connection for you.

Do you have any other ideas how to migrate wifi settings from one device to another?

On GNOME Governance

On 2015-10-04 it was announced that the governing body of the GNOME Foundation, the Board, has a vacant seat. That body was elected about 15 weeks earlier. The elections are very democratic, they use an STV system to make as many votes as possible count. So far, no replacement has been officially announced. The question of what strategy to use in order to find the replacement has been left unanswered. Let me summarise the facts and comment on the strategy I wish the GNOME project to follow.

The STV system used can be a bit hard to comprehend, at first, so let me show you the effects of an STV system based on the last GNOME elections. With STV systems, the electorate can vote for more than one candidate. An algorithm then determines how to split up the votes, if necessary. Let’s have a look at the last election’s first votes:

elections-initial-votes

We see the initial votes, that is, the number of ballots in which a candidate was chosen first. If a candidate gets eliminated, either because the number of votes is sufficient to get elected or because the candidate has the least votes and cannot be elected anymore, the vote of the ballot is being transferred onto the next candidate.

In the chart we see that the electorate chose to place 19 or more votes onto two candidates who got directly elected. Overall, six candidates have received 13 or more votes. Other candidates have at least 30% less votes than that. If we had a “simple” voting mechanism, the result would be the seven candidates with the most votes. I would have been one of them.

But that’s not how our voting system works, because, as we can see below, the picture of accumulated votes looks differently just before eliminating the last candidate (i.e. me):

elections-final-votes

If you compare the top seven now, you observe that one candidate received votes from other candidates who got eliminated and managed to get in.

We can also see from the result that the final seat was given to 17.12 votes and that the first runner-up had 16.44 votes. So this is quite close. The second runner-up received 10.39 votes, or 63% of the votes of the first runner-up (so the first runner-up received 158% of the votes of the second runner-up).

We can also visually identify this effect by observing a group of eight which accumulated the lion’s share of the votes. It is followed by a group of five.

Now one out of the seven elected candidates needed to drop out, creating a vacancy. The Foundation has a set of rules, the bylaws, which regulate vacancies. They are pretty much geared towards maintaining an operational state even with a few directors left and do not mandate any particular behaviour, especially not to follow the latest election results.

But.

Of course this is not about what is legally possible, because that’s the baseline, the bare minimum we expect to see. The GNOME Foundation’s Board is one of the few democratically elected bodies. It is a well respected entity in industry as well as other Free Software communities. I want it to stay that way. Most things in Free Software are not very democratic; and that’s perfectly fine. But we chose to have a very democratic system around the governing body and I think that it would leave a bad taste if the GNOME Foundation chooses to not follow these rather young election results. I believe that its reputation can be damaged if the impression of forming a cabal, of not listening to its own membership, prevails. I interpret the results as a strong statement of its membership for those eight candidates. GNOME already has to struggle with accusations of it not listening to its users. I’d rather want to get rid of it, not fueling it by extending it to its electorate. We’re in the process of acquiring sponsors for our events and I don’t think it’s received well if the body ignores its own processes. I’d also rather value the membership more rather than producing arguments for those members who chose to not vote at all and try to increase the number of members who actually vote.

LinuxCon Europe 2015 in Dublin

sponsor

The second day was opened by Leigh Honeywell and she was talking about how to secure an Open Future. An interesting case study, she said, was Heartbleed. Researchers found that vulnerability and went through the appropriate vulnerability disclosure channels, but the information leaked although there was an embargo in place. In fact, the bug proofed to be exploited for a couple of months already. Microsoft, her former employer, had about ten years of a head start in developing a secure development life-cycle. The trick is, she said, to have plans in place in case of security vulnerabilities. You throw half of your plan away, anyway, but it’s good to have that practice of knowing who to talk to and all. She gave a few recommendations of which she thinks will enable us to write secure code. Coders should review, learn, and speak up if they feel uncomfortable with a piece of code. Managers could take up on what she called “smells” when people tend to be fearful about their code. Of course, MicroSoft’s SDL also contains many good practices. Her minimal set of practices is to have a self-assessment in place to determine if something needs security review, have an up-front threat modelling that is kept up to date as things evolve, have a security checklist like Mozilla’s or OWASP’s, and have security analysis built into CI process.

Honeywell

The container panel was led by Jeo Zonker Brockmeier who started the discussion by stating that we’ve passed the cloud hype and containers are all the rage now. The first question he shot at the panellists was whether containers were ready at all to be used for production. The panellists were, of course, all in agreement that they are, although the road ahead is still a bit bumpy. One issue, they identified, was image distribution. There are, apparently, two types of containers. Application containers and System containers. Containers used to be a lightweight VM with a full Linux system. Application Containers, on the other hand, only run your database instance. They see application containers as replacing Apps in the future. Other services like databases are thus not necessarily the task of Application containers. One of the panellists was embracing dockerhub as a similar means to RPM or .deb packages for distributing software, but, he said, we need to solve the problem of signing and trusting. He was comparing the trust issue with packages he had installed on his laptop. When he installed a package, he didn’t check what was inside the packages his OS downloaded. Well, I guess he missed that people put trust in the distribution instead of random people on the Internet who put up an image for everybody to download. Anyway, he wanted Docker to be a form of trusted entity like Google or Apple are for their app stores which are distributing applications. I don’t know how they could have missed the dependency resolution and the problem of updating lower level libraries, maybe that problem has been solved already…

Container Panel

Intel’s Mark was talking on how Open Source was fuelling the Internet of Things. He said that trust was an essential aspect of devices that have access to personal or sensitive data like access to your house. He sees the potential in IoT around vaccines which is a connection I didn’t think of. But it makes somewhat sense. He explained that vaccines are quite sensitive to temperature. In developing countries, up to 30% of the vaccines spoil, he said, and what’s worse is that you can’t tell whether the vaccine is good. The IoT could provide sensors on vaccines which can monitor the conditions. In general, he sees the integration of diverse functionality and capabilities of IoT devices will need new development efforts. He didn’t mention what those would be, though. Another big issue, he said, was the updatetability, he said. Even with smaller devices, updates must not be neglected. Also, the ability of these devices to communicate is a crucial component, too, he said. It must not be that two different light bulbs cannot talk to their controller. That sounds like this rant.

IoT opps

Next, Bradley talked about GPL compliance. He mentioned the ThinkPinguin products as a pristine example for a good GPL compliant “complete corresponding source”. He pointed the audience to the Compliance.guide. He said that it’s best to avoid the offer for source. It’s better to include the source with the product, he said, because the offer itself creates ongoing obligations. For example, your call centre needs to handle those requests for the next three years which you are probably not set up to do. Also, products have a typically short lifespan. CCS requires good instructions how to build. It’s not only automated build tools (think configure, make, make install). You should rather think of a script as a movie or play script. The test to use on your potential CCS is to give your source release to another developer of some other department and try whether that person can build the code with your instructions. Anyway, make install does usually not work on embedded anyway, because you need to flash the code. So make sure to include instructions as to how to get the software on the device. It’s usually not required to ship the tool-chain as long as you give instructions as to what compiler to use (and how it was configured). If you do include a compiler, you might end up having more obligations because GCC, for example, is itself GPL licensed. An interesting question came up regarding specialised hardware needed to build or flash the software. You do not need to include anything “tool-chain-like” as long as you have instructions as to the requirements what the user needs to obtain.

Bradley

Samsung’s Krzysztof was talking about USB in Linux. He said, it is the most common external interface in the world. It’s like the Internet in the sense that it provides services in a client-server architecture. USB also provides services. After he explained what the USB actually is how the host interacts with devices, he went on to explain the plug and play aspect of USB. While he provided some rather low-level details of the protocol, it was a rather high level in the sense that it was still the very basic USB protocol. He didn’t talk too much on how exactly the driver is being selected, for example. He went on to explain the BadUSB attack. He said that the vulnerability basically results from the lack of user interaction when plugging in a device and loading its driver. One of his suggestions were to not connect “unknown devices”, which is hard because you actually don’t know what “services” the device is implementing. He also suggested to limit the number of input sources to X11. Most importantly, though, he said that we’d better be using device authorisation to explicitly allow devices before activating them. That’s good news, because we are working on it! There are, he said, patches available for allowing certain interfaces, instead of the whole device, but they haven’t been merged yet.

USB

Jeff was talking about applying Open Source Principles to hardware. He began by pointing out how many processors you don’t get to see, for example in your hard disk, your touchpad controller, or the display controller. These processors potentially exfiltrate information but you don’t really know what they do. Actually, these processors are about owning the owner, the consumer, to then sell them stuff based on that exfiltrated big data, rather than to serve the owner, he said. He’s got a project running to build devices that you not only own, but control. He mentioned IoT as a new battleground where OpenHardware could make an interesting contestant. FPGAs are lego for hardware which can be used easily to build your functionality in hardware, he said. He mentioned that the SuperH patents have now expired. I think he wants to build the “J-Core CPU” in software such that you can use those for your computations. He also mentioned that open hardware can now be what Linux has been to the industry, a default toolkit for your computations. Let’s see where his efforts will lead us. It would certainly be a nice thing to have our hardware based on publicly reviewed designs.

Open Hardware

The next keynote was reserved for David Mohally from Huawei. He said he has a lab in which they investigate what customers will be doing in five to ten years. He thinks that the area of network slicing will be key, because different businesses needs require different network service levels. Think your temperature sensor which has small amounts of data in a bursty fashion while your HD video drone has rather high volume and probably requires low latency. As far as I understood, they are having network slices with smart meters in a very large deployment. He never mentioned what a network slice actually is, though. The management of the slices shall be opened up to the application layer on top for third parties to implement their managing. The landscape, he said, is changing dramatically from what he called legacy closed source applications to open source. Let’s hope he’s right.

Huawei

It was announced that the next LinuxCon will happen in Berlin, Germany. So again in Germany. Let’s hope it’ll be an event as nice as this one.

Intel Booth

HP Booth

LinuxCon Europe – Day 1

attendee registration

The conference was opened by the LinuxFoundation’s Executive Jim Zemlin. He thanked the FSF for their 30 years of work. I was a little surprised to hear that, given the differences between OpenSource and Free Software. He continued by mentioning the 5 Billion Dollar report which calculates how much “value” the projects hosted at Linux Foundation have generated over the last five years. He said that a typical product contains 80%, 90%, or even more Free and Open Source Software. He also extended the list of projects by the Real Time Collaborative project which, as far as I understood, effectively means to hire Thomas Gleisxner to work on the Real Time Linux patches.

world without Linux

The next, very interesting, presentation was given by Sean Gourley, the founder of Quid, a business intelligence analytics company. He talked about the limits of human cognition and how algorithms help to exploit these limits. The limit is the speed of your thinking. He mentioned that studies measured the blood flow across the brain when making decisions which found differences depending on how proficient you are at a given task. They also found that you cannot be quicker than a certain limit, say, 650ms. He continued that the global financial market is dominated by algorithms and that a fibre cable from New York to London costs 300 million dollars to save 5 milliseconds. He then said that these algorithms make decisions at a speed we are unable to catch up with. In fact, the flash crash of 2:45 is inexplicable until today. Nobody knows what happened that caused a loss of trillions of dollars. Another example he gave was the crash of Knight Capital which caused a loss of 440 million dollars in 45 minutes only because they updated their trading algorithms. So algorithms are indeed controlling our lives which he underlined by saying that 61% of the traffic on the Internet is not generated by humans. He suggested that Bots would not only control the financial markets, but also news reading and even the writing of news. As an example he showed a Google patent for auto generating social status updates and how Mexican and Chinese propaganda bots would have higher volume tweets than humans. So the responsibilities are shifting and we’d be either working with an algorithm or for one. Quite interesting thought indeed.

man vs. machine

Next up was IBM on Transforming for the Digital Economy with Open Technology which was essentially a gigantic sales pitch for their new Power architecture. The most interesting bit of that presentation was that “IBM is committed to open”. This, she said, is visible through IBM’s portfolio and through its initiatives like the IBM Academic Initiative. OpenPower Foundation is another one of those. It takes the open development model of software and takes it further to everything related to the Power architecture (e.g. chip design), she said. They are so serious about being open, that they even trademarked “Open by Design“…

IBM sales pitch

Then, the drone code people presented on their drone project. They said that they’ve come a long way since 2008 and that the next years are going to fundamentally change the drone scene as many companies are involved now. Their project, DroneCode, is a stack from open hardware to flight control and the next bigger thing will be CAN support, which is already used in cards, planes, and other vehicles. The talk then moved to ROS, the robot operating system. It is the lingua franca for robotic in academia.

Drones

Matthew Garret talked on securing containers. He mentioned seccomp and what type of features you can deprive processes of. Nowadays, you can also reason about the arguments for the system call in question, so it might be more useful to people. Although, he said, writing a good seccomp policy is hard. So another mechanism to deprive processes of privileges is to set capabilities. It allows you to limit the privileges in a more coarse grained way and the behaviour is not very well defined. The combination of capabilities and seccomp might have surprising results. For example, you might be allowing the mknod() call, but you then don’t have the capability to actually execute it or vice versa. SELinux was next on his list as a mechanism to secure your containers. He said that writing SELinux policy is not the most fun thing in the world. Another option was to run your container in a virtual machine, but you then lose some benefits such as introspection of fine grained control over the processes. But you get the advantages of more isolation. Eventually, he asked the question of when to use what technology. The performance overhead of seccomp, SELinux, and capabilities are basically negligible, he said. Fully virtualising is usually more secure, he said, but the problem is that you have more complex infrastructure which tend to attract bugs. He also mentioned GRSecurity as a means of protecting your Linux kernel. Let’s hope it’ll be merged some day.

Containers

Canonical’s Daniel Watkins then talked on cloud-init. He said it runs in three stages. Init, config, and final in which init sets up networking, config does the actual configuration of your services, final is for the things that eventually need to be done. The clound-init architecture is apparently quite flexible and versatile. You can load your own configuration and user-data modules so that you can set up your cloud images as you like. cloud-init allows you get rid of custom images such that you can have confidence in your base image working as intended. In fact, it’s working not only with BSDs but also with Windows images. He said, it is somewhat similar to tools like Ansible, so if you are already happily using one of those, you’re good.

cloud-init

An entertaining talk was given by Florian Haas on LXC and containers. He talked about tricks managing your application containers and showed a problem when using a naive chroot which is that you get to see the host processes and networking information through the proc filesystem. With LXC, that problem is dealt with, he said. But then you have a problem when you update the host, i.e. you have to take down the container while the upgrade is running. With two nodes, he said, you can build a replication setup which takes care of failing over the node while it is upgrading. He argued that this is interesting for security reasons, because you can upgrade your software to not be vulnerable against “the latest SSL hack” without losing uptime. Or much of it, at least… But you’d need twice the infrastructure to run production. The future, he said, might be systemd with it’s nspawn tool. If you use systemd all the way, then you can use fleet to manage the instances. I didn’t take much away, personally, but I guess managing containers is all the rage right now.

LXC

Next up was Michael Hausenblas on Filesystems, SQL and NoSQL with Apache Mesos. I had briefly heard of Mesos, but I really didn’t know what it was. Not that I’m an expert now, but I guess I know that it’s a scheduler you can use for your infrastructure. Especially your Apache stack. Mesos addresses the problem of allocating resources to jobs. Imagine you have several different jobs to execute, e.g. a Web server, a caching layer, and some number crunching computation framework. Now suppose you want to increase the number crunching after hours when the Web traffic wears off. Then you can tell Mesos what type of resources you have and when you need that. Mesos would then go off and manage your machines. The alternative, he said, was to manually SSH into the machines and reprovision them. He explained some existing and upcoming features of Mesos. So again, a talk about managing containers, machines, or infrastructure in general.

Mesos

The following Kernel panel didn’t provide much information to me. The moderation felt a bit stiff and the discussions weren’t really enganged. The topics mainly circled around maintainership, growth, and community.

Kernel Panel

SuSE’s Ralf was then talking on DevOps. He described his DevOps needs based on a cycle of planning, coding, building, testing, releasing, deploying, operating, monitoring, and then back to planning. When bringing together multiple projects, he said, they need to bring two independent integration loops together. When doing DevOps with a customer, he mentioned some companies who themselves provide services to their customers. In order to be successful when doing DevOps, you need, he said, Smart tools, Process automation, Open APIs, freedom of choice, and quality control are necessary. So I guess he was pitching for people to use “standards”, whatever that exactly means.

SuSE DevOps

I awaited the next talk on Patents and patent non aggression. Keith Bergelt, from OIN talked about ten years of the Open Invention Network. He said that ten years ago Microsoft sued Linux companies to hinder Linux distribution. Their network was founded to embrace patent non-aggression in the community. A snarky question would have been why it would not be simply enough to use GPLv3, but no questions were admitted. He said that the OIN has about 1750 licensees now with over a million patents being shared. That’s actually quite impressive and I hope that small companies are being protected from patent threats of big players…

OIN

That concluded the first day. It was a lot of talks and talking in the hallway. Video recordings are said to be made available in a couple of weeks. So keep watching the conference page.

Sponsors

IBM Booth

mrmcd 2015

I attended this year’s mrmcd, a cozy conference in Darmstadt, Germany. As in the previous years, it’s a 350 people event with a relaxed atmosphere. I really enjoy going to these mid-size events with a decent selection of talks and attentive guests.

The conference was opened by Paolo Ferri’s Keynote. He is from the ESA and gave a very entertaining talk about the Rosetta mission. He mentioned the challenges involved in launching a missile for a mission to be executed ten years later. It was very interesting to see what they have achieved over a few hundred kilometers distance. Now I want to become a space pilot, too 😉

The next talk was on those tracking devices for your fitness. Turns out, that these tracking devices may actually track you and that they hence pose a risk for your privacy. Apparently fraud is another issue for insurance companies in the US, because some allow you to get better rates when you upload your fitness status. That makes those fitness trackers an interesting target for both people wanting to manipulate their walking statistics to get a better premium for health care and attackers who want to harm someone by changing their statistics.

Concretely, he presented, these devices run with Bluetooth 4 (Smart) which allows anyone to see the device. In addition, service discovery is also turned on which allows anyone to query the device. Usually, he said, no pin is needed anymore to connect to the device. He actually tested several devices with regard to several aspects, such as authentication, what data is stored, what is sent to the Internet and what security mechanisms the apps (for a phone) have been deployed. Among the tested devices were the XiaomMi Miband, the Fitbit, or the Huawei TalkBand B1. The MiBand was setting a good example by disabling discovery once someone has connected to the device. It also saves the MAC address of the phone and ignores others. In order to investigate the data sent between a phone and a band, they disassembled the Android applications.

Muzy was telling a fairytale about a big data lake gone bad.
He said that data lakes are a storage for not necessarily structured data which allow extraction of certain features in an on-demand fashion and that the processed data will then eventually end up in a data warehouse in a much more structured fashion. According to him, data scientists then have unlimited access to that data. That poses a problem and in order to secure the data, he proposed to introduce another layer of authorization to determine whether data scientists are allowed to access certain records. That is a bit different from what exists today: Encrypt data at rest and encrypt in motion. He claimed that current approaches do not solve actual problems, because of, e.g. key management questions. However, user rights management and user authorization are currently emerging, he said.

Later, he referred on Apache Spark. With big data, he said, you need to adapt to a new programming paradigm away from a single worker to multiple nodes, split up work, handling errors and slow tasks. Map reduce, he said, is one programming model. A popular framework for writing in a such a paradigm is Apache’s Hadoop, but there are more. He presented Apache Spark. But it only begins to make sense if you want to analyse more data than you can fit in your RAM, he said. Spark distributes data for you and executes operations on it in a parallel manner, so you don’t need to care about all of that. However, not all applications are a nice fit for Spark, he mentioned. He gave high performance weather computations as such as example. In general, Spark fits well if IPC not required.

The conference then continued with two very interesting talks on Bahn APIs. derf presented on public transport APIs like EFA, HAFAS, and IRIS. These APIs can do things like routing from A to B or answer questions such as which trains are running from a given station. However, these APIs are hardly documented. The IRIS-system is the internal Bahn-API which is probably not supposed to be publicly available, but there is a Web page which exposes (bits) of the API. Others have used that to build similar, even more fancy things. Anyway, he used these APIs to query for trains running late. The results were insightful and entertaining, but have not been released to the general public. However, the speakers presented a way to query all trains in Germany. Long story short: They use the Zugradar which also contains the geo coordinates. They acquired 160 millions datasets over the last year which is represented in 80GB of JSON. They have made their database available as ElasticSearch and Kibana interface. The code it at Github. That is really really good stuff. I’m already in the process of building an ElasticSearch and Spark cluster to munch on that data.

Yours truly also had a talk. I was speaking on GNOME Keysign. Because the CCC people know how to run a great conference, we already have recordings (torrent). You get the slides here. Those of you who know me don’t find the content surprising. To all others: GNOME Keysign is a tool for signing OpenPGP Keys. New features include the capability to sign keys offline, that is, you present a file with a key and you have it signed following best practices.

Another talk I had, this time with a colleague of mine, was on Searchable Encryption. Again, the Video already exists. The slides are probably less funny than they were during the presentation, but hopefully still informative enough to make some sense out of them. Together we mentioned various existing cryptographic schemes which allow you to have a third party execute search operations on your encrypted data on your behalf. The most interesting schemes we showed were Song, Wagner, Perrig and Cash et al..

Thanks again to the organisers for this nice event! I’m looking forward to coming back next year.

GUADEC 2015 in Gothenburg, Sweden

This summer, GUADEC, the GNOME Users and Developers Conference took place in Gothenburg, Sweden. It’s a lovely city, especially in summer, with nice people, excellent beers, and good infrastructure. Fun fact: Unisex toilet seem to be very popular in Gothenburg. The conference was hosted in sort of a convention centre and was well equipped to serve our needs. I guess we’ve been around 150 people to come together in order to discuss and celebrate our favourite Free Software project: GNOME.

One of the remarkable talks I attended was given by Matthias Kirschner from the FSFE presented on software freedom and how is concerned about the computer as a general purpose machine. So his talk was title “The computer as a Universal Machine”. He was afraid that the computing machines we are using become more and more special purpose devices rather than a general purpose machine. He gave examples of how he thinks that has happened, like corporations hiding the source code or otherwise limit access to change the behaviour of the computing machines we are using. Other examples were media with Digital Restrictions Management. Essentially it is about removing features instead of widening the functionality. As such, SIM locks also served an example. With SIM locks, you cannot change your SIM card when, say, you are on holidays. More examples he gave were the region code of DVDs or copy restrictions on CD-ROMs. He was also referring to the Sony CD story from a couple of years ago when they infected buyers of their CD-ROMs or the Amazon fiasco where they deleted books on their reader devices. Essentially, these companies are trying to put the user into the back-seat when it comes to take control over your devices.

While protecting the owner of the computer sounds useful in a few scenarios, like with ATMs, it can be used against the owner easily, if the owner cannot exercise control over what the machine considers trusted. A way to counter this, he said, is to first simply not accept the fact that someone else is trying to limit the amount of control you can exercise over your machines. Another thing to do, according to him, is to ask for Free Software when you go shopping, like asking for computers with a pre-installed GNU/Linux system. I liked most parts of the talk, especially because of the focus on Free Software. Although I also think that for most parts he was preaching to the choir. But I still think that it’s important to remind ourselves of our Free Software mission.

Impressively enough, you can already watch most of the Videos! It’s quite amazing that they have already been cut and post-process so that we can watch all the things that we missed. I am especially looking forward to Christian’s talk on Builder and the Design session.

I really like going to GUADEC, because it is so much easier and more pleasant to communicate with people in-person rather than on low bandwidth channels such as IRC or eMail. I could connect my students with all these smart people who know much more about the GNOME stack than I do. And I was able to ask so many things I hadn’t understood. Let’s hope there will be GUADEC next year! If you are interested in hosting next year’s edition, you should consider submitting a bid!

On my travel back I realised that the Frankfurt Airport is running Ubuntu:

I want to thank the GNOME Foundation for sponsoring my travel to GUADEC 2015.
Sponsored by GNOME!

Unboxing a Siswoo C55

For a couple of days now, I am an owner of a Siswoo Longbow C55. It’s a 5.5″ Chinese smartphone with an interesting set of specs for the 130 EUR it costs. For one, it has a removable battery with 3300mAh. That powers the phone for two days which I consider to be quite good. A removable battery is harder and harder to get these days :-/ But I absolutely want to be able to replace the battery in case it’s worn out, hard reboot it when it locks up, or simply make sure that it’s off. It also has 802.11a WiFi which seems to be rare for phones in that price range. Another very rare thing these days is an IR interface. The Android 5.1 based firmware also comes with a remote control app to control various TVs, aircons, DVRs, etc. The new Android version is refreshing and is fun to use. I don’t count on getting updates though, although the maker seems to be open about it.

The does not have NFC, but something called hotknot. The feature is described as being similar to NFC, but works with induction on the screen. So when you want to connect two devices, you need to make the screens touch. I haven’t tried that out yet, simply because I haven’t seen anyone with that technology yet. It also does not have illuminated lower buttons. So if you’re depending on that then the phone does not work for you. A minor annoyance for me is the missing notification LED. I do wonder why such a cheap part is not being built into those cheap Chinese phones. I think it’s a very handy indicator and it annoys me to having to power on the screen only to see whether I have received a message.

I was curious whether the firmware on the phone matches the official firmware offered on the web site. So I got hold of a GNU/Linux version of the flashtool which is Qt-based BLOB. Still better than running Windows… That tool started but couldn’t make contact with the phone. I was pulling my hair out to find out why it wouldn’t work. Eventually, I took care of ModemManager, i.e. systemd disable ModemManager or do something like sudo mv /usr/share/dbus-1/system-services/org.freedesktop.ModemManager1.service{,.bak} and kill modem-manager. So apparently it got in the way when the flashtool was trying to establish a connection. I have yet to find out whether this

/etc/udev/rules.d/21-android-ignore-modemmanager.rules

works for me:

ACTION!="add|change|move", GOTO="mm_custom_blacklist_end"
SUBSYSTEM!="usb", GOTO="mm_custom_blacklist_end"
ENV{DEVTYPE}!="usb_device", GOTO="mm_custom_blacklist_end"
ATTR{idVendor}=="0e8d", ATTR{idProduct}=="2000", ENV{ID_MM_DEVICE_IGNORE}="1"
LABEL="mm_custom_blacklist_end"

I “downloaded” the firmware off the phone and compared it with the official firmware. At first I was concerned because they didn’t hash to the same value, but it turns out that the flash tool can only download full blocks and the official images do not seem to be aligned to full blocks. Once I took as many bytes of the phone’s firmware as the original firmware images had, the hash sums matched. I haven’t found a way yet to get full privileges on that Android 5.1, but given that flashing firmware works (sic!) it should only be a matter of messing with the system partition. If you have any experience doing that, let me know.

The device performs sufficiently well. The battery power is good, the 2GB of RAM make it unlikely for the OOM killer to stop applications. What is annoying though is the sheer size of the device. I found 5.0″ to be too big already, so 5.5″ is simply too much for my hands. Using the phone single handedly barely works. I wonder why there are so many so huge devices out there now. Another minor annoyance is that some applications simply crash. I guess they don’t handle the 64bit architecture well or have problems with Android 5.1 APIs.

FWIW: I bought from one of those Chinese shops with a European warehouse and their support seems to be comparatively good. My interaction with them was limited, but their English was perfect and, so far, they have kept what they promised. I pre-ordered the phone and it was sent a day earlier than they said it would be. The promise was that they take care of the customs and all and they did. So there was absolutely no hassle on my side, except that shipping took seven days, instead of, say, two. At least for my order, they used SFBest as shipping company.

Do you have any experience with (cheap) Chinese smartphones or those shops?

Creative Commons Attribution-ShareAlike 3.0 Unported
This work by Muelli is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported.