So much hard work from the Fedora community went into Fedora 11 to make your life better. Mad props to the Fedora Artwork team too, the graphics in F11 are stunning. I started at Red Hat in 2003 during the Fedora Core 1 cycle, and with every Fedora release it’s been a great pleasure to watch how the community continues to grow rapidly and contributes so much more each release. Fedora has truly been a community project for years now, and Fedora 11 shows just how great the community can be when everyone pulls together. People are awesome. Which brings us to the next stage…
Packaging²
If you have the right tools, tools that help you do what you came to do and don’t get in your way. And that’s what Fedora Community is; it’s the next step in helping people make a better Fedora. Building better software like Fedora only gets you so far; to keep getting better you need to make the people that make the software better. That means giving the community the tools it needs to be more efficient, turn ideas into features, and collaborate more effectively. Fedora Community helps fill that need. It only gets better from here.
HAL is dead; all hail udev
Over the past few days I’ve exorcised HAL from NetworkManager’s ‘master’ branch. Instead, we go bare-metal with libgudev. cgit’s diffstat lies, here are the real numbers:
Net loss of 1511 lines of code. Not bad for a few days’ work. Besides killing HAL, this patch merges nm-system-settings into NetworkManager. Why do you care? Here’s why: fewer running processes, less latency, and cleaner internal code. We just keep scaling up from here. Next up: nm-applet and ModemManager.
When we were adding multiple active device support to NetworkManager 0.7, Bryan Clark and Mike Langlie redesigned much of the applet and kicked out some great mock-ups. Turns out GtkMenu doesn’t work well for multiple connections or multiple active devices. While it’s served us well, the applet’s GtkMenu-based code should be sent to the slaughterhouse and rendered down into tasty, tasty animal fat. Even though the new design work was done in January 2008, I only blogged about the mock-ups in June last year because I suck. I’ve had them sitting in my inbox for 18 months, because between getting NM 0.7 out, mobile broadband, and all the rest of the awesomeness that 0.7.1 brought, there hasn’t been any time to sit down and actually do the redesign. But that shouldn’t stop discussions about how excellent nm-applet can become.
In the Operating Room
Ignore the title bar. These are old; we don’t want a title-bar any more. Instead, just a custom widget like the Volume applet. So here you go:
The user has an active wired connection, and an inactive wireless device. Note that only your favorite wifi networks (ie, ones you’ve explicitly connected to before) appear in the list. People often see 10 or 20 wifi networks, and list all those in the menu is mostly useless. There’s a tradeoff between the easy first-time wifi network discovery, which would now take one more click (of “Show”), but only listing the networks you actually use makes the UI a lot cleaner. And nicer for netbooks. The new design also shows more relevant detail, like security settings, in plain language.
Here the user has connected to a favorite network, and the wired interface is now inactive. Grouping the security information, the SSID, and the actions the user can perform gives a certain focus the old applet could not. Irrelevant information simply doesn’t show up.
If you’ve used a VPN with this network before (or tied it to the wifi network) it should probably show up too. This part needs more thought, because VPNs are really independent of the underlying network connection, but often should be “tied” them to a specific connection or two. But what it gets right is to group the things you probably want to do near each other.
So what if you do want to see everything? Well, hit the “Show” button and you get the list of course. We can probably do better than a scrollbar here, but whatever; they’re mockups. Maybe we can be more intelligent about scanning too.
An Updated Face for 2009
2008 is not 2009. And that’s why Jon McCann and I sat down a few weeks ago and worked through things people care about now. Hot off the whiteboard after a trip through the Gimp:
Immediately you’ll notice the strong resemblance Bryan and Mike’s work from over a year ago. To their credit, Jon and I think the same core concepts still hold. We took a look at how mobile broadband would fit in. We split out the icons to show each device type individually, though this needs more thought too, since you don’t care that much what your wifi signal strength is when you’re on 3G. But you do care how good your 3G is when it’s not connected, since you want to know whether you can even hop onto 3G or not in the first place. Thoughts? Comments? Flames?
Windows 7++?
Much to our surprise, Windows 7 looks a lot like what Bryan and Mike sketched out over 18 months ago. Sooo ahead of their time. Yeah, the coloring is different, and yeah it’s got a bunch of questionable Microsoft bling, but Windows 7’s network applet looks and feels a lot like the nm-applet mockups from January 2008. But I think we can do better, by making networking clearer and more concise. Apple’s Airport applet is probably too minimal, but Microsoft’s is probably too complex. Somewhere in the middle is where nm-applet was planning to be and should be: get you connected with a minimum of steps, and put what you need in one convenient place.
Everyday Simplicity
Sounds like something Martha Stewart would sell at K-Mart, but it’s what software should aim for. Let the users do what they came to do, then get the hell out of the way and let them do it. Don’t show options that the users don’t need on a daily basis, but make them available elsewhere in a click or two. Keep the interaction streamlined, simple, and clean. Don’t clutter it up with unnecessary options. If it’s not used on a daily basis, it probably shouldn’t be seen on a daily basis. That’s what nm-applet should do.
So let’s make it do that. Prototyping the concepts in a Python applet would be a great first step. After being kicked around the court a few times, we implement them in nm-applet. Debuting NetworkManager 0.8 with a sexy new interface would make your momma proud. Any takers?
And you need some internet. You know, to giggle over the funny lolcats like a vapid schoolgirl or figure out just what Bobby Love is really up to these days or God forbid, get some actual hacking done. What’s a person to do? You could pull out your iPhone, but the guy across the table with the Sierra card says AT&T is shite on this part of the corridor. Besides, you don’t have an iPhone, and the iPhone can’t tether anyway. Useless. Or, you could pull out your T-Mobile G1 Googlephone and make rlove proud. Oh wait, it can’t tether either. Nice try.
You could get out your cable and hook up a real phone that can tether, like most other phones on the planet besides the iPhone and the G1. But then again…
Wires Suck
That’s where Bluetooth tethering comes in. There’s a few reasons Bluetooth support hadn’t yet got into NetworkManager, mostly related to lack of time and good planning of the user experience. Bastien and I talked over a lot of it last summer in Portland and came up with a strategy, which was spelled out in variousplaces, but nobody really followed it through. Bluetooth support can’t just be a hack; it shouldn’t be “Type your modem device name here”-style fail; it should be well-integrated into the desktop experience. And that requires doing the right things in the right places. The first step towards that goal was ModemManager, which pulls all the modem code out of NetworkManager into a nicely architected daemon that abstracts the hardware differences. The next step was making NetworkManager talk to Bluez.
But Bluetooth Kicks Ass
Since Bastien apparently had nothing better to do, and since his favorite team was probably losing as hard as only they can and thus pointless to watch, he showed up with a pile of patches adding Bluetooth bits to NetworkManager. At the end of the week, we’d got core Bluetooth PAN working pretty well on master, while DUN has to wait a bit until some ModemManager issues get sorted out. Next up is creating the seamless Bluetooth desktop experience from pair to air by adding the necessary bits to the applet and connection editor. But the heavy lifting on the NetworkManager side is mostly done. Thanks to Bastien, NetworkManager 0.8 will ship with native Bluetooth support that doesn’t suck.
Since NetworkManager 0.7 came out, there’s one issue that’s been causing confusion with lots of users: hashed network keys. That passphrase you type into the box when you connect to a WiFi network using any OS isn’t what actually gets used; instead it’s hashed to come up with the real key. There are a few different ways to enter an encryption key for a WiFi network, so bear with me:
Hex: works with both WEP and WPA, and is the most compatible since it actually is exactly what gets sent to the driver as the encryption key. For WEP, this is either a 10 character (for 40-bit WEP) or a 26 character (for 104-bit WEP) string composed of hexadecimal characters. For WPA it’s a 64 character hexadecimal string. Typing in 64 hexadecimal characters gets old pretty fast, which leads us to…
Passphrase: a string of arbitrary characters that is hashed into the actual key to be used. WEP passphrases have no real size restrictions, and are repeated into a 64-byte buffer before being hashed with MD5. At least the creators of WPA learned from experience, specifying that WPA passphrases are between 8 and 63 characters inclusive, which means you can actually autodetect whether it’s a passphrase or a hex key, unlike WEP passphrases. WPA passphrases are hashed using SHA-1 into the real encryption key.
ASCII key: Thanks, Lucent. The original WaveLAN cards used passphrases of 5 or 13 ASCII characters, which some drivers and people still use for God knows what reason. To hash it, take the two-byte ASCII value of each character and stuff them into a buffer. Not secure at all.
Apple passwords: in their infinite wisdom, Apple chose a completely different hashing mechanism for WEP. This means that to connect a non-Apple computer to an Airport WEP network, you need the “Compatible Network Password”, ie the hexadecimal WEP key. At least they stuck with the standard for WPA.
The huge pain with WEP is that you simply cannot autodetect what type of key the user has entered. Since WEP passphrases can also be composed of 10 or 26 hexadecimal characters, it’s impossible to differentiate between a WEP hex key, a WEP passphrase, or a WEP 104-bit ASCII key. Which means the user has to know what WEP key type they are using. FAIL. They also have to know whether the network uses 40-bit or 104-bit encryption, and whether it uses Shared Key authentication or Open System authentication. That’s 12 different possible WEP configurations.
WEP == MASSIVE USER FAIL
In any case, NetworkManager 0.7 required pre-hashed keys for reasons I don’t accurately remember, possibly related to bad trips from the NM 0.6 API that I mis-designed. So the applet hashed your passphrase right after you entered it and stored the hashed key in the keyring. Unfortunately, when the driver failed to connect and NetworkManager asked for your secrets again, all you saw was something you certainly don’t remember typing in. While this actually was your passphrase, and it would work when you hit OK, it certainly was confusing.
Change We Can Believe In
As of Saturday, you’ll always see what you typed in. The real fix is to simply connect the first time and never ask for your passphrase again, but that’s almost always due to driver and supplicant bugs that can and should be fixed; I’ve spent weeks of my life doing just that. Of course, that can only reliably happen in open-source drivers; at least when we find the bugs we can fix them. Which is why you really don’t want any of these.
Seriously Huawei. Your firmware engineers need a huge punch in the face. And then some re-education. With firmware version 11.100.17.00.114 at least, there are two complete stupidities that deserve a huge public mocking.
First, the response to AT+GCAP is simply “+CIS707-A, +MS, +ES, +DS, +FCLASS”. No, it’s not prefixed with “+GCAP: ” like every other modem on the planet that I’m pretty sure the relevant standards (TIA/EIA/IS-131, TIA/EIA-602, and V.250) require:
Extended syntax result codes shall be prefixed by the “+” character to avoid duplication of basic format result codes specified in TIA-602 and by manufacturers. Following the “+” character, the name of the result code appears; result code names shall follow the same rules as command names (see 5.4.1).
The EC121 engineers apparently decided they couldn’t even follow their own documented responses for the CM300/CM320/CM350 for example. I’m assuming they don’t rewrite the firmware every time they make a new part, but I never underestimate the capacity for stupidity.
Second, when it encounters a command it doesn’t like, it sometimes returns “COMMAND NOT SUPPORT”. Seriously. Not “ERROR” or “ERR” like every other modem on the planet. Maybe they were so high on crack they forgot the “ED” too.
Seriously, what the fuck? Stop it. No really, stop being dumb.
Tons of bugs fixed and new features implemented due to popular demand.
Support for more mobile broadband devices and phones
Plays better with stupid wifi and ethernet drivers
Support for rfc3442 classless static routes
The default “Auto eth0” connection is now read/write
Compatibility fixes for 802.1x PEAP authentication and 3G/PPP connections
Reduced wakeups for power saving awesomeness
Ability to deny specific devices the default route
More correct display of wifi signal strength
Custom IPv4 settings for mobile broadband connections
More informative display of network device state
Less annoying password behavior in the vpnc plugin
OpenVPN HMAC authentication and IP configuration fixes
This release fixes more than 50 bugs, including 17 from Fedora, 22 from GNOME, 6 from Ubuntu, and 3 from Debian. Packages are already in updates-testing for Fedora. If you don’t use Fedora, and your distro doesn’t have 0.7.1 soon, then you need to harrass them until they get it 🙂
… to shoot myself in the head. Some mobile broadband cards are like a nice, quiet child that does everything you tell them to do; they’d even clean out your family’s slurry tank if you asked. Unfortunately, most of the cards you just want to throw right into the slurry tank strapped to the side of a large brick. Hopefully the tank is full, and the card doesn’t have a snorkel, not that a snorkel would help it much.
Yes, there are standards. But as we all know, given 10 people and a standard, you’ll end the day with 12 or 13 differently behaving “standards-compliant” implementations. People suck. You’d think it would be easy to agree on an AT command for “prefer 3G / prefer 2G / 3G only / 2G only”. NO SIMPLE FOR YOU. But NetworkManager has to work around huge amounts of stupid. Here’s a run-down of some of the mobile broadband hardware that’s available today and what about it sucks.
HUAWEI (PHEAR THE DRAGON)
Europe apparently got carpet-bombed with these things. They provide two usable serial ports; one for data and another for stuff like signal strength, mode switches, etc. But asking it anything on the second port makes the modem cry, grab its toys, and run home to tell mommy what you’ve done. This caused problems with the new modem capability probing code in NetworkManager 0.7.1. Thanks guys (not). Dropping unhandled input on the floor would apparently have been too easy. And, of course, they use AT^SYSCFG with some magic numbers to indicate 2G/3G preference. That said, Huawei does participate upstream and proactively adds IDs and support for their hardware.
Qualcomm Gobi (NEW HOTNESS ALERT)
Apparently now all the rage State-side (though they’ve been out for a while); even in the ultra-small Atom-based Poulsbong-smoking Sony Vaio P series. These parts can do GSM/HSPA and CDMA/EVDO, depending on the firmware they load. Now that’s pretty cool. There’s even a driver for them (qcserial) queued up in gregkh’s tree. Unfortunately, because Qualcomm still can’t get their head out of their ass you won’t get signal strength, cell tower reports, or mode change signals, since the driver only exposes one serial port which is used for PPP (it might support GSM multiplexing, in which case this rant is wrong). Everything else seems to get done from userspace with libusb and a proprietary protocol. WTF is so awesome about proprietary protocols? You get to sell people an SDK for $20,000 or something? Nice try.
Modern Sierra (MAGICALLY DELICIOUS)
Driven by the ‘sierra’ driver (surprise!), these cards expose multiple serial ports; two or more of which accept AT commands. Only one of these ports has a full AT interpreter and gets used for PPP, the other ports get used for signal strength and GPS during the PPP session. I hear new Sierra gear is switching to the tty+netdevice model though, so these will be old-but-not-busted soon. But of course, somebody took a huge pull off the bong, and came up with AT!SELRAT for 2G/3G preference. Yay! Variation #2!
Old-School Sierra (OLD BUT NOT BUSTED)
Sweet bliss. Works like a champ. 16-bit PCMCIA. HSDPA even. GSM multiplexing support makes up for the fact that it only exposes one serial port to the OS (even though we don’t support userspace muxing yet). It’s been supported in NetworkManager since, like, day #1. Like the newer Sierra cards, it also uses AT!SELRAT, so at least Sierra is consistent. Which is more than I can say for some other hardware I’ve seen.
Option “HSO” (THE NEED FOR SPEED)
PPP sucks; it’s only between you and the card, not over the air. So why bother? Which is why Option killed PPP dead. These devices expose multiple AT-capable ttys, and an ethernet network interface. Do the setup on the AT ports, do the data in high speed on the network interface. This is the current trend. Sierra is going to do it soon. So is Huawei. But unfortunately, everyone does the authentication and the IP configuration differently. And Option’s 2G/3G preference command is AT_OPSYS. Variant #3! Go to hell. In any case, big thanks to Option for providing me with hardware and also working with the upstream kernel community; you guys rock.
Ericsson F3507g (SWEDISH INVASION)
Dude, you got a Dell Mini? If you’ve coughed up for the 5530 Mobile Broadband option, it’s probably got one of these inside. The Sony Ericsson MD300 is the same hardware. For once, somebody uses standard interfaces too; these parts expose multiple cdc-acm serial ports (like most mobile phones), and one cdc-ether network device used for data. The interesting thing is that to get an IP address, you use DHCP on the ethernet interface. We don’t yet know how to set 2G/3G preference, but you can get it with AT*ERINFO. All hail variant #4. This is getting rediculous. At least Ericsson pays people to make their stuff work with Linux, though the AT reference document is NDA-encumbered. Need to hit somebody with the cluebat for that.
BUSlink SCWi275u (DEAR GOD DON’T BUY THIS)
Really. If you find one, put it out of its misery by burning it alive. Yeah, it’s really old, and it’s only GPRS, and it’s from the land before time when they put WiFi into cellular modems because nobody had it onboard. And hey, its firmware is as clueless about standards as Qualcomm is about Open Source. But it works fine with NetworkManager 🙂
As you can see, nobody in this industry talks to each other, and none of the carriers care about making it easier to write software for the devices they sell. Everywhere you look there are silos, walled gardens, and revenue stream protection. But that’s where NetworkManager comes in.
The Bright, Shiny New Mobile Future
NM 0.7 delivered the promise of Mobile Broadband. We took a limited set of devices (ie, no phones) and made those work out of the box. Now it’s time to get bigger, faster, and stronger. We can’t support everything in the current architecture inside NetworkManager, so Tambet started a new project called ModemManager. All mobile broadband handling gets punted out to ModemManager (similar to how WiFi is handled with wpa_supplicant), making the NetworkManager core simpler, easier to maintain, and more robust. ModemManager provides a nice D-Bus API for everything modem-related; data connections, SMS, phonebook, signal strength, GPS, etc. It rocks. It’s more flexible. It spews out cute, cuddly kittens by the thousands. It’s definitely the right architecture and the way forward.
The Slightly Less-Bright Now
But until ModemManager drops some awesome on y’all, we need to better support modems in NetworkManager 0.7.x too. A few problems we’ve been tackling over the past few months:
multiple serial ports – most modems provide more than one port; but nothing tells you what that port gets used for. Sometimes asking the port what’s purpose in life is doesn’t work either. So we have to special-case some modems in the udev prober, and some in NetworkManager. This gets as ugly as your first girlfriend/boyfriend.
modem capabilities – this is why your mobile phone didn’t work with NetworkManager 0.7.0. We need to know whether the modem is CDMA/EVDO or GSM/HSPA since the operation and UI needs to change based on which kind of modem it is. Previously we used hal-info’s 10-modem.fdi, which simply doesn’t scale. Asking the modem freaks some of them out (ie, Huawei) and others just lie for various reasons. So with NM 0.7.1, we probe serial ports with a udev helper and are careful not to touch things that shouldn’t be touched.
modem init strings – because, of course, consistent handling of initialization strings between devices would just be too easy. Some devices puke up half-eaten puppies when given the same init string that every other device on the planet supports. No standardization here. So NetworkManager 0.7.1 tries different init strings until one works.
registration commands – some Huawei modems want to use AT+CGREG instead of AT+CREG. Yeah, I know why it seems to think it can be special, but it’s not. It’s just plain stupid. And this seems to change based on firmware version of all things. Dear God, why do you toy with me? So in lieu of finding a Huawei engineer and asking them what the fuck they were thinking, we hacked around it for now.
We’ve gotten most of worked out in the NetworkManager 0.7.1 release candidate series. And all this crap is exactly why NM 0.7.1 isn’t out yet. Like when NM 0.7.1-rc1 broke people’s Huawei cards due to modem probing freaking out the firmware, I spent $100 for a Huawei E160G off eBay. It took a week to get here, and two days to fix it.
But that’s why NetworkManager rocks; we pony up the cash to make sure our shit works. Users appreciate that.
Or test out packages by your favorite distro. If they don’t have testing packages for 0.7.1-rc3 already, they are lame. Fedora packages here for F9 and F10.
The other day while chilling beside the pool on my private island (A), I decided to head into Port Nelson (B) to check up on my various offshore accounts. Financial crisis and all you see; that Stanford thing last week really had me worried. A laptop hibernation and a short helicopter ride later, I’m in the branch office and need to look up a few things pertaining to my net worth. But upon resume, NetworkManager started reconnecting to my villa’s access point, which was all the way back on my island. WTH!!!??!?!
This problem has been around for a long time. Pretty much since the beginning of time. I looked at it last year and concluded that it wasn’t NetworkManager. This time it really annoyed me, so I made a bet with my porter that I’d figure it out by time I left to hit up this party in Bailey Town. He’s cool like that. I got to keep my money. It still wasn’t NetworkManager.
See, drivers timestamp wifi networks they know about. That way you can figure out if the network was last seen a second ago, 7 seconds ago, or so long ago that it’s dead to me. But they all use an kernel counter called ‘jiffies’ to do that. And ‘jiffies’ doesn’t increment across suspend/resume. See where I’m going with this?
So the next scan after resume, all the old networks are mixed in with the new networks, and you simply can’t tell which ones are old and which ones are new. They all look like they were scanned within the past 10 seconds. The last AP you were connected to looks like a great candidate to try, no matter where it is.
The solution is to age the scan results with the amount of time spent in suspend. This keeps both normal laptops (where you’ll usually be suspended for a while) and OLPC-style laptops (where suspend can happen for sub-second durations) happy. The patches are queued for 2.6.30, and I’ve backported them to 2.6.27, 2.6.28, and 2.6.29. They are also a prerequisite for making NetworkManager just try harder to associate when the connection fails, which I know annoys a lot of people, including myself.
Problem solved, party attended.
The big lesson? When something is wrong with the drivers, fix the drivers. Don’t hack around it like a helpless tool. And if you can’t fix the driver, well… then why did mindlessly stuff $50 bills into Broadcom’s thong in the first place?