Archive for the ‘Fedora’ Category

Good Touch, Bad Touch: /etc/hosts

Thursday, December 16th, 2010

Raise 'em up really high (stina jonsson, cc-by-nc-2.0)

So I’ll bet you thought touching a simple little file would be simple and a little code right?  So very wrong.  This is Linux we’re talking about, and if this stuff was simple, we’d already be enjoying our gin peartinis on the white sand beaches of St. Maarten with the latest dead tree from Johanna Lindsey.

See, your hostname needs to map to an IP address assigned to your computer, otherwise stuff gets angry.  Like X11, unless you have this hack.  Or quite a few other things are broken enough to look up the hostname to determine the machine’s IP address.  If your hostname isn’t in /etc/hosts, and it isn’t resolvable by DNS, or if DNS is down, or if your network cable got unplugged, or if it doesn’t map to an address assigned to your machine, or if /etc/resolv.conf isn’t set up right, this stuff just breaks.  That’s a lot of ifs.

Bad Touch

So since mid-2008, NetworkManager has tried to keep /etc/hosts updated with your current hostname, and since earlier this year, to map the current hostname to your default interface’s IP address.  Despite having 31 unit tests and fixing a bunch of bugs the code still doesn’t make everyone happy.  The Debian people want the hostname mapped to 127.0.1.1 not 127.0.0.1 (Fedora) or 127.0.0.2 (old Debian).  Which is fine.  Other people get touchy when stuff changes even if their special changes get preserved.  That’s also fine.  Others want to let DNS handle the hostname resolution even though that creates 3 more ways your machine can inexplicably hang.  And I’m tired of piling hacks on top of code that’s already really ugly and complicated.  Thank God for unit tests.

Good Touch

So here’s what we’re going to do.  After a third-quarter huddle with my linebackers, we’ll be removing all the code in NetworkManager that touches /etc/hosts. Gone are the bits that add your hostname.  Gone are the bits that remove your old hostname if it changes.  You now have all the rings of power.  Distros should use the “recommends” functionality of their package system to install nss-myhostname, which ensures that your hostname is always resolvable to a local IP address.  And if for some reason you don’t like that, you can uninstall it and keep /etc/hosts all to yourself.  Everyone wins!

And the best part?  I get to delete code.  I just love doing that.

Don’t Try to Run, Honey

Thursday, September 23rd, 2010

Excuse me sir! Which way to the DNS?

We periodically get mails and feature requests for making NetworkManager play better with a local caching nameserver.  Why would you want to run one, you ask?  Simple: speed, latency, and split DNS.  Of these, the first two are the most important.  It turns out that DNS service on many ISPs just sucks.  Besides returning utterly useless yet supposedly “helpful” web pages for non-existent domains that you simply mistyped, they are often just glacially slow.  A huge shout out to Qwest in Portland making the Interwebs last year feel like getting all my fingernails gradually pulled off with a pair of red-hot pliers.  I can’t update my Facebooks and browse my collegehumor with lookups that take a second or two.  Especially on high latency connections like 3G or satellite running a local caching nameserver makes things considerably snappier.

dnsmasq makes it trivially easy.  You can do it with BIND too, but like everything involving BIND, it’s certainly not trivially easy.  We actually tried this about 3 or 4 years ago with NetworkManager 0.6 but it just wasn’t time yet and the implementation wasn’t that great.  Oh yeah, there’s also DNSSEC which various people want to deploy.

Here’s How It’s Gonna Be

Cue fully-integrated, seamless local caching nameserver support for NetworkManager 0.8.2.  If you have dnsmasq installed and set the “dns=dnsmasq” key in your /etc/NetworkManager/NetworkManager.conf file then you’re all set.  Distros can enable this by default, which we’ll be doing in Fedora 15 and later.  Now you’ll get a local caching nameserver that will also do split DNS when you’re connected to a VPN, so that queries for resources on the secure network go to the VPN nameservers, and everything else goes to your upstream ISP.  And the results get cached for speed.  This already works great with dnsmasq, but there are still a few issues with the BIND plugin that mean it’s not quite ready yet.

Plus, it’s a plugin-style architecture so it’s easy to create new plugins for services that might want to be aware of your network connection’s DNS servers for prefetching or whatever.  Or if djbdns floats your boat, make a plugin!  It’s pretty simple.

You’re a Fine Piece of Real-Estate

Which brings us to a 0.8.2 release.  In keeping with the goal of speeding up minor point releases we’re going to push out a 0.8.2 really, really soon.  We’ve spent a ton of time on polish and bug fixing and everyone should get a piece of the action.  Then, we’ll start concentrating more heavily on NM 0.9 and pushing the architecture forward while simplifying the API dramatically, all in preparation for an awesome GNOME 3.

N900: $349 from Dell Small Business until Monday

Saturday, August 7th, 2010

Another month, another Dell deal on the N900I’m pretty [1] happy with mine, and for only $349 you too can help support a company that actually contributes back to open-source instead of just tossing shit over the wall.  As open-source developers we should vote with our money, and just like we don’t buy Broadcom wifi cards because they Just Don’t Care ®, so should we avoid phones from companies with similar attitudes.  That means you should get an N900.  Use coupon code P3MK8L7D80B0F0 at the checkout.

[1] now if only the sodding email client would actually cache my mail instead of hitting up the server every time I open my inbox…  What gives, Nokia?

Determination That Is Incorruptible

Monday, August 2nd, 2010

Networking is never done... (via earthkath)

Whoever said networking was boring?  Actually, I hope it is boring for users, so boring in fact that they can ignore it completely, get on with their life, and accomplish all sorts of magical things.  But enabling that magic is never dull, and it’s never done. There’s always a new technology or device to enable, more configurations to cover, and changing usage patterns to adapt to.  And another giant leap along that road is…

NetworkManager 0.8.1

… which we released a few days ago.  Tarballs are in the usual places.  Hit up the packages for Fedora, Debian, and Ubuntu.  This release is the culmination of a ton more effort than just the minor version bump signifies, and a huge thanks goes out to everyone involved in the features, code, and testing.  As always, this release nails the top feature request and piles in a bit of something for everyone else.

  • Bluetooth Dialup Networking (DUN) – the #1 user requested feature; you set it up just like Bluetooth PAN except you check a different box at the end
  • nmcli – a command-line interface to almost everything NM does
  • Mobile Broadband Status – signal strength, roaming, carrier name, and access technology shown for your convenience
  • Enhanced IPv6 support – with better DHCPv6 and tons of fixes to IPv6 operation
  • Logging and debugging – log verbosity and domains are now highly granular; make NM as quiet or verbose as you desire

Overall we’ve had 650 commits, 80 bugs fixed, and almost 20,000 lines of code changed since 0.8. That’s a ton of great stuff.  And we’ll continue to land yet more great stuff in 0.8.2.  Let us know what you want!

Next Stop: Simplification 0.9

We’ve decided the benefits of user settings are outweighed by the simplicity of having all your configuration stored and managed by the core daemon.  So Daniel Gnoutcheff is spearheading the effort to kill user settings as a Google Summer of Code project, and he’s kicking major ass.  We’re reworking NetworkManager into the one-stop-shop for all your configuration needs, making it radically simpler to create custom user interfaces for controlling and configuring your network, enabling great fast user switching and finer-grained administration.  It makes NetworkManager smaller, faster, and easier to interact with.  We’re going to base a great GNOME Shell network experience off this architecture and make KDE and XFCE developers’ lives easier at the same time.

This is a huge effort, and best of all should get rid of way more code than it adds.  I love waking up to the smell of freshly killed code even if I wrote it.  I don’t think I can understate how much easier it’ll be to talk to, work with, and understand NetworkManager when this is done.  It’s gonna be great.  Even magical.

Not a Jackass Episode #1

Thursday, July 15th, 2010
Donkey with a circle and slash

Straight from the horse's mouth (via Lamerie)

Why WEXT Sucks Episode #52,334

The world only needs a few jackasses and I’d like to think I’m not one of them.  So instead of being a jackass and making fun of people who bought the wrong hardware, tonight I’m going to throw a bone to everyone who mistakenly bought a Broadcom WiFi card thinking that Broadcom cares about open-source and that any bugs you had with their binary driver would be fixed in a timely manner.

In a great example of how WEXT is underspecified, the frequency returned from SIOCGIWFREQ has been interpreted to mean one of two things depending on the driver you have.  Some drivers report the associated channel, while others report the tuned channel.  Of course during a scan the card tunes to a bunch of different channels.  So when you hit up SIOCGIWFREQ you have no idea what the card is going to report.

Some configurations use the same BSSID/SSID combination on different bands.  Thus we need to know what the associated frequency is so we can match up the exact AP the card’s talking to with an entry in the scan list.  Otherwise the scan list doesn’t represent any sort of reality, and that’s not a good thing.  If the card reports the tuned frequency when it’s background scanning or finding a better roaming candidate then the match will fail.

Tossing the Bone

What’s the only thing more common than a dual-band single BSSID/SSID network configuration?  If you guessed “drivers which make talking to that network hard” then you win a big wet donkey kiss from an ugly goddamn donkey.  So in complete violation of my Fix the Stupid Drivers Instead of Hacking Around Them policy I’ve checked a fix into NetworkManager that handles this situation better.  If you ever saw:

NetworkManager[666]: <info> (wlan0): roamed from BSSID 11:22:33:44:66 (cakehole) to (none) ((none))

then I just fixed 15% of your problem.  You’re welcome.  The other 85% is your proprietary driver.  The real fix for this is to use the much more capable nl80211/cfg80211 kernel interfaces instead of WEXT.  That still doesn’t help all you proprietary driver users out there, because Broadcom pretty much ignores upstream kernel wireless advances.  So next time spend another $5 and make your life easier by getting an Intel or Atheros card instead.

Eat Burgers on the Short Bus

Friday, May 7th, 2010

Every so often I get questions about D-Bus and I end up giving a mini-lesson on what D-Bus is and how it’s used.  It took me a while to wrap my head around D-Bus long ago so I don’t expect everyone to pick up D-Bus like Yo-Yo Ma sight-reading.  So this is D-Bus, simplified.

D-Bus is just an IPC mechanism, but it layers a few concepts on top of plain message-passing. It took me some time to understand how the D-Bus object model really works (long ago of course), so don’t worry about it you don’t completely understand how it all fits together yet.

  • service: a program that responds to requests from clients. Each service is identified by a “bus name” which clients use to find the service and send requests to it. The bus name usually looks like org.foobar.Baz. A program can claim more than one bus name; NM claims org.freedesktop.NetworkManager and org.freedesktop.NetworkManagerSystemSettings, each is a unique service which provides different functionality to clients.
  • object: a method of organizing distinct entities, just like programming languages have objects. Each object is uniquely identified by an “object path” (somewhat like an opaque pointer) that often looks like /org/foobar/Baz/235235. Each request sent to the service must be directed to a specific object. Many services have a base object with a well-known path that you use to bootstrap your communication with the service.
  • interface: each request belongs to an interface, which is simply a way of logically separating different functionality. The same way that object-oriented languages like Java or C++ or GLib define an “interface”; a specific API that completely different objects can implement, but the caller doesn’t need to know what type the object is, just what methods the interface defines. Interface names often look like D-Bus service names, but have no relation to them.
  • method call: a request for an operation or information that a client sends to the service; method calls are defined by an Interface and are sent to a service’s objects.
  • signal: a message broadcast from a service to any listening client.

Putting it All Together

Say you have a binary called “mcdonaldsd” that provides a D-Bus service called org.fastfood.McDonalds. Clients that want to talk to this service use the bus name org.fastfood.McDonalds to direct requests to mcdonaldsd.  mcdonaldsd provides a base object called /org/fastfood/McDonalds. This object implements the org.fastfood.McDonalds interface, which defines these method calls:

  • GetItems
  • Order

GetItems() returns an array of object paths representing all the things on the menu that you can order. So if you call it you’ll get something like this in return:

[ ‘/org/fastfood/McDonalds/Item/0′, ‘/org/fastfood/McDonalds/Item/1′ ]

Each of these returned object paths is a pointer to an object; mcdonaldsd probably even implements these as objects internally using Java or C++ or GObject or whatever. These objects are probably completely different (one may be a burger, one may be a drink, the other could be fries) but they all implement a common interface: org.fastfood.McDonalds.Item.

The org.fastfood.McDonalds.Item interface defines the following method calls:

  • GetName
  • GetType (returns either TYPE_BURGER, TYPE_DRINK, or TYPE_FRIES)
  • GetPrice
  • Consume

So even if you don’t know what exact type of object /org/fastfood/McDonalds/Item/0 is, you still can get a lot of information about it, enough to decide whether you want to order it or not.

Assume that item “0” is a “BigMac” and item “1” is “Coke”. These are clearly different objects, but each has a name, a calorie count, a price, and can be consumed.  Next, since each item is different (even though they all implement the common org.fastfood.McDonalds.Item interface) each item will implement other interfaces that define functionality specific to that type of item.

So item “0” (BigMac) implements the org.fastfood.McDonalds.Item.Burger interface which has the following methods:

  • Unwrap
  • AddMustard
  • RemovePickle (nobody likes those stupid limp pickles anyway)

And item “1” (Coke) implements the org.fastfood.McDonalds.Item.Drink interface which has the following methods:

  • PutLidOn
  • InsertStraw
  • RemoveStraw

Remember, since both objects *also* implement the base org.fastfood.McDonalds.Item interface, you can use the Consume() method to consume both items. But clearly, you don’t want to include the InsertStraw() method on the generic org.fastfood.McDonalds.Item interface, because all items implement that interface, and it would be pretty funny if you tried to call InsertStraw() on the BigMac object.  People would stare.

So interfaces are about logically separating method calls that have specific functionality, and thus any object that wants that functionality can implement that interface, instead of having every object type duplicate every call the interface defines.

So, with python-esque pseudocode:

# Get local proxy for the remove mcdonaldsd service
bus = get_service("org.fastfood.McDonalds")
mcds = bus.get_object("org.fastfood.McDonalds", "/org/fastfood/McDonalds")
burger_path = None
drink_path = None
# Lets read all the menu items
menu_items = mcds.GetItems()
for object_path in menu_items:
    item = bus.get_object("org.fastfood.McDonalds.Item", object_path)
    print "Item: %s price %s" % (item.GetName(), item.GetPrice())
    # Now let's figure out what we want to order; we'll order
    # the first burger we find and the first drink we find, but
    # only one of each.  We just had breakfast so we're not that
    # hungry.
    item_type = item.GetType()
    if item_type == TYPE_BURGER and burger is None:
        burger_path = object_path
    elif item_type == TYPE_DRINK and drink is None:
        drink_path = object_path
    # We've found a burger and drink on the menu, lets order them
    if burger_path and drink_path:
        break

# Did this place not get their latest deliveries or something?
if not burger_path or not drink_path:
    print "This restaurant doesn't have enough food for me."
    sys.exit(1)
food = mcds.Order([burger_path, drink_path])
if food.len() != 2:
    print "Oops, not enough money or something. Need to get a job."
    sys.exit(1)
# Yay, we got our order.  Now we take off the damn pickle.
burger = bus.get_object("org.fastfood.McDonalds.Item.Burger", burger_path)
burger.RemovePickle()
# And we're taking this to go, so we need a lid and straw for the drink
drink = bus.get_object("org.fastfood.McDonalds.Item.Drink", drink_path)
drink.InsertStraw()
try:
    drink.PutLidOn()
catch Exception, e:
    print "Oops, straw already inserted!"
# We were distracted by the smell of the burger and put the
# straw in before we put the lid on.  Oops.  Take the straw
# out, put the lid on, and then re-insert the straw
drink.RemoveStraw()
drink.PutLidOn()
drink.InsertStraw()
# All ready.  Now we can walk out, sit on the curb, and consume the
# burger and drink; note that even though burger_proxy and drink_proxy
# were created with D-Bus interfaces specific to their food type, we
# don't really need to create another interface just to call the
# generic Consume() method which both the burger and drink implement.
# Just give the method call the generic interface.
burger.Consume(dbus_interface="org.fastfood.McDonalds.Item")
drink.Consume(dbus_interface="org.fastfood.McDonalds.Item")

What You Don’t Know About NetworkManager Part 1: Configuration

Friday, April 30th, 2010

It's a D-Bus Party!

A tale of two services…

The “settings service” is a core concept of NetworkManager.  There are two settings services: the system settings service and the user settings service.  These are just D-Bus services that provide stored network configuration to NM and apps like nm-connection-editor, nm-applet, knetworkmanager, and anything else.  The job of a settings service is to store configuration in some manner (GConf, KConfig, keyfiles, ifcfg, /etc/network/interfaces, whatever) and translate that into a format apps understand.  That’s it.

Why are there two of them?  Well, mainly because you don’t want every connection usable by everyone.  Do you want your kids starting your work VPN tunnel to your secret CIA front-company?  Or your metered 3G to watch online cartoons?  Probably not.  Those connections get stored in your user settings service where only you can use them.  But connections that anyone can use, like your home WiFi or ethernet, should be system connections and thus available to everyone.

What uses these services?

First, NetworkManager uses them to get the list of networks which you’ve connected to.  So it can reconnect you to them.  That’s pretty fundamental.  When you connect to a new network, the settings service (usually nm-applet or knetworkmanager) creates a new connection config and sends that to NM, which then connects you.

Second, any application that wants to know about network configuration can.  Note that they cannot read your passwords unless you allow them to via PolicyKit; there’s a good amount of security built into the system to make sure your passwords aren’t discovered and sold off by Nigerian hackers.  nm-connection-editor lets you edit this list through a UI.  nmcli reads this list to show you active connections and their details in the terminal.  An application like Evolution could read the list and start pulling your work email only when you’re connected to the VPN.  The possibilities are endless.

The system settings service is special

Partially because it’s built into NetworkManager, but also because it’s privileged, the system settings service can do stuff the user settings services can’t.  First, it’s trusted because the storage it uses (ifcfg files, /etc/network/interfaces, keyfiles, etc) cannot be modified by normal users.  You have to prove yourself with PolicyKit before you can modify system settings, and in this way unprivileged users can’t mess with your network configuration.

Second, the system settings service is tasked with interpreting your normal distro config files and turning the configuration format you’re familiar with into data all apps can use.  And this is where the magic lies. In a happy rainbow-filled world, NetworkManager can take your configuration stanzas in /etc/network/interfaces or /etc/sysconfig/network-scripts/ifcfg-eth0 and apply them to your network device, and everything works just like you expect it to.  You don’t even know NetworkManager is there.  This intelligence is provided by distro-specific plugins.

Each distro should have a plugin that understands the native configuration format.  We have plugins for SUSE, Debian, Ubuntu, and Red Hat.  There’s also a generic plugin called ‘keyfile’ that writes .ini-style files to /etc/NetworkManager/system-connections and can be used as a backup if any of the plugins you enable are incapable of saving configuration.  Plugins get enabled through the NetworkManager config file, one of /etc/NetworkManager/NetworkManager.conf (the new name) or /etc/NetworkManager/nm-system-settings.conf (the old name).  And you can stack plugins; since the ‘ifupdown’ (Debian/Ubuntu) plugin can’t write out any configuration yet, adding the ‘keyfile’ plugin allows changed connections to be saved as keyfiles instead.

Make the Editor Your Slave

All it really wants is to love you

You don’t have edit the config files directly unless you want to; the connection editor provides a convenient interface to all the network configuration.  But since the system settings service is privileged and writes system-wide configuration you’ll need to be authorized through PolicyKit to change it.  Look for /usr/share/polkit-1/actions/org.freedesktop.network-manager-settings.system.policy to find out which privileges there are and what the default access level is.  Read up on PolicyKit to find out how to customize the privileges for your installation or your organization.  If you can’t change the “Available to all users” checkbox for a connection, chances are you’re not authorized or PolicyKit can’t determine who you are.  You should either fix that, or talk to your system administrator :)

So how do I talk to a settings service?

If you’re an app developer, there are three important resources are at your disposal:

  • the NetworkManager setting specification, which details what the connection configuration contains and what values each member has
  • the python examples, which show how to talk to a settings service and get the information you need
  • the mailing list, which provides quick, useful help when you get stuck

Suggestions for better examples and documentation focus greatly appreciated.  It’s not supposed to be hard.  It’s supposed to be fun to add network awareness to your apps.

Tell me more!

No.  Not yet.  Later.  I can only do so much in a week.

Qualified to Satisfy You

Thursday, April 22nd, 2010

I spent a lot of March beating the shit out of ModemManager.  When the beatdown was done and the dust settled, out came a boatload of stuff people want.  It ain’t gonna win design awards, but your wildest 3G fantasies just got rocked. Hard.

Access Technology

Access wha? Wanna see how fast you be cruisin’ those intarwebs fool?  That’s why you want some “access technology” dropped on your desktop, right up on your 32px hot-pink GNOME panel.  Ask and you get it delivered.  Most modems can tell whether they’re connected to the tower using EDGE, UMTS, HSxPA, 1x, EVDO, or whatever, and now you see it too.  It’s a great way to figure out just how bad your provider’s network buildout is.  Yeah that means you, T-Mobile USA.

Now if only I had some EVDO...

AT&T for the 3G Win

Hope you like EDGE, 'cause with TMO that's about all you get

Avoiding the Roaming Shaft

You work hard for the cash money.  You don’t want some punk roaming network up and grabbing the bills straight outta your pocket.  So if you check the magic button, ModemManager won’t connect to a roaming network.  If you’re on the home network and get handed off to a roaming one, ModemManager will kill the connection.  Dead.

Just push the button

If you feel like you’re getting shafted, look for the roaming badge on the applet icon, or check the menu before you connect. Kick roaming in the ass or whatever you feel like doing.  Maybe you like roaming too much.  But at least now you have choice.

Roaming is fun for your wallet

You Got a Preference

Do you hate 3G?  Do you just loooove the 2G action?  Fine, have it your way.  Choose your mode preference.  But remember: every time you pick GPRS God kills a kitten.  Think of the kittens.

Don't. Kill. Kittens.

Slave to the Provider Info

Long ago Antti Kaijanmäki started the mobile broadband provider database to build up a free, open, easily usable list of mobile broadband provider details.  Besides being the core of the mobile broadband wizard, it’s now used for grabbing the provider name when we can’t get it elsewhere.

For example, every CDMA 1x base station broadcasts a System ID, and we look this SID up in the provider info and pull out the provider name.  And on the GSM side, if the card is stupid (or your provider didn’t set the SIM correctly) then sometimes the alpha for of the PLMN will be missing, but the MCC/MNC won’t be, and we look that up in the provider database to get a pretty network name too.

So that’s a wrap for what will be ModemManager 0.4 in a few weeks.  But if you want a preview, check out Fedora 12, 13, or rawhide.

Mobile Broadband and Qualcomm Proprietary Protocols

Thursday, April 15th, 2010

NO PROTOCOL FOR YOU (via bassclarinetist)

There are two major mobile broadband technology families: GSM/UMTS (which three quarters of the world uses) and CDMA/EVDO (used by the rest).  Keep in mind that UMTS uses CDMA as the radio technology, but incompatibly from CDMA/EVDO.

Back to School

GSM is a TDMA (Time Division Multiple Access) technology; communication is divided into a number of slots in which specific devices talk.  Each slot contains voice, data, or signalling information.  When it’s not your turn, you can’t talk.  Pretty simple, but given that it’s a TDMA technology, it’s prone to multipath interference and hard capacity limits.  You also have to carefully plan out your cell layout to ensure that adjacent towers don’t use the same frequency.

CDMA, on the other hand, is an ingenious spread-spectrum technology.  It’s got a great back story with movie stars and a war and stuff.  In contrast to GSM, in a CDMA system every user talks at the same time.  Each user is given a unique sequence of zeros and ones called a “spreading code” which is used to modulate the data stream over a certain frequency range (hence the spread-spectrum part).  On the receive side, when you know a user’s spreading code you apply it to the RF signal and retrieve the original data.  Each user in the cell just sees every other user’s signal as slightly increased background noise.  This is why CDMA is extremely robust against snooping and multipath interference, and why its capacity gracefully degrades as cell utilization increases.

What about Qualcomm?

Qualcomm holds many of the patents on CDMA since they spent a ton of time and money turning CDMA into a viable cellular radio technology 20 years ago.  They are also one of the largest sellers of cellular chipsets in the world.  We as open-source developers have to care, because their stuff shows up in tons of the devices we support.  Users don’t like being told “no”.

Most mobile broadband devices (Qualcomm’s included) appear as USB interfaces providing two or more serial ports.  One port is usually AT-command capable.  If you’re lucky, you get a secondary AT-capable port to use for signal quality and status while the primary port is using PPP for data transmission.  Most GSM/UMTS modems have a second AT port.  Most CDMA modems do not.

So when your device only has one AT-capable port, what language do the other ports speak?

Proprietary Protocol #1: QMI

This protocol is found on newer Qualcomm chipsets like the MSM7k series that show up in Android handsets Qualcomm Gobi data cards.  Google exposed some of the QMI protocol in the Android drivers.  Other details have recently turned up through the Gobi Linux driver sources, though Qualcomm doesn’t distribute sources for the “QCQMI DLKM” that probably contains the protocol mechanics.  It shouldn’t be too hard to reverse-engineer most of the protocol given these sources and a USB sniffer, but nobody has had the time yet.  QMI uses an HDLC-type framing which is quite common in proprietary mobile broadband protocols: a CRC-16 and 0x7E terminates a frame, and the frame is escaped such that 0x7E doesn’t show up in the data.  But since we haven’t reverse-engineered QMI yet, it isn’t the main focus of this post.

Proprietary Protocol #2: DM

Diagnostic Monitor is an older protocol found in most Qualcomm devices.  I’ve been interested in QCDM for a while, since without it, you can’t get signal strength and status from most CDMA devices while connected.  So I’ve been trawling the web for the past couple years looking for anything related to QCDM, and I finally hit the jackpot last fall:  the GPL sources for the Sprint-branded Linksys WRT54G3G-V2 router, which have since disappeared.  They include a GPL-licensed tool called ‘nvtlstatus’ which implements various pieces of the QCDM protocol.  The code is complete junk (as you’d expect from many embedded device manufacturers with schedules to hit) but it worked.

There’s also a sketchy Chinese package called “CDMA_Test.rar” that includes lists of the NVRAM items and some of the DM command numbers.  While not GPL, we can use the command numbering and structure definitions because it falls under the phonebook and interoperability copyright exceptions.  Additionally, there’s the TCL-based (ick) “RTManager” tool that implements some interesting QCDM commands, which, while we can’t use any of the code, is useful for structure field names that I hadn’t already guessed. Third, some guy did some reverse engineering of Novatel devices on Windows and built up a list of commands, subsystems, and NVRAM locations that were useful for confirming what I found in the other sources.

So through a combination of reverse engineering and these sources I wrote libqcdm, which we now use extensively in ModemManager for controlling CDMA devices.

DM Commands

Since DM is a pretty old protocol (2000 and possibly earlier), many of the commands are purely historical and currently unused.  The most interesting ones are:

  • DIAG_CMD_VERSION_INFO: grabs firmware build dates and version information
  • DIAG_CMD_ESN: grabs the CDMA device’s ESN, which is essentially the IMEI of a CDMA device
  • DIAG_CMD_NV_READ and DIAG_CMD_NV_WRITE: NVRAM read/write commands, see below
  • DIAG_CMD_SUBSYS: subsystem commands; see below
  • DIAG_CMD_STATUS_SNAPSHOT: gives information about the current state and registration of the device on the CDMA 1x network

But given that many aren’t really used anymore, Qualcomm started running out of command IDs a long time ago…

Subsystems

So Qualcomm used command 75 (DIAG_CMD_SUBSYS) to extended the number of available commands; this command takes a subsystem selector and a subsystem command ID, thus getting around the original 8-bit command ID limitation.

There are a number of standard subsystems (Call Manager, HDR Manager, WCDMA, GSM, GPS, etc) but each manufacturer generally implements their own subsystem too.  In this way QCDM isn’t that different from AT commands; while supposedly standardized, each manufacturer inevitably implements a bunch of proprietary commands for their own device because the specs simply don’t cover everything.  This just makes our life harder.

The currently identified subsystems are:

  • Call Manager: the most important command here reports the general state of the device, including the registered SID/NID, the terminal state (online/offline), the network mode (2G/3G), and various preferences that control which network the mobile registers with.  This is what we use to determine online/offline mode for CDMA devices since there aren’t any “standard” AT commands we can use to detect both 1x and EVDO registration.  Other commands start and end voice or data calls.
  • HDR (High Data Rate, ie EVDO): the most important command here provides EVDO state, which is mostly taken from the state machines specified in the IS-856 standard.  This lets us figure out if the modem is registered on the EVDO network or the CDMA 1x network.
  • Novatel: only implemented on Novatel Wireless devices, obviously.  But it provides access to a lot of stuff we want: the Extended Roaming Indicator (ERI) which shows detailed roaming state, the current access-technology the device is using (AMPS, digital, IS-95, CDMA 1x, EVDO r0, EVDO rA, etc), the voice mail and SMS indicators, and more.
  • ZTE: for ZTE devices, obviously. I actually did reverse engineer this one using a ZTE AC2726 kindly provided by Huzaifas S. from Red Hat India.  All we’ve got so far is the signal strength, the other fields of the command are unknown.

There are also GSM and WCDMA subsystems used with Qualcomm UMTS chipsets, but since most UMTS devices have multiple AT-capable ports we’re less interested in using QCDM there.

NVRAM Locations

Each device has a number of NVRAM locations in which it stores various parameters like mode preference, roaming, home networks, radio parameters, and a whole bunch of other stuff.  Not all devices implement every location.  I’ve only included the locations that we actually use in libqcdm, but there a couple thousand.  The ones we currently use are:

  • DIAG_NV_MODE_PREF: sets the mode preference: analog (ie AMPS), digital (TDMA), CDMA 1x, or EVDO (HDR)
  • DIAG_NV_DIR_NUMBER: retrieves your Mobile Directory Number (MDN), aka your phone #
  • DIAG_NV_ROAM_PREF: controls whether your device will roam on a partner network or not

The values each contains took a bit of time reverse-engineer using the Sprint connection manager, 3 different Sprint CDMA cards, and some USB traces, but now we’ve got the important parts.

Pulling It All Together

Earlier this year we had a number of bugs from Russian, Indian, and Czech Fedora users where ModemManager simply wouldn’t connect.  MM is pretty clever (a good thing) but the IS-707 AT commands aren’t useful enough to tell us what we need (not good).  The IS-707 standard AT+CAD? and AT+CSS commands really apply to the CDMA 1x network, not the EVDO network, and all these users had EVDO-only plans.  So when ModemManager checked AT+CSS and found that the device wasn’t registered, we sat around polling the registration state for a while.  The modem was already registered on the EVDO network, but not on a CDMA 1x network; of course AT+CSS doesn’t tell us that so MM got it wrong.

The real fix was to utilize QCDM and ask the Call Manager whether the modem was online or not, and if so, whether it had a 1x or an EVDO connection.  Sounds simple, but it took a lot of work to get there.

Next, since most CDMA devices only expose one AT-capable port, we need a way to get signal strength from the device while it’s connected and the primary port is talking PPP.  I’ll cover that in another blog post; stay tuned.  We still don’t have a good way to figure out which EVDO revision (either 0 or A) we’re using, nor can we get a reliable roaming indicator yet.

All of this is built in Fedora 12, 13, and rawhide if you’d like to take it for a spin.

The Kernel Side

Many devices provide the AT port via the standard CDC-ACM serial mechanism, which is picked up automatically by the kernel drivers.  But their QCDM-capable ports are only exposed via vendor-specific USB interfaces, so I created the qcaux driver to handle these ports; it’s in the 2.6.34 kernel.  With qcaux.ko and a recent version of ModemManager stuff will Just Work.

Why You Care

First a big shout to Qualcomm for keeping this shit secret.  NOT.  Double-plus-shout-out for keeping QMI secret; it’s a pretty simple protocol and there’s not much there worth keeping under wraps.  It might be nice to let open-source developers actually talk to your hardware.

With that out of the way, you care because we now have better support for a whole bunch of mobile broadband devices.  We even have support for CDMA signal strength while connected for the vast majority of CDMA devices that only expose one AT port.  I’ll talk about that later, since it’s quite an interesting story.

Why Sierra Wireless Rocks and Qualcomm Doesn’t

Buy Sierra stuff.  It’s top quality and they actually care about open-source, unlike Qualcomm’s mobile broadband division.  Last year I initiated a dialogue with Sierra about releasing some details of their proprietary Command and Status (CnS) protocol.  Being able to talk CnS to their modems gets us a lot that AT commands and even QCDM don’t provide, like roaming indicator, access technology, and RSSI.

And guess what?  They actually listened, did the work, and put the documentation under a Creative Commons license too.  I hear it’ll show up soon on their support site if it’s not there already (document #2131024, “CDMA 1xEV-DO CnS Reference”).

Sierra rocks.  Now if only Qualcomm would do it too…

Few Surprised at New Evidence of Staging Driver Suckage

Thursday, November 19th, 2009

wdyt_photo3.articleThomas Johnson (High School Janitor)

Oh yeah, I’ve seen that code.  It’s worse than what I clean up in the bathrooms after Prom or Homecoming.  The kids get high and drunk and party too hard and puke all over the place.  I deal with enough vomit from 7:30 to 6; I wouldn’t touch the staging drivers with a mop twice as long as the one I have at work.

Just Say No

Thomas just found out that none of the “staging” wifi drivers will work with hidden access points because they don’t set the IW_SCAN_CAPA_ESSID capability bit.  Furthermore, the most popular “staging” drivers (for the Ralink hardware used in many netbooks) don’t even have specific SSID scanning capability at all.

Why do you care?  Hidden APs don’t broadcast their network ID, which misinformed people think is more secure (hint: it’s not).  Before a driver can associate to the network, it needs to discover available APs and capabilities, which requires a probe-request, which exposes the network ID to everyone anyway.  But that requires driver support which none of the staging drivers have.

I fixed this issue upstream two years ago by adding IW_SCAN_CAPA_ESSID to Wireless Extensions.  Of course the staging WiFi drivers that many distros enable never got fixed because the vendor it came from didn’t bother to work with the community in the first place.  And people wonder why they don’t work.

Broadly speaking, staging WiFi drivers come in two flavors: (a) old dried gum from under the cafeteria table (drivers with a future), and (b) fresh vomit from the hung-over kid in your math class (those without a future).

The drivers with a future (winbond, rtl81xx) are or will based on the kernel-standard mac80211 wireless stack, which implements the 802.11 WiFi specification in the kernel.  Since they use the standard mac80211 stack, they get all these nice features like probe-scanning and the correct capability bits for free.  All you have to do is work on supporting the hardware itself.

The drivers without a future (rt2860, rt2870, rt3070, rt3090, wlan-ng, vt665x) are based on forks of the ancient ieee80211 stack that Intel’s ipw2x00 drivers forked from the hostap driver.  Each of these drivers includes their own copy of the core ieee80211 stack forked at different times and with different hacks.  When a bug shows up, that means 4x the work, and 4x the chance for the fix to slip through the cracks.  Which is why these drivers have no future.  They are a maintenance nightmare.  Besides, they have crap like this:

pAdapter->StaCfg.bScanReqIsFromWebUI = TRUE;

It just blows my mind why people think staging wifi drivers are a great idea.  There’s a reason staging drivers set the TAINT_CRAP flag in your kernel; because that’s what they literally are.

So what’s the right thing to do?

There’s one huge reason why dead-end staging drivers are a bad idea: there aren’t enough developers.  So do you spend that effort  on maintaining unmaintainable shit code?  Or do you spend it on fixing the code that has a future?  Most of the time you can’t do both.

If you choose to maintain the staging drivers, then things become worse over time since the staging code is simply less tested and less maintainable.   So you continue to drop hacks and fixes onto an ever-growing steaming pile of manure.  Nobody cares much about the driver (because it doesn’t use the standard kernel interfaces and thus doesn’t have a future), so your staging driver never benefits from all the great feature work and bug fixing that the mac80211 and wireless developers are doing.

But if you choose to help fix the upstream drivers that do use mac80211 (like rt2x00), and thus have a future, maybe for a few months some users won’t have great wireless.  But they didn’t before either.  But then 6 months later, all the users get great wireless with features like power saving, background scanning, WiFi Direct, Bluetooth 3, access-point mode, etc.  Those things will never be done to the staging drivers, because those drivers are a dead-end maintenance nightmare, because their code is awful, and because they don’t use the standard kernel wireless stack.

I know I’d invest the effort where it helps users the most, even if it means a few more months of subpar driver support while the official upstream drivers get fixed and the staging drivers go untouched.  That’s how things actually get better when you can’t fix everything at once.