June 2011 – Dan Williams’ blog

The Incredible Magical Pantech UML290

The Basics

Along with the LG VL600 this modem was the launch device for the Verizon 4G LTE network late last year. Despite being quite large (over twice the size of a normal 3G modem) it’s not a bad device and performs quite well in speed tests. Inside is a Qualcomm MDM9600 chipset providing both CDMA 1xRTT and EVDO on the standard North American 850 MHz Cellular and 1900 MHz PCS bands, and LTE on Verizon’s Upper 700 MHz C-block band. This device cannot roam internationally.

Linux Support

The UML290 exposes four USB interfaces: a standard CDC-ACM AT command port which supports PPP, a QCDM port, a WMC port, and a raw IP network port. Of these, only the AT command and the QCDM ports are really usable in Linux. You can connect to the LTE network using standard ETSI 27.007 GSM-style AT commands like AT+CGDCONT and ATD#99* and such. Connections to the 3G EVDO network can be made with the standard ATD#777 command. Unfortunately, the PPP functionality does not support data connection handoff between the EVDO and LTE networks, so you have to break the connection and reconnect with the appropriate ATD command when necessary. Why is that?

To allow seamless operation between the EVDO and LTE networks Verizon upgraded parts of their core network to eHRPD. HRPD (High Rate Packet Data) is the new name for HDR (High Data Rate) which was the old name for the IS-856 standard developed by Qualcomm ten years ago for high speed 3G packet data. EVDO (Evolution Data Only) is just the marketing name for all that. eHRPD stands for “evolved” or “enhanced” HRPD and essentially drops in pieces of the LTE core network modified to work with older EVDO protocols. Normally your device uses the eHRPD protocol when starting a data session since both the network and the modem support it. But when you use traditional CDMA PPP via ATD#777 the session is between pppd on your computer and the packet data gateway in the network, in contrast to GSM/WCDMA/LTE where the PPP session is only between pppd and the modem itself, not over the air. My theory here is that to maintain backwards compatibility or for some other reason, PPP data sessions using ATD#777 only allow HRPD, and thus handoffs between EVDO and LTE don’t work because the LTE side doesn’t like the older HRPD.

This leads to the problem where you, as the user, have to poke values into the NV_HDRSCP_FORCE_AT_CONFIG_I NVRAM item to manually switch between HRPD and eHRPD just to get connected. Why does this matter? Because the only way to connect to the EVDO network on Linux is with a direct PPP data session using ATD#777. That sucks.

All Hail WMC (wait, what?)

Hardware often makes me want to dress all in black, sit at the end of the bar, drink, and cry. Often Matthew Garrett is right there with me so at least I have company on my trip to black, black oblivion. The hope is that talking to the UML290 on the WMC port and using the modem’s native network interface makes this stupid handoff problem just go away because the modem firmware takes care of the data session protocols and handoffs when you’re not using direct PPP. But that means that we need to reverse engineer both the WMC protocol and the network interface. I’ll drink to that.

It turns out the network interface appears to just be passing raw IP packets over USB. At least that’s what the Windows USB traces tell me unless I’ve had to much Jacky D in which case they just look like Care Bears and rainbows. Qualcomm posted some driver patches for the “smd_rmnet” driver for Android devices that describe a “raw IP” mode for RMNET interfaces that lead me to believe I’m on the right track here. We’ll see.

The WMC bits are the best part though. This Pantech-specific (as far as I can tell) protocol that has been around at least since 2005 since I’ve got an Audiovox PC5740 that uses it and a Pantech PX-500 on Sprint that looks similar yet different. WMC is just another binary protocol; essentially encoding structs on the wire but with a bunch of stupid at the front and some idiot at the end. It’s got a frame start marker of 0xC8, except when there’s more shit at the front. It’s got a frame terminator of 0x7E, except when it doesn’t. It gets HDLC escaped, except when even control characters get escaped instead of just the escape characters. It’s got standard command numbers, except when it doesn’t.

The basic WMC frame starts with 0xC8. The PC5740 and the PX-500 both accept plain WMC requests like this. The UML290 on the other hand uses just about the most convoluted format I can think of. I’d really love to know why. I hope there’s a good reason. Instead the Verizon connection manager sends the WMC packet prefixed with “AT*WMC=”, then 0xC8, and then a bunch of binary data. And not only are the HDLC escape characters escaped, all control characters under 0x20 are escaped too. Even better, the request terminates with a 0x0D instead of the standard 0x7E. So you end up with something looking like this:

41542a574d433dc87d2a87b80d

and when all the framing and shit is removed, it comes down to a single byte: 0x0A. That’s it. Really. Why is this so hard? It’s USB for crying out loud. We’re not on serial links anymore where if somebody picks up the telephone downstairs you get a bunch of garbage in your XMODEM transfer.

It gets better. There’s a CRC-16 at the end, which is pretty standard with these sorts of binary modem protocols. Qualcomm writes the original firmware for all these modems anyway and they all include a Qualcomm DIAG port which speaks a protocol using the standard HDLC framing with CRC-16 (polynomial 0x8408 and seed of 0xFFFF) and a frame terminator of 0x7E. So you’d think they’d re-use those bits. THINK AGAIN. Perhaps because they woke up one day and decided to make life hard for everyone on the planet, the Pantech engineers working on the UML290 decided to use a CRC-16 initial seed of 0xAAFE. What the fuck? Even the PC5740 and the PX-500 use a standard HDLC CRC-16 seed of 0xFFFF like just about everything else on the planet.

But it gets better. The responses from the UML290 don’t bother to include a valid CRC-16; instead it’s just 0x3030. Wow, class work guys. I’m sure there’s good reason for that. Or not. At least the PC5740 and PX-500 get points for valid CRCs.

Which begs the question: why do people still use these serial protocols? Every other piece of USB-connected wireless hardware I’ve seen, from WiFi devices to WiMAX cards, don’t bother with this serial framing shit at all. Even for firmware uploads. They just push packed structs up and down the wire. USB already has a 16-bit CRC check for data packets. Let’s re-invent the wheel for no good reason just because it’s fun.

Why do mobile broadband modems have to be different? Why all the framing and escaping and general eye gouging with shards broken glass? Why duplicate what USB already does? If your modem doesn’t use USB, doesn’t that protocol already have integrity protection and error checking? Cause if it doesn’t you’re already in for a world of hurt.

As an embedded engineer you just have to wake up one morning and say “This is fucking stupid.” But I suppose that’s not something a 6-month product cycle allows. Which is why, as open-source engineers that have to talk to hardware, we tend to drink. And then cry a lot.

NetworkManager and Dual-stack Addressing

Dodge the pig! (via the|G|™ under CC BY-NC-ND 2.0)

The big reason that NetworkManager 0.9 is slower to connect than NM 0.8 is that we flipped IPv6 addressing on by default. That means that when you connect to a new network and that network supports IPv6 autoconfiguration via router advertisements you’ll get IPv6 connectivity. But if that network doesn’t support IPv6 then you’ll spin for 60 seconds or so waiting for a router advertisement because there’s nothing on the network that listens to the IPv6 autoconf solicitations that the kernel puts out when the link comes up. You can fix that but changing the IPv6 addressing method to “Ignore” in nm-connection-editor if you know your network doesn’t support IPv6.

Why don’t we bring up IPv4 and just wait for IPv6 to happen in the background? That’s a great question; I’m glad I asked it. First, it requires some small changes in NetworkManager’s D-Bus interface to add connected states for both IPv4 and IPv6 simultaneously so that applications can listen for when each stack’s connectivity is available. That’s trivial. It could be done tomorrow. It’s not a technical problem at all.

But second, it requires applications to be smarter about what resources they require and to do smart things when those resources aren’t available. And that apparently happens when solid gold pigs start dropping out of the sky. I hope you have falling-gold-pig insurance for your car. But app authors often don’t make their applications smarter and more network aware because hey, that’s more work for them, and hey, people haven’t requested this yet, and hey, that’s one more D-Bus API I need to depend on and I don’t know what else.

NetworkManager says it’s connected via a global “State” property. That property is a logical OR of both IPv4 and IPv6 connectivity. If one is connected then the State property is NM_STATE_CONNECTED. Great, right? But if NM flips the state to CONNECTED when IPv4 completes but IPv6 is still waiting, then your favorite IRC application will try to connect to your IPv6-enabled IRC server. Except IPv6 isn’t up yet so it fails. And you get mad because shit doesn’t magically work.

And then what happens if IPv6 fails? Do we fail the entire connection? Or do we just keep listening for IPv6 router advertisements and when one comes in configure the interface? Currently there’s a setting called ‘failure fatal’ for both IPv4 and IPv6 that lets you determine that behavior; it defaults to TRUE for IPv4 and FALSE for IPv6 since so many networks don’t yet have IPv6 enabled. But this really is something we shouldn’t have to care much about.

And that brings us back to applications. When NetworkManager adds dual-stack connected state, which is actually pretty trivial to do, the applications have to listen to that and care so that your life is better. If the app has an IPv6 address and NM indicates that IPv6 isn’t yet available, the app needs to wait until NM says it is available. Same for IPv4. The problem is that nobody ever seems to bother with this sort of intelligence at the application level, but that’s where it’s really needed, since the connection manager has no idea what servers you’re connecting to and whether or not they are IPv4 or IPv6.

As a side rant about application intelligence, apps should also allow you to associate resources (like internal VPN-only mail servers) with NetworkManager VPN connection UUIDs so that they only check the mail on your corporate VPN when NM says your VPN connection is up. You can do that now. It’s been there for years. But nobody bothers to write that sort of useful support into applications either. Where does the application’s responsibility for intelligence begin? Useful insights on where that line gets drawn are most welcome. So are comments about how hot Colin Walter’s mom is.

gnote performance

I’ve been using gnote as my daily job status tool for a few years now, and it’s great. I love it. I have 900+ notes. But every day when I create a new note it hangs for 10 seconds, and again after typing the note’s title and hitting return. This machine isn’t slow (Core 2 Duo 1.86GHz) so it’s got to be gnote.

So we fire up sysprof. And for both operations (creating a new note, changing the title) we find the culprit to be the add_keyword() function, called from gnote::TrieController::update(). It appears to be mostly add_match_at_state() checking for equality of something. Full sysprof data available upon request.

I like gnote a lot; this is a minor annoyance but one I hit every day. If anyone optimizes this I will owe you something, and I’m a great person to have owing you something.