The world only needs a few jackasses and I’d like to think I’m not one of them. So instead of being a jackass and making fun of people who bought the wrong hardware, tonight I’m going to throw a bone to everyone who mistakenly bought a Broadcom WiFi card thinking that Broadcom cares about open-source and that any bugs you had with their binary driver would be fixed in a timely manner.
In a great example of how WEXT is underspecified, the frequency returned from SIOCGIWFREQ has been interpreted to mean one of two things depending on the driver you have. Some drivers report the associated channel, while others report the tuned channel. Of course during a scan the card tunes to a bunch of different channels. So when you hit up SIOCGIWFREQ you have no idea what the card is going to report.
Some configurations use the same BSSID/SSID combination on different bands. Thus we need to know what the associated frequency is so we can match up the exact AP the card’s talking to with an entry in the scan list. Otherwise the scan list doesn’t represent any sort of reality, and that’s not a good thing. If the card reports the tuned frequency when it’s background scanning or finding a better roaming candidate then the match will fail.
Tossing the Bone
What’s the only thing more common than a dual-band single BSSID/SSID network configuration? If you guessed “drivers which make talking to that network hard” then you win a big wet donkey kiss from an ugly goddamn donkey. So in complete violation of my Fix the Stupid Drivers Instead of Hacking Around Them policy I’ve checked a fix into NetworkManager that handles this situation better. If you ever saw:
NetworkManager: <info> (wlan0): roamed from BSSID 11:22:33:44:66 (cakehole) to (none) ((none))
then I just fixed 15% of your problem. You’re welcome. The other 85% is your proprietary driver. The real fix for this is to use the much more capable nl80211/cfg80211 kernel interfaces instead of WEXT. That still doesn’t help all you proprietary driver users out there, because Broadcom pretty much ignores upstream kernel wireless advances. So next time spend another $5 and make your life easier by getting an Intel or Atheros card instead.
Every so often I get questions about D-Bus and I end up giving a mini-lesson on what D-Bus is and how it’s used. It took me a while to wrap my head around D-Bus long ago so I don’t expect everyone to pick up D-Bus like Yo-Yo Ma sight-reading. So this is D-Bus, simplified.
D-Bus is just an IPC mechanism, but it layers a few concepts on top of plain message-passing. It took me some time to understand how the D-Bus object model really works (long ago of course), so don’t worry about it you don’t completely understand how it all fits together yet.
service: a program that responds to requests from clients. Each service is identified by a “bus name” which clients use to find the service and send requests to it. The bus name usually looks like org.foobar.Baz. A program can claim more than one bus name; NM claims org.freedesktop.NetworkManager and org.freedesktop.NetworkManagerSystemSettings, each is a unique service which provides different functionality to clients.
object: a method of organizing distinct entities, just like programming languages have objects. Each object is uniquely identified by an “object path” (somewhat like an opaque pointer) that often looks like /org/foobar/Baz/235235. Each request sent to the service must be directed to a specific object. Many services have a base object with a well-known path that you use to bootstrap your communication with the service.
interface: each request belongs to an interface, which is simply a way of logically separating different functionality. The same way that object-oriented languages like Java or C++ or GLib define an “interface”; a specific API that completely different objects can implement, but the caller doesn’t need to know what type the object is, just what methods the interface defines. Interface names often look like D-Bus service names, but have no relation to them.
method call: a request for an operation or information that a client sends to the service; method calls are defined by an Interface and are sent to a service’s objects.
signal: a message broadcast from a service to any listening client.
Putting it All Together
Say you have a binary called “mcdonaldsd” that provides a D-Bus service called org.fastfood.McDonalds. Clients that want to talk to this service use the bus name org.fastfood.McDonalds to direct requests to mcdonaldsd. mcdonaldsd provides a base object called /org/fastfood/McDonalds. This object implements the org.fastfood.McDonalds interface, which defines these method calls:
GetItems() returns an array of object paths representing all the things on the menu that you can order. So if you call it you’ll get something like this in return:
Each of these returned object paths is a pointer to an object; mcdonaldsd probably even implements these as objects internally using Java or C++ or GObject or whatever. These objects are probably completely different (one may be a burger, one may be a drink, the other could be fries) but they all implement a common interface: org.fastfood.McDonalds.Item.
The org.fastfood.McDonalds.Item interface defines the following method calls:
GetType (returns either TYPE_BURGER, TYPE_DRINK, or TYPE_FRIES)
So even if you don’t know what exact type of object /org/fastfood/McDonalds/Item/0 is, you still can get a lot of information about it, enough to decide whether you want to order it or not.
Assume that item “0” is a “BigMac” and item “1” is “Coke”. These are clearly different objects, but each has a name, a calorie count, a price, and can be consumed. Next, since each item is different (even though they all implement the common org.fastfood.McDonalds.Item interface) each item will implement other interfaces that define functionality specific to that type of item.
So item “0” (BigMac) implements the org.fastfood.McDonalds.Item.Burger interface which has the following methods:
RemovePickle (nobody likes those stupid limp pickles anyway)
And item “1” (Coke) implements the org.fastfood.McDonalds.Item.Drink interface which has the following methods:
Remember, since both objects *also* implement the base org.fastfood.McDonalds.Item interface, you can use the Consume() method to consume both items. But clearly, you don’t want to include the InsertStraw() method on the generic org.fastfood.McDonalds.Item interface, because all items implement that interface, and it would be pretty funny if you tried to call InsertStraw() on the BigMac object. People would stare.
So interfaces are about logically separating method calls that have specific functionality, and thus any object that wants that functionality can implement that interface, instead of having every object type duplicate every call the interface defines.
So, with python-esque pseudocode:
# Get local proxy for the remove mcdonaldsd service
bus = get_service("org.fastfood.McDonalds")
mcds = bus.get_object("org.fastfood.McDonalds", "/org/fastfood/McDonalds")
burger_path = None
drink_path = None
# Lets read all the menu items
menu_items = mcds.GetItems()
for object_path in menu_items:
item = bus.get_object("org.fastfood.McDonalds.Item", object_path)
print "Item: %s price %s" % (item.GetName(), item.GetPrice())
# Now let's figure out what we want to order; we'll order
# the first burger we find and the first drink we find, but
# only one of each. We just had breakfast so we're not that
item_type = item.GetType()
if item_type == TYPE_BURGER and burger is None:
burger_path = object_path
elif item_type == TYPE_DRINK and drink is None:
drink_path = object_path
# We've found a burger and drink on the menu, lets order them
if burger_path and drink_path:
# Did this place not get their latest deliveries or something?
if not burger_path or not drink_path:
print "This restaurant doesn't have enough food for me."
food = mcds.Order([burger_path, drink_path])
if food.len() != 2:
print "Oops, not enough money or something. Need to get a job."
# Yay, we got our order. Now we take off the damn pickle.
burger = bus.get_object("org.fastfood.McDonalds.Item.Burger", burger_path)
# And we're taking this to go, so we need a lid and straw for the drink
drink = bus.get_object("org.fastfood.McDonalds.Item.Drink", drink_path)
catch Exception, e:
print "Oops, straw already inserted!"
# We were distracted by the smell of the burger and put the
# straw in before we put the lid on. Oops. Take the straw
# out, put the lid on, and then re-insert the straw
# All ready. Now we can walk out, sit on the curb, and consume the
# burger and drink; note that even though burger_proxy and drink_proxy
# were created with D-Bus interfaces specific to their food type, we
# don't really need to create another interface just to call the
# generic Consume() method which both the burger and drink implement.
# Just give the method call the generic interface.
The “settings service” is a core concept of NetworkManager. There are two settings services: the system settings service and the user settings service. These are just D-Bus services that provide stored network configuration to NM and apps like nm-connection-editor, nm-applet, knetworkmanager, and anything else. The job of a settings service is to store configuration in some manner (GConf, KConfig, keyfiles, ifcfg, /etc/network/interfaces, whatever) and translate that into a format apps understand. That’s it.
Why are there two of them? Well, mainly because you don’t want every connection usable by everyone. Do you want your kids starting your work VPN tunnel to your secret CIA front-company? Or your metered3G to watch online cartoons? Probably not. Those connections get stored in your user settings service where only you can use them. But connections that anyone can use, like your home WiFi or ethernet, should be system connections and thus available to everyone.
What uses these services?
First, NetworkManager uses them to get the list of networks which you’ve connected to. So it can reconnect you to them. That’s pretty fundamental. When you connect to a new network, the settings service (usually nm-applet or knetworkmanager) creates a new connection config and sends that to NM, which then connects you.
Second, any application that wants to know about network configuration can. Note that they cannot read your passwords unless you allow them to via PolicyKit; there’s a good amount of security built into the system to make sure your passwords aren’t discovered and sold off by Nigerian hackers. nm-connection-editor lets you edit this list through a UI. nmcli reads this list to show you active connections and their details in the terminal. An application like Evolution could read the list and start pulling your work email only when you’re connected to the VPN. The possibilities are endless.
The system settings service is special
Partially because it’s built into NetworkManager, but also because it’s privileged, the system settings service can do stuff the user settings services can’t. First, it’s trusted because the storage it uses (ifcfg files, /etc/network/interfaces, keyfiles, etc) cannot be modified by normal users. You have to prove yourself with PolicyKit before you can modify system settings, and in this way unprivileged users can’t mess with your network configuration.
Second, the system settings service is tasked with interpreting your normal distro config files and turning the configuration format you’re familiar with into data all apps can use. And this is where the magic lies. In a happy rainbow-filled world, NetworkManager can take your configuration stanzas in /etc/network/interfaces or /etc/sysconfig/network-scripts/ifcfg-eth0 and apply them to your network device, and everything works just like you expect it to. You don’t even know NetworkManager is there. This intelligence is provided by distro-specific plugins.
Each distro should have a plugin that understands the native configuration format. We have plugins for SUSE, Debian, Ubuntu, and Red Hat. There’s also a generic plugin called ‘keyfile’ that writes .ini-style files to /etc/NetworkManager/system-connections and can be used as a backup if any of the plugins you enable are incapable of saving configuration. Plugins get enabled through the NetworkManager config file, one of /etc/NetworkManager/NetworkManager.conf (the new name) or /etc/NetworkManager/nm-system-settings.conf (the old name). And you can stack plugins; since the ‘ifupdown’ (Debian/Ubuntu) plugin can’t write out any configuration yet, adding the ‘keyfile’ plugin allows changed connections to be saved as keyfiles instead.
Make the Editor Your Slave
You don’t have edit the config files directly unless you want to; the connection editor provides a convenient interface to all the network configuration. But since the system settings service is privileged and writes system-wide configuration you’ll need to be authorized through PolicyKit to change it. Look for /usr/share/polkit-1/actions/org.freedesktop.network-manager-settings.system.policy to find out which privileges there are and what the default access level is. Read up on PolicyKit to find out how to customize the privileges for your installation or your organization. If you can’t change the “Available to all users” checkbox for a connection, chances are you’re not authorized or PolicyKit can’t determine who you are. You should either fix that, or talk to your system administrator
So how do I talk to a settings service?
If you’re an app developer, there are three important resources are at your disposal:
the NetworkManager setting specification, which details what the connection configuration contains and what values each member has
the python examples, which show how to talk to a settings service and get the information you need
the mailing list, which provides quick, useful help when you get stuck
Suggestions for better examples and documentation focus greatly appreciated. It’s not supposed to be hard. It’s supposed to be fun to add network awareness to your apps.
Tell me more!
No. Not yet. Later. I can only do so much in a week.
I spent a lot of March beating the shit out of ModemManager. When the beatdown was done and the dust settled, out came a boatload of stuff people want. It ain’t gonna win design awards, but your wildest 3G fantasies just got rocked. Hard.
Access wha? Wanna see how fast you be cruisin’ those intarwebs fool? That’s why you want some “access technology” dropped on your desktop, right up on your 32px hot-pink GNOME panel. Ask and you get it delivered. Most modems can tell whether they’re connected to the tower using EDGE, UMTS, HSxPA, 1x, EVDO, or whatever, and now you see it too. It’s a great way to figure out just how bad your provider’s network buildout is. Yeah that means you, T-Mobile USA.
Avoiding the Roaming Shaft
You work hard for the cash money. You don’t want some punk roaming network up and grabbing the bills straight outta your pocket. So if you check the magic button, ModemManager won’t connect to a roaming network. If you’re on the home network and get handed off to a roaming one, ModemManager will kill the connection. Dead.
If you feel like you’re getting shafted, look for the roaming badge on the applet icon, or check the menu before you connect. Kick roaming in the ass or whatever you feel like doing. Maybe you like roaming too much. But at least now you have choice.
You Got a Preference
Do you hate 3G? Do you just loooove the 2G action? Fine, have it your way. Choose your mode preference. But remember: every time you pick GPRS God kills a kitten. Think of the kittens.
Slave to the Provider Info
Long ago Antti Kaijanmäki started the mobile broadband provider database to build up a free, open, easily usable list of mobile broadband provider details. Besides being the core of the mobile broadband wizard, it’s now used for grabbing the provider name when we can’t get it elsewhere.
For example, every CDMA 1x base station broadcasts a System ID, and we look this SID up in the provider info and pull out the provider name. And on the GSM side, if the card is stupid (or your provider didn’t set the SIM correctly) then sometimes the alpha for of the PLMN will be missing, but the MCC/MNC won’t be, and we look that up in the provider database to get a pretty network name too.
So that’s a wrap for what will be ModemManager 0.4 in a few weeks. But if you want a preview, check out Fedora 12, 13, or rawhide.
There are two major mobile broadband technology families: GSM/UMTS (which three quarters of the world uses) and CDMA/EVDO (used by the rest). Keep in mind that UMTS uses CDMA as the radio technology, but incompatibly from CDMA/EVDO.
Back to School
GSM is a TDMA (Time Division Multiple Access) technology; communication is divided into a number of slots in which specific devices talk. Each slot contains voice, data, or signalling information. When it’s not your turn, you can’t talk. Pretty simple, but given that it’s a TDMA technology, it’s prone to multipath interference and hard capacity limits. You also have to carefully plan out your cell layout to ensure that adjacent towers don’t use the same frequency.
CDMA, on the other hand, is an ingenious spread-spectrum technology. It’s got a great back story with movie stars and a war and stuff. In contrast to GSM, in a CDMA system every user talks at the same time. Each user is given a unique sequence of zeros and ones called a “spreading code” which is used to modulate the data stream over a certain frequency range (hence the spread-spectrum part). On the receive side, when you know a user’s spreading code you apply it to the RF signal and retrieve the original data. Each user in the cell just sees every other user’s signal as slightly increased background noise. This is why CDMA is extremely robust against snooping and multipath interference, and why its capacity gracefully degrades as cell utilization increases.
What about Qualcomm?
Qualcomm holds many of the patents on CDMA since they spent a ton of time and money turning CDMA into a viable cellular radio technology 20 years ago. They are also one of the largest sellers of cellular chipsets in the world. We as open-source developers have to care, because their stuff shows up in tons of the devices we support. Users don’t like being told “no”.
Most mobile broadband devices (Qualcomm’s included) appear as USB interfaces providing two or more serial ports. One port is usually AT-command capable. If you’re lucky, you get a secondary AT-capable port to use for signal quality and status while the primary port is using PPP for data transmission. Most GSM/UMTS modems have a second AT port. Most CDMA modems do not.
So when your device only has one AT-capable port, what language do the other ports speak?
Proprietary Protocol #1: QMI
This protocol is found on newer Qualcomm chipsets like the MSM7k series that show up in Android handsets Qualcomm Gobi data cards. Google exposed some of the QMI protocol in the Android drivers. Other details have recently turned up through the Gobi Linux driver sources, though Qualcomm doesn’t distribute sources for the “QCQMI DLKM” that probably contains the protocol mechanics. It shouldn’t be too hard to reverse-engineer most of the protocol given these sources and a USB sniffer, but nobody has had the time yet. QMI uses an HDLC-type framing which is quite common in proprietary mobile broadband protocols: a CRC-16 and 0x7E terminates a frame, and the frame is escaped such that 0x7E doesn’t show up in the data. But since we haven’t reverse-engineered QMI yet, it isn’t the main focus of this post.
Proprietary Protocol #2: DM
Diagnostic Monitor is an older protocol found in most Qualcomm devices. I’ve been interested in QCDM for a while, since without it, you can’t get signal strength and status from most CDMA devices while connected. So I’ve been trawling the web for the past couple years looking for anything related to QCDM, and I finally hit the jackpot last fall: the GPL sources for the Sprint-branded Linksys WRT54G3G-V2 router, which have since disappeared. They include a GPL-licensed tool called ‘nvtlstatus’ which implements various pieces of the QCDM protocol. The code is complete junk (as you’d expect from many embedded device manufacturers with schedules to hit) but it worked.
There’s also a sketchy Chinese package called “CDMA_Test.rar” that includes lists of the NVRAM items and some of the DM command numbers. While not GPL, we can use the command numbering and structure definitions because it falls under the phonebook and interoperability copyright exceptions. Additionally, there’s the TCL-based (ick) “RTManager” tool that implements some interesting QCDM commands, which, while we can’t use any of the code, is useful for structure field names that I hadn’t already guessed. Third, some guy did some reverse engineering of Novatel devices on Windows and built up a list of commands, subsystems, and NVRAM locations that were useful for confirming what I found in the other sources.
So through a combination of reverse engineering and these sources I wrote libqcdm, which we now use extensively in ModemManager for controlling CDMA devices.
Since DM is a pretty old protocol (2000 and possibly earlier), many of the commands are purely historical and currently unused. The most interesting ones are:
DIAG_CMD_VERSION_INFO: grabs firmware build dates and version information
DIAG_CMD_ESN: grabs the CDMA device’s ESN, which is essentially the IMEI of a CDMA device
DIAG_CMD_NV_READ and DIAG_CMD_NV_WRITE: NVRAM read/write commands, see below
DIAG_CMD_SUBSYS: subsystem commands; see below
DIAG_CMD_STATUS_SNAPSHOT: gives information about the current state and registration of the device on the CDMA 1x network
But given that many aren’t really used anymore, Qualcomm started running out of command IDs a long time ago…
So Qualcomm used command 75 (DIAG_CMD_SUBSYS) to extended the number of available commands; this command takes a subsystem selector and a subsystem command ID, thus getting around the original 8-bit command ID limitation.
There are a number of standard subsystems (Call Manager, HDR Manager, WCDMA, GSM, GPS, etc) but each manufacturer generally implements their own subsystem too. In this way QCDM isn’t that different from AT commands; while supposedly standardized, each manufacturer inevitably implements a bunch of proprietary commands for their own device because the specs simply don’t cover everything. This just makes our life harder.
The currently identified subsystems are:
Call Manager: the most important command here reports the general state of the device, including the registered SID/NID, the terminal state (online/offline), the network mode (2G/3G), and various preferences that control which network the mobile registers with. This is what we use to determine online/offline mode for CDMA devices since there aren’t any “standard” AT commands we can use to detect both 1x and EVDO registration. Other commands start and end voice or data calls.
HDR (High Data Rate, ie EVDO): the most important command here provides EVDO state, which is mostly taken from the state machines specified in the IS-856 standard. This lets us figure out if the modem is registered on the EVDO network or the CDMA 1x network.
Novatel: only implemented on Novatel Wireless devices, obviously. But it provides access to a lot of stuff we want: the Extended Roaming Indicator (ERI) which shows detailed roaming state, the current access-technology the device is using (AMPS, digital, IS-95, CDMA 1x, EVDO r0, EVDO rA, etc), the voice mail and SMS indicators, and more.
ZTE: for ZTE devices, obviously. I actually did reverse engineer this one using a ZTE AC2726 kindly provided by Huzaifas S. from Red Hat India. All we’ve got so far is the signal strength, the other fields of the command are unknown.
There are also GSM and WCDMA subsystems used with Qualcomm UMTS chipsets, but since most UMTS devices have multiple AT-capable ports we’re less interested in using QCDM there.
Each device has a number of NVRAM locations in which it stores various parameters like mode preference, roaming, home networks, radio parameters, and a whole bunch of other stuff. Not all devices implement every location. I’ve only included the locations that we actually use in libqcdm, but there a couple thousand. The ones we currently use are:
DIAG_NV_MODE_PREF: sets the mode preference: analog (ie AMPS), digital (TDMA), CDMA 1x, or EVDO (HDR)
DIAG_NV_DIR_NUMBER: retrieves your Mobile Directory Number (MDN), aka your phone #
DIAG_NV_ROAM_PREF: controls whether your device will roam on a partner network or not
The values each contains took a bit of time reverse-engineer using the Sprint connection manager, 3 different Sprint CDMA cards, and some USB traces, but now we’ve got the important parts.
Pulling It All Together
Earlier this year we had a numberofbugs from Russian, Indian, and Czech Fedora users where ModemManager simply wouldn’t connect. MM is pretty clever (a good thing) but the IS-707 AT commands aren’t useful enough to tell us what we need (not good). The IS-707 standard AT+CAD? and AT+CSS commands really apply to the CDMA 1x network, not the EVDO network, and all these users had EVDO-only plans. So when ModemManager checked AT+CSS and found that the device wasn’t registered, we sat around polling the registration state for a while. The modem was already registered on the EVDO network, but not on a CDMA 1x network; of course AT+CSS doesn’t tell us that so MM got it wrong.
The real fix was to utilize QCDM and ask the Call Manager whether the modem was online or not, and if so, whether it had a 1x or an EVDO connection. Sounds simple, but it took a lot of work to get there.
Next, since most CDMA devices only expose one AT-capable port, we need a way to get signal strength from the device while it’s connected and the primary port is talking PPP. I’ll cover that in another blog post; stay tuned. We still don’t have a good way to figure out which EVDO revision (either 0 or A) we’re using, nor can we get a reliable roaming indicator yet.
All of this is built in Fedora 12, 13, and rawhide if you’d like to take it for a spin.
The Kernel Side
Many devices provide the AT port via the standard CDC-ACM serial mechanism, which is picked up automatically by the kernel drivers. But their QCDM-capable ports are only exposed via vendor-specific USB interfaces, so I created the qcaux driver to handle these ports; it’s in the 2.6.34 kernel. With qcaux.ko and a recent version of ModemManager stuff will Just Work.
Why You Care
First a big shout to Qualcomm for keeping this shit secret. NOT. Double-plus-shout-out for keeping QMI secret; it’s a pretty simple protocol and there’s not much there worth keeping under wraps. It might be nice to let open-source developers actually talk to your hardware.
With that out of the way, you care because we now have better support for a whole bunch of mobile broadband devices. We even have support for CDMA signal strength while connected for the vast majority of CDMA devices that only expose one AT port. I’ll talk about that later, since it’s quite an interesting story.
Why Sierra Wireless Rocks and Qualcomm Doesn’t
Buy Sierra stuff. It’s top quality and they actually care about open-source, unlike Qualcomm’s mobile broadband division. Last year I initiated a dialogue with Sierra about releasing some details of their proprietary Command and Status (CnS) protocol. Being able to talk CnS to their modems gets us a lot that AT commands and even QCDM don’t provide, like roaming indicator, access technology, and RSSI.
And guess what? They actually listened, did the work, and put the documentation under a Creative Commons license too. I hear it’ll show up soon on their support site if it’s not there already (document #2131024, “CDMA 1xEV-DO CnS Reference”).
Sierra rocks. Now if only Qualcomm would do it too…
After more than a year of heavy development, NetworkManager 0.8 was unleashed on the world a few weeks ago. While we obviously couldn’t make everyone happy just yet, this release includes a ton of great stuff. Much of it is under the hood, so while it won’t dazzle you in a blinding flash of light, it should still make your head explode.
Bluetooth: PAN is now supported for connecting to the Internet with 3G. git master (ie, 0.8.1) got DUN support three months ago already. Take it for a drive in Fedora 13 and other recent distros.
IPv6: we’ve added support for static and autoconfigured IPv6. Welcome to 2010. git master (ie 0.8.1) has support for DHCPv6 if you’re using the ISC dhclient 4.0 or above. 2007 called Debian and wants dhclient 3.0.x back. So until Debian upgrades dhclient to something recent, only Fedora, SUSE and a few other distros get DHCPv6.
udev: we had a party last week, and we stabbed HAL in the face and buried it out back in the woods. All hardware detection is done with udev now. Stuff should just work more smoothly.
802.1x certificates: we’ve fixed the long-standing bugs with multi-certificate files. Whether your .pem file contains one certificate or 50, it’ll work.
system settings service: nm-system-settings is dead. NetworkManager at it. One less process to run, less memory used, and a simpler architecture. This eliminates the 4-second delay waiting to figure out if hot-plugged hardware should be ignored by NM or not. Faster network connections for you.
3G: and best of all, we’ve punted out mobile broadband handing to ModemManager. Just like wpa_supplicant handles all the wifi, modem-manager handles your 3G modems. It’s so much more capable than NM 0.7 that there’s a huge street party about how great it is. ModemManager lets us implement tons of oft-requested features like roaming, 2G/3G mode preference, signal strength display, access technology, etc. It’s neat. Fedora 13 has most of this right now.
And finally, we’ve rocked the documentation world. There are manpages everywhere. There’s tons of new documentation on the wiki. If you have a question, chances are you can find something about it there. Best of all, there’s a new Debugging Guide that should cover almost all aspects of debugging NetworkManager and how to get good information to help fix your bugs. We’ll love you longtime if you look at it before submitting a bug report.
With this kind of base to build on, we’ve literally got a truckload of really awesome shit queued up for NetworkManager 0.8.1. More on that later.
I’ll also be starting a series called “What You Don’t Know About NetworkManager Could Fill A Keg“, in which we’ll explore various random stuff about NetworkManager that you should know, but probably don’t because we never got around to telling you.
Thomas: first, you can use NM on simple servers, or you can revert to the ‘network’ service. But there shouldn’t be anything preventing you from using NM on a server. We’ve tried hard to expand NM’s capabilities in that direction while still retaining the ease-of-use everyone is accustomed to.
But let’s examine your issue. NM starts the network asynchronously, while ‘service network’ does not. That obviously means that by the time the system gets to drbd’s init script, the network may or may not be up yet. The typical solution to this on Fedora is to set NETWORKWAIT=yes in /etc/sysconfig/network, which causes NM’s initscript to block for 30 seconds or until a network connection is brought up (whichever is first). Or better, drop a dispatcher script into /etc/NetworkManager/dispatcher.d that kicks drbd when the network goes up or down. Quite a few Fedora services already have these, even sendmail!
‘man NetworkManager’ has more information on dispatcher scripts. NM runs scripts in /etc/NetworkManager/dispatcher.d when various network events occur, which you can use to kick drbd to life when the network comes up or goes down.
The main problem is initscripts and daemons that simply aren’t smart enough to figure out what resources they require (like network connectivity) and wait until that requirement is available to them. Instead, they quit. And since we don’t have a real event system yet (though Upstart promises to be that) the services don’t get restarted when their resources become available.
In a perfect world, the initscript would parse the config file, figure out that it needs ‘eth0’ to be connected (or maybe just some interface with the IP address 18.104.22.168). Then, when NM brings up eth0 or when some interface gets 22.214.171.124, Upstart would look through that initscripts dependencies, notice that they are now fulfilled, and start the service. When eth0 goes down, Upstart stops the service. Kittens frolic in the sweet fields of clover.
Next, you probably want to make sure that your connections are ‘system’ connections, that is available to all users and before login. For Fedora, this means creating a typical ifcfg file for your network configuration, which NetworkManager will then pick up and use to activate your device at boot time. You can do this either by creating the ifcfg file yourself, or using nm-connection-editor and checking the “Available to all users” button. See this wiki page for more information.
Do you have either of these devices? Or do you want to buy one? And then send it to me? I’ll even PayPal you up to $100 for it. Let me know! I believe the Yota device can still be purchased, but the XOHM variant stopped being sold last year after the XOHM/Clear merger.
“Oh yeah, I’ve seen that code. It’s worse than what I clean up in the bathrooms after Prom or Homecoming. The kids get high and drunk and party too hard and puke all over the place. I deal with enough vomit from 7:30 to 6; I wouldn’t touch the staging drivers with a mop twice as long as the one I have at work.”
Just Say No
Thomas just found out that none of the “staging” wifi drivers will work with hidden access points because they don’t set the IW_SCAN_CAPA_ESSID capability bit. Furthermore, the most popular “staging” drivers (for the Ralink hardware used in many netbooks) don’t even have specific SSID scanning capability at all.
Why do you care? Hidden APs don’t broadcast their network ID, which misinformed people think is more secure (hint: it’s not). Before a driver can associate to the network, it needs to discover available APs and capabilities, which requires a probe-request, which exposes the network ID to everyone anyway. But that requires driver support which none of the staging drivers have.
I fixed this issue upstream two years ago by adding IW_SCAN_CAPA_ESSID to Wireless Extensions. Of course the staging WiFi drivers that many distros enable never got fixed because the vendor it came from didn’t bother to work with the community in the first place. And people wonder why they don’t work.
Broadly speaking, staging WiFi drivers come in two flavors: (a) old dried gum from under the cafeteria table (drivers with a future), and (b) fresh vomit from the hung-over kid in your math class (those without a future).
The drivers with a future (winbond, rtl81xx) are or will based on the kernel-standard mac80211 wireless stack, which implements the 802.11 WiFi specification in the kernel. Since they use the standard mac80211 stack, they get all these nice features like probe-scanning and the correct capability bits for free. All you have to do is work on supporting the hardware itself.
The drivers without a future (rt2860, rt2870, rt3070, rt3090, wlan-ng, vt665x) are based on forks of the ancient ieee80211 stack that Intel’s ipw2x00 drivers forked from the hostap driver. Each of these drivers includes their own copy of the core ieee80211 stack forked at different times and with different hacks. When a bug shows up, that means 4x the work, and 4x the chance for the fix to slip through the cracks. Which is why these drivers have no future. They are a maintenance nightmare. Besides, they have crap like this:
pAdapter->StaCfg.bScanReqIsFromWebUI = TRUE;
It just blows my mind why people think staging wifi drivers are a great idea. There’s a reason staging drivers set the TAINT_CRAP flag in your kernel; because that’s what they literally are.
So what’s the right thing to do?
There’s one huge reason why dead-end staging drivers are a bad idea: there aren’t enough developers. So do you spend that effort on maintaining unmaintainable shit code? Or do you spend it on fixing the code that has a future? Most of the time you can’t do both.
If you choose to maintain the staging drivers, then things become worse over time since the staging code is simply less tested and less maintainable. So you continue to drop hacks and fixes onto an ever-growing steaming pile of manure. Nobody cares much about the driver (because it doesn’t use the standard kernel interfaces and thus doesn’t have a future), so your staging driver never benefits from all the great feature work and bug fixing that the mac80211 and wireless developers are doing.
But if you choose to help fix the upstream drivers that do use mac80211 (like rt2x00), and thus have a future, maybe for a few months some users won’t have great wireless. But they didn’t before either. But then 6 months later, all the users get great wireless with features like power saving, background scanning, WiFi Direct, Bluetooth 3, access-point mode, etc. Those things will never be done to the staging drivers, because those drivers are a dead-end maintenance nightmare, because their code is awful, and because they don’t use the standard kernel wireless stack.
I know I’d invest the effort where it helps users the most, even if it means a few more months of subpar driver support while the official upstream drivers get fixed and the staging drivers go untouched. That’s how things actually get better when you can’t fix everything at once.