How to get backtraces from a window manager

This may sound obvious, but I only just thought of it.

Suppose you make a change to Metacity which causes it to segfault on startup. What you’d ordinarily do is to load it into gdb and have a look at what’s going on in the backtrace with the bt command. But you can’t do that, because it will keep Metacity suspended and so no new window manager will be spawned, and that means that you’ll be running without a window manager. You can get around the problem by sshing into your computer from elsewhere, or by running Xnest or similar, or by using a virtual machine. But here’s a much, much simpler way.

Firstly, create a file called test.gdb containing the text
run --replace
bt

Then simply give the command
tthurman@haematite:metacity$ gdb src/metacity --batch -x test.gdb
[Thread debugging using libthread_db enabled]
[New Thread 0xb7131720 (LWP 16959)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7131720 (LWP 16959)]
0xb7943a6e in g_error_free () from /usr/lib/libglib-2.0.so.0
#0 0xb7943a6e in g_error_free () from /usr/lib/libglib-2.0.so.0
#1 0xb7943adc in g_clear_error () from /usr/lib/libglib-2.0.so.0
#2 0x080a70ba in meta_frame_style_draw (style=0x8119c78, widget=0x8122090, drawable=0x80f4e20, x_offset=0, y_offset=0, clip=0x0, fgeom=0xbfc67384, client_width=1365, client_height=718, title_layout=0x80e1850, text_height=17, button_states=0xbfc67784, mini_icon=0x8123418, icon=0x8123398) at ui/theme.c:4553
[...]
#21 0xb79561e7 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#22 0x08070992 in main (argc=1, argv=0xbfc68514) at core/main.c:479
tthurman@haematite:metacity$

Easy as that.

If the user will not come to the window, the window shall come to the user

KissingSuppose you have two workspaces, and a window on each one. You’re looking at window A, so clearly window B is offscreen. You click something on window A, and window A attempts to present window B to you. What does that mean?

Let’s have two concrete examples:

  • 0x01: You’ve clicked a link in Pidgin’s buddy window, and it’s attempting to present the chat window to you.
  • 0x02: You’ve clicked a link in Evolution, and it’s attempting to present Firefox to you.

In 0x01, you want to stop looking at the old workspace and look at the new one.  But you don’t want the windows to move off their workspaces.  You want everything to stay where it is.

This is the way upstream Metacity currently works throughout.  However, since Firefox is a tabbed browser,((I know Firefox has had tabs since 2002)) people have been asking whether this is the wisest course all the time.  In case 0x02 above, in the old days, the browser would just have launched a new window in your workspace.  People don’t like that now, because they want all their tabs in the same window.  But if the user gets shoved onto the workspace of the existing window and then we add a new tab, eventually they’ll close it and then wonder where their mail went. (At least, that’s how I understand their argument; perhaps I’m mistaken.)  As a compromise, downstream Metacity has now been patched in Ubuntu, Fedora, and possibly other places to make the window demand attention when this happens (i.e. go pulsy on the taskbar).

So we have multiple options when this happens:

  • Bring the window to the user, always.
  • Bring the user to the window, always.  (This is what we do now.)
  • Make the window demand attention– in other words, apply the downstream patch.  This is not the path of least resistance, since judging by recent feedback it appears to really annoy anyone using, say, Pidgin.
  • Tell the target application to deal with it.  This would mean that Firefox could open a new window if you were on a workspace where it had no windows open and open a new tab if you were on a workspace where it had one already.  It would mean finding some way of dealing with windows that didn’t co-operate.  It would also mean, alone among all these solutions, that we’d have to find a way of communicating with the target application.
  • Ask the summoning application to give us a hint as to which of these it would like.  This is my (Thomas’s) favourite solution.  It will need a change to the EWMH.

Things which are not solutions:

  • Allowing the user to pick one and then requiring them to stick with it.  As Havoc said, this is basically giving them a choice between “break Pidgin” and “break Firefox”.
  • Window matching.  We do not do window matching.  We are not about to start for an issue as small as this.  That’s what devilspie is for.

Want to join in the argument fun?  Dive in at GNOME bug 482354.  The water’s lovely.

Photo credit: rofanator.

The overview series: Drag and drop. You complain, we explain.

drag onIf there are two overlapping windows on the screen, people would like to be able to pick up an object from the lower window and drag it to the upper without bringing the lower window to the front, because if that happens the lower window will obscure the upper, and you won’t have anywhere to drag to. In this instance, we would like the same behaviour as Microsoft Windows: if the click starts a drag, raise the lower window on button release; if it doesn’t, raise the window on button press as normal.

However, Metacity (along with most other window managers) doesn’t currently do this, for want of a way to know whether the click starts a drag. This is really something that only the application owning that window can tell us. (It is possible for the user to tell us what they think, by holding down AltGr at the start of the drag. That may not be an official feature. It’s not really ideal either way.)

This whole question is something we’ve been batting around for six years now and it probably ought to be fixed one way or another. Over that time, there are also a few other reasons people have asked to be able to pick stuff up from lower windows, such as the ability to copy text from the lower window and paste it into the upper, or scrolling the lower window’s scrollbars: GNOME bug 76672 deals with this more general case, which we shan’t discuss further here now. Let’s concentrate on the most common problem, represented by GNOME bug 80984: not raising the source window when a drag and drop begins. What isn’t a solution to our problem?

What isn’t a solution

  • Always raising the lower window only on release, not on click (suggested by many people). This would solve the problem at the cost of weirding everyone out, not just breaking the expectations of existing Metacity users and users from other window managers in the world of free software, but also the expectations of Mac and Windows people.
  • Only raising a window when you click on the frame and not the insides, which was raised in GNOME bug 86108. This is a bad idea for similar reasons to the last.
  • Having a magic kind of window that Metacity promises never to raise; then the client will decide whether to raise itself or not based on whether the click was the start of a drag operation. This is how Sawfish does or did it. It’s a bad idea because it rather defeats the purpose of having a window manager if clients are going to manage their own windows, and besides applications can’t raise their own windows in Metacity anyway.

What is a solution
What needs to happen is this:

  1. We figure out a way for other clients to tell the window manager that a click in their window was the start of some kind of drag-and-drop operation.
  2. At this point, the fact that Metacity doesn’t understand this message suddenly becomes a bug in Metacity. So we fix the window manager to understand this.
  3. At this point, the fact that none of the applications out there understand how to tell Metacity about this becomes a bug in those applications, but we can’t do anything much about it without fixing the toolkits like GTK. So we do that.
  4. Now we can actually fix all the applications separately. The bug for fixing Nautilus at this point is GNOME bug 132339.

Clearly we can’t get 2, 3, and 4 sorted until we have 1 down, so let’s just talk about that for the moment. Back in 2004, Lubos Lunak (the maintainer of KDE’s window manager) proposed the first plan to do this, called _NET_WM_TAKE_ACTIVITY (a misleading name, since it’s about taking focus and not activity). When a window other than the topmost one was clicked, the window manager would send it _NET_WM_TAKE_ACTIVITY, which it would remember; after that, nothing would happen until the button was released. If the click had actually begun a drag-and-drop operation, that was all well and good, but if it hadn’t, the client should send it on to the root window and the window manager would raise the window after all. In GNOME bug 152952, Elijah Newren wrote a patch for Metacity implementing this plan.

Lubos’s original plan had a few infelicities, some of which were discussed in this meeting. It means that the window is raised when you release the mouse button, which is bad for reasons we discussed above.  It also means that a lot of policy is decided ahead of time: for example, some people would like their window manager to raise the lower window while they were copying text from it, and then drop it back down when they were done, but not do the same thing for drag-and-drop.  There was working code for KDE and GNOME, but many people objected about all the problems mentioned above including the GTK hackers.  In the end it didn’t make it into the EWMH standard, although some parts of the KDE libraries appear still to accept it to some extent.

Elijah then proposed to fix the problem with a new message type called _NET_WM_MOUSE_ACTION. With this plan, a client would send _NET_WM_MOUSE_ACTION through to the root window as soon as any button was pressed or released on it, telling the window manager what kind of action the click meant: it could be “nothing special” or “drag-and-drop”, but also “text selection” or “scrollbar drag” or “generic thing that I don’t want to explain right now but involves not raising me”.  Lubos agreed that this was a better plan, but it died even earlier in committee, and as far as I know was never implemented anywhere.

It seems to me that the best thing to do, if we can, is to go with a partial fix using _NET_WM_MOUSE_ACTION which allows us to heal this obvious problem.  Then we can carry on later and fix specific problems.  Elijah has said that _NET_WM_MOUSE_ACTION needed a great deal of work to implement on the GTK side; the closest thing we have so far to working code is a patch he then posted.  This does still need working on, preferably by someone who understands the internals of GDK (could this be you, gentle reader?).

A similar but not identical problem is the issue of raising windows when they are a drag target; this is covered in GNOME bug 112308.

Next in the overview series: why getting stacking exactly right is hard and what we’re going to do about it.

Photo by pbo31, cc-by-nc-nd.

Bug hitlist, first week of May

Hooray, hooray, the first of May, Metacity unit and regression testing begins today. Okay, so it doesn’t scan as well as the original, but it’s almost as exciting. Well, perhaps. I (Thomas) am planning the framework at present, have something fairly solid in my head now, and will probably post something up here in the next few days when I have it planned out so everyone else can argue it :)

As well as this, here are the bugs on the front burner for me for this week: if you want something else fixed, talk to us about it (and send us patches!):

* GNOME bug 499996 (possibly aka Debian bug 443933 possibly aka Debian bug 460712)
* Launchpad bug 221144 (possibly aka Debian bug 476386) (may be related to the above, although perhaps not)
* Launchpad bug 216049 – restore to wrong workspace
* GNOME bug 468075 aka Launchpad bug 133541 – vertical maximisation ignores struts, apparent regression
* GNOME bug 528927 — something to do with _net_wm_state_demands_attention, details a little unclear
If you think it should be different, feel free to advocate below.

Are there distributions not using Launchpad which have a systematic way to link to upstream bugs that we could make use of for keeping people installed?

This is the first time we’ve used the new “bug hitlist” tag, which may become partly automated like Metacity Journal is. If you’re working on a bug in Metacity and you have an @gnome.org address, feel free to post bug hitlist entries; if you are and you don’t, feel free to comment to one of them and ask for one to be made or added to.

I shall post Metacity Journal tomorrow, I think, but I just wanted to acknowledge the particular good contributions that Santanu Chatterjee has been making on fixing problems with keyboard grabs during drag-and-drop.

2008-04-27: Metacity given enough eyeballs

Ye Olde Fighting Cocks, St Albans. Photo by Gary Houston, public domain.

Eric Raymond has proposed in his essay “The Cathedral and the Bazaar” that given enough eyeballs, all bugs are shallow, using Linux as a particular example. The idea may be true in the abstract, but in practice even if a project has many users it won’t necessarily have enough eyeballs, and most users of the program would just never be inspired to take printouts of the code to bed: sometimes this is because most users of the program aren’t programmers, other times, because the code is badly-commented and obscure, and still other times because maintenance, frankly, often doesn’t seem immediately as much fun as new development. (Not that long ago, someone sent PostSecret the back of a packet of Viagra where they’d scribbled: “These are for my wife’s benefit. When I’m with my new love interest, I don’t need them.”)

But does this mean that we should just throw out reliable code every five years? By no means! That would just mean we had to go over and over through teething troubles and never reach maturity. What we need is a way to make this code maintainable, so that it may asymptotically approach perfection, firstly by maintainable coding practices and secondly by having external support systems. (Knuth has the asymptotic perfection approach to development of TEΧ, of course; interestingly he also recently said that the idea of unit testing didn’t appeal to him because he rarely needed feedback on what would work and what wouldn’t; James Cape pointed out that this runs rather counter to the idea of code maintenance.) Elijah used the analogy of being dropped in an unfamiliar town with no street signs, but in the end it should be possible to drop people in the town and expect them to find their way, given some help.

The kernel does have this kind of structure, of course, and that’s part of why it can be used as an example of the “given enough eyeballs” dictum in the first place. The challenge for the rest of us is to find ways to help paint the street signs, both through keeping the code maintainable and through external structures.

This is all written from Thomas’s private opinion, and not the opinion of Metacity or GNOME or anyone else, really. But this is here to say that a year ago Thomas, in fixing a complicated bug, accidentally removed Metacity’s ability to stack up several small windows in a cascade. On Friday, in GNOME bug 529925, Erwann Chenede found the two missing lines and put them back in. A release is imminent. I am heartened that Erwann read the code, arrived out of the blue, and found the bug. And I would like Metacity to have the sort of code where people take the printouts to bed to read.

(Ah, so good to see the script picked a photo of the Fighting Cocks for this entry where I’ve had many happy conversations over many pints.)

Bugs

Much busy activity, especially with making sure our bug tracker knows about everything Launchpad knows about Metacity; in particular,

  • GNOME bug 530056 – someone says Metacity becomes a zombie process when you start evince. Anyone else seen this? It’s never happened to me.

Checkins on trunk

Links

Hmm, kind of quiet.

Translations

  • On branches/gnome-2-22: et by plaes, he by yairhr, nn by eskildh, tr by bcicek
  • On trunk: es by jorgegonz, he by yairhr, nn by eskildh, sl by mateju, tr by bcicek

Photo: Ye Olde Fighting Cocks, St Albans. Photo by Gary Houston, public domain.

Zenity

This post is a presentation of the ideas behind GNOME bug 521914.

At present, we ship a program called metacity-dialog, which is often to be found as the sole occupant of /usr/lib/metacity, and it gets spawned on the rare occasions when Metacity needs to ask the user a question. For example, if you attempt to close a window, Metacity asks the permission of the program which owns it; if nothing is heard back from that program within a reasonable time, Metacity spawns metacity-dialog to ask the user whether the window should be closed by force.

There are only three occasions when Metacity needs to ask such questions, and metacity-dialog has ad hoc code for each, with the usual problems attendant on ad hoc code. When there’s any difficulty with metacity-dialog (see, for example, Debian bug 427406), reproducing the fault can be pretty difficult.

GNOME also includes another program called zenity, whose purpose is to pop up and ask questions in just this way. Zenity is stable, polished, and well-understood. With the closing of GNOME bug 335763, zenity’s abilities are now a superset of metacity-dialog’s. It would be a simple matter to remove metacity-dialog from the codebase, and use zenity instead.

The only difficulty I can see is that in some distros it may not always be true that if Metacity is installed, zenity will be too, and it may be difficult to add this as a dependency. Of course, they can always patch and keep metacity-dialog; what do you folks think?

Photo by Dave Spellman; cc-by-nc.

Session management

It’s time to talk about session management. It might even be interesting.

The basic idea is that when you log in, you want your desktop to look like it did when you logged out. There is a program called the session manager which starts up all the programs which were running when you logged out, and tells them that they’re being restored. It is smart enough to call the window manager first. The specs which govern this, if you really want to read them, are §5 of the ICCCM and XSMP.

You can compile Metacity with session management turned off, but most people don’t. If you’re running a copy of Metacity which knows about session management, and you’re running under a session manager, then the session manager will tell Metacity to save state when you log out, and give it a session string. Metacity will write out a single file named after that string, including the position, minimised/maximised state, size, and so on of each window. When you log back in, the session manager tells Metacity the name of the file, and it will reload all the information.

This is the icky part: remember how the session runs the window manager before any of the applications? Metacity now has a list of the windows that were open before you logged out, and none of them have been opened again yet, because the applications haven’t run. So Metacity keeps a list, and when the applications start and open their windows, Metacity will place them how they were beforehand.

So what we have here is window matching. Gentle reader, when you hear the words “window matching”, you have my permission to scream bloody murder. There is simply currently no good way to do it, and so we don’t. What do we match on? The window’s title, its role, its class, or what? We don’t do window matching in Metacity, because there is no good way to do it at present. Rather, we leave that to power-user tools like Devil’s Pie, where people who really want this behaviour and have enough time to specify exactly what they want in detail can do so.

But, as you’ve noted, we do do window matching in Metacity: we just pretend we don’t. And we don’t in general, but session management forces our hand. Every time we open a window (if there’s session management going on), we make a list of session windows which were opened by the application with the same session client ID and have the same resource name and class and the same role setting. Then we use the settings of any matching window which has the same title; if we don’t find one, we use the settings of any matching window which has the same type; if we don’t find one, we give up on window matching. (“The same” above includes the case where they’re both null.)

That was pretty complicated. Add to that complexity the fact that most applications don’t even set the role and resource settings to anything useful, and you begin to see why we (supposedly) don’t do window matching in Metacity.

Now, there are five possible ways forward for us:

1) Carry on as we are.

But we have this directory “~/.metacity”, which clutters up people’s home directories and has to be cleared if they want to remove all state. “~/.metacity” only ever contains the directory “sessions”, anyway.

2) Move metacity session data to “~/.gnome2/metacity/session”.

You might think this was the obvious way forward, but it isn’t going to happen because…

3) Move metacity session data to “~/.config/metacity/session”.

As GNOME bug 518596 points out, freedesktop.org is standardising hidden directory names across all desktops. The name to use based on this idea depends whether our session files are configuration data, cached data, or some other kind of data; this has been discussed on that bug and consensus seems to be that they’re configuration data.

However, if we’re going to move to another directory, we should also consider:

4) Move metacity session data to “~/.config/metacity/session” and change the format.

The existing format is based on GMarkup, but does pretty much exactly what GMarkup is bad at. It is also generally rather inelegant. It would be very simple indeed to replace it with a GKeyFile-based system; there’s almost a one-to-one mapping with the API in many places. We’d know that files in the new place were in the new format and files in the old place in the old, and after a few years perhaps we could drop support for reading back the files in the old place (how long do people keep session files around for, anyway)?

But the most radical solution is:

5) Abolish session management (or at least session management within the window manager).

This suggestion has come from a number of people. The reasons given so far are: XSMP has so many problems that we’re better off not implementing it at all than implementing it partially; we really shouldn’t be even attempting to do this because it involves window matching, which is an impossible problem anyway; Sven Herzberg apparently gave a talk at last year’s GUADEC that said that applications should be responsible for restoring their windows because they know how to do so and the window manager can’t know in general.

Photo by Mark Norman Francis, cc-by-nc.