If there are two overlapping windows on the screen, people would like to be able to pick up an object from the lower window and drag it to the upper without bringing the lower window to the front, because if that happens the lower window will obscure the upper, and you won’t have anywhere to drag to. In this instance, we would like the same behaviour as Microsoft Windows: if the click starts a drag, raise the lower window on button release; if it doesn’t, raise the window on button press as normal.
However, Metacity (along with most other window managers) doesn’t currently do this, for want of a way to know whether the click starts a drag. This is really something that only the application owning that window can tell us. (It is possible for the user to tell us what they think, by holding down AltGr at the start of the drag. That may not be an official feature. It’s not really ideal either way.)
This whole question is something we’ve been batting around for six years now and it probably ought to be fixed one way or another. Over that time, there are also a few other reasons people have asked to be able to pick stuff up from lower windows, such as the ability to copy text from the lower window and paste it into the upper, or scrolling the lower window’s scrollbars: GNOME bug 76672 deals with this more general case, which we shan’t discuss further here now. Let’s concentrate on the most common problem, represented by GNOME bug 80984: not raising the source window when a drag and drop begins. What isn’t a solution to our problem?
What isn’t a solution
- Always raising the lower window only on release, not on click (suggested by many people). This would solve the problem at the cost of weirding everyone out, not just breaking the expectations of existing Metacity users and users from other window managers in the world of free software, but also the expectations of Mac and Windows people.
- Only raising a window when you click on the frame and not the insides, which was raised in GNOME bug 86108. This is a bad idea for similar reasons to the last.
- Having a magic kind of window that Metacity promises never to raise; then the client will decide whether to raise itself or not based on whether the click was the start of a drag operation. This is how Sawfish does or did it. It’s a bad idea because it rather defeats the purpose of having a window manager if clients are going to manage their own windows, and besides applications can’t raise their own windows in Metacity anyway.
What is a solution
What needs to happen is this:
- We figure out a way for other clients to tell the window manager that a click in their window was the start of some kind of drag-and-drop operation.
- At this point, the fact that Metacity doesn’t understand this message suddenly becomes a bug in Metacity. So we fix the window manager to understand this.
- At this point, the fact that none of the applications out there understand how to tell Metacity about this becomes a bug in those applications, but we can’t do anything much about it without fixing the toolkits like GTK. So we do that.
- Now we can actually fix all the applications separately. The bug for fixing Nautilus at this point is GNOME bug 132339.
Clearly we can’t get 2, 3, and 4 sorted until we have 1 down, so let’s just talk about that for the moment. Back in 2004, Lubos Lunak (the maintainer of KDE’s window manager) proposed the first plan to do this, called _NET_WM_TAKE_ACTIVITY (a misleading name, since it’s about taking focus and not activity). When a window other than the topmost one was clicked, the window manager would send it _NET_WM_TAKE_ACTIVITY, which it would remember; after that, nothing would happen until the button was released. If the click had actually begun a drag-and-drop operation, that was all well and good, but if it hadn’t, the client should send it on to the root window and the window manager would raise the window after all. In GNOME bug 152952, Elijah Newren wrote a patch for Metacity implementing this plan.
Lubos’s original plan had a few infelicities, some of which were discussed in this meeting. It means that the window is raised when you release the mouse button, which is bad for reasons we discussed above. It also means that a lot of policy is decided ahead of time: for example, some people would like their window manager to raise the lower window while they were copying text from it, and then drop it back down when they were done, but not do the same thing for drag-and-drop. There was working code for KDE and GNOME, but many people objected about all the problems mentioned above including the GTK hackers. In the end it didn’t make it into the EWMH standard, although some parts of the KDE libraries appear still to accept it to some extent.
Elijah then proposed to fix the problem with a new message type called _NET_WM_MOUSE_ACTION. With this plan, a client would send _NET_WM_MOUSE_ACTION through to the root window as soon as any button was pressed or released on it, telling the window manager what kind of action the click meant: it could be “nothing special” or “drag-and-drop”, but also “text selection” or “scrollbar drag” or “generic thing that I don’t want to explain right now but involves not raising me”. Lubos agreed that this was a better plan, but it died even earlier in committee, and as far as I know was never implemented anywhere.
It seems to me that the best thing to do, if we can, is to go with a partial fix using _NET_WM_MOUSE_ACTION which allows us to heal this obvious problem. Then we can carry on later and fix specific problems. Elijah has said that _NET_WM_MOUSE_ACTION needed a great deal of work to implement on the GTK side; the closest thing we have so far to working code is a patch he then posted. This does still need working on, preferably by someone who understands the internals of GDK (could this be you, gentle reader?).
A similar but not identical problem is the issue of raising windows when they are a drag target; this is covered in GNOME bug 112308.
Next in the overview series: why getting stacking exactly right is hard and what we’re going to do about it.
Photo by pbo31, cc-by-nc-nd.
What exactly is the Windows behavior if I click and hold on the lower window on a spot where I could be initiating a drag? Does it wait until I either release (indicating no drag => raise window) or move the mouse (indicating a drag => don’t raise)?
@Andy:
I haven’t actually tested this on a Windows machine, but I’m pretty sure that the behaviour is that if something COULD be a drag, it’s treated as one until we know it isn’t. Otherwise the whole thing would be unworkable.
Hmm…maybe I should have posted some updates somewhere.
_NET_WM_MOUSE_ACTION didn’t really “die in committee”; it died because no one bothered implementing it. The EWMH wasn’t meant for adding “pie in the sky ideas”. Since I was suggesting it, it really was up to me to come up with an implementation. Thus, it really was more my fault than anyone else’s that it has never yet been adopted.
_NET_WM_MOUSE_ACTION was overly aggressive in its original goal and would be insane to try to implement. But someone suggested selectively implementing the useful part of it (do not raise on a click that starts a drag-and-drop operation, otherwise raise on button release), or maybe I came up with that, but anyway it either should be or already was part of my last proposal.
Also, note that my patches (combined across the different bug reports) were functional. The gtk+ one was a bit ugly (though not really all that long), and there may have been a bug or two with them, but they did work. I was using it on my machines. I really don’t think there was much work left, and was convinced that I could finish it in a day or two if I ever had time. However, (1) I was too overloaded with too many things at the time and didn’t view this as top priority (release-team was for me), (2) I was trying to encourage to get others involved (I figured making a functional demonstration would allow someone else to finish it off), (3) compiz came along and discouraged me from pursuing metacity as it looked for a while to me that it’d become a dead end (I didn’t realize how resilient metacity was despite its lack of the cool bling), and probably most importantly, (4) the horizontal/vertical maximization thingy came up and utterly sucked the life out of metacity for me.
@Elijah:
I hope the tone of this piece doesn’t sound like an attack on you (or anyone): it wasn’t intended as such, especially since you did a lot of good work on all this. It was merely supposed to be an overview of everything I could find out while searching for what had happened on the matter so that we had everything in one place to move forwards, because people kept asking me about it. If I have said anything you think I should reconsider (that isn’t ameliorated by having your comment below it), let me know where so I can see about rewriting it.
(One thing, though: even if Metacity had been an evolutionary dead end and no patches were made against its code, it would still have been worth fixing this problem in the EWMH and getting GTK to support whatever fix was used, so that Compiz and kwin and so on had the problem solved.)
Do you think we should ask the GTK people to accept what patches we have against GTK already and then work on from there?
I didn’t take it as an attack at all. I just thought there were some pieces that needed clarifying: you said the suggestion “died in committee”, suggesting that it would be hard to make use of it (I thought the list would have accepted using the basic version of _NET_WM_MOUSE_ACTION, which is close to _NET_WM_TAKE_ACTIVITY in terms of work but gets around the hackishness of the latter proposal that gtk+ developers objected to). Also, you suggested there was lots of work left, and I don’t think that is the case.
As far as trying to get the gtk+ developers to accept the patches as they are, I don’t think that’s a good idea. I listed a few things that needed to be cleaned up at the end of comment 28 of bug 154260, and those should be tackled first. (And I’m not so sure there’s even any work needed on the win32 or osx side; people there probably already have everything working just fine.)
Oh, good, I was a bit worried. Okay, so I know where I should be working. Thanks!
@Elijah:
Oh, the reason I said it would take a great deal of work was a quote from you here:
“Owen did not like the hack at all, and thus _NET_WM_MOUSE_ACTION was born. Unfortunately, that one was tough with lots and lots of work needed for it.”
Maybe I misunderstood the context or something, or took it out of chronological sequence.
You might want to contact Metisse people ( http://insitu.lri.fr/metisse/ ) who worked on this overlapping subject in one of their paper http://insitu.lri.fr/~roussel/publications/CHI07-rocknroll.pdf . It is handled by Metisse, with ugly hacks but I’m sure they’ll be happy if some wmspec solution could be found.
You can check a screencast at :
http://www.dailymotion.com/relevance/search/metisse%2Bmandriva/video/x11e0x_mandriva-linux-2007-metisse-copy-pa_tech
I think the issue here might be related to two other issues:
1) Not being able to click a button that was enabled while cursor was already over it
2) Not being able to prevent click-through for critical buttons in a lowered window (for example a “delete all” button should be drawn as enabled but should only react to clicking and hovering when the window is the active – same as similar buttons behave on OS X)
All three of these could be addressed if GDK was able to track the cursors (think MPX) for all interactive controls and set appropriate hints on the window (mouse1 is over a draggable element, do not raise unless released, mouse2 is over a button that could lead to accidental data loss, mouse3 works as usual).
Metisse (see URL) does this by peeling windows like a piece of paper on a drag operation and then rolling the window back. Another solution :)
@Rémi:
But the point is, how does Metisse know that this IS a drag operation?
@Thomas:
That quote of mine was from Oct 2005; comment 28 of bug 154260 is from May 2006. The original _NET_WM_MOUSE_ACTION, to be fully implemented, would take an unrealistic amount of work, but implementing just the sane subset is already nearly done. So, yeah, it was a chronological thing. :-)
@Thomas,
Metisse currently listens for DnD events using XFixes (although a better solution such as _NET_WM_MOUSE_ACTION is *definitely* needed).
As for the interaction, we are currently implementing the following inside Metisse’s window manager:
– during the drag, hovering and staying still over a window will bring it on top after a timeout,
– if at the end of the drop, the user clicks on the destination window after the drop (within say 500ms), then the destination window stays on top,
– if the user does not click before the timeout, the window stacking is restored to what it was before the drag operation started.
The folding operation Frederic and I both described is only used in Metisse for text selection operations (see Fred’s URLs) but it can also be used for drag and drops : http://www.lri.fr/~dragice/foldndrop/ (this works great but we are working on a faster solution for Metisse, one that could be more easily picked up by Metacity)
Can drag-and-drop not be connected to the clipboard? If we define drag-and-drop as putting something into the clipboard and pasting it somewhere, then moving from one window to another should be easy: if the pointer enters a window with the left key of the mouse already active, be ready to paste from the clipboard that something. The action is separated from dragging a window around by the presence or absence of the left click on the window frame or the depression of Alt.
Just my opinion as a user. I do not code in C or anything that complex.
@Robert:
You seem to be talking about raising other windows when the mouse enters them during DND operations. That’s not really what this post was about (though Thomas did refer to it; see the bug 112308 reference near the end). And the problem in that case is that we don’t know when the mouse enters the window, due to how grabs & X11 work. (Though, yes, the Metisse folks apparently figured out something clever with the XFixes extension…)
The problem being discussed is about not raising windows on mouse button press, when that button press could begin a DND operation. (It would instead be raised on mouse button release, if no DND operation was started between the button press and button release.) The problem here is that the WM needs some kind of hint from the app about whether certain button press/release combos could start a DND operation and whether they actually did. That info isn’t currently available to the WM.
The first idea coming in mind was the same as Rémi’s: adding a timeout; probably that would still be hack-ish but it would be a nice workaround while waiting for a better solution.