Christian Neumair

Snappy GnomeVFS

Motivated by all sorts of rants about GnomeVFS file transfer performance and a bug report I investigated the GnomeVFS Xfer process. A few tweaks vastly improved my sftp file transfer experience (literally by orders of magnitude). Patch, rationale.

All your permissions are belong to us

You may have read my blog postings [1,2] covering the design of a new Nautilus permissions GUI.

I sat back and did nothing for some time, and things got clearer.
My first approach failed because it didn’t break with our old permissions GUI semantics, my second approach failed because its advanced mode was totally hosed, not being compliant with our UnixPowerForDesktop project. No advanced user would ever be willing to poke around with a few fiddly buttons to get a “+rwX”. Besides, the dropdowns were to cluttered.

I invested some hours to refurbish my second patch, and I hope the result of my work will convince both, the advanced users who know precisely how chmod and the UNIX permissions work, and the beginners who want to make a downloaded script executable, or share a file with somebody else.

The basic mode was shamelessly stolen from Thunar. Those guys really had some very good interface ideas, if you’re interested in file management usability, make sure to check out their home page. We may even steal even more of the ideas, like a simple dropdown for MIME type associations.

On a sidenote: The “inventor” of the dropdown permission box widget arrangement was – as far as I know – Apple, so don’t sue us you brave Thunar developers :).

I just split out the executable toggle button into three separate buttons, because it may not be to uncommon to demand a very special executability triple for a file, like ??x??x??-, for instance for some script.

The advanced mode is chmod. You have a little entry, you enter your desired mode and we use chmod’s parser to convert it into some permission modification blob we just throw at each of the modified files. This transfers the well-known and beloved chmod semantics for any of your GnomeVFS-browsable remote shares.

Use case 1:

Joe downloaded a script and wants to make it executable.

Use case 2:

Jimmy doesn’t want to share a whole directory with the other users, like he did before.

Now, this is a directory, so what about recursion? As he clicks “None”, a dialog pops up asking for confirmation and explaining the options. I’m not sure on the wording, as you may have guessed from the previous text I’m not a native English speaker. Feedback highly appreciated! I also made sure to forgive the user – when clicking the [X] icon, the action is cancelled, without cluttering the button arrangement, thus being a “Yes/No” decision with cancellation point.

Jimmy reads the text carefully, and confirms.

Use case 3:

Some random *NIX admin stumbles across Nautilus and wants to try out its permission GUI. Having read that GNOME is really the crap, he tries it out anyway.

“Holy crap” he thinks when seeing the GUI, it’s not even suitable for chmodding my files. Where did -R go?. He’s frustrated, but clicks the “Details” button. His input focus is immediately taken to the chmod entry. As he enters text, the crappy GUI is desensitized to denote that he now has full control over the permissions.

He enters his chmod command, and presses enter (whether an apply button should be added isn’t decided, but usability in a keynav sense often means reduction to something shell-alike).

Look, this question isn’t as stupid and cryptic as the first one :).

The patch is available in bugzilla, hopefully we’ll get something similar into Nautilus 2.15, maybe we’ll even stuff the views with some accelerators for reaching the chmod entry with two key presses.

Sorry, no ACL GUI yet, but it’s on the TODO list.

ioctl, fsync – how to flush block device buffers?

Maybe somebody with deep kernel knowledge could help me to sort out the following problem I mentioned some time ago (bug report):

When ejecting USB mass storage, the volume monitor notifies the file manager that an unmount happened, although the USB stick is not yet ejected, but just unmounted. The difference in this particular case is that the buffers on the USB stick are still not yet flushed, i.e. not all data has definitly been written to the stick.

Because of the said notification, the USB stick icon is removed from the screen and the user is suggested that it is OK to remove the stick, although the buffer is not yet flushed and the “eject” command wasn’t even run.

We have multiple ways of resolving this:

a) block unmount signal, i.e. delay it until the whole ejection process is over. Problems: The ejection runs in a different thread, which does not know anything about the volume, but just about its URI. We’d have to introduce mutex-protected hash tables and some extra glue code. Not particularly attractive.

b) tell the underlying operating (linux in my case) that it should flush the storage buffer before the unmount takes place, thus ensuring a clean unmount experience.
Unfortunately, I was unable to figure out the right syscall, although I fiddled around with the kernel for a very notable amount of time.

candidate I: fsync(filedes)

POSIX claims:

“The fsync() function shall request that all data for the open file
descriptor named by fildes is to be transferred to the storage device
associated with the file described by fildes.”

Oh well, it is blocking and returns, but /dev/foo isn’t flushed, no matter what parameter combination I try out. I wonder whether the interpretation of “transferred to the storage device” should be interpreted as of “data is consistent” or as of “data was piped through some connection”.

candidate II: sync()

Works, flushes buffers for all block devices. Problematic, though: It will sync even unrelated block devices, which may be a huge problem with many devices, maybe unmounted concurrently (at least to the user).

candidate III: ioctl(filedes, BLKFLSBUF, 0)

This one sounds promising. When investigating the issue I hoped this would work since e2fsprogs also uses this. However, I was wrong. While according to sys/mount.h it is part of “(t)he read-only stuff” the actual kernel code checks for the user having admin caps, and sets errno to EACCESS even if the user has rw permissions to the block device file.

For non-generic block devices, patches were submitted by others as mentioned in a comment in the said bug report, which reduce the amount of permissions required to flush a buffer. I wonder whether there is a traditional way of dealing with unflushed buffers in UNIX, because this has a lot to do with permission models which is where things get compilicated.

It would really be nice to have a simple portable way of achieving what I want: flushed buffers for a distinct block device.

Anybody? 🙂

studies, USB mounts

Absolved a practical C training within the scope of my studies. It was really trivial, and mostly boiled down to abusing printf/scanf. Scored 100% in roughly 40% of the available time. vim is such a great tool. Let’s see how I work out in the first few real tests, namely circuitry, electrophysics and advanced maths for engineers.

Crispin felt kind enough to provide me with an USB stick to implement a progress dialog for unmount operations in Nautilus. It just cost 60 pence to send it from the UK to Germany by air mail. Tempi Mutantur. The issue is that users won’t know when data is entirely written to the disk (which is done async), and it is safe to remove the drive. Filed a bug report against GnomeVFS, because it doesn’t tell us when the drive is really ready to be removed. I really wonder whether as of writing there is any solution to this problem in the kernel/HAL/GnomeVFS stack.

When things go really bad…and nobody notices it for years

The following analysis is wrong. The callback organization is done in a not very obvious way and not very well documented but does NOT POSE ANY RISK OF LOSING DATA! DON’T PICK UP THIS STORY!

This is really a horrific story about design glitches which should remind you of double-checking that imperative programming forces you to take extremely care of what you’re doing.

GnomeVFS has an async file operation API, which ensures that your application does not block until the operation is finished. It offers you two callback hooks, one of them is called “synchronous” and the other one “asynchronous”. since 2001, the code works like:
For some transfer phases, both the sync and the async callbacks are called, and for others, whether only the sync callback is called or both depends on whether the async call was already called within a particular period of time. The callbacks may flag what further action should be taken in the Xfer process. Its interpretation depends on the overall state and progress of the xfer machinery.

It turns out that the original idea probably was that the sync callback is used for the really important stuff, being called on every single change of the xfer state machine, while the async callback was mainly meant to be used for expensive user interface updates. Unfortunately, the code is used in a way (cf. call_progress_often_internal and call_progress_uri) which does incorporate both the async and the sync callback’s return value, but the async callback is called after the sync callback, and so takes precedence over the sync retval. Calling the async callback after the sync one may make sense in some situations, like informing the user that something just changed in the sync callback, but the fact that the sync callback’s return value is overwritten by the async callback’s retval can really be a problem, considering that – as proven by call_progress_often_internal – it can’t always be predicted inside the sync callback whether the async callback will be called right afterwards or not.

In short, to work around this design glitch, you’ll be forced to either use only a sync or an async callback, or use both and let the async callback be aware of the sync callback’s last retval through some internal variable, so that it can overwrite the callback’s retval if desired. Notice that each time the async callback is called, the sync callback was called right before, so either the sync retval can be returned again or some intentionally override value.

Shockingly, the current Nautilus Xfer code does all its error handling in the async callback, which is not aware of the sync callback’s response, resulting in random behavior depending on the time since the last async invocation and the actual XFer state.

With some good luck, the async code is called each time the transfer does something important, but the GnomeVFS XFer copy_file, copy_directory code and some more rather important ones use the callback invocations which only have the async callback invocated if it wasn’t done already within a particular period, which is probably due to performance considerations.

Under some circumstances the async code is not invoked, and the Nautilus sync code blindly returns 0 in some situations where it is absolutely not desirable.
No error handling is done in Nautilus’ sync callback, it’s response does not depend on the state of the state machine, and whether its return value is used or not depends on the period since the last invocation.

Conclusions
a) GnomeVFS has design flaws
b) Nautilus has design flaws
c) fixing a partially broken or unintuitive API concept is very hard if not impossible, even if the API itself is powerful
d) Data loss is no good

Consequences
a) improve GnomeVFS docs, maybe change async/sync callback handling
b) fix Nautilus by making the async callback sync-aware and moving important stuff into the sync callback (done with some luck, needs testing)
c) write proper GnomeVFSXfer API documentation (TODO)

Update

I’m not so sure whether I got the whole sync/async process right anymore. According to the GnomeVFSProgressCallbackState, the async callback is called “periodically every few hundred miliseconds and whenever user interaction is needed”. I don’t like that architecture at all, and am inclined to modify it, so that the user always has to specify a sync callback, and the async callback would be optional with its retval being ignored.

Maybe Christian Kellner also was right some months ago when he concluded that a new GnomeVFS async file operation API is needed.

Debian/GFDL – could they do better?

I really wonder why GFDL-licensed documentation is a problem if it doesn’t contain invariant sections. Maybe somebody could shed some light on the issue, I accused the Debian people of distinguishing themselves at the cost of a pragmatic policy.

Another approach to recursive permissions

You may have read my recent blog entry dealing with a patch adding recursive permission caps to Nautilus. I got some very helpful comments which pointed out why the last approach failed.
The combo box that was added to the permissions grid was too confusing for newbies. It doesn’t mean anything to them. For nerds however, it doesn’t provide enough caps because they want to edit the permissions of files and folders at the same time (cf. chmod -R +rX). So obviously – although usability experts usually don’t like the idea – a basic/advanced mode separation was needed. This was also necessary because for common computer users only binaries can be executed, and they don’t easily grasp that the x flag for a directory means listability, while not providing this implementation detail to experienced users will confuse those.

So how would the new approach look like? This might be very disappointing for those of you expecting a creative approach, but I simply replicated the GUI concept from MacOS that was also adopted by KDE. For reducing the complexity of the involved GUI descriptions, one cannot select files and folders at the same time and edit their permissions. It will simply cause too much headache, because “Read” means rx for directories and r for files, which is the reason why in basic mode, there is no “Execute” terminology used for directory permissions. Furthermore, un advanced mode it would cause many strings like “Apply foo to the selected files and the files in the selected folders and their subfolders”, requiring the user to scan the help text again and again to get what he is actually doing.

Note that the screenshots shown in this entry are not HIG-compliant, because the dialog helpers weren’t modified to respect it. I’ve filed a bug report against this issue which has some screenshots and is waiting for your usability-related comments.

Use case 1: Joe Doe downloads a binary and wants to make it executable.

He brings up the “Access Rights” tabs and sees the following:

He then clicks the first “Access” combo box, and selects “Read, Execute” or “Read, Write, Execute” from the list.

Note that if a particular permission combination (like wx) is not available, the combo box has no preselected entry, and allows you to set the permissions to any of the “sane” permission combinations in the combo. When modifying folder permissions in basic mode, you’re asked for confirmation (and are offered cancellation, recursive application and toplevel application). For the sake of sanity, +X is always added to “Read”/”Write” folder permissions, but it is taken care to not mess up with the +x flag of the files in the folder while still setting them [+-][rw] as desired.

Use case 2: Foo (too lazy to open a shell and use chmod) wants to apply g+rX to a directory hierarchy.

He brings up the “Access Rights” tabs and sees basically the same as Joe Doe:

“What the hell”, he thinks. This dialog doesn’t offer anything! Goddamnit, the GNOME morons even removed the last useful feature from GNOME.
He is pissed off by the fact that the combo box just offers the crap it offered to Joe Doe – but wait, it even offers *less*. Just three shitty items, and where is my +X?

OK, so he toggles the “Details” button expecting that he’ll soon use apt-get to remove that crappy GNOME shit from his and all his clients’ computers.

Hah, now THAT’S what I call a chmod GUI. While he is a bit disappointed that they didn’t yet figure out how to properly do keynav (more buttons than possible accels) and that he has to press “Execute” and “Read” twice in the folder section and additionally “Read” once in the file section, he does it.

Next, he presses OK – and voila – g+rX is done.

Note that the file details dialog looks like the folder section of the folder dialog of the folder details dialog, and error handling is still not yet done. The patch also lacks some cleanup, but it’s too late for that.

I know, nowadays people demand ACLs, sudo caps when there are unresolvable permission problems and such, but I really think we’re making some decent progress.

Comments are again welcome, and don’t forget to grab the patch and play with it.

Nautilus: Now with recursive permission changes

I sat down yesterday and figured out how to implement recursive permission changes in Nautilus. The result of my work can be found in the Bugzilla, it wasn’t much work after all. The whole NautilusDirectory machinery was really well thought-out, kudos to its designers.

It’s not yet entirely decided how to do error handling, in theory we could use a cancel/retry dialog, but I think it isn’t too common to have a folder you own with many files you don’t own, and without a file operation sudo framework you’re always forced to do a sudo chmod to continue.

I’m really looking forward to usability-related comments.

Oh, and I know that the dialog doesn’t conform to the HIG, I’ll cook a patch for that issue soon.

Update

Be sure to also check out the comments , many people made very helpful suggestions. I disagree with some of the comments that an explicit “Apply” button buys us much.
It rather turns out that the whole current “Permissions” tab approach is rather unintuitive due to UNIX permission obfuscation. I’ve also found a Thunar file manager wiki page containing an insigthful comparison of different UNIX permissions GUI approaches, and I think the KDE/Apple approach is really better than the others. Maybe we should adopt it and explicitly ask the user whether he wants to apply the changes to the directory contents when changing some permissions. We should probably not mess with the “x” part of permissions at all when changing directory permissions. If people did -x, they did it for a distinct reason, and an admin’s approach to file management can obviously never be mapped to a GUI.

We want YOU!

You’re a frequent GNOME user, and want to help out? You can help us! If you have problems with your GNOME software, report them if they’ve not yet been reported.
People doing GNOME development suffer from a constant lack of time. They are typically working full-time, and their GNOME commitment is limited to their spare time. Therefore, they need people who do the prelimitary work for them. We have to know about the users’ problems to tackle them. Unfortunately, without managing and sorting all their wishes by priority and category, we can’t deal with those reports. That’s where you can help out! Help us to not loose track of all the interesting and cool ideas our users have! Join the GNOME BugSquad! Just pick a software product you really like and follow the instructions.
We really have to join forces to improve our software, let’s do it!

Regarding Nautilus, I’d like also like to thank in particular some very active users and developers who got more and more involved into Nautilus, by filing bugs, triaging them and writing patches, including Vidar Braut Haarr, Christian Kirbach, Reinout van Schouwen, Jaap A. Haitsma, Fabio Bonelli, Teppo Turtiainen, Nelson Benitez and many many others I forgot.

It’s nice to see that Martin Wehner could also invests some of his very very limited spare time into triaging and fixing some bugs, so a previously inactive maintainer is back on stage again.

Good desktop experience is all about getting the details right :).

XComposite required for semi-transparency

thos: IIRC, a compositing manager together with the XComposite extension is in fact required for having a semi-transparent (thus arbitrarily-shaped) window.

There is another X extension available allowing arbitrarily shaped windows without any semi-transparency (each pixel is either displayed or omitted). It uses a bitmap, i.e. you explicitly set which pixels to display and which to “carve out”. It probably takes care of that by sending expose events to the underlying windows, and overlaying the results afterwards. This would still mean on-screen drawing, though, compared to off-screen XComposite drawing.

On a sidenote: I think the problem with XComposite is that even if the library is loaded, there is no guarantee that the underlying graphics hardware supports it, and that the user gets semi-transparent output.

For instance, some people demanded semi-transparent drag icons for Nautilus, but it turned out that finding out whether compositing is actually done is nontrivial. I made a proposal on the xdg freedesktop.org mailing list for finding out whether a compositing manager is actually running, but nobody has proposed a freedesktop spec yet.