Having discussed how to evolve GNOME on LUGRadio I figured that
an interesting proposal for a Google Summer of Code project would be to fork GTK+. Many people in the community have talked about how GNOME 3 would need to happen outside the current structures. I also think that nothing fundamental will change in GNOME without a new GTK+ giving the impetous for such a change. So a enterprising student could put together a proposal for taking GTK+ and trying to make a ‘Beryl’ version. The goal might not need to be to create something that would actually become GTK+, but instead come up with changes to GTK+ that enables some stunning graphical effects inside GTK+ applications, kind what they are doing with Beryl on the window manager level. So the ‘fork’ would not care about things maintainability, portability or sensibiilty, but instead try to enable some select demo applications to do some amazing looking things. Enlightenment (which also sports a GUI toolkit these days) would be a good example for ideas for some cool effects, Beryl another. Another idea could be to try to integrate librsvg with GTK+ and use it to do interesting things. The goal of such projects should simply be to try to inspire the GNOME community into taking the leap.
When GNOME originally came out its themeing capabilities essentially set the bar for letting users and developers change the look and feel of their desktop. Lets try to do so again :)
And to make it clear. With Fork I don’t mean an actuall fork in the sense of a new project meant for a life of its own, more of doing a wild and wacky experiemental branch.
I think that many GTK+ code must be rewritten, especially to make it render with vector graphics, like Cairo, but I’m not very sure about to equip it with hilarious effects, I think that is only needed at WM level.
I think *Compiz* was the example you were looking for ;)
If the widgets are drawn using vectors, we should be able to zoom them correctly. I really like the zoom plug-in of Beryl, but it’s only zooming the bitmap of the widget.
And what about having 3D widgets? For sure, it would be a nightmare to design them.
Not just a rewrite. If you really want to innovate, or more importantly, allow innovation, the focus of the GTK toolkit needs to change.
Currently, the focus is that the developer chooses the actual user-interface. We gain consistency using conventions and the HIG. When we update these conventions we have to recreate the interfaces. But even as of yet the consistency isn’t anywhere near perfect. Preferences are not always found at the same place, or even using the same name. Toolbars are not always located at the same place. Confirmation questions are not always asked for the same stuff and handled in the same way. The list goes on and on. Seconldy its a burden on the programmer, because they need to implement the HIG over and over again for each APP. A lot of duplicated work goes into this.
Why not define a completely new API based upon SEMANTIC interfaces. Let a programmer describe ‘what’ the type of interaction would be, instead of defining ‘how’ the interaction should take place. So instead of talking about Windows, Buttons, Grids talk about Dialogs, Actions, Browsers, Tasks, Forms, Questions, etc..
At first this API would translate the Semantic UI defintion of a program based on HIG using the normal GTK+ toolkit. Lets call this higher-level library GSI (gnome-semantic-interface). So it would work like this:
Program -> GSI -> GTK+
Example program: Rhythmbox
Rhythmbox defines a ‘browser’ that selects a current ‘playlist’ which selects a current ‘song’. The GSI knows not about playlists, nor songs. But it knows about Browsers (tree-based selection) and Lists. The programmer would then define actions that can be done with a browser-node (a media-source in this case), a list-node (the play-list in this case).
These actions also have some sort of priority. Based on all this information the GSI creates a user-interface, with high-priority actions as buttons, lower priority actions as right-click options offering all the options as menu-options as well in a structured way. The program throws events of different types. Some are informational messages, some are progress-related, etc. The GSI decides what to put in the status bar, what to popup with a notification message and what to tell the user using a modal message-box.
At first we should just try to get as close as possible with the current gnome-desktop. However, people can then also experiment by creating their own GSI engine implementing the same API. This way they test HIG-improvements. They want single-window for every task, they can create that. They want big monolithic applications, they can create that. Not just for one app, but for the whole desktop.
There could also be a GSI that creates a native windows or qt app using conventions found in their environments. GNOME isn’t about GTK+ anymore, its about HIG. That its strength. Unfortunately, we put all the burden on the programmer.
This would also help a lot for making a gnome desktop work with accesibility or on a mobile-platform where there is less space. Currently we need to redesign the interface of programs to work with this.
Off course I understand, that it won’t be easy to come up with a good semantic API that really does cover most use-cases (if not all). Do we want to abstract stuff like undo&redo? typical file-operations such as open/save/close/exit? stuff like help&about? typical game stuff (start game, exit, highscores, etc.)?
At least currently I think its best if it only deals with the layout though. It should offer all these typical use-cases of interfaces, but not implement except creating a default UI and sending/receiving semantic events about it. So programs might define they want to use the default file-operations, and receive events about them, but the program itself should still take of opening, saving and closing the file.
Then, even if we are able to define an API, there is still the implementation issue. Is it a library like implementation, or a server-like implementation using DBUS-messaging? Or perhaps have the semantic ui in some sort of XML file, that you can just load into a program? I would actually vote for the DBus type of communication because its so programming-language neutral. But I can imagine some issues with it.
All right, that was a long rant. I hope i did not bore you all ;-) But this has been playing in my head for some time now.
Re-architecting gtk+ is not a job for a student who works for one summer. It’s a job for a single brilliant and highly experienced person, or a very small team.
Its the bazaar. He is giving it a try. If its a good improvement, it will be adopted. If not, then it will stay what it was, an attempt.
Secondly, lack of experience, can also help solve the tunnel-vision syndrome commonly found in API’s created by experienced users. Good programming practices change, use-cases evolve and priorities may switch. Too much familiarity with a subject is not always a good thing. For each design-mistake an experienced developer will not make, there is a design-mistake he will copy.
So perhaps he can create a good start. Then an experienced developer comes along and sanitizes it.
At least give him a chance. In the worst case scenario, he just learns a lot. In the best case, we have a new and improved GTK toolkit.
I have had almost _exactly_ the same idea in my mind for a while. It seems to me that graphical toolkits still haven’t experienced the presentation separated from content that CSS brought to HTML. When people write modern websites they describe the content using the right semantics rather than worry about presentation. Presentation (and more importantly – _changing_ presentation) is then a formality.
>>Why not define a completely new API based upon SEMANTIC >>interfaces.
I kind of like: libhig
However it would happen, I really really like this idea.
@ Simon Gray & Patrick Hallinan
I’m glad to hear i’m not the only one ;-)
But I would like to emphasize something. I am not just talking about layout. I’m talking about behaviour as well.
CSS is not the best example because of this. CSS doesn’t control how the URL is formed, nor wether to just display icons, a list or grid of patterns. It doesn’t control any part of the behaviour (client-side or server-side). It only controls the visual aspects of it.
This is also why I think it has never been done. The only exception I can think of is zenith, but it deals with the simpelest types of interface: dialogs. But zenith dialogs can be terminal-gui, command-line and GTK.
It is going to be a lot harder to do the same for the type of interfaces found on the gnome-desktop. Esspecially if it should be flexible enough to create a consistent and useable terminal and voice-controlled interface as well. How do we get the rhythmbox-gnome-interface and an equally usable command-line interface from the same semantic definition?
I’m not even worried about the browser-type-of-programs (webbrowser, filebrowser, musicbrowser, photobrowser, etc.). But what about graphical programs like the GIMP? Or music creating tools like ‘Jokosher’? Or an spreadsheet? Hell, what about the calculator? Its going to be difficult to come up with a semantic api that actually creates the calculator interface we are all used to. On the other hand, it should be general enough that it doesn’t require a use-case for each type of application that exists out there, or it would not really serve a purpose anymore.
I think such a refactoring might be one of the most important next steps for the Linux desktop.
That said, I think having this engineered by an inexperienced summer student is absurd. We should have our very best people on it! Let the summer student take over the current version maintenance tasks, freeing up the maintainers to experiment.
What I don’t understand is why does gdk exist if cairo is cross platform drawing code? Wouldn’t it be a reasonable progression to just kill all deprecated stuff in gtk and switch the backend to cairo? Wouldn’t this cut down the maintainer work? Wouldn’t this provide the opportunities for bling?
benoror: Window Managers can only do graphic bling at the window level. GTK needs to evolve to allow graphic bling at a *widget* level (just like vista/avalon and E17 did)
Instead of thinking about a semantic API, why not think about a semantic presentation language? This, way you can parse it with a simple function call and you will just need functions to change its attributes.
A semantic language would be a lot more flexible I admit. But will also integrate less into the current toolchain of things.
I suspect you are thinking about a pure funtional lanuguage that allows us to define the semantics using combinators and higher-order functions. Which is very cool, but will also limit the audidiance that ‘gets’ it. If everyone would be able to work within such a context why isn’t Haskell used more? They pretty much provide everything we’re talking about here, and within Haskell, everything integrates nicely.
Lots of good ideas on this page. As a current KDE user and former Gnome user, it gives me hope for GTK and Gnome. Aside from Nautilus which was born outside of the traditional Gnome womb, Gnome does have this “old” look and feel to it.
Cairo will probably provide short term improvements to the look and feel, but it is going to take a wishfull, hopefull, risk taking approach to keep Gtk and Gnome fresh in the long term. The “fork” idea is really a great idea because it gives you the possibility of failure. If you have no tolerance for failure, you are less likely to try anything new.
The one important thing that GTK+ needs is a .gtkrc option to give their toolbars Athena style behavior.
The current toolbars are “intuitive” but dysfunctional and unergonomic.
Here is what Athena style toolbars offer in contrast:
a) control the amount of scrolling for one click: scroll by just few lines without having to “drag”, the most RSI-prone mouse operation.
b) change the direction of scrolling without having to move the mouse, using the same scroll amount as for scrolling in the other direction
It is a pity that a good design from literally eternities ago has not been picked up. It is ok if that behavior requires additional configuration _if_ that configuration is available for serious users.
The whole Beryl/Compiz junk does not offer half of the added ergonomics and productivity for daily use, yet will cost a thousand times the effort.
It is a pity that eye candy is considered so much more important than injury-avoiding ergonomic scrollbar semantics.
A semantic interface, using some sort of xml format for the UI and then displaying it based on a libHIG along with some kind of css style rules would be awesome.
Primarily because you could write different rules for screens of different sizes and shapes, so instead of having to create a custom UI when working on a device with more limited functionality than a desktop you could just create a set of rules for that device.
I’m a bit late to this discussion I admit, but I’ve been looking for a way to get more involved in GNOME at large, does this idea have a mailing list yet?