Ubuntu Brainstorm Announced!

The Ubuntu QA community have put together an awesome new resource for Ubuntu users and developers – Ubuntu Brainstorm.  This allows you to suggest ideas for improvements, and to vote on the ideas others have suggested.

We have of course been inspired by the IdeaStorm site from our good friends at Dell but modified the concept to fit our needs.

The development team can now take the pulse on the most pressing user issues and propose the ideas as topics at the Ubuntu Development Summits and ultimately as specifications. Ubuntu development is in turn driven by detailed specifications written up in the wiki and tracked as blueprints in Launchpad.

An idea on brainstorm can easily be linked to a Launchpad blueprint as well as to a bug or a forum discussion thread. In this way we expect to bridge the locations where ideas are often submitted now, as forum posts or bug reports, with the blueprint format they should be expressed in to be implemented.

How to (and why) supervise forking processes

Yesterday’s celebratory blog post demonstrated that Upstart is now able to supervise processes that fork into the background, as most daemons do. Now that the code has undergone a little more testing, and been pushed into the archive, it’s worth explaining a little bit more of the background as to the how, and why, we do this.

The why is easiest to answer first. Daemons are normally written to fork, usually twice; this detaches them from the terminal, process group and session that they were spawned from so that they remain running after the user logs out. The fork isn’t just mechanism though, over time a convention has occurred that means daemons don’t go into the background until their initialisation is complete and they’re ready to receive connections — if that’s their bag.

Simply adding an option to remain in the foreground might appear to eliminate the need to deal with the problem, but this also takes away the notification that the daemon is ready for use. Over time this signal can be replaced with other notifications: registering a known D-Bus name, or simply raising SIGSTOP; but these require code changes that need to be agreed with upstream first. Making code changes also assumes that we have the code. Whether we like it or not, sysadmins will often have the need to run proprietary daemons — or even simply older versions of software where the patch is too invasive.

So that’s why we have to do it, now how do we?

This is one of the reasons that building the service supervisor into init, rather than having it as a seperate process, makes sense. Init has a few special kernel-provided buffs, one of which is that orphaned processes are reparented to it. When you run a daemon from the command-line, the process is initially your child; it forks once and the parent dies, the new child is now orphaned, and thus reparented to init. (Most daemons now run setsid and fork a second time. This is to ensure that if they open a tty device, they don’t unexpectedly become its owner.) Init, like any other process, receives notification about its children through wait so will know when daemons terminate; the “must have” of supervision.

So if all daemons are our children we are notified when they terminate and why; we can compare their exit status or signal against a list of known good ones, and choose whether we need to respawn the dead job or mark it as stopped normally.

This isn’t enough though, all we get is the process id of the dead child. We still need to relate that back to a job somehow. One way to do that is to use waitid with the WNOWAIT flag, leaving the process on the table so we can examine /proc to find out more about it. This seems like quite a reasonable approach, we can then match a process to a job by details such as what binary it was actually running. Unfortunately this only works for singleton processes where we’re guaranteed that only one of them exists, both at the job level and at the process-level itself; should the process fork, even to run another child, we could accidentally consider it to have died. Daemons need to be able to run their own children, or even have pools of them to use; and we also need to be able to run multiple copies of daemons where we can support it.

So we really do need to know the process id of the actual daemon process we should be supervising. Unfortunately any method of passing this back to init, even relatively common ones like writing it to a pid file, aren’t sufficiently standard or reliable to do this kind of work.

Ideally the kernel would just tell init when a process was reparented to it, provided both the child process id and that of its previous parent. Such a notification doesn’t exist today, though would be a nice project to try and get it into the kernel mainline; difficult if there’s only one implementation using it.

If we can’t have that, a syscall that would allow us to watch a process and find out when it forks would be the second-best thing. We’d have the previous process id since we were watching it, and we’d hopefully be able to obtain the new child process id from this.

Happily that syscall exists, and I suspect you use it all the time if you’re a developer; it’s a bit of a mad leap to using it inside init, but as you can see, it works rather nicely. All we need do is watch the process, and follow it each time it spawns a new child. We stop watching as soon as we have followed twice (once if a different option is used), or if the process runs a different binary by itself. And thus we can know the process id of daemons we spawned, even if they attempt to detach from their parent process which they’ll just be reparented to anyway.

What’s the syscall? Oh, hmm, is that the time? Got to go! Alright, it’s ptrace.

Supervising forking processes


quest /tmp# cat test.c
#include <sys/types.h>

#include <stdlib.h>
#include <unistd.h>

int
main (int   argc,
      char *argv[])
{
        pid_t pid;

        pid = fork ();
        if (pid > 0)
                exit (0);

        pid = fork ();
        if (pid > 0)
                exit (0);

        pause ();
        exit (0);
}
quest /tmp# gcc -Wall -g -O0 -o test test.c

quest /tmp# cat /etc/event.d/test
wait for daemon
exec /tmp/test

quest /tmp# start test
test (#0) goal changed from stop to start
test (#0) state changed from waiting to starting
event_new: Pending starting event
Handling starting event
event_finished: Finished starting event
test (#0) state changed from starting to pre-start
test (#0) state changed from pre-start to spawned
process_spawn: Spawned main process 6380 for test (#0)
Active test (#0) main process (6380)
test (#0) main process (6380) forked new child 6381
test (#0) main process (6381) forked new child 6382
test (#0) state changed from spawned to post-start
test (#0) state changed from post-start to running
event_new: Pending started event
Handling started event
event_finished: Finished started event

Ubuntu Desktop Developer

Continuing my mission to put together a kick-ass team to develop the Ubuntu Desktop, the following position is now up on the website:

Posting Date & ID: September 2007 UDD
Job Location: Your home with broadband. Some international travel will be required.
Job Summary: To adapt and develop the GNOME desktop to improve the Ubuntu user experience.

Key responsibilities and accountabilities:

  • Use open source development methods to create, select and adapt software to produce innovative user experiences and address the common problems of desktop computing
  • Extend the desktop platform as necessary to support development
  • Work with designers, artists and other developers to develop ideas and complete the project
  • Involve the community of development projects, teams and Ubuntu supporters to incorporate a range of perspectives and ideas
  • Take ownership of many aspects of the desktop user experience (“look and feel”) in Ubuntu
  • Follow projects and trends in user interface design in the open source world, integrate the best technologies into Ubuntu and ensure their quality
  • Analyse, triage and respond to bug reports

Requirements skills and experience:

  • A keen and insightful eye for user interaction
  • A passion for intuitive, usable and visually appealing interfaces
  • A strong desire to produce distinctive ideas that stand Ubuntu out from the crowd
  • Experience with the GNOME development platform and desktop environment and technologies such as GTK+
  • Some experience with mainstream graphics technologies such as OpenGL and Cairo in the C programming language
  • Ability to be productive in a globally distributed team through self-discipline and self-motivation, delivering according to a schedule
  • Familiarity with open source development tools and methodology, especially those in common use for Ubuntu and Debian package maintenance

How to apply

Please send a cover letter and CV with references to hr@canonical.com. Please indicate in your submission the role for which you are applying. We prefer to receive applications and CVs/Resumes in either PDF or plain text format.

Online Desktop

Havoc’s keynote at GUADEC was extremely interesting, especially for how it polarised the people present.

Several people seemed very upset with the notion that f-spot should be replaced by flickr, but I think that was a problem with the way that Havoc presented the message, and not the underlying idea.

Instead consider f-spot and flickr as sharing the same collection of data, and being two different ways to view and manage it; with changes from one appearing in the other. The mechanism isn’t important.

Consider the following:

  • While out and about, I take a picture with my camera phone.
  • On coming home, the phone is within bluetooth range of my laptop (with both enabled).
  • The laptop sees the new picture, so announces the availability of the new picture.
  • f-spot is subscribed to those announcements, and causes the picture to be copied into my local f-spot library, with the meta-data adjusted to indicate the local cache (as well as the origin).
  • flickr is also subscribed, so the picture is automatically uploaded to my flickr account.
  • At some point in the past, a friend on Facebook changed their mobile number; this was detected and the change announced.
  • e-d-s was subscribed, so automatically adjusted my contacts.
  • And my phone sync service is subscribed, so now my phone is in range, its contact list is updated too.

Now, isn’t that cool?

Something for everybody

According to the current issue (#93) of Linux Format, Ubuntu 7.04 (“Feisty Fawn”) is “…a dull release for Ubuntu, leaving Fedora to storm ahead…” (p. 23) whilst “shaping up to be one of the most innovative Linux distro releases of the year.” (p. 38)

Especially amusing for myself is that, with Upstart, they “seldom notice any difference in boot speed” (p. 42), yet “Ubuntu 7.04 boots up in record time, leaving other Linux distros in the dust.” (p. 22)

(As anyone who’s ever read anything about Upstart will know, Ubuntu still uses the SysV-rc scripts so there should be no difference in speed at this point. Funnily enough, they identified the reason Ubuntu boots fast in the same issue; “Changing the /bin/sh symlink to point to Dash instead of Bash can significantly shorten boot times” (p. 33) — unfortunately they simultaneously claim that Dash is only “almost POSIX compliant”, without explaining why they think it isn’t.)

In this modern world, the lack of any editorial direction or basic research into what’s being printed is quite refreshing.

Upstart 0.3

For the last couple of months, both at the Ubuntu Developer Summit in Mountain View and on the #upstart IRC channel, we’ve been discussing the changes we want to make to upstart for the Feisty Fawn release of Ubuntu.

This will ship with a version of upstart based on the 0.3 series (it may end up getting called 0.5 before release); the primary goal for this are to have an init system that is suitable for general standalone list in any Linux distribution.

I’ll be giving a talk at linux.conf.au 2007 in Sydney with that aim, I hope to persuade at least one other major Linux distribution that it’s the right solution.

A complete list of the specifications and bugs being targeted for the 0.3 release can be found in Launchpad.

The rest of this post will introduce some of the shiniest new things.

Writing Jobs

Upstart takes care of starting, supervising and stopping daemons itself; unlike in the init script system where you have to write code to do that yourself, often using a helper like start-stop-daemon. All you need to is give the path to, and arguments for, the binary you wish to be started.

exec /usr/bin/dbus-daemon

Some jobs, especially quick tasks, will usually be written as shell scripts. To save having to write a separate file and invoke it, you can include shell script code directly in the job file instead of using the exec stanza.

script
    echo /usr/share/apport/apport > /proc/sys/kernel/crashdump-helper
end script

Usually it’s not sufficient to just start a binary and wish it well; you frequently need something to be run before it is started to prepare the system, and sometimes something after it terminates to clean up again.

For these purposes, additional snippets of shell code can be given — to be run before the binary is started, and after it has finished. Unlike init scripts, these do not need to start or stop the daemon itself; that’s done automatically based on the exec stanza.

pre-start script
    mkdir -p /var/run/dbus
    chown messagebus:messagebus /var/run/dbus
end script

post-stop script
    rm -f /var/run/dbus/pid
end script

For consistency, executables may be specified with pre-start exec and post-start exec instead of shell scripts as above.

It’s sometimes useful to be able to run something after the binary has been started; for example, you may wish to attempt to connect to the daemon to determine whether it is ready to serve requests. post-start script or post-start exec can be used to this.

post-start script
    # wait for listen on port 80
    while ! nc -q0 localhost 80 </dev/null >/dev/null 2>&1; do
        sleep 1;
    done
end script

It’s also useful to be able to notify a daemon that it may be about to be stopped, or delay it for a while. pre-stop script or pre-stop exec can be used for this.

pre-stop script
    # disable the queue, wait for it to become empty
    fooctl disable
    while fooq >/dev/null; do
        sleep 1
    done
end script

Events

Events are now quite a bit more detailed than in previous versions; they’re still named with simple strings that are up to the system sending the event, but they can now include arguments and environment variables which are passed through to jobs being started or stopped as a result.

initctl emit network-interface-up eth0 -DIFADDR=00:11:D8:98:1B:37

This command will now output all of the effects of this event, and will not terminate until the event has been fully handled inside upstart.

Events such as the above can be used by jobs that examine the event arguments and environment within their script:

start on network-interface-up
script
    [ $1 = lo ] && exit 0
    grep -q $IFADDR /etc/network/blacklist && exit 0
    # etc.
 end script

or matched directly in the start on and stop on stanzas:

start on block-device-added sda*

The events generated by job state changes have also changed. Previously both jobs and events shared the same namespace, which not only caused confusion but actually caused some problems when one accidentally named a job after an event.

The two primary events generated are now simply called started and stopped; they inform you that a job is fully up and running, or fully shut down again. The name of the job is received as an argument to this event.

start on started dbus

The started event is not emitted until the post-start task (described above) has finished; so the post-start task can delay other jobs from starting because they can’t yet connect to the daemon.

Likewise the stopped event is not emitted until after the post-stop task has finished.

The other two events emitted by a job are special; they are the starting and stopping events. The reason they are special is that the job is not permitted to start or stop until the event has been handled.

This means that if you have a task to perform when your database server is stopped, but before it’s actually terminated, it’s as simple as:

start on stopping mysql
exec /usr/bin/backup-db.py

MySQL won’t be terminated until the backup has finished.

This is especially useful for daemons that depend on each other, for example HAL needs DBUS, it shouldn’t be started until DBUS is running and DBUS should not be stopped until HAL has been terminated. All the HAL job needs is:

start on started dbus
stop on stopping dbus

Likewise if tomcat is installed, Apache should not be started until tomcat is running; and tomcat should not be stopped until apache has been terminated. All the tomcat job needs is:

start on starting apache
stop on stopped apache

Failure

Nothing goes smoothly all of the time, sometimes tasks the job runs will fail, or the daemon itself will die. As well as providing the ability for a crashed daemon to be automatically restarted, upstart ensured that other jobs are notified with a special failed argument to the stopping and stopped events.

start on stopped typo failed
script
    echo "typo failed again :-( " | mail -s "typo failed" root
end script

And if any job started or stopped by an event fails, it’s possible to discover that the event itself failed.

start on network-interface-up/failed

States

While tasks such as configuring a network interface, or checking and mounting a block device are usually performed as a result of events; services are more complicated.

Services normally need to be running while the system is in a certain state, not just when a particular event occurs. Therefore upstart allows you to describe arbitrarily complex system states by referring to events that define their changes.

For example, many services should be running only while the filesystem is mounted, and at least one network device is up. We have events to indicate the changes into and out of these dates, we just need to combine them:

from fhs-filesystem-mounted until fhs-filesystem-unmounted
and from network-up until network-down

The until operator defines a period between two events, the and operator ensures we’re within both of these periods.

Perhaps we need to be running while any display manager is:

from started gdm until stopping gdm
or started kdm until stopping kdm

Or maybe we only want to be run if a network interface comes up before bind9 has been started:

on network-interface-up and from startup until started bind9

These “complex event configurations” can appear in any job file; and any job file itself can serve as a reference for other jobs. They will be started and stopped at the same time as the named job:

with apache

Omitting the exec or script stanza from a job file means that it simply defines a state that can serve as a reference for others. As such, the multiuser state is simply a job file that defines it.

As an added bonus, these states can still have pre-start, post-stop, etc. scripts.

Slippery Slopes

One of the most interesting thing about slippery slopes is how you never seem to be standing at the top of them, looking down. The slope seems fine at the top, and it’s only once you start down it that you realise this could end up with some broken limbs.

When Ubuntu was formed, Debian were having a debate about how to treat GFDL documentation. It was their opinion that the GFDL was inherently non-free, and they’ve since taken steps to remove all such licensed documentation from their main distribution. We took a more pragmatic approach, and decided that it maintained the spirit of freedom, and thus we continue to this day to ship that documentation in our main distribution.

A similar discussion resulted in the handling of data files such as graphics, icons, fonts, etc. We decided that such things didn’t necessarily need to ship with corresponding source code, as frequently they don’t have any such thing or when they do, it’s just as easy to modify the data file directly.

The slope didn’t seem at all slippery back then.

Then came the issue of firmware, binary blobs in the kernel which are uploaded into a flash (or similar) chip in the hardware. Could we distribute these? On one hand, these blobs have always existed, they just used to be in ROM in the hardware; the move to firmware doesn’t change that. On the other hand, they’re machine code and if we had the source, we could improve the hardware as well.

And what if we didn’t distribute them? Our users would be stuck without being able to use some fairly (to them) critical parts of their computer.

In the end, the argument that firmware isn’t inherently any less free on the disk than in the ROM won, so we opted to continue to ship it.

Perhaps that slope is a bit slippery, but we’ve got a good foothold.

Of course, at that point somebody notices the binary “Hardware Access Layer” in the Atheros WiFi card driver. It’s not firmware, it’s run on the host processor, and is separate to “comply with FCC law”. (The ipw3495 driver has a binary daemon that allegedly performs the same legal function).

Again, if we don’t distribute that, a large section of laptop users will not be able to use their WiFi cards. A compromise was reached; because the driver is necessary we’d ship it, but in a special restricted component that makes it absolutely clear that it’s not completely free. Users could choose to remove that component and any packages from it, to keep their system untainted.

Ok, foothold wasn’t as strong as we thought; tumbled a bit, but we’re definitely on solid ground now!

That’s what we thought, anyway. Unfortunately it seems that there’s a point a little bit lower down the slope which has a fantastic vista. The views from there are just incredible, people are saying, much prettier than where we are now. The only trouble is that we’re not sure there’s a foothold down there, if we try for the better view, we could end up broken at the bottom.

I’m talking, of course, about the NVIDIA binary X driver. (Some reports/blogs/etc. indicate we’re also considering the ATI fglrx driver, this isn’t true — that driver doesn’t support AIGLX, so it’s not being considered.)

We’ve shipped this driver in our restricted section, but not enabled it by default. It’s been there for people who want it to switch on, if they know how, but the default driver has always been the free (albeit obfuscated) one in the Xorg distribution.

The problem is that users do not need this driver, they can get decent enough 2D graphics support from the free(ish) driver. In the long term, they may even get decent 3D graphics support from the nouveau driver effort.

What’s the problem then? Simple, other operating systems use the 3D GPU to make the desktop seriously beautiful. If Linux doesn’t catch up and do the same, then we’ll be considered obsolete again.

And just to drive the point home, some of our Linux friends shipped similar support in their last releases. They don’t enable the NVIDIA binary driver, but this means that a large percentage of their user base can’t get the bling without manual hackery.

We needed a way to catch up with both the commerical operating systems and other Linux distributions; we have a policy of not doing our own software development, but only packaging what others have developed, so the only way for us to get ahead was to package something that others wouldn’t.

Which brings us back to the NVIDIA binary driver. If we install that by default, we’ll be bringing a 3D desktop to more people. And we’ll gain a step ahead of the other distributions.

Will our users care? To be brutally honest, I think the answer is no! In fact, I suspect our users will largely love us for this decision. Most probably already install the NVIDIA driver anyway, because they think it’s better, or because (sadly, like me) they have a card combination not supported by the free one.

Will this make any difference to the effort to get NVIDIA to free up the driver, or at least the specs? Sorry, but to be honest again, I don’t think it’ll make one little difference. Linux distributions have been refusing to install it for years, and yet NVIDIA haven’t budged in their position.

Perhaps a new tactic is required. Maybe if we do install it, we’ll be more likely to be chosen by OEMs as we can actually support the hardware they install. Then later, we may be able to actually affect their decision as to what hardware they install, and maybe then NVIDIA will pay attention.

Will this change the perception of Ubuntu in the Linux developer community? I’m not sure about this one, I think that those who already feel strongly about the distribution of binary drivers are probably already pretty grumpy at us distributing things like the Atheros and ipw3495 drivers. I suspect this will change the opinion of a lot of people who’ve been on the fence until now, probably equal in both directions.

Will we be able to sleep at night?

Despite all of the above, personally I still think that installing and using the nvidia driver by default, when the nv driver would do, is the wrong decision.

If the nv driver doesn’t work, I’m willing to accept the nvidia driver being used; provided that there’s some message informing the user what’s happened, why it has happened, and which alternate graphics cards they can purchase if they aren’t willing to accept a non-free binary driver.

If the nv driver is good enough for 2D, I would prefer that we instead disabled the 3D desktop effects for this group of users (by default). A similar message could explain why this is disabled, again which alternate graphics cards provide this by default, but also provide a button for the user to enable it if they wish. While we should make the correct moral decision for the defaults, we shouldn’t stand in the way of users who wish to make a different decision for themselves.

Later as the nouveau driver becomes stable, we may be able to activate 3D support for nvidia users by default.

I think that preserves our current foothold, we’d only activate it if there is no free alternative. Where there is, we’d be educating users about why they may wish to consider alternatives to NVIDIA in future, while at the same time not getting too much in their way if they want to see the better view down the slope.

Unfortunately it’s not my decision, and I suspect that the lure of the bling will win out. With any luck, we’ll find a foothold there, and the fallout of doing so won’t be too bad. I’m just worried that once we compromise on this, we’ll start compromising on other things … would we replace Firefox with a non-free web browser that rendered web pages “better”?

The slippery slope only gets steeper from here on …

What we'll get in feisty

This post is a sequel to my “What I want in edgy+1″ post, which was written when the developer summit was first announced. Now that the summit (and the following company All Hands meeting) is over, and we’re all back home, this seems as good a time as any to review what was discussed and get a good idea of what feisty might look like.

I’ve touched on the problem of predicting time-based releases before. It’s both the gift and the curse of a time-based release schedule that work not completed in time can be deferred to a later release. So take the following with a pinch of salt, some of this may still not make it.

General Themes

The general theme of dapper was to be a release that could be supported for a long term, conservatism was the goal. We did do some quite exciting work under the hood, such as the switch away from hotplug to a fully udev based system, but in general it wasn’t innovative or ground-breaking.

Edgy was intended to be more ground-breaking, but the practical matter of having only a few months to develop it and our own pride in shipping something that still worked meant that it turned out as a shinier, improved dapper.

So what’s feisty going to be like? Judging from the discussions at UDS, and the specifications that have been written, the general theme of feisty is to lead the way again with new technologies.

The Desktop

For the users, perhaps the most obvious change will be the active use of 3D acceleration to draw the desktop where hardware can support it (the issue of binary drivers has not yet been resolved).

Windows are more visually distinct from each other through shadows behind them, and transparency for the non-active windows. The relationship between different workspaces/viewports is much clearer as the transition is animated on a cube or sliding pane.

And for the bling crowd, window s can wobble, burn, explode or dissolve.

There are two different compositors being considered at this point, compiz and beryl; we’re likely to decide which to use at Feature Freeze based on how well they’ve been fixed, developed and supported until that point.

Underneath the hood, the configuration of the X server will be simpler and more robust; so even the worst case will not leave you confined to a console without any help.

Networking

Networking in feisty should be a much more pleasurable experience. The Network Manager project, which has been waiting on the side lines for a couple of releases, may finally get a shot at being in the default installation. For the average user, this makes switching between wired and wireless networks, including setting up WEP and WPA much, much easier.

And what if there’s no network infrastructure around? Out of the box support for RFC 3927 link-local networks, and multi-cast DNS resolution (aka. Zeroconf), means that you just need to agree on a network name with others around you to be able to communicate.

Of course, once you’re on a network, you still need to be able to share files and access local services. The integration of the Avahi project gives you one-click access to other people’s shared music or files; and lets you share your own, should you choose to do so.

Customisations

One of the most encountered problems with edgy was it being difficult to install various common packages that aren’t part of the default installation, especially codecs. Projects such as Automatix attempt to tackle this, but can cause problems with upgrading to later releases.

Some effort will be going into feisty to make performing these common customisations much simpler, including being able to install codecs or viewers by just trying to open the file.

Boot Sequence

A long-running project within Ubuntu has been to get the boot and shutdown sequences as fast and efficient as possible. At the time we started, it was common for a Linux distribution to boot in a mere two or three minutes.

If you thought edgy booted fast, wait until you see feisty.

Feisty is the release where we take full advantage of Upstart, not only bringing the system up as fast as possible but also more robustly than we can do today.

And if that weren’t enough, it should look slicker too; without some of the nasty flickering and mode changes that happen today.