Upstart 0.3

For the last couple of months, both at the Ubuntu Developer Summit in Mountain View and on the #upstart IRC channel, we’ve been discussing the changes we want to make to upstart for the Feisty Fawn release of Ubuntu.

This will ship with a version of upstart based on the 0.3 series (it may end up getting called 0.5 before release); the primary goal for this are to have an init system that is suitable for general standalone list in any Linux distribution.

I’ll be giving a talk at linux.conf.au 2007 in Sydney with that aim, I hope to persuade at least one other major Linux distribution that it’s the right solution.

A complete list of the specifications and bugs being targeted for the 0.3 release can be found in Launchpad.

The rest of this post will introduce some of the shiniest new things.

Writing Jobs

Upstart takes care of starting, supervising and stopping daemons itself; unlike in the init script system where you have to write code to do that yourself, often using a helper like start-stop-daemon. All you need to is give the path to, and arguments for, the binary you wish to be started.

exec /usr/bin/dbus-daemon

Some jobs, especially quick tasks, will usually be written as shell scripts. To save having to write a separate file and invoke it, you can include shell script code directly in the job file instead of using the exec stanza.

script
    echo /usr/share/apport/apport > /proc/sys/kernel/crashdump-helper
end script

Usually it’s not sufficient to just start a binary and wish it well; you frequently need something to be run before it is started to prepare the system, and sometimes something after it terminates to clean up again.

For these purposes, additional snippets of shell code can be given — to be run before the binary is started, and after it has finished. Unlike init scripts, these do not need to start or stop the daemon itself; that’s done automatically based on the exec stanza.

pre-start script
    mkdir -p /var/run/dbus
    chown messagebus:messagebus /var/run/dbus
end script

post-stop script
    rm -f /var/run/dbus/pid
end script

For consistency, executables may be specified with pre-start exec and post-start exec instead of shell scripts as above.

It’s sometimes useful to be able to run something after the binary has been started; for example, you may wish to attempt to connect to the daemon to determine whether it is ready to serve requests. post-start script or post-start exec can be used to this.

post-start script
    # wait for listen on port 80
    while ! nc -q0 localhost 80 </dev/null >/dev/null 2>&1; do
        sleep 1;
    done
end script

It’s also useful to be able to notify a daemon that it may be about to be stopped, or delay it for a while. pre-stop script or pre-stop exec can be used for this.

pre-stop script
    # disable the queue, wait for it to become empty
    fooctl disable
    while fooq >/dev/null; do
        sleep 1
    done
end script

Events

Events are now quite a bit more detailed than in previous versions; they’re still named with simple strings that are up to the system sending the event, but they can now include arguments and environment variables which are passed through to jobs being started or stopped as a result.

initctl emit network-interface-up eth0 -DIFADDR=00:11:D8:98:1B:37

This command will now output all of the effects of this event, and will not terminate until the event has been fully handled inside upstart.

Events such as the above can be used by jobs that examine the event arguments and environment within their script:

start on network-interface-up
script
    [ $1 = lo ] && exit 0
    grep -q $IFADDR /etc/network/blacklist && exit 0
    # etc.
 end script

or matched directly in the start on and stop on stanzas:

start on block-device-added sda*

The events generated by job state changes have also changed. Previously both jobs and events shared the same namespace, which not only caused confusion but actually caused some problems when one accidentally named a job after an event.

The two primary events generated are now simply called started and stopped; they inform you that a job is fully up and running, or fully shut down again. The name of the job is received as an argument to this event.

start on started dbus

The started event is not emitted until the post-start task (described above) has finished; so the post-start task can delay other jobs from starting because they can’t yet connect to the daemon.

Likewise the stopped event is not emitted until after the post-stop task has finished.

The other two events emitted by a job are special; they are the starting and stopping events. The reason they are special is that the job is not permitted to start or stop until the event has been handled.

This means that if you have a task to perform when your database server is stopped, but before it’s actually terminated, it’s as simple as:

start on stopping mysql
exec /usr/bin/backup-db.py

MySQL won’t be terminated until the backup has finished.

This is especially useful for daemons that depend on each other, for example HAL needs DBUS, it shouldn’t be started until DBUS is running and DBUS should not be stopped until HAL has been terminated. All the HAL job needs is:

start on started dbus
stop on stopping dbus

Likewise if tomcat is installed, Apache should not be started until tomcat is running; and tomcat should not be stopped until apache has been terminated. All the tomcat job needs is:

start on starting apache
stop on stopped apache

Failure

Nothing goes smoothly all of the time, sometimes tasks the job runs will fail, or the daemon itself will die. As well as providing the ability for a crashed daemon to be automatically restarted, upstart ensured that other jobs are notified with a special failed argument to the stopping and stopped events.

start on stopped typo failed
script
    echo "typo failed again :-( " | mail -s "typo failed" root
end script

And if any job started or stopped by an event fails, it’s possible to discover that the event itself failed.

start on network-interface-up/failed

States

While tasks such as configuring a network interface, or checking and mounting a block device are usually performed as a result of events; services are more complicated.

Services normally need to be running while the system is in a certain state, not just when a particular event occurs. Therefore upstart allows you to describe arbitrarily complex system states by referring to events that define their changes.

For example, many services should be running only while the filesystem is mounted, and at least one network device is up. We have events to indicate the changes into and out of these dates, we just need to combine them:

from fhs-filesystem-mounted until fhs-filesystem-unmounted
and from network-up until network-down

The until operator defines a period between two events, the and operator ensures we’re within both of these periods.

Perhaps we need to be running while any display manager is:

from started gdm until stopping gdm
or started kdm until stopping kdm

Or maybe we only want to be run if a network interface comes up before bind9 has been started:

on network-interface-up and from startup until started bind9

These “complex event configurations” can appear in any job file; and any job file itself can serve as a reference for other jobs. They will be started and stopped at the same time as the named job:

with apache

Omitting the exec or script stanza from a job file means that it simply defines a state that can serve as a reference for others. As such, the multiuser state is simply a job file that defines it.

As an added bonus, these states can still have pre-start, post-stop, etc. scripts.

Slippery Slopes

One of the most interesting thing about slippery slopes is how you never seem to be standing at the top of them, looking down. The slope seems fine at the top, and it’s only once you start down it that you realise this could end up with some broken limbs.

When Ubuntu was formed, Debian were having a debate about how to treat GFDL documentation. It was their opinion that the GFDL was inherently non-free, and they’ve since taken steps to remove all such licensed documentation from their main distribution. We took a more pragmatic approach, and decided that it maintained the spirit of freedom, and thus we continue to this day to ship that documentation in our main distribution.

A similar discussion resulted in the handling of data files such as graphics, icons, fonts, etc. We decided that such things didn’t necessarily need to ship with corresponding source code, as frequently they don’t have any such thing or when they do, it’s just as easy to modify the data file directly.

The slope didn’t seem at all slippery back then.

Then came the issue of firmware, binary blobs in the kernel which are uploaded into a flash (or similar) chip in the hardware. Could we distribute these? On one hand, these blobs have always existed, they just used to be in ROM in the hardware; the move to firmware doesn’t change that. On the other hand, they’re machine code and if we had the source, we could improve the hardware as well.

And what if we didn’t distribute them? Our users would be stuck without being able to use some fairly (to them) critical parts of their computer.

In the end, the argument that firmware isn’t inherently any less free on the disk than in the ROM won, so we opted to continue to ship it.

Perhaps that slope is a bit slippery, but we’ve got a good foothold.

Of course, at that point somebody notices the binary “Hardware Access Layer” in the Atheros WiFi card driver. It’s not firmware, it’s run on the host processor, and is separate to “comply with FCC law”. (The ipw3495 driver has a binary daemon that allegedly performs the same legal function).

Again, if we don’t distribute that, a large section of laptop users will not be able to use their WiFi cards. A compromise was reached; because the driver is necessary we’d ship it, but in a special restricted component that makes it absolutely clear that it’s not completely free. Users could choose to remove that component and any packages from it, to keep their system untainted.

Ok, foothold wasn’t as strong as we thought; tumbled a bit, but we’re definitely on solid ground now!

That’s what we thought, anyway. Unfortunately it seems that there’s a point a little bit lower down the slope which has a fantastic vista. The views from there are just incredible, people are saying, much prettier than where we are now. The only trouble is that we’re not sure there’s a foothold down there, if we try for the better view, we could end up broken at the bottom.

I’m talking, of course, about the NVIDIA binary X driver. (Some reports/blogs/etc. indicate we’re also considering the ATI fglrx driver, this isn’t true — that driver doesn’t support AIGLX, so it’s not being considered.)

We’ve shipped this driver in our restricted section, but not enabled it by default. It’s been there for people who want it to switch on, if they know how, but the default driver has always been the free (albeit obfuscated) one in the Xorg distribution.

The problem is that users do not need this driver, they can get decent enough 2D graphics support from the free(ish) driver. In the long term, they may even get decent 3D graphics support from the nouveau driver effort.

What’s the problem then? Simple, other operating systems use the 3D GPU to make the desktop seriously beautiful. If Linux doesn’t catch up and do the same, then we’ll be considered obsolete again.

And just to drive the point home, some of our Linux friends shipped similar support in their last releases. They don’t enable the NVIDIA binary driver, but this means that a large percentage of their user base can’t get the bling without manual hackery.

We needed a way to catch up with both the commerical operating systems and other Linux distributions; we have a policy of not doing our own software development, but only packaging what others have developed, so the only way for us to get ahead was to package something that others wouldn’t.

Which brings us back to the NVIDIA binary driver. If we install that by default, we’ll be bringing a 3D desktop to more people. And we’ll gain a step ahead of the other distributions.

Will our users care? To be brutally honest, I think the answer is no! In fact, I suspect our users will largely love us for this decision. Most probably already install the NVIDIA driver anyway, because they think it’s better, or because (sadly, like me) they have a card combination not supported by the free one.

Will this make any difference to the effort to get NVIDIA to free up the driver, or at least the specs? Sorry, but to be honest again, I don’t think it’ll make one little difference. Linux distributions have been refusing to install it for years, and yet NVIDIA haven’t budged in their position.

Perhaps a new tactic is required. Maybe if we do install it, we’ll be more likely to be chosen by OEMs as we can actually support the hardware they install. Then later, we may be able to actually affect their decision as to what hardware they install, and maybe then NVIDIA will pay attention.

Will this change the perception of Ubuntu in the Linux developer community? I’m not sure about this one, I think that those who already feel strongly about the distribution of binary drivers are probably already pretty grumpy at us distributing things like the Atheros and ipw3495 drivers. I suspect this will change the opinion of a lot of people who’ve been on the fence until now, probably equal in both directions.

Will we be able to sleep at night?

Despite all of the above, personally I still think that installing and using the nvidia driver by default, when the nv driver would do, is the wrong decision.

If the nv driver doesn’t work, I’m willing to accept the nvidia driver being used; provided that there’s some message informing the user what’s happened, why it has happened, and which alternate graphics cards they can purchase if they aren’t willing to accept a non-free binary driver.

If the nv driver is good enough for 2D, I would prefer that we instead disabled the 3D desktop effects for this group of users (by default). A similar message could explain why this is disabled, again which alternate graphics cards provide this by default, but also provide a button for the user to enable it if they wish. While we should make the correct moral decision for the defaults, we shouldn’t stand in the way of users who wish to make a different decision for themselves.

Later as the nouveau driver becomes stable, we may be able to activate 3D support for nvidia users by default.

I think that preserves our current foothold, we’d only activate it if there is no free alternative. Where there is, we’d be educating users about why they may wish to consider alternatives to NVIDIA in future, while at the same time not getting too much in their way if they want to see the better view down the slope.

Unfortunately it’s not my decision, and I suspect that the lure of the bling will win out. With any luck, we’ll find a foothold there, and the fallout of doing so won’t be too bad. I’m just worried that once we compromise on this, we’ll start compromising on other things … would we replace Firefox with a non-free web browser that rendered web pages “better”?

The slippery slope only gets steeper from here on …

What we'll get in feisty

This post is a sequel to my “What I want in edgy+1″ post, which was written when the developer summit was first announced. Now that the summit (and the following company All Hands meeting) is over, and we’re all back home, this seems as good a time as any to review what was discussed and get a good idea of what feisty might look like.

I’ve touched on the problem of predicting time-based releases before. It’s both the gift and the curse of a time-based release schedule that work not completed in time can be deferred to a later release. So take the following with a pinch of salt, some of this may still not make it.

General Themes

The general theme of dapper was to be a release that could be supported for a long term, conservatism was the goal. We did do some quite exciting work under the hood, such as the switch away from hotplug to a fully udev based system, but in general it wasn’t innovative or ground-breaking.

Edgy was intended to be more ground-breaking, but the practical matter of having only a few months to develop it and our own pride in shipping something that still worked meant that it turned out as a shinier, improved dapper.

So what’s feisty going to be like? Judging from the discussions at UDS, and the specifications that have been written, the general theme of feisty is to lead the way again with new technologies.

The Desktop

For the users, perhaps the most obvious change will be the active use of 3D acceleration to draw the desktop where hardware can support it (the issue of binary drivers has not yet been resolved).

Windows are more visually distinct from each other through shadows behind them, and transparency for the non-active windows. The relationship between different workspaces/viewports is much clearer as the transition is animated on a cube or sliding pane.

And for the bling crowd, window s can wobble, burn, explode or dissolve.

There are two different compositors being considered at this point, compiz and beryl; we’re likely to decide which to use at Feature Freeze based on how well they’ve been fixed, developed and supported until that point.

Underneath the hood, the configuration of the X server will be simpler and more robust; so even the worst case will not leave you confined to a console without any help.

Networking

Networking in feisty should be a much more pleasurable experience. The Network Manager project, which has been waiting on the side lines for a couple of releases, may finally get a shot at being in the default installation. For the average user, this makes switching between wired and wireless networks, including setting up WEP and WPA much, much easier.

And what if there’s no network infrastructure around? Out of the box support for RFC 3927 link-local networks, and multi-cast DNS resolution (aka. Zeroconf), means that you just need to agree on a network name with others around you to be able to communicate.

Of course, once you’re on a network, you still need to be able to share files and access local services. The integration of the Avahi project gives you one-click access to other people’s shared music or files; and lets you share your own, should you choose to do so.

Customisations

One of the most encountered problems with edgy was it being difficult to install various common packages that aren’t part of the default installation, especially codecs. Projects such as Automatix attempt to tackle this, but can cause problems with upgrading to later releases.

Some effort will be going into feisty to make performing these common customisations much simpler, including being able to install codecs or viewers by just trying to open the file.

Boot Sequence

A long-running project within Ubuntu has been to get the boot and shutdown sequences as fast and efficient as possible. At the time we started, it was common for a Linux distribution to boot in a mere two or three minutes.

If you thought edgy booted fast, wait until you see feisty.

Feisty is the release where we take full advantage of Upstart, not only bringing the system up as fast as possible but also more robustly than we can do today.

And if that weren’t enough, it should look slicker too; without some of the nasty flickering and mode changes that happen today.

The Edgy Dance

Last week, a few of us gathered at Canonical’s London offices to oversee the final release preparations. This basically consists of testing the various candidate CD images and performing both install and upgrade tests on them.

As you can imagine, three people performing repeated tests of edgy means that the fabled startup sound got many, many playings.

A tradition started.

Now, whenever you hear that sound, remember to get up and dance, wave your arms in the air or just tap your fingers on the table.

Automatix and Upgrading

It also seems that several of the dapper to edgy upgrade problems are caused by the use of Automatix; a tool to perform common customisations to Ubuntu, such as replace the pre-installed software with alternatives and install packages that Ubuntu is unable to pre-install due to patent or other legal issues.

Henrink has a few good points about this, however I feel that it’s also important to remember that the Ubuntu community does not only consist of the core developers.

Automatix, and its like, are by their very definition, tools to reduce the amount of your system that the core developers will support. The default set of installed packages is not arbitrary, and one may be selected over your preferred solution simply because we do not have the expertise in the team to deal with the other, or even because the other is not supported upstream!

We therefore rely on the wider community to take ownership of these packages, and support them within the community structure.

Support, in the development sense, doesn’t just consist of security updates either; it also consists of keeping the software up to date, fixing bugs, and most importantly of all; testing it before we release.

The right approach to making sure that Automatix users are not bitten again during the edgy to feisty upgrade in 6 months time is for members of the community to come together and form a team to support it. The existing Automatix team in Launchpad is probably a good start.

One of the goals of this team should be to make sure that throughout feisty’s development cycle, upgrading from an edgy box with Automatix installed works flawlessly. Where it doesn’t, they should take effort to ensure that useful bugs are filed (e.g. “foo 1.1-2 contains same file as bar 1.0-1 but neither Replaces nor Conflicts it”) so that the problems can be fixed.

Likewise where community members suggest that a user install software from outside the main component, or even outside the Ubuntu repository entirely, they should keep in mind that they’re likely to cause that user problems when it’s time for upgrade.

If you’re running a repository of your own right now, have you considered that you need to start testing upgrades from edgy with your packages installed to feisty? Testing when feisty releases is too late!

Before Upgrading to Edgy

It seems that some people with heavily customised Ubuntu installations have had problems upgrading from dapper to edgy. While we do test upgrades as much as we can, there’s no way to test every possible permutation, so problems do creep in.

Here’s a checklist to perform before upgrading to minimise any problems you might have:

  • Make sure you have the ubuntu-minimal and ubuntu-desktop packages installed (Kubuntu, Xubuntu, etc. should use the appropriate meta-packages for their distribution). This may require removing some replacement programs you have installed, but you can always put those back after the upgrade process.
  • Check for any locally installed packages, you can use aptitude search '~i!~Oubuntu' to get a list of them. Some of these may cause conflicts with the upgrade process, it may be worth removing them and putting them back after the upgrade.
  • If you have manually installed any software from upstream, and not used Ubuntu packages (especially the Nvidia or ATI binary drivers), revert those to the Ubuntu-provided versions before upgrading.
  • Use the update manager, as described in the release notes, and not apt-get dist-upgrade. While the latter may work eventually, it will require more manual tinkering than the automatic upgrader.

If, after taking all of these precautions, your upgrade still fails; please file a bug report, and try to include as much information as possible. Provide the list of packages that failed, and if possible the error message provided by them. Provide /var/log/dpkg.log and the files in /var/log/dist-upgrade.

Not That Edgy

Now that Ubuntu 6.10 (“The Edgy Eft”) has been released, we’re starting to see reviews of it; while largely positive, one common theme is that Edgy isn’t quite as edgy as people were expecting.

Mark’s original announcement is certainly the likely reason for this expectation. In it he set the scene for a bold, brash, bleeding edge release to counter the boring dapper release.

Unfortunately, the simple truth is that reality set in. When planning the release schedule for edgy, we realised that if we wanted to get back to our original six-monthly release schedule, we were only going to have four months in which to develop it.

That’s still enough time to throw everything to the wind, and shove out a “release” at the last moment when the CD happens to be installable. It’d be edgy in the extreme sense.

Unfortunately while exciting, we felt that such a release would ruin Ubuntu’s reputation. It’d be a release that, for all intents and purposes, would only be interesting to Ubuntu developers.

Mark has already touched on this in his blog, citing a conversation he had with Matt (the Ubuntu CTO). Especially noteworthy is the mention that the kinds of itches that developers get are not the same as those users get. We get itches because the installer still relies on devfs-style paths, or because it’s not possible to boot the system without race-conditions. None of these things are noticeable to the end-user.

We’re drawing up the list of topics to be discussed at UDS Mountain View in two weeks time, this is as good a guide as any for what we’re thinking about for feisty, the next release. At the end of that summit, we’ll have a list of approved specifications, assigned to developers throughout the community for implementation in the feisty schedule.

Obviously some of those won’t make it due to time constraints, but the best thing about a six-monthly release cycle is that they’re not delayed for long.

What I want in edgy+1

Now that edgy has been frozen for the beta release in a week’s time, and the next developer summit, where the plans for the next release will be discussed, has been announced, I think it’s the perfect time to start thinking about the kinds of features we want to see in that release.

At our recent sprint in Wiesbaden, Mark reminded us that when Ubuntu first released we were leading the field in many of the components we installed by default. Many users came to us because we provided Linux 2.6, hotplug/udev and Project Utopia by default; along with the latest GNOME release and development framework. Now everybody does that, butwe led the way.

With the dapper release, we produced something that could be supported for a long time. With edgy we’ve scratched our itches and fixed things under the hood that have been causing us problems. edgy+1 seems a good time to integrate the hottest new technologies and lead the way again.

So what do I think is exciting, and what do I want in edgy+1?

telepathy, farsight and galago

Anybody who has used the 2006 software release for the Nokia 770 Internet Tablet will have seen these in action already.

They’re, in my opinion, some of the most exciting projects currently being worked on; and ripe for integration into Ubuntu.

Telepathy is a communication framework built on DBus that allows any application to establish communication with users all around the world, using any protocol from IRC to MSN to VoIP.

It uses Farsight to provide the codecs and protocol support on top of the GStreamer media framework.

And Galago glues it all together; providing presence information about contacts and allowing users to specify their own status.

Galago Presence Applet

This means that you’ll be able to contact your contacts. Your calendar alarm tells you that it’s a colleague’s birthday, you’d be able to click their name and see a selection of different ways that you can contact that person right now. From e-mail, opening an MSN or Jabber window or even establishing a VoIP or SIP call to them.

The most exciting thing is that because every application on the desktop has easy access to this, people are coming up with fantastic ideas for using it!

zeroconf

By this, I mean zero configuration networking. Something the OLPC project appears to be going to do very well indeed.

Where no existing network infrastructure exists, your computer should still be able to communicate with other nearby computers. It should use ad-hoc wireless networks, link-local IP addresses and server-free DNS resolution with libnss-mdns.

Play games with your friends on the train without touching a button. Chat with other people in a lecture hall or at a conference where the existing network is unavailable. Hassle-free home and office networking without any complicated set up.

Networking that “just works”.

avahi

Whether your computer is on an infrastructure or ad-hoc network, it should be able to locate resources on the network and share its own to other people.

Avahi allows applications to, via DBus, do this by using a mulicast form of DNS; compatible with Apple’s Bonjour protocol.

The immediately obvious benefits of this are that one can locate such things as file servers without any complicated directory; more fun things can be done too, such as sharing your music from RhythmBox with other users on the network.

Combine it with telepathy, etc. and you can establish communication with anyone else, with a wide range of different protocols.

The immediate problem here is that we currently have a “no open ports” policy, which if we’re going to give the users the opportunity to establish ad-hoc networks is almost certainly a very good thing. We already have exceptions for that policy for DNS and DHCP, it may be that Avahi (mDNS) joins that list. Another option is a “mobile-phone like” interface for enabling and disabling discovery and sharing.

Bluetooth

I don’t think we should stop discovery and sharing at 802.* networking either. Linux actually has a pretty decent bluetooth stack, it’s something that we should start taking advantage of.

Your computer should be able to discover paired or known bluetooth devices when in range without intervention and communicate with them.

If I attempt to establish a network connection, and my phone is nearby, that should be offered as a method for getting on the Internet. If possible, it should also be offered through telepathy as a method for contacting people in my address book!

Synchronisation

These technologies together give us a really big possibility, automatic synchronisation of user’s information.

When I bring my laptop home, it should automatically update my desktop with any changes I’ve made to my address book or calendar, and any shared files, etc.

Likewise it should be automatically be updated with any changes made on the network, to a company address book on a server, for example.

And why stop there? Don’t use share your music, synchronise it so that your laptop, desktop and iPod have the same selection available at all times.

Hassle-free backups is an obvious win here.

But we’re stopping again. Why is my mobile phone, and PDA excluded from this? Via bluetooth, my address book and calendar should be automatically updated from these and these should be automatically updated themselves.

If I add a new number to my phone’s memory, it should appear in my computer address book; and my computer should offer me the ability to contact that person (using VoIP, for example).