Upstart 0.5: Job Environment

In my previous post on Upstart 0.5, I talked about the ways you can define a service for Upstart to manage and introduced the different processes in a job’s lifecyle. In this post, I’ll look into the detail of those processes and their environment.

Upstart ensures that each process it runs has a sane, safe and predictable environment. By default each process is run in a new process group and session, but not as a leader of that process group or session (otherwise the process would have to be careful on all open() calls to make sure it didn’t suddenly own any ttys it opened); the standard input, output and error file descriptors are bound to /dev/null; the PATH environment variable is set to a sensible default, and the TERM variable inherited from the kernel, otherwise no other variables are set; and all resource limits and the like are inherited from init itself.

There are, of course, many ways to customise this environment from the job definition:

  • Jobs may run as a process group and session leader (normally getty likes this).
  • Jobs may have standard file descriptors sent to /dev/console and may be the owner of /dev/console (so they receive Ctrl-C).
  • Jobs may specify custom resource limits, umask, “nice” level, working directory and chroot directory.

Environment Variables

To say that jobs only have the PATH and TERM environment variables set is quite a fallacy, these are just the two variables that all jobs always have set. In fact, the additional environment variables for a job are very important to Upstart since they are the primary method of communicating with that job how it should behave.

To illustrate this, take an instance of the getty service; it needs to know which tty it should use. We could invent some kind of common configuration or parameter database (or D-Bus service) for this kind of thing, with the job being able to run commands to interrogate it, etc. but that’s entirely unnecessary. UNIX already gives us the functionality we need in environment variables, which you’ve probably noticed your shell documentation calls parameters anyway.

In our getty example, we would store the tty in the TTY environment variable, and then the job definition is nice and simple to understand:


exec /sbin/getty 38400 $TTY

So environment variables can be set from a number of sources: the built-in PATH and TERM variables will always be set; others can be set from the job definition (which can specify to inherit the value from init’s environment); and finally environment can come from the start request for the job. I’ll explain more on the latter in later posts, but for now, it suffices to demonstrate that we’d start our getty example with:


# start getty TTY=tty1

So Upstart allows you to define the job’s true life cycle, including any setup and cleanup it needs to perform before and after the daemon is running; and it allows you to define the environment that daemon runs in, so you don’t have to worry about unexpected situations. In the next post, I’ll talk about how you can manage the lifetime of a job, looking at things such as singletons and respawning.

Upstart 0.5: Job Lifecycle

Next month I am hoping to release Upstart 0.5.0, the culmination of almost a year’s worth of work on it.  Comparitively the version that shipped in edgy (0.2.x) was simply an essay to figure out the basics and the version in feisty thru hardy (0.3.x) a first draft.  The new version has been stripped back to the very basics and rebuilt to correct the problems we found with the earlier versions, and to make sure it can handle real world uses as simply and elegantly as possible.

Over the next few weeks, I’ll be writing about the new version; both how it has improved from previous versions and how it compares to what else is out there.

Introduction

First we’ll look at how Upstart allows you to manage the lifecyle of services and tasks (collectively jobs) that you wish to manage.  We’ll use the D-Bus daemon as an example service, simply because it’s a modern, well-behaved service that we’re all familiar with.

With SystemV RC, we would have had a single /etc/init.d/dbus file accepting both start and stop as arguments. They may have looked something like this:


case "$1" in
    start)
        start-stop-daemon --start --pidfile /var/run/dbus.pid /usr/sbin/dbus-daemon
        ;;
    end)
        start-stop-daemon --stop --pidfile /var/run/dbus.pid
        ;;
esac

As you’re well aware, the simple act of starting a daemon and stopping again is not so simple this way. You nearly always end up requiring some kind of helper like start-stop-daemon to help out, and rely on accurate PID files and the like.

Upstart, like just about every other modern service manager (but strangely, not SMF), takes care of all of this hard work for you. Instead of defining how to start and stop a service you just define what to start. Here’s how you’d define the same service in Upstart:


exec /usr/sbin/dbus-daemon

Setup and teardown

Of course, we all know that no service definition is ever that simple. I massively simplified the SystemV example for the purposes of documentation. In reality, we frequently need to do various things to set up the system for the daemon and clean up again afterwards. The original start shell code probably looks more like this (and even now, I’m simplifying for space):


mkdir /var/run/dbus
chown messagebus.messagebus /var/run/dbus

/usr/bin/dbus-uuidgen --ensure

start-stop-daemon --start --pidfile /var/run/dbus.pid /usr/sbin/dbus-daemon

We need a directory for socket files, etc. and to create the machine id if missing. ANd likewise to shut it down, we need to clean up:


start-stop-daemon --stop --pidfile /var/run/dbus.pid

rm -rf /var/run/dbus

And this is where most init replacements fall down (especially launchd). In fact, ironically, you’ll often find the developers using their minimal service definitions when they talk about how fast their system can boot. You can boot really fast if you don’t start anything properly.

Obviously I wouldn’t be pointing this out if Upstart didn’t allow you to do this properly; we’ll extend our minimal service definition to include the set up and tear down code necessary.


pre-start script
    mkdir /var/run/dbus
    chown messagebus.messagebus /var/run/dbus

    /usr/bin/dbus-uuidgen --ensure
end script

exec /usr/sbin/dbus-daemon

post-stop script
    rm -rf /var/run/dbus
end script

Before we just defined one process in a job’s lifecycle, known as the main process. Our new definition defines two more, the pre-start and post-stop processes. We’ve chosen to define them as shell scripts embedded in the definition, we could have defined them as binaries to execute if we preferred (using pre-start exec), and we could have defined the main process as a script (using script...end script).

As their name suggests, these processes are run before the main process is started and after it has been stopped respectively. In fact, Upstart guarantees more than that:

  • For every time that the job is started, the post-stop process will be run.
  • For every time that the main process is run, the pre-start process will have been completed successfully first.

It might seem a little strange that the post-stop process will always run but the pre-start process doesn’t have as strong a guarantee. This is because it’s possible for the job to be stopped immediately after it is started. Should that happen, Upstart will not run the main process since there’s no need, and therefore will also not run the pre-start process; however to ensure the system is clean, it always runs the post-stop process.

These guarantees also provide sane restart behaviour. If you restart a job, the main process is killed, the post-stop process is run, then the pre-start process is run again before the main process. If you cancel a restart (by stopping the job again) after the post-stop process has been run, it will always be run again.

Spawned, Running and Killed

Upstart makes important distinctions in the state of the main process, it does not necessarily assume that just because the exec() syscall has succeeded that the process is in a suitable running state. Likewise, it does not assume that just because the kill() syscall has succeeded that the process is no longer running.

The latter is easy to understand, delivering the TERM signal to a running process normally just invokes its own termination handler which may perform any number of activities before cleanly shutting down. Upstart waits for the actual child signal signifying termination before running the post-stop script, until that point the process is considered merely “killed”. Obviously too long in the “killed” state means Upstart delivers the much more harcode KILL signal, but that’s adjustable.

The former is harder to understand since the new binary is in memory and is probably at least initialising, but that’s the point: it isn’t yet ready for other jobs to use. In the SystemV script, this wasn’t an issue, since we could generally rely on daemons (well behaved ones anyway) to follow the convention that they should not fork() until initialisation was completed successfully.

Since Upstart forks and supervises its own processes, it generally prefers that daemons do not fork() and remain as the pid they were given when started. So how do jobs signify that they are ready? There are a few ways:

  • By forking as before. As I’ve talked about before, Upstart can supervise process that fork, and it will wait for that to happen before assuming the process is ready.
  • By raising the STOP signal. Jobs marked with expect stop will wait for this, and once received will sent it the CONT signal and assume that it is now ready.
  • By registering a D-Bus name. An early 0.5.x release will wait for a particular D-Bus name to be registered, and not assume that the job is ready until it has done so.
  • By calling listen(). Again, planned for an early 0.5.x release, Upstart will use the same mechanism it uses to follow forks to watch for the listen() system call.
  • With a post-start script, more on that in a second.

The last two processes

I’ve introduced the three processes that most jobs will tend to use, but there’s also another two which will be somewhat rarer but are probably the most powerful of them all. These are the post-start and pre-stop processes, and they’re interesting because they’re run while the main process is running.

The post-start process, as its name suggests, is run after the main process has been spawned and any event we were expecting (see above) has happened. The job will not be considered ready until the post-start process completes, thus a common use for it is to interrogate the daemon or send it commands it can only act on once its running.

The pre-stop process is run when a request to stop the job occurs (this means it is not run if the main process terminates on its own), and the process is not killed until it finishes. It receives information about the request, and can cause that request to be ignored (thus leaving the job running). Another common use is to send the daemon commands before it receives the TERM signal.

Next…

So that’s a look at the ways we can define the lifecycle of an Upstart job. In the next couple of posts we’ll look at the environment and session of jobs, and then at matters such as respawning and singletons.

How to (and why) supervise forking processes

Yesterday’s celebratory blog post demonstrated that Upstart is now able to supervise processes that fork into the background, as most daemons do. Now that the code has undergone a little more testing, and been pushed into the archive, it’s worth explaining a little bit more of the background as to the how, and why, we do this.

The why is easiest to answer first. Daemons are normally written to fork, usually twice; this detaches them from the terminal, process group and session that they were spawned from so that they remain running after the user logs out. The fork isn’t just mechanism though, over time a convention has occurred that means daemons don’t go into the background until their initialisation is complete and they’re ready to receive connections — if that’s their bag.

Simply adding an option to remain in the foreground might appear to eliminate the need to deal with the problem, but this also takes away the notification that the daemon is ready for use. Over time this signal can be replaced with other notifications: registering a known D-Bus name, or simply raising SIGSTOP; but these require code changes that need to be agreed with upstream first. Making code changes also assumes that we have the code. Whether we like it or not, sysadmins will often have the need to run proprietary daemons — or even simply older versions of software where the patch is too invasive.

So that’s why we have to do it, now how do we?

This is one of the reasons that building the service supervisor into init, rather than having it as a seperate process, makes sense. Init has a few special kernel-provided buffs, one of which is that orphaned processes are reparented to it. When you run a daemon from the command-line, the process is initially your child; it forks once and the parent dies, the new child is now orphaned, and thus reparented to init. (Most daemons now run setsid and fork a second time. This is to ensure that if they open a tty device, they don’t unexpectedly become its owner.) Init, like any other process, receives notification about its children through wait so will know when daemons terminate; the “must have” of supervision.

So if all daemons are our children we are notified when they terminate and why; we can compare their exit status or signal against a list of known good ones, and choose whether we need to respawn the dead job or mark it as stopped normally.

This isn’t enough though, all we get is the process id of the dead child. We still need to relate that back to a job somehow. One way to do that is to use waitid with the WNOWAIT flag, leaving the process on the table so we can examine /proc to find out more about it. This seems like quite a reasonable approach, we can then match a process to a job by details such as what binary it was actually running. Unfortunately this only works for singleton processes where we’re guaranteed that only one of them exists, both at the job level and at the process-level itself; should the process fork, even to run another child, we could accidentally consider it to have died. Daemons need to be able to run their own children, or even have pools of them to use; and we also need to be able to run multiple copies of daemons where we can support it.

So we really do need to know the process id of the actual daemon process we should be supervising. Unfortunately any method of passing this back to init, even relatively common ones like writing it to a pid file, aren’t sufficiently standard or reliable to do this kind of work.

Ideally the kernel would just tell init when a process was reparented to it, provided both the child process id and that of its previous parent. Such a notification doesn’t exist today, though would be a nice project to try and get it into the kernel mainline; difficult if there’s only one implementation using it.

If we can’t have that, a syscall that would allow us to watch a process and find out when it forks would be the second-best thing. We’d have the previous process id since we were watching it, and we’d hopefully be able to obtain the new child process id from this.

Happily that syscall exists, and I suspect you use it all the time if you’re a developer; it’s a bit of a mad leap to using it inside init, but as you can see, it works rather nicely. All we need do is watch the process, and follow it each time it spawns a new child. We stop watching as soon as we have followed twice (once if a different option is used), or if the process runs a different binary by itself. And thus we can know the process id of daemons we spawned, even if they attempt to detach from their parent process which they’ll just be reparented to anyway.

What’s the syscall? Oh, hmm, is that the time? Got to go! Alright, it’s ptrace.

Supervising forking processes


quest /tmp# cat test.c
#include <sys/types.h>

#include <stdlib.h>
#include <unistd.h>

int
main (int   argc,
      char *argv[])
{
        pid_t pid;

        pid = fork ();
        if (pid > 0)
                exit (0);

        pid = fork ();
        if (pid > 0)
                exit (0);

        pause ();
        exit (0);
}
quest /tmp# gcc -Wall -g -O0 -o test test.c

quest /tmp# cat /etc/event.d/test
wait for daemon
exec /tmp/test

quest /tmp# start test
test (#0) goal changed from stop to start
test (#0) state changed from waiting to starting
event_new: Pending starting event
Handling starting event
event_finished: Finished starting event
test (#0) state changed from starting to pre-start
test (#0) state changed from pre-start to spawned
process_spawn: Spawned main process 6380 for test (#0)
Active test (#0) main process (6380)
test (#0) main process (6380) forked new child 6381
test (#0) main process (6381) forked new child 6382
test (#0) state changed from spawned to post-start
test (#0) state changed from post-start to running
event_new: Pending started event
Handling started event
event_finished: Finished started event

Something for everybody

According to the current issue (#93) of Linux Format, Ubuntu 7.04 (“Feisty Fawn”) is “…a dull release for Ubuntu, leaving Fedora to storm ahead…” (p. 23) whilst “shaping up to be one of the most innovative Linux distro releases of the year.” (p. 38)

Especially amusing for myself is that, with Upstart, they “seldom notice any difference in boot speed” (p. 42), yet “Ubuntu 7.04 boots up in record time, leaving other Linux distros in the dust.” (p. 22)

(As anyone who’s ever read anything about Upstart will know, Ubuntu still uses the SysV-rc scripts so there should be no difference in speed at this point. Funnily enough, they identified the reason Ubuntu boots fast in the same issue; “Changing the /bin/sh symlink to point to Dash instead of Bash can significantly shorten boot times” (p. 33) — unfortunately they simultaneously claim that Dash is only “almost POSIX compliant”, without explaining why they think it isn’t.)

In this modern world, the lack of any editorial direction or basic research into what’s being printed is quite refreshing.

Upstart can now replace sysvinit

Today I reached another milestone in the development of upstart, the packages in universe can now replace the existing sysvinit package.

Before trying this, make sure your installation is up to date as we’ve had to split out some parts of sysvinit into a new sysvutils package. If you’re up to date, and want to try it out, install the upstart and upstart-compat-sysv packages from universe.

Note that the first reboot after you’ve installed the packages (from sysvinit to upstart) will be a little tricky … use reboot -f.

If your system boots and shuts down normally, everything’s working just fine. Note that both will be somewhat more quiet than you’re used to, unless you have usplash running.

Throughout the rest of this entry, I’ll try to answer some of the questions and comments that I’ve received since the last post.

Events

As I talked about previously, upstart is an event-based init daemon. Events are the primary force for having your services and tasks started and stopped at the appropriate time and in the appropriate order.

So what are events and where do they come from? (Note that this part is under development, so may change in later releases).

Events are just simple strings that may be sent by any process when something it is tracking the state of changes. They have no state or longevity, and if, when queued, they do not cause any job state changes, then they have no effect unless they are sent again.

Jobs can list which events cause them to be started if they are not already running and which events cause them to be stopped if they are running. Multiple start and stop events may be listed, in which case the first to occur changes the job until the next one occurs.

upstart itself generates the following system events:

  • “startup”, on system boot.
  • “shutdown”, when the system is about to be shut down.
  • “stalled”, when there are no jobs running and no events in the queue.

The shutdown tool included in the package also causes one of the following events to be sent once the “shutdown” event has been handled:

  • “reboot”,
  • “halt”,
  • “poweroff”,
  • “maintenance” (aka. going into “single user” mode),
  • any user-defined event with shutdown -e event.

Jobs also generate events whenever they change state, this is the primary source of events for ordering:

  • “jobname/start”, when the job is first started.
  • “jobname/started”, once the job is running.
  • “jobname/stop”, when the job is first stopped.
  • “jobname/stopped”, once the job has stopped.
  • “jobname”, for services this is generated once it is running; for tasks this is generated once it has finished.

And as mentioned, any other process on the system may send events through the control socket or just by using initctl trigger EVENT. For now this is just the event string, however it’s intended that the event may include other details including environment variables and even file descriptors.

Typical example

To clarify how it all hangs together, here’s an example (using fictional names) of how the tasks and events can be arranged to provide race-free mounting of filesystems.

  • “udev” service started on the “startup” event.
  • udev daemon is configured to send a “new-block-device” event whenever a new block device is added to the system.
  • “checkfs” task is started on the “new-block-device” event to check the filesystem.
  • “mountfs” task is started when the “checkfs” task has finished and mounts the filesystem if listed in /etc/fstab
  • “filesystem-mounted” event is generated whenever a filesystem is mounted.
  • “fstab” task is started on the “filesystem-mounted” event, it checks the list of mounted filesystems against /etc/fstab and if all are mounted, generates the “writable-filesystem” event.
  • other services and tasks would be started on the “writable-filesystem” event.

By breaking this job into these small tasks, we can see how the pieces fit together. Because everything is now done on events, there are no race conditions; we know that any filesystem listed in /etc/fstab will be checked and mounted.

The only reason they wouldn’t be is if there’s an error of some kind, and that means you have larger problems anyway and the system administrator would have a shell to fix it. Of course, the moment they finish checking the filesystem and mount it, the boot process would carry on.

There’s no reason that any of these events need to be generated by the upstart daemon itself, it can receive them from any other daemon on the system such as udev, acpid, etc. This keeps the focus of the init daemon narrow.

A large part of the future development will be working out exactly what kinds of events we want init itself to generate, what kinds we want to come from elsewhere, and what the contents of an event can be.

Getting Involved

If you want to get involved with trying to nudge the direction of upstart development, you can join the upstart-devel mailing list at http://lists.netsplit.com/.

Or if you just want to grab the source code, tarballs are published at http://people.ubuntu.com/\~scott/software/upstart/ and the bzr archive is at http://bazaar.launchpad.net/\~keybuk/upstart/main

Upstart in Universe

Upstart is a replacement for the init daemon, the process spawned by the kernel that is responsible for starting, supervising and stopping all other processes on the system.

The existing daemon is based on the one found in UNIX System V, and is thus known as sysvinit. It separates jobs into different “run levels” and can either run a job when particular run levels are entered (e.g. /etc/init.d/rc 2) or continually during a particular run level (e.g. /sbin/getty).

The /etc/init.d/rc script is also based on the System V one (and is in the sysv-rc package), it simply executes the stop then start scripts found in /etc/rcN.d (where N is the run level) in numerical order.

Why change it?

Running a fixed set of scripts, one after the other, in a particular order has served us reasonably well until now. However as Linux has got better and better at dealing with modern computing (arguably Linux’s removable device support is better than Windows’ now) this approach has begun to have problems.

The old approach works as long as you can guarantee when in the boot sequence things are available, so you can place your init script after that point and know that it will work. Typical ordering requirements are:

  • Hard drive devices must have been discovered, initialised and partitions detected before we try and mount from /etc/fstab.
  • Network devices must have been discovered and initialised before we try and start networking.

This worked ten years ago, why doesn’t it work now? The simple answer is that our computer has become far more flexible:

  • Drives can be plugged in and removed at any point, e.g. USB drives.
  • Storage buses allow more than a fixed number of drives, so they must be scanned for; this operation frequently does not block.
  • To reduce power consumption, the drive may not actually be spun up until the bus scan so will not appear for an even longer time.
  • Network devices can be plugged in and removed at any point.
  • Firmware may need to be loaded after the device has been detected, but before it is usable by the system.
  • Mounting a partition in /etc/fstab may require tools in /usr which is a network filesystem that cannot be mounted until after networking has been brought up.

We’ve been able to hack the existing system to make much of this possible, however the result is chock-full of race conditions and bugs. It was time to design a new system that can cope with all of these things without any problems.

What we needed was an init system that could dynamically order the start up sequence based on the configuration and hardware found as it went along.

Design of upstart

upstart is an event-based init daemon; events generated by the system cause jobs to be started and running jobs to be stopped. Events can include things such as:

  • the system has started,
  • the root filesystem is now writable,
  • a block device has been added to the system,
  • a filesystem has been mounted,
  • at a certain time or repeated time period,
  • another job has begun running or has finished,
  • a file on the disk has been modified,
  • there are files in a queue directory,
  • a network device has been detected,
  • the default route has been added or removed.

In fact, any process on the system may send events to the init daemon over its control socket (subject to security restrictions, of course) so there is no limit.

Each job has a life-cycle which is shown in the graph below:

upstart_state.png

The two states shown in red (“waiting” and “running”) are rest states, normally we expect the job to remain in these states until an event comes in, at which point we need to take actual to get the job into the next state.

The other states are temporary states; these allow a job to run shell script to prepare for the job itself to be run (“starting”) and clean up afterwards (“stopping”). For services that should be respawned if they terminate before an event that stops them is received, they may run shell script before the process is started again (“respawning”).

Jobs leave a state because the process associated with them terminates (or gets killed) and move to the next appropriate state, following the green arrow if the job is to be started or the red arrow if it is to be stopped. When a script returns a non-zero exit status, or is killed, the job will always be stoped. When the main process terminates and the job should not be respawned, the job will also always be stopped.

As already covered, events generated by the init daemon or received from other processes cause jobs to be started or stopped; also manual requests to start or stop a job may be received.

The communication between the init daemon and other processes is bi-directional, so the status of jobs may be queries and even changes of state to all jobs be received.

How does it differ from launchd?

launchd is the replacement init system used in MacOS X developed as an “Open Source” project by Apple. For much of its life so far, the licence has actually been entirely non-free and thus it has only become recently interesting with the licence change.

Much of the goal of both systems appears initially to be the same; they both start jobs based on system events, however the launchd system severly limits the events to only the following:

  • system startup,
  • file modified or placed in queue directory,
  • particular time (cron replacement),
  • connection on a particular port (inetd replacement).

Therefore it does not actually allow us to directly solve the problems we currently have; we couldn’t mount filesystems once the “filesystem checked” event has been recived, we couldn’t check filesystems when the block device is added and we certainly couldn’t start daemons once the complete filesystem (as described by /etc/fstab) is available and writable.

The launchd model expects the job to “sit and wait” if it is unable to start, rather than provide a mechanism for the job to only be started when it doesn’t need to wait. Jobs that need /usr to be mounted would need to spin in a loop waiting for /usr to be available before continuing (or use a file in a tmpfs to indicate it’s available, and use that modification as the event).

This is not especially surprising given that Apple have a high degree of control over both their hardware and the actual underlying operating system; they don’t need to deal with the wide array of different configurations that we have in the Linux world.

Had the licence been sufficiently free at the point we began development of our own system, we would probably have extended launchd rather than implement our own. At the point Apple changed the licence, our own system was already more suitable for our purposes.

How does it differ from initng?

Initng by Jimmy Wennlund is another replacement init daemon intended to replace the sysvinit system used by Linux. It is a dependency-based system, where upstart is an event-based system.

The notion of a dependency-based system is interesting to talk about at this point. Jobs declare dependencies on other jobs that need to happen before the job itself can be started. Starting the job causes its dependencies to be started first, and their dependencies, and so on. When jobs are stopped, if running jobs have no dependencies, they themselves can be stopped.

It’s a neat solution to the problem of ordering a fixed boot sequence and the problem of keeping the number of running processes to a minimum needed.

However this means that you need to have goals in mind when you boot the system, you need to have decided that you want gdm to be started in order for it, and its dependencies, to be started. Initng uses run levels to ensure this happens, where a run level is a list of goal jobs that should be running in that run level.

It’s also not clear how the dependencies interact with the different types of job, a dependency on Apache would need the daemon to be running where a dependency on “checkroot” would need the script to have finished running. Upstart handles this by using different events (“apache running” vs. “checkroot stopping”).

Again while interesting, Initng does not solve the problems that we wanted to solve. It can reorder a fixed set of jobs, but cannot dynamically determine the set of jobs needed for that particular boot.

A typical example would be that if the only dependency on the job that configures networking is the mount network filesystems job, then should that job fail or notbe a goal (e.g. because there are no network filesystems to be mounted) the result is that network devices themselves will not be configured. You could make everything a goal, and just use the dependencies to determine the order, however this is less efficient than just ordering the existing sysv-rc scripts (which
can be done at install time).

Another example is that often you simply don’t know whether something is a dependency or not without reading other configuration, for example the mount network filesystems may be a dependency of everything under /usr or may just be a dependency of anything allowing the user to login if it just mounts /home.

The difference in model can be summed up as “initng starts with a list of goals and works out how to get there, upstart starts with nothing and finds out where it gets to.”

How does it differ from Solaris SMF?

SMF is another approach to replacing init developed by Sun for the Solaris operating system. Like initng it’s a dependency-based system, so see above for the differences between those systems and upstart.

SMF’s main focus is serive management; making sure that once services are running, they stay running, and allowing the system administrator to query and modify the states of jobs on the system.

Upstart provides the same set of functionality in this regard, services are respawned when they fail and system administrators can at any time query the state of running services and adjust the state to their liking.

Will it replace cron, inetd, etc?

The goal of upstart is to replace those daemons, so that there is only one place (/etc/event.d) where system administrators need to configure when and how jobs should be run.

In fact, the goal is that upstart should also replace the “run event scripts” functionality of any daemon on the system. Daemons such as acpid, apmd and Network Manager would send events to init instead of running scripts themselves with their own perculiar configuration and semantics.

A system administrator who only wanted a particular daemon to be run while the computer was on AC power would simply need to edit /etc/event.d/daemon and change “on startup” to “on ac power”.

What about compatibility?

There’s a lot of systems administrators out there who have learned how Linux works already and will not want to learn again immediately, there’s also a large number of books that cover the existing software and won’t cover upstart for at least a couple of years.

For this reason, compatibility is very important. upstart will continue to run the existing scripts for the forseeable future so that packages will not need to be updated until the author wants.

Compatibility command-line tools that behave like their existing equivalents will also be implemented, a system administrator would never need to know that crontab -e is actually changing upstart jobs.

Does it use D-BUS?

“To D-BUS people, every problem seems like a D-BUS problem.”
– Erik Troan

The UNIX philosophy is that something should do just one job, and do it very well. upstart’s one job is starting, supervising and stopping other jobs; D-BUS’s one job is passing messages between other jobs.

D-BUS does provide a mechanism for services to be activated when the first message is sent to them, thereby starting other jobs. Some people have taken this idea and extended it to suggest that all a replacement init system need do is register jobs with D-BUS and turn booting into a simple matter of message parsing.

This seems wrong to me, D-BUS would need to be extended to supervise these services, provide means for them to be restarted and stopped; as well as deal with being process #1 which means cleaning up after children whose parent’s have died, etc. It seems far simpler to arrange for D-BUS to send an event to init when it needs a service to be started, and focus on being a very good message passing system.

The IPC mechanism used by upstart is not currently D-BUS because of various problems, however it’s always been expected that even if init itself doesn’t communicate with D-BUS directly, there would be a D-BUS proxy that would ensure messages about all init jobs and events would be given to D-BUS and D-BUS clients could send messages to init to query and change the state of jobs.

What is the implementation plan?

Because this is process #1 we are changing, we want to make sure that we get it right. Therefore instead of releasing a fully-featured daemon and configuration to the world, we’re developing it in the following stages:

  1. Principal development; at the end of this stage the daemon has been implemented and can manage jobs as described.
  2. Replacement of /sbin/init while running the existing sysv-rc scripts. This is the shake-down test of the daemon, can it perform the same job as the existing sysvinit daemon without any regressions?
  3. /etc/rcS.d scripts replaced by upstart jobs. These consitute the majority of tasks for booting the system into at least single-user mode, and contain many of the current ordering problems and race conditions. If the daemon solves the problems here, it will be a success.
  4. Other daemon’s scripts replaced by upstart jobs on a package-by-package basis; this will be an ongoing effort during which upstart will continue running the existing sysv-rc scripts as well as its own jobs. During this time the event system may be tweaked to ensure it truly solves the problems we need.
  5. Replcement of cron, atd, anacron and inetd. This will happen alongside the above and result in a single place to configure system jobs.
  6. Modification of other daemons and processes to send events to init instead of trying to run things themselves.

The current plan is that we will be at least part of the way into stage #3 by the time edgy is released, with that release shipping with upstart as the init daemon and the most critical rcS scripts being run by it to correct the major problems

For edgy+1 we hope to have completed stage #5 and be at least part of the way into the implementation of stage #6. From the start of development of edgy+2, no new packages will be accepted unless they provide upstart jobs instead of init scripts and init scripts will be considered deprecated.

What state is it in now?

The init daemon has been written and is able to manage jobs as described above, receiving events on the control socket to start and stop them. This has now been uploaded to the Ubuntu universe component in the upstart package for testing before it becomes the init daemon.

We welcome any experienced users who want to help test this; install the package and follow the instructions in /usr/share/doc/upstart/README.Debian to add a boot option that will use upstart instead of init. If your system boots and shut downs normally (other than a slightly more verbose boot without usplash running) then it is working correctly.

Other types of events will be added as required during development and testing. Currently only a basic client tool (initctl) has been written, compatibility tools such as shutdown will be written over the next week or two before it replaces our sysvinit package.