wiki:DevelopmentPlan

Version 15 (modified by dustin, 14 months ago) (diff)

--

This page sketches the future of Buildbot. This is an open-source project with no full-time developers, so no timeline is in place. Instead, this serves as a roadmap for important projects so that we can focus our efforts on the next step or two. Of course, along the way we will fix lots of defects and make several releases in the process.

Goals

  • Efficient, scalable operation
  • Provide well-defined APIs for users to build on
  • Store all state in a database
  • Real-time updates as builds proceed
  • Modern, fast web interface
  • Pluggable backends to suit different use-cases
  • Support distributed masters with no loss in functionality

We will make a significant compatibility break at 0.9.0:

  • Process objects (e.g., BuildRequest) and status objects (e.g., BuildStepStatus) are no longer part of the configuration interface
  • Most custom subclasses (status listeners, schedulers, change sources, but notably not steps) will need to be rewritten to use the data api
  • WebStatus will be replaced with 'www', with a different configuration and customization interface.
  • Various plugins like the status client and debug client will be replaced with implementations using different interfaces

Target Architecture

buildbot-architecture.png Buildbot's per-master architecture is illustrated at the right. From bottom to top, it operates as follows. (note that the Inkscape source for the image above is in  https://github.com/buildbot/buildbot/tree/master/media)

db

The db layer implements a pluggable, abstracted, persistent storage layer. All masters in a cluster have access to the same persistent storage, although Buildbot does not make any strong assumptions about the timing of updates to that storage. For example, a replicated MySQL database may have up to a few seconds' lag between a write from one master and a read of that data on another master.

The db api presented by this layer is a Python-only, asynchronous API, and is very closely tied to the data being stored.

mq

The mq layer implements a pluggable message queueing system. Depending on your background, you can think of this as a single message bus supporting pattern-matching against structured message topics, or in AMQP terms a single topic exchange. The mq layer itself is content-agnostic. The mq layer interacts with the db layer to update per-master clocks, which are used by the data layer to provide consistency guarantees.

The mq api simply allows callers to publish and consume messages.

data

The data layer combines access to stored state and messages, ensuring consistency between them, and exposing a well-defined API that can be used both internally and externally. Using caching and the clock information provided by the db and mq layers, this layer ensures that its callers can easily receive a dump of current state plus updates to that state, without missing or duplicating updates.

The data api is divided into three sections:

  1. ro ("read only"), which allows reading any state and subscribing to any messages;
    • getters - fetching data from the db API
    • subscriptions - subscribing to messages from the mq layer
  2. control, which allows state to be changed in specific ways by sending appropriate messages (e.g., stopping a build); and
  3. rw ("read/write"), which allows direct updates to state while sending appropriate messages

The ro section is exposed everywhere. Access to the control section should be authenticated at higher levels, as the data layer does no authentication. The rw section is for use only by the process layer -- all external interaction with that layer should be done via the rw section.

See DevelopmentPlan-DataApi for more detail.

status listeners

This layer contains components that consume state changes and present that information as desired. The prototypical example is the IRC client. If implemented in the Buildmaster process, these components can use the Python data API. Otherwise, they can be built atop the language-agnostic APIs exposed by the www layer.

www

Technically, the www layer is a status listener, but it is complex enough to deserve its own box in the diagram. This layer exposes the data API externally, via typical web technologies like a REST API, WebSockets?, comet, and so on. It also contains and serves the static content that comprises the Buildbot web UI, although that content can just as well be served by a dedicated application like Apache httpd or nginx. This layer is responsible for web-based authentication of access to the control section of the data api.

process

This little box holds all of the interesting parts of Buildbot. Change sources, Schedulers, Builders, Builds, Steps, and so on are all part of the process layer. In fact, these are all distinct components within this layer: change sources use the rw section of the data api to add changes, and do not interact directly with schedulers. Where necessary, schedulers use the ro section to learn about new changes, and then use the rw section to schedule new builds. Likewise, builders are alerted to new build requests by the data layer, and feed the results of the subsequent builds back into the data layer.

Implementation

See the  Roadmap for a mapping of the following to particular versions.

Implement and use mq api

This entails implementing the mq api and converting what are now implemented as subcriptions to use the api -- specifically, changes, buildsets, and build requests. This leaves the task of coordinating messages and db changes to the master, which operates directly against the db api and the mq api. The StatusReceiver interface is adapted so that the new messages are properly translated into method calls.

Include status in the db, mq, and data apis

The db api currently stops somwhere around source stamps and builds (and leaves a particularly ragged edge, sadly). The API should be extended to cover everything currently described as "status" - builds, steps, logs, and so on - in the same API. This does *not* entail moving that data into a database, but does entail a sufficiently flexible design to make that move later. The process layer gets better-defined here, as it begins updating state via the data api, with corresponding messages sent via the mq api and state changes made via the db api.

Because the db api is asynchronous, this step will break all of the existing users of the (mostly synchronous, unspecified, and horribly convoluted) status interface. So this is a moment when a release is not possible. However, the rest is downhill.

Implement the www layer

This means implementing the REST API and methods for consuming messages, along with some client-side JavaScript to make it easy to use there.

Implement the web UI

Related, but in a separate project, we'll need to build a good user experience on the web.

Rewite status plugins

The remaining status plugins will need to be rewritten to use the data api.

Not Included

There are a number of other improvements that can be made in parallel with the above, but are not related:

  • message-based master/slave communication
  • declarative configuration

Attachments