Ticket #48 (closed enhancement: fixed)
Slave concurrency
| Reported by: | dustin | Owned by: | warner |
|---|---|---|---|
| Priority: | major | Milestone: | 0.7.6 |
| Version: | 0.7.5 | Keywords: | |
| Cc: |
Description
Some patches I sent to the mailing list, to allow configurable limits on slave concurrency.
Attachments
Change History
comment:2 Changed 5 years ago by dustin
apply in this order:
use-objects-in-config.patch canStartBuild.patch track-slavebuilders.patch slavebuilder-concurrency.patch
Changed 5 years ago by dustin
-
attachment
canStartBuild.patch
added
updated version to apply against CVS HEAD
comment:4 Changed 5 years ago by dustin
Greg Ward suggests I add some detail on these patches. Here it is! Some of this is drawn from the patch headers, which Trac seems not to show.
use-objects-in-config.patch
This patch changes the bot configuration to use objects instead of tuples. This will allow us to move some bot-specific functionality into BuildSlave, and also allow users to subclass BuildSlave for their own purposes. The existing configuration format is supported as a backward-compatibility measure.
* buildbot/buildslave.py: add new BuildSlave class, suitable for use
in master.cfg
* buildbot/master.py: use the BotPerspective subclass, BuildSlave,
as a configuration source; support consumption of successors' souls
as a reconfiguration technique
* docs/buildbot.texinfo buildbot/scripts/sample.cfg: documentation
* buildbot/test/test_config.py: update to use new bot configuration
canStartBuild.patch
Add a new function, canStartBuild, to build slaves. This allows slave-level (rather than SlaveBuilder?-level) control over whether or not builds can run. The implementation in this patch just returns True, which preserves the existing behavior (no limit on concurrency in a given slave).
* buildbot/buildslave.py: add canStartBuild()
* buildbot/process/builder.py: call BuildSlave's canStartBuild()
* buildbot/test/test_run.py: corresponding tests
track-slavebuilders.patch
This patch adds machinery to track the SlaveBuilders? running on a given slave. This information is required for concurrency control, as a means to determine just what *is* currently running on the slave.
* buildbot/buildslave.py: add addSlaveBuilder() and removeSlaveBuilder()
* buildbot/process/builder.py: call BuildSlave's addSlaveBuilder() and
removeSlaveBuilder()
* test/test_slaves.py: corresponding tests
slavebuilder-concurrency.patch
The patch that makes it all worthwhile -- the BuildSlave class, which is instantiated in the configuration file, gains an optional concurrency parameter which will limit the total number of builds that will run simultaneously on that slave. The existing behavior (unlimited concurrency) is the default.
This change also required a change to the build-starting mechanism, since the completion of a build in one builder may allow a build in another builder to start.
* buildbot/buildslave.py buildbot/process/builder.py: Add a concurrency
limit to the BuildSlave class, and use that to limit the total
number of concurrent builds on a given BuildSlave.
* buildbot/scripts/sample.cfg: describe new concurrency_limit keyword arg
* buildbot/process/builder.py: call the botmaster's maybeStartAllBuilds
when a build is finished, to potentially trigger a build from another
builder.
* buildbot/test/test_run.py: corresponding tests
* docs/buildbot.texinfo: documentation
comment:5 Changed 5 years ago by dustin
bug #63 was a duplicate, more or less, of the first patch in this bug.
comment:6 Changed 5 years ago by warner
first off, yay! patches with unit tests and docs! wonderful!
secondly, I love the idea of long-lived BuildSlave objects which keep track of information about specific build slaves. Given the sorts of uses we can put these to, I prefer your approach (including the replacing-your-predecessor logic) to the simpler form I did in #63. I went with the simpler form because I had smaller goals and wanted something that didn't require as much thought about the update-the-config logic (in the hopes of letting 0.7.6 ship sooner), but I think I'm going to replace it with your approach.
In particular, being able to recover from connection interruptions (#25) would be improved by having a place to put connection- or event- sequence numbers that outlives the connection or the build. So in the longer run, I think we want this.
thirdly: while having a BuildSlave object is a good starting point for things like load-average-sensitive and policy-based build dispatch, does attachment:slavebuilder-concurrency.patch add any immediate functionality that we couldn't get by using a SlaveLock? with maxCount > 1 ?
I like this approach in general, I'm just trying to decide whether it should go in 0.7.6 or in the following release, so I'm balancing risks vs benefits.
thanks,
-Brian
comment:7 Changed 5 years ago by warner
I keep asking this question and I keep forgetting the answer: the reason that this approach is more powerful than a SlaveLock? is that the slave-dispatcher has the opportunity to choose a slave that isn't already busy. Doing this with a SlaveLock? means that it will sometimes pick a slave that happens to be busy, then the build will stall while it waits for the lock to free up. But the dispatcher could notice that slave3 is idle (or at least below its job limit) and preferentially assign builds to it rather than sending them to the busy ones.
Maybe this time I'll remember the answer long enough to get this code merged and thus make the question go away altogether... :)
Changed 5 years ago by dustin
-
attachment
use-objects-in-config.patch
added
updated to apply against HEAD
comment:8 Changed 5 years ago by dustin
Here's an updated version of use-objects-in-config.patch, to apply against HEAD. This incoroprates my patches to subclass BotPerspective? and make the objects created in the config file active participants in the buildbot processing, while keeping your new syntax and so on.
You hit the nail on the head as to why this approach beats a lock -- thanks.
From the sound of it, it's too late to split these patches up into separate tickets -- it sounds like you're considering merging or not merging the whole lot (which is, I think, the right way to look at it). So I'll upload updated versions of the remaining patches too.
Changed 5 years ago by dustin
-
attachment
track-slavebuilders.2.patch
added
updated to apply against HEAD
Changed 5 years ago by dustin
-
attachment
slavebuilder-concurrency.2.patch
added
updated to apply against HEAD
comment:9 follow-up: ↓ 10 Changed 5 years ago by warner
ok, I'm going to apply these, but I have to make some changes along the way. Starting with use-objects-in-config, I'm thinking the BuildSlave class and BotPerspective? should merge (since there's no use for a BotPerspective? in isolation), and they should move out to a separate file (since putting import statements in an init.py makes me nervous). The test_config.py entry that uses cbots? was testing for backwards compatibility, so we should leave that in for a release or two (but update the test to insure that BuildSlave instances are retained when they're supposed to be and not when they're not.
Also, consumeTheSoulOfYourSuccessor isn't actually called anywhere.. we'll need some more logic in the update-config routine to notice that a BuildSlave should be updated instead of replaced.
I'll see about hacking on this a little bit tonight.
comment:10 in reply to: ↑ 9 Changed 5 years ago by dustin
Replying to warner:
ok, I'm going to apply these, but I have to make some changes along the way. Starting with use-objects-in-config, I'm thinking the BuildSlave class and BotPerspective? should merge (since there's no use for a BotPerspective? in isolation), and they should move out to a separate file (since putting import statements in an init.py makes me nervous). The test_config.py entry that uses cbots? was testing for backwards compatibility, so we should leave that in for a release or two (but update the test to insure that BuildSlave instances are retained when they're supposed to be and not when they're not.
All of these changes sound fine -- I wasn't that happy with the consume* functionality, anyway.
comment:11 Changed 5 years ago by warner
I'm running tests on my modified form of the first change now.. should commit in a few minutes. I moved the new merged BuildSlave class out to buildbot/buildslave.py, and updated docs/tests to match. I also changed loadConfig_Slaves to compare the new c['slaves'] against the old by comparing tuples of (name, password, class), since a change to any of the three requires a detach/reattach cycle (if they changed the password, that's an indication that they *don't* want the old slave reconnecting, unless/until they change the password on the slave too).
I renamed that consume* method to be just 'update', since the functionality is still useful.. BuildSlave instances are long-lived but mutable (as long as you're changing something other than name/password/class), so updating the existing instance using the specification found in the config file is the best thing to do. I renamed it from consume* to update (despite an obvious fondness for the imagery :) because the word "Successor" isn't quite accurate: in this case the new instance found in c['slaves'] is replacing the old one (since the old one is long lived), but rather the old one is mutated to make it look just like the new one, then the new one is thrown away.
I still need to think about whether we should be using ComparableMixin here.. given the name/password/class comparison, I'm not sure it is useful or even correct to play around with equality on this class.
comment:12 Changed 5 years ago by warner
ok, I've incorporated the first patch (use-objects-in-config), in [aeffc77a34747c38c4d66a2ec7fe9663c9947dfb]. I'm working on the next one now.
comment:13 Changed 5 years ago by warner
- Status changed from new to closed
- Resolution set to fixed
- Milestone set to 0.7.6
I've just finished merging in the last of these, in [2125831d5cfbae353bbb504710cfafef6efce56c]. I changed the name of the parameter to max_builds= (instead of concurrency_limit=), and cleaned up the unit tests a bit.
thanks!
![[Buildbot Logo]](/chrome/site/header-text-transparent.png)