Opened 10 years ago

Closed 8 years ago

#744 closed enhancement (wontfix)

"queued" status for dependent builders

Reported by: tfogal Owned by:
Priority: minor Milestone: 0.8.+
Version: master Keywords: buildcoord
Cc: warner

Description

I've got a scheduler which is dependent on my main build:

 sched_visit = Scheduler(
   name="sched-build",
   branch=None,
   treeStableTimer=2*60,
   builderNames=["main-debian-lenny-amd64", "main-osuse-11.1-amd64",
                 "main-osx-10.5-x86"]
 )
 sched_tests_databases = Dependent(
   name="sched-tests-databases",
   upstream=sched_visit,
   builderNames = ["test-db-s-debian-lenny-amd64",
                   "test-db-s-osuse-11.1-amd64",
                   "test-db-s-osx-10.5-x86"]
 )

These share the same set of slaves; for example:

 tjf_builders.append({
   'name': "main-debian-lenny-amd64",
   'slavenames': ["tjf-debian-lenny-amd64"],
   'builddir': "main-%s" % "lenny-amd64",
   'env': {'VISITARCH' : "bbot", 'BBDIR' : "lenny-amd64" },
   'factory': f_incremental_visit,
   'locks': [single_bld.access('counting')]
 })
 tjf_builders.append({
   'name': "test-db-s-debian-lenny-amd64",
   'slavenames': ["tjf-debian-lenny-amd64"],
   'builddir': "test-db-s-%s" % "lenny-amd64",
   'env': {'VISITARCH' : "bbot", 'BBDIR' : "lenny-amd64" },
   'factory': f_tests_db,
   'locks': [single_bld.access('counting')]
 })

in the /builders/ web interface, the test-db-s builders end up with a status of "building" while the main- builders are running. This is a bit confusing; it would be nicer if these said "Queued", or something else to indicate they are not running, but will at some point.

Change History (8)

comment:1 Changed 10 years ago by tfogal

Correction/update: I was confused because I couldn't seem to reproduce this after submitting the ticket... yet I obviously saw it.

Turns out this situation only arises if, while "sched-build" is building, a new commit comes in. Even then, the dependent step still lists itself as "idle". It only goes to "building" status (yet doing nothing / blocked waiting on its upstream) when "sched-build" starts doing the next build.

comment:2 Changed 9 years ago by dustin

  • Cc warner added

Is this in 0.8.0, then?

I don't really understand how the state tracking works with the new Dependent. Warner?

comment:3 Changed 9 years ago by dustin

  • Milestone changed from undecided to 0.8.+

I also don't understand how *schedulers* are waiting, but *builders* show a status like this - do they just arbitrariliy pick one of the schedulers that can trigger them, and show that scheduler's status?

comment:4 Changed 9 years ago by dustin

  • Keywords buildcoord added
  • Milestone changed from 0.8.+ to 0.8.1

comment:5 Changed 9 years ago by dustin

  • Milestone changed from 0.8.2 to 0.8.3

comment:6 Changed 9 years ago by dustin

  • Milestone changed from 0.8.3 to 0.8.+

comment:7 Changed 8 years ago by callek

Reso intentional? We only want to show queued when we are SURE we want the build and no slaves are free... right?

comment:8 Changed 8 years ago by dustin

  • Resolution set to wontfix
  • Status changed from new to closed

Are maxBuilds or locks in use here? I suspect that what's happening is this:

  • sched-build gets a change, and creates a buildset with requests for all "main-*" builders
  • dependent scheduler attaches itself to that buildset, but this is not reflected in any status display
  • those builders start running -- correctly reflected in the status ("main-*" are building, "test-*" are idle)
  • sched-build gets another change, and creates a new, similar buildset
  • all builds in the first buildset complete, and the dependent scheduler creates a buildset with requests for all "test-*" builders
  • the next buildset in the queue is the second sched-build-created buildset, so the "main-*" builders start again, and get the slaves. At this point, "main-*" are building and "test-*" are idle.
  • once that's done, the master starts the "test-*" builds based on the dependent's buildset, but those immediately block on the locks preventing slave concurrency. At this point, "main-*" are busy and "test-*" are busy -- locks are not specially reflected in the build's status.

It might be nice to display this differently, but I don't think that's practical at this point -- the lock logic is not very well exposed to the status. So I'm going to close as wontfix. I expect that this will look a little better in the brave new world of 0.9.x, and we can revisit the idea at that time.

Note: See TracTickets for help on using tickets.