Opened 3 years ago

Closed 3 years ago

#2960 closed undecided (fixed)

Builder page show that some build exists, but their own page doesn't

Reported by: Ben Owned by: tardyp
Priority: major Milestone: 0.9.0
Version: 0.8.9 Keywords: web


I'm having troubles where the own page of some builds (the latest, always) cannot be shown (springs to the lower number), although that build is shown on the builder page. Consequence of which, when I click on a buildnumber, I sometime fall to the page of the previous build.

This happens sometimes when the build has been forced, and the redirect send me directly to the wrong build page, or when I'm curious, and want to see a build page that is just started, and can't see it ...

I set json_cache_seconds=0, thinking some cache was plying me some tricks, didn't changed anything.

It sometimes works perfectly fine: Forcing a build, and get directly sent to the page of the build in progress ...

Tell me which information I should provide !

Change History (9)

comment:1 Changed 3 years ago by dustin

  • Keywords web added; www removed

comment:2 Changed 3 years ago by tardyp

  • Owner set to tardyp
  • Status changed from new to accepted

Yes, it is a bug that I am aware of.

It is related to the buildbot service's internal cache. The first time it tries to get the build, it gets a 404, and store that in the cache. Then it can't convince himself to update the build.

The code is very generic, so this is probably not only related to builds. It is not very easy to debug, and I haven't got the time to fix it yet.

The workaround is to just reload the page.

comment:3 Changed 3 years ago by Ben

You're right, I was able to reproduce the trouble by first visiting the +1 page, and then forcing a build. I automatically landed on the page of the build previous to the one I just forced. (CACHE HIT -> 404 -> id - 1 )

I was also able to mitigate the trouble via commenting out the line 27 and 28 here:

I believe we are caching (memoizing to be right) on the wrong level. restangular is based on $http, and $http also implements caching ... I do believe that $http can better handle request caching than _.$http#caching

restangular itself, recommends using $http caching:

comment:4 Changed 3 years ago by tardyp

The way it is done allows not only to cache, but also to manage the updates automatically. Several attempt to listen to the same builds will not only avoid to create http request to the data, but also prevent the registering of update events on that build.

This results in a very big impression of speed. Once build page has been seen once, you can go back instantly. updates will even work in the background, as the events are unregistered only few minutes after the last consumer as unbind the data.

comment:5 Changed 3 years ago by Ben

Sure, I understand that relying on the $http cache would (maybe ?) create new Object for every request, that they come from the cache, or that they come from the db, and that the way it is done now assure us that the same object get answered for all subsequent requests, keeping the binding and the rest.

The point is, because we are caching on the wrong level, we are caching the wrong stuff (404). That's an architectural trouble that can hardly be worked around in the current situation without adding much unnecessary complexity as we do need some information about the status of the underlying http request that is a few levels below. Quite difficult to achieve without violating layers ...

As it is now, is just doesn't work. I do recognize the simplicity of the current solution, and its huge advantage, but we have to move on to a suitable solution, be it, first, completely without caching ?

Last edited 3 years ago by Ben (previous) (diff)

comment:6 Changed 3 years ago by tardyp

I really dont think it is an architectural problem. It is just a implementation bug that needs to be solved. 404 error should go through the layers, and be changed to a proper exception, and retried at next bind() call

You cannot cache a build at http level until it is actually complete, and most of the stuff user is actually interested in aren't finished, so this is why caching at this level is efficient. If you go from a running build page back to builder page, and then again back to the build page, you can see it is instant, and also, the build status has been updated in the background. I think this UX is very nice.

You can for now try and see without the memoize, everything will work as expected, but the UI will feel a lot slower.

If there is an architecture problem I would like to fix, it is the fact that collections are not handled properly. We store a representation of the list, and another representation for each item of the list, if requested independently.

I would like to go even further, and store the cache in a browser backed db. This is one of my next project to actually rework completly the buildbot api.

  • Make it less generic, using coffeescript classes, especially for each resource type, have a method to know whether the resource is immutable (so that it can be cached permanantly, and update events subscribing is unneeded).
  • Make a separated bower library out of it, with fakes, so that it can be included in the plugins test dependancies. For now each plugins includes its own version of fakes for the buildbot service.

comment:7 Changed 3 years ago by sa2ajj

My two cents: the UI is not complete yet and I do not believe in premature optimisations (and caching at this stage would fall in this category).

comment:8 Changed 3 years ago by tardyp

I disagree that finding proper way of caching is premature. For me it is very important to experiment from the start an efficient data model. I have seen so many web apps that are very slow as soon as your client is oversee (e.g Rational Team Concert).

You might not see the difference now with caching if you are right in the same network as your master, but as soon as I tried to deploy on heroku, the ui starts to be slow.

This design is following the principles of what gmail or google docs are doing. If you go to one mail, you first get a load latency, but when you go back and forth, mails already loaded are still there. This way of doing things is not something you can just patch in later. you have to design it from the start.

That said, I think we have something that is more or less working. probably rewritting it from scratch now (like I suggested earlier) is a bit early, I'd probably wait after 0.9 is out, so that we have the time to learn more from the current design.

comment:9 Changed 3 years ago by Mikhail Sobolev <mss@…>

  • Resolution set to fixed
  • Status changed from accepted to closed

In 7e049a91549dece82830aa27807d02c74ccb8ffd:

Merge pull request #1306 from tardyp/t2960

buildbot.service: better handle error

Fixes ticket:2960

Note: See TracTickets for help on using tickets.