Opened 5 years ago

Closed 5 years ago

#2851 closed task (fixed)

Host "latest" metabuildbot on bb infra

Reported by: dustin Owned by: sa2ajj
Priority: critical Milestone: sys - on-bb-infra
Version: 0.8.9 Keywords: ansible
Cc: sysadmin@…

Description

nine.buildbot.buildbot.net is currently hosted on an EC2 instance on Dustin's account, and is running a recent version of master.

So, re-hosting this on buildbot infra is important as that EC2 instance isn't especially cheap.

For the moment, we can continue to update this to newer versions manually. Eventually, it would be great to have it automatically update periodically -- say, nightly.

Change History (34)

comment:1 Changed 5 years ago by dustin

Config is at https://github.com/buildbot/metabbotcfg/ in the 'nine' branch.

comment:2 Changed 5 years ago by dustin

  • Owner set to dustin
  • Status changed from new to assigned

comment:3 Changed 5 years ago by sa2ajj

[Relevant] questions from #3064:

  • what configurations we'd like to have?
  • do we have any existing slaves that we can use right away (after poining them to the right master)?

The goal is to get answers and prepare Ansible playbook that'd make it happen.

comment:4 Changed 5 years ago by sa2ajj

  • Owner changed from dustin to sa2ajj
  • Status changed from assigned to accepted

comment:5 Changed 5 years ago by sysadmins

  • Cc sysadmin@… added

comment:6 Changed 5 years ago by Ben

Eat your own dog food, eh !

comment:7 Changed 5 years ago by sa2ajj

Let's see how tasty it is :)

comment:8 Changed 5 years ago by dustin

There as some discussion on IRC about this. Since it's what we can do now, and since OS flavor really doesn't matter to Buildbot, let's run it in a FreeBSD jail for now.

I can see good reasons to want to do deployments with Docker: it means users can install that same docker image and see what we see. We can switch to that later, though.

comment:9 Changed 5 years ago by sa2ajj

It looks like nine w/ sqlite is slowish.

What database we are going to use with it?

MySQL (w/ MySQL connector) or PostgreSQL (w/ pg8000)?

PostgreSQL is not available yet (see #3068).

Last edited 5 years ago by sa2ajj (previous) (diff)

comment:10 Changed 5 years ago by dustin

I don't mind which we use. The existing metabuildbot is using postgres, if that helps at all. On the other hand, mysql is running now and postgres isn't..

comment:11 Changed 5 years ago by Ben

I don't understand the comment about sqlite. I can't really link any slowness with the database backend in my install. Care to elaborate on your findings ?

As long as master isn't stable, I believe it's easier to stick to sqlite, it makes it way easier to delete the db would something go wrong for instance ...

comment:12 Changed 5 years ago by sa2ajj

  • Keywords ansible added

Well, there're no findings. It just while accessing the nine installation at the new place (which is not yet available as http://nine.buildbot.net) we _felt_ that it works slowly.

In any case, the nine branch of metabuildbot uses sqlite and I will stick to it. So we can actually see if it's slow for the real setup.

comment:13 Changed 5 years ago by sa2ajj

Well, the installation is available at http://140.211.10.244/

However when the waterfall page is selected, it never goes beyond "loading" notification.

comment:14 Changed 5 years ago by Ben

However when the waterfall page is selected, it never goes beyond "loading" notification.

That's just because the db is empty ... But you could make an issue about that, it counts as 'first impression when installing buildbot', and I don't think that's negligible ...

comment:15 Changed 5 years ago by sa2ajj

What part of the database is empty?

There's one change registered there.

comment:16 Changed 5 years ago by Ben

The part that should be displayed on the waterfall: the builds.

comment:17 Changed 5 years ago by sa2ajj

After connecting old slaves to the new master, we've got a lot of interesting tracebacks. I'll try to collect those into separate tickets, for now check: service2.buildbot.net:/usr/local/jail/nine.buildbot.net/home/bbmaster/master/nine/twistd.log

comment:18 Changed 5 years ago by sa2ajj

The filed defects: #3094 (related to #2714), #3095 and #3096 (these two seem to be related to #2818).

Last edited 5 years ago by sa2ajj (previous) (diff)

comment:19 Changed 5 years ago by sa2ajj

Just a note for those who watch this ticket: the site is available at http://nine.buildbot.net

comment:20 Changed 5 years ago by sa2ajj

I tried to upgrade to the latest master and it looks like GH:1417 is required before I can continue.

comment:21 Changed 5 years ago by Ben

Visiting http://nine.buildbot.net I got:

GET http://nine.buildbot.net/styles.css net::ERR_CONTENT_LENGTH_MISMATCH
GET http://nine.buildbot.net/scripts.js net::ERR_CONTENT_LENGTH_MISMATCH

comment:22 Changed 5 years ago by sa2ajj

I tried several times and I can't reproduce this. :(

(I'm using newish Firefox.)

comment:23 Changed 5 years ago by sa2ajj

Interestingly, djmitche and I started to see a similar problem and in the corresponding error_log there's a plenty of line like this:

2014/12/07 17:32:19 [crit] 32474#0: *34786 mkdir() "/var/tmp/nginx/proxy_temp/5" failed (2: No such file or directory) while reading upstream, client: 188.238.241.82, server: nine.buildbot.net, request: "GET /scripts.js HTTP/1.1", upstream: "http://192.168.80.244:8010/scripts.js", host: "nine.buildbot.net", referrer: "http://nine.buildbot.net/"

After restarting nginx, things seem to be normal.

So it needs to be watched to see if the problem re-occurs...

comment:24 Changed 5 years ago by dustin

I'd guess tmpclean is to blame -- likely nobody looked at the host for a while, so /var/tmp/nginx's timestamp got old enough to clean.

comment:25 Changed 5 years ago by sa2ajj

Is it run by cron? (I could not find anything by running fgrep -lR tmpclean . in /etc/ and in /usr/local/etc)

comment:26 Changed 5 years ago by dustin

/etc/periodic/daily/110.clean-tmps

Config says

# 110.clean-tmps
daily_clean_tmps_enable="NO"                            # Delete stuff daily
daily_clean_tmps_dirs="/tmp"                            # Delete under here
daily_clean_tmps_days="3"                               # If not accessed for
daily_clean_tmps_ignore=".X*-lock .X11-unix .ICE-unix .font-unix .XIM-unix"
daily_clean_tmps_ignore="$daily_clean_tmps_ignore quota.user quota.group .snap"
daily_clean_tmps_ignore="$daily_clean_tmps_ignore .sujournal"

So it's limited to /tmp and not enabled by default anyway. So maybe not a good guess.

comment:27 Changed 5 years ago by sa2ajj

Should we move the temporary files of nginx to some other place then? Or add ignores?

(My inclination is the first option)

comment:28 Changed 5 years ago by dustin

Let's try moving them, just to see if it helps..

comment:29 Changed 5 years ago by sa2ajj

(For the record, the parameter that needs to be changed is proxy_temp_path.)

comment:30 Changed 5 years ago by sa2ajj

Moved as part of #38.

comment:31 Changed 5 years ago by Ben

It's working fine this morning. both via nine.b.n and the IP

comment:32 Changed 5 years ago by sa2ajj

Ok, the PR is ready for the final review and then this ticket can be closed. Further improvements will be dealt with in new tickets...

comment:33 Changed 5 years ago by sa2ajj

nine.buildbot.net is now running exactly what's configured by the above mentioned PR.

comment:34 Changed 5 years ago by sa2ajj

  • Resolution set to fixed
  • Status changed from accepted to closed

The abovementioned PR is now merged! :)

Note: See TracTickets for help on using tickets.