Opened 6 years ago

Closed 6 years ago

#3002 closed task (fixed)

Host DNS internally

Reported by: dustin Owned by: dustin
Priority: major Milestone: sys - on-bb-infra
Version: 0.8.9 Keywords: ansible
Cc: sa2ajj

Description

Forward DNS is currently hosted on dns.he.net, under my account. Reverse is hosted on, I believe, RTEMS systems.

Let's host these in a hidden primary on Buildbot infrastructure, with a long TTL in the SOA and public secondaries on external services -- he.net offers this service for free, and many of us know other services that will happily secondary for us.

Change History (37)

comment:1 Changed 6 years ago by sa2ajj

So we basically need an authoritative primary.

Should we use what FreeBSD comes with by default?

Last edited 6 years ago by sa2ajj (previous) (diff)

comment:2 Changed 6 years ago by dustin

I think that's BIND, right? Yeah, that'd be my preference. Amar, would this end up on a service machine, or in an 'ns.buildbot.net' jail, or what?

comment:3 Changed 6 years ago by sa2ajj

So BIND or Unbound? :)

(Yesterday I installed a local copy of FreeBSD (so I would not mess thing up) and started to read the Handbook.)

From the Handbook:

In FreeBSD 10, the Berkeley Internet Name Domain (BIND) has been removed from the base system and replaced with Unbound. Unbound as configured in the FreeBSD Base is a local caching resolver. BIND is still available from The Ports Collection as dns/bind99 or dns/bind98. In FreeBSD 9 and lower, BIND is included in FreeBSD Base. The FreeBSD version provides enhanced security features, a new file system layout, and automated chroot(8) configuration. BIND is maintained by the Internet Systems Consortium.

comment:4 Changed 6 years ago by sa2ajj

Disregard that my last comment.

So BIND it is.

comment:5 Changed 6 years ago by dustin

(for those following along at home, Unbound is a recursive resolver only -- it cannot be authoritative)

comment:6 Changed 6 years ago by sa2ajj

(that's why I said "disregard" :))

comment:7 Changed 6 years ago by sa2ajj

Dustin, can you dump the current zone file for buildbot.net somewhere?

comment:8 Changed 6 years ago by sa2ajj

  • Version 0.8.9 deleted

comment:9 Changed 6 years ago by dustin

  • Version set to 0.8.9
; buildbot.net Dumped Thu Nov 20 08:53:34 2014
;
buildbot.net.	86400	IN	SOA	ns1.he.net. hostmaster.he.net. (
					2014110307	;serial
					10800		;refresh
					1800		;retry
					604800		;expire
					86400	)	;minimum
buildbot.net.	7200	IN	NS	ns1.he.net.
buildbot.net.	7200	IN	NS	ns2.he.net.
buildbot.net.	7200	IN	NS	ns3.he.net.
buildbot.net.	7200	IN	NS	ns4.he.net.
buildbot.net.	7200	IN	NS	ns5.he.net.
buildbot.net.	7200	IN	A	63.245.223.35
buildbot.buildbot.net.	7200	IN	CNAME	ds0210.flosoft-servers.net.
docs.buildbot.net.	7200	IN	CNAME	ds0210.flosoft-servers.net.
nine.buildbot.net.	7200	IN	TXT	"demo server - Pierre Tardy"
www.buildbot.net.	7200	IN	CNAME	buildbot.net.
xy2dwhhhx5d4.trac.buildbot.net.	7200	IN	CNAME	gv-d47fzorbndt4i4.dv.googlehosted.com.
meetings.buildbot.net.	43200	IN	CNAME	ds0210.flosoft-servers.net.
mx.buildbot.net.	86400	IN	A	140.211.10.235
buildbot.net.	86400	IN	MX	10 mx.buildbot.net.
lists.buildbot.net.	86400	IN	A	140.211.10.241
trac.buildbot.net.	86400	IN	A	140.211.10.240
rc.buildbot.net.	86400	IN	CNAME	ds0210.flosoft-servers.net.
nine.buildbot.buildbot.net.	86400	IN	A	54.243.62.208
ftp.buildbot.net.	86400	IN	A	140.211.10.243
bot.buildbot.net.	86400	IN	A	140.211.10.242
service1.buildbot.net.	86400	IN	A	140.211.10.230
service2.buildbot.net.	86400	IN	A	140.211.10.231
service3.buildbot.net.	86400	IN	A	140.211.10.232
vm1.buildbot.net.	86400	IN	A	140.211.10.233
mac1.buildbot.net.	86400	IN	A	140.211.10.234
git.buildbot.net.	86400	IN	A	140.211.10.236
docs-new.buildbot.net.	86400	IN	A	140.211.10.237
www-new.buildbot.net.	86400	IN	A	140.211.10.238 

comment:10 Changed 6 years ago by sa2ajj

Dustin, why did you put version back? :) It's not really related to this ticket...

comment:11 Changed 6 years ago by skelly

Should internal addresses be hosted as well? The internal MySQL database is only accessible internally but it needs something in /etc/hosts to resolve the name. It seems simpler to not have to maintain DNS and /etc/hosts.

comment:12 Changed 6 years ago by sa2ajj

I think we should host the internal zone as well. Otherwise, what's the point of having a CM system?

comment:13 Changed 6 years ago by sa2ajj

It does not have to be the same server though.

I heard something about split zones, but I have zero experience with that.

comment:14 Changed 6 years ago by sa2ajj

  • Keywords ansible added

comment:15 follow-up: Changed 6 years ago by dustin

I've done split horizon DNS before ("views" in BIND's terminology). In fact, I run them at home. You'll lose your mind if the same host resolves differently in two different places. So if you have disjoint sets of names and IPs in your internal and external views, then the only reason to keep them separate is to obscure infrastructure details. Since we're doing everything else in a public repo, there doesn't seem to be much point to that.

All of which is to say, let's do a single view, containing both public and private names.

comment:16 Changed 6 years ago by sa2ajj

Dustin, how do we let secondaries to fetch the zone: anyone with a key or anyone from the given addresses?

comment:17 Changed 6 years ago by sa2ajj

One more thing: what service host we allocate for this function?

comment:18 Changed 6 years ago by sa2ajj

(And, for the record, the work in progress is available in this PR.)

comment:19 Changed 6 years ago by dustin

  • Cc verm added

Our zonefile is, more or less, available for download from GitHub?, so I don't think it makes sense to restrict AXFR very much. Just source address should be sufficient.

I don't have a service host in mind -- I'd leave that to Amar, although he's mostly AFK lately so in the absence of an answer, pick one. It's always possible to move later.

comment:20 Changed 6 years ago by verm

I'm still around I just can't type much. :(

The delegation for name services is service1. All the services have been delegated already I thought I sent an email out about this to the sysadmin list a long time ago.

comment:21 Changed 6 years ago by dustin

Thanks!

I thought I copied all of the data from ML posts into docs/, so I must have missed something. If you come across it, please forward it to me and I'll get it in there.

comment:22 Changed 6 years ago by verm

I'll just re-generate the data and send it again to the list.

comment:23 Changed 6 years ago by sa2ajj

Yes, it is on GitHub, so anyone who wants it will be pounding them, not this poor ns in a jail.

comment:24 Changed 6 years ago by sa2ajj

(and I saw the comment regarding what address to allow to do xfrs...)

comment:25 Changed 6 years ago by sa2ajj

  • Owner set to sa2ajj
  • Status changed from new to accepted

comment:26 Changed 6 years ago by sa2ajj

The PR got merged to master.

Now we need to get a jail and install stuff in it.

comment:27 Changed 6 years ago by sa2ajj

Ok, the jail is now available. I'll check how things work in it tomorrow morning.

comment:28 in reply to: ↑ 15 Changed 6 years ago by verm

Replying to dustin:

I've done split horizon DNS before ("views" in BIND's terminology). In fact, I run them at home. You'll lose your mind if the same host resolves differently in two different places. So if you have disjoint sets of names and IPs in your internal and external views, then the only reason to keep them separate is to obscure infrastructure details. Since we're doing everything else in a public repo, there doesn't seem to be much point to that.

All of which is to say, let's do a single view, containing both public and private names.

This is already setup to do that. *.int.buildbot.net and *.nfs.buildbot.net. Views/horizons only good for VPN use otherwise for services in our case it's confusing as you never know what the IP is just the hostname.

comment:29 Changed 6 years ago by verm

  • Cc sa2ajj added; verm removed
  • Owner changed from sa2ajj to verm

I setup NSD with our zone, converted to the abbreviated format.

I disabled the local control since there is little chance of us using it very often. Otherwise the zone does work:

[verm@peach ~]$ nslookup 140.211.10.236 www.buildbot.net
www.buildbot.net        canonical name = buildbot.net.
Name:   buildbot.net
Address: 63.245.223.35

What is the setup going to be? I have put my own nameserver in as secondary. I will have a 3rd nameserver up here in Toronto within the next month or so however the secondary sits at he.net which is quite stable.

I'll take ownership of the ticket I will be around tomorrow to get this done a cutover should be trivial.

comment:30 follow-up: Changed 6 years ago by dustin

Mikhail already committed some Ansible code for this - https://github.com/buildbot/buildbot-infra/pull/19. It was using bind, not nsd, but otherwise was, I think, ready to run in the jail. From comment 27, he is planning to do so tomorrow morning (so, in a few hours).

I don't mind nsd vs. bind, but I do want things to be done with Ansible from the start. So, if you'd like to convert, that should come in the form of a pull req. Rough consensus and working code and all that.

Also, since this is a new jail, I'd like to see it built from creation by Ansible, so that we know everything works. That means deleting and re-creating it -- but with Ansible, that shouldn't be any trouble, right?

comment:31 in reply to: ↑ 30 Changed 6 years ago by verm

Replying to dustin:

Mikhail already committed some Ansible code for this - https://github.com/buildbot/buildbot-infra/pull/19. It was using bind, not nsd, but otherwise was, I think, ready to run in the jail. From comment 27, he is planning to do so tomorrow morning (so, in a few hours).

I didn't know that which is my fault. If he wants to siwtch it that's fine. I've added the configs to /usr/local/etc/ and added an abbreviated master config

I don't mind nsd vs. bind, but I do want things to be done with Ansible from the start. So, if you'd like to convert, that should come in the form of a pull req. Rough consensus and working code and all that.

I don't have time to do it due to my current situation. So I am OK with whatever solution is deployed.

Also, since this is a new jail, I'd like to see it built from creation by Ansible, so that we know everything works. That means deleting and re-creating it -- but with Ansible, that shouldn't be any trouble, right?

I sent the usage to the config list. It's not an issue just a pain. I use ezjail to create the jail then use FreeBSD to manage the jail using /etc/jail.conf.

We can create the jails by hand entirely avoiding ezjail all together. I am working on this for RTEMS I will share that back to Buildbot when it's done. For now I think we should stick with what we have to avoid duplicating work.

comment:32 Changed 6 years ago by dustin

  • Owner changed from verm to sa2ajj
  • Status changed from accepted to assigned

It seems like this has stalled out a bit. nsd is installed and running, but has some bugs (missing trailing '.'). Yet we have configs committed to use BIND, including a zonefile which appears to be missing those bugs, at least.

I'm going to make an executive decision and say we're going with BIND for now, so let's drop that jail and re-create it (using #3034) using the config already committed to the repo.

comment:33 Changed 6 years ago by sa2ajj

I have not re-created the jail yet (I still need to see how things work and probably re-arrange something), however the updated 'dns role' PR was applied to ns1.

nds was stopped and disabled in that jail.

comment:34 Changed 6 years ago by sa2ajj

  • Owner changed from sa2ajj to dustin

As agreed with Dustin, he'll need some time for recongiguring 'he.net' DNS.

comment:35 Changed 6 years ago by dustin

I think the process is:

  • delete zone on HE
  • set up zone as secondary on HE

which will incur some downtime. I'll also need to make sure that the HE servers can correctly sync from ns.buildbot.net. I'll test it out with a temp zone of some sort.

comment:36 Changed 6 years ago by sa2ajj

(It's ns1.buildbot.net.)

comment:37 Changed 6 years ago by dustin

  • Resolution set to fixed
  • Status changed from assigned to closed

OK, this is in place, and things seem to be resolving:

knuth ~ # dig @8.8.8.8 ns1.buildbot.net

; <<>> DiG 9.9.5 <<>> @8.8.8.8 ns1.buildbot.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43632
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;ns1.buildbot.net.              IN      A

;; ANSWER SECTION:
ns1.buildbot.net.       21599   IN      A       140.211.10.236

;; Query time: 62 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sat Dec 27 09:59:43 EST 2014
;; MSG SIZE  rcvd: 61

(ns1 wasn't in the manually-configured zones on he.net, so that's evidence that it's using the slaved zone)

Note: See TracTickets for help on using tickets.