55 Commits

Author SHA1 Message Date
Zuul
e56cbdcee3 Merge "Run nodepool launchers with ansible and containers" 2020-05-06 14:21:53 +00:00
Zuul
9b1161e051 Merge "Set up robots.txt on lists servers" 2020-04-30 18:42:55 +00:00
Monty Taylor
e0619f17f1 Run nodepool launchers with ansible and containers
We don't run start in prod normally but we do need to run
it in the gate.

Change-Id: Iec50684280409eb978bf5638bf74ae16fad8aa26
2020-04-30 17:37:22 +00:00
Zuul
fdfbc3d0b9 Merge "Run zookeeper cluster in nodepool jobs" 2020-04-29 22:22:37 +00:00
Monty Taylor
8d7075b02f Run zookeeper cluster in nodepool jobs
Rather than running a local zookeeper, just run a real zookeeper.
Also, get rid of nb01-test and just use nb04 - what could possibly
go wrong?

Dynamically write zookeeper host information to nodepool.yaml

So that we can run an actual zk using the new zk role on hosts in
ansible inventory, we need to write out the ip addresses of the
hosts that we build in zuul. This means having the info baked in
to the file in project-config isn't going to work.

We can do this in prod too, it shouldn't hurt anything.

Increase timeout for run-service-nodepool

We need to fix the playbook, but we'll do that after we get the
puppet gone.

Change-Id: Ib01d461ae2c5cec3c31ec5105a41b1a99ff9d84a
2020-04-29 16:18:25 -05:00
Clark Boylan
eeac5467c3 Set up robots.txt on lists servers
This sets up a robots.txt on our lists servers. To start this file
prevents SEMrush bot from indexing our lists as that has been causing
lists.openstack.org to OOM with many listinfo processes started by
Apache.

We've avoided this OOM by manually configuring this robots.txt. Other
things we have ruled out are bup and input email causes qrunner's to
grow unexpectedly large. Fairly confident this bot is the trigger.

Note this fixes testing by adding 'hieradata' to set listpassword var.

Depends-On: https://review.opendev.org/724389
Change-Id: Id4f6739a8cf6a01f9796fa54c86ba1af3e31fecf
2020-04-29 17:48:13 +00:00
Monty Taylor
767e001cd6 Run test playbooks with more forks
As we add jobs that have more nodes in them, we need to make
sure we're running ansible with enough forks that the jobs
don't take forever.

Change-Id: I2b5bf55bd65eaf0fc2671f5379bd0cb5c3696f87
2020-04-29 12:04:22 -05:00
Monty Taylor
05b0587871 Add nodepool node key
Change-Id: I28ccb83fc984190b1ce8e3e18c5945209fcb2387
2020-04-24 17:54:50 -05:00
Zuul
b21a8e58cf Merge "Run Zuul using Ansible and Containers" 2020-04-24 16:31:42 +00:00
Zuul
1b2d113c0f Merge "Split eavesdrop into its own playbook" 2020-04-24 15:02:34 +00:00
Monty Taylor
f0b77485ec Run Zuul using Ansible and Containers
Zuul is publishing lovely container images, so we should
go ahead and start using them.

We can't use containers for zuul-executor because of the
docker->bubblewrap->AFS issue, so install from pip there.

Don't start any of the containers by default, which should
let us safely roll this out and then do a rolling restart.
For things (like web or mergers) where it's safe to do so,
a followup change will swap the flag.

Change-Id: I37dcce3a67477ad3b2c36f2fd3657af18bc25c40
2020-04-24 09:18:44 -05:00
Monty Taylor
9fd2135a46 Split eavesdrop into its own playbook
Extract eavedrop into its own service playbook and
puppet manifest. While doing that, stop using jenkinsuser
on eavesdrop in favor of zuul-user.

Add the ability to override the keys for the zuul user.

Remove openstack_project::server, it doesn't do anything.

Containerize and anisblize accessbot. The structure of
how we're doing it in puppet makes it hard to actually
run the puppet in the gate. Run the script in its own
playbook so that we can avoid running it in the gate.

Change-Id: I53cb63ffa4ae50575d4fa37b24323ad13ec1bac3
2020-04-23 14:34:28 -05:00
Zuul
0b46f403ec Merge "Rearrange set-hostnames and cloud-init removal" 2020-04-21 20:18:55 +00:00
Monty Taylor
68b50ca05b Rearrange set-hostnames and cloud-init removal
In launch-node, we run two playbooks that aren't part of base.
One sets the system's hostname and removes cloud-init, the other
runs unattended update.

We need to run the hostname setting in our functional tests so
that the hosts behave as expected, but running the cloud-init
removal is a little weird, since our test nodes already don't
have it.

Make it so that set-hostname actually just sets the hostname,
and then run it in run-base. For running puppet, we need the
host to have the correct hostname.

Move cloud-init removal to the base-server role. Also move
the autoremove into base-server, since it's probably a nice
way to get rid of excess things.

Change-Id: I53cb8c515444a7d73b839e799c5794b067429daa
2020-04-21 13:18:24 -05:00
James E. Blair
f7bf07a03d Use real passwords for meetpad
The docker containers expect this now and refuse to start with
fake passwords.

Change-Id: I4c4bd243c9684e3987eeb99e4c66d31a882336a0
2020-04-20 09:05:51 -07:00
Monty Taylor
ebae022d07 Use project-config from zuul instead of direct clones
We use project-config for gerrit, gitea and nodepool config. That's
cool, because can clone that from zuul too and make sure that each
prod run we're doing runs with the contents of the patch in question.

Introduce a flag file that can be touched in /home/zuulcd that will
block zuul from running prod playbooks. By default, if the file is
there, zuul will wait for an hour before giving up.

Rename zuulcd to zuul

To better align prod and test, name the zuul user zuul.

Change-Id: I83c38c9c430218059579f3763e02d6b9f40c7b89
2020-04-15 12:29:33 -05:00
Monty Taylor
c117c1106d Update install-ansible away from /opt/system-config
So that we can start running things from the zuul source rather
thatn update-system-config and /opt/system-config, we need to
install a few things onto the host in install-ansible so that the
ansible env is standalone.

This introduces a split execution path. The ansible config is
now all installed globally onto the machine by install-ansible
and does not reference a git checkout.

For running ad-hoc commands, an ansible.cfg is introduced inside
the root of the system-config dir. So if ansible-playbook is
executed with PWD==/opt/system-config it will find that ansible.cfg,
it will take precedence, and any content from system-config
will take precedence.

As a followup we'll make /opt/system-config/ansible.cfg written
out by install-ansible from the same template, and we'll update
the split to make ansible only work when executed from one of
the two configured locations, so that it's clear where we're
operating from.

Change-Id: I097694244e95751d96e67304aaae53ad19d8b873
2020-04-14 14:54:23 -05:00
Monty Taylor
b23515c623 Make a new dockerized etherpad.opendev.org
Upstream likes building the settings file into the image, but that's
less exciting, let's bind-mount ours in.

Depends-On: https://review.opendev.org/717491/
Change-Id: Ia1894d884ef2a84e1282345b77fe07bf8898f367
2020-04-07 11:10:57 -05:00
Monty Taylor
2e6cf25e5d Rename bridge.yaml to install-ansible.yaml
We have a bridge.yaml and a service-bridge.yaml and it keeps
being confusing. Rename bridge.yaml to install-ansible.yaml to make
it clear what it is that it actually does.

Add a soft-depend on it for manage-projects, because if
something updates with the ansible config, we want it to
happen before running manage-projects.

Change-Id: Ia7c8dd0e32b2c4aaa674061037be5ab66d9a3581
2020-04-01 14:14:55 -05:00
Ian Wienand
e7f1062d51 Add install zookeeper role; use for nodepool-builder testing
This adds a simple role to install Zookeeper.

Add an option to nodepool-base to use this role to install Zookeeper.

Use this in the nodepool-builder gate testing where we are just
validating that the nodepool-builder container starts and is ready to
accept connections.  It needs a zookeeper to talk to, even though it
is not going to do anything.

Change-Id: I4ae89a51e454be4ee53ad4e04407162aaa8d9f9a
2020-03-06 14:02:52 +11:00
Monty Taylor
bbe8086726 Use LE certs for Apache
We're getting LE certs for the hosts now, use them in the apache
config. Also add the redirects.

Change-Id: I67d33b4c542182a2474ac0d2416357541b1c3a47
2020-02-13 10:31:59 -06:00
Monty Taylor
4de5f79599 Add Apache to Ansible for Gerrit
When we run gerrit, we also need to run Apache.

Change-Id: Ia2f1494808bd29d83e041e224cb2eb5fc406a93b
2020-02-03 07:57:36 -06:00
Ian Wienand
f49fc87f95 afs-client: move reduced cache to group variable
For gate testing we need the smaller AFS cache size applied to
everything that might install openafs, not just the mirror nodes.
Move the definition to the afs-client group.

Change-Id: Id27efd2f12f5ac3f351f65fa1ae513624a53df90
2019-12-16 15:34:12 +11:00
Zuul
29019411eb Merge "Run a gerrit container on review-dev01" 2019-12-15 19:00:21 +00:00
Clark Boylan
5392f8a27c Manage opendev.org cert with LE
This is the first step in managing the opendev.org cert with LE. We
modify gitea01.opendev.org only to request the cert so that if this
breaks the other 7 giteas can continue to serve opendev.org. When we are
happy with the results we can merge the followup change to update the
other 7 giteas.

Depends-On: https://review.opendev.org/694182
Change-Id: I9587b8c2896975aa0148cc3d9b37f325a0be8970
2019-11-18 12:07:10 -08:00
James E. Blair
4f9720e76e Run a gerrit container on review-dev01
This runs gerrit in a container on review-dev01 using podman.

Remove an unused web_server.py file that we found from copying it
from puppet to ansible.

Change-Id: I399d3cf8471bc8063022b0db0ff81718b2ee2941
2019-10-29 08:29:17 +09:00
Ian Wienand
912dff49e7 Set zuul_work_dir for tox testing
Setting this to system-config allows us to run the base tests as 3rd
party ci for projects like testinfra.

Change-Id: I2d15df154dcdc7c5da6c3326fbecec2146201164
2019-09-09 09:44:43 +10:00
Ian Wienand
814e4be128 Ansible roles for backup
This introduces two new roles for managing the backup-server and hosts
that we wish to back up.

Firstly the "backup" role runs on hosts we wish to backup.  This
generates and configures a separate ssh key for running bup and
installs the appropriate cron job to run the backup daily.

The "backup-server" job runs on the backup server (or, indeed
servers).  It creates users for each backup host, accepts the remote
keys mentioned above and initalises bup.  It is then ready to receive
backups from the remote hosts.

This eliminates a fairly long-standing requirement for manual setup of
the backup server users and keys; this section is removed from the
documentation.

testinfra coverage is added.

Change-Id: I9bf74df351e056791ed817180436617048224d2c
2019-08-05 16:59:57 +10:00
Ian Wienand
814b42f616 Set openafs cache sizes for mirror/mirror-update
Set the openafs cache values to the same as the puppet set values for
openafs-client role users.

Change-Id: I5a58673cad8df2a1e8dddb592c322e751d7f2ac5
2019-07-19 12:04:26 -07:00
Ian Wienand
82c6dec4fa Disable cloud launcher cron job during CI
This takes a similar approach to the extant ansible_cron_install_cron
variable to disable the cron job for the cloud launcher when running
under CI.

If you happen to have your CI jobs when the cron job decides to fire,
you end up with a harmless but confusing failed run of the cloud
launcher (that has tried to contact real clouds) in the ARA results.

Use the "disbaled" flag to ensure the cron job doesn't run.  Using
"disabled" means we can still check that the job was installed via
testinfra however.

Convert ansible_cron_install_cron to a similar method using disable,
document the variable in the README and add a test for the run_all.sh
script in crontab too.

Change-Id: If4911a5fa4116130c39b5a9717d610867ada7eb1
2019-07-16 15:01:55 +10:00
James E. Blair
ee3b273876 Exclude ansible_python_interpreter from write-inventory
Zuul now includes an ansible_python_interpreter hostvar in every
host in its inventory.  It defaults to python2.  The write-inventory
role, which takes the Zuul inventory and makes an inventory for
the fake bridge server in the gate passes that through.  Because it's
in /etc/ansible/inventory.yaml, it overrides any settings which may
arrive via group vars, but this is the way we set the interpreter
for all the hosts on bridge (we do not do so in the actual inventory
file).

To correct this, tell write-inventory to strip the
ansible_python_interpreter variable when it writes out the new
inventory.  This restores the behavior to match what happens on
the real bridge host.  One instance of setting the interpreter
for the fake "trusty" host used in base platform tests is moved to
a hostvars file to match the rest of the real hosts.

Change-Id: I60f0acb64e7b90ed8af266f21f2114fd598f4a3c
2019-07-10 10:10:02 -07:00
Ian Wienand
b85282c046 Move rsync mirror updates to new opendev.org mirror-update host
This move was prompted by wishing to expose the mirror update logs for
the rsync updates so that debugging problems does not require a root
user (note: not actually done in this change; will be a follow-on).

Rather than start hacking at puppet, the rsync mirror scripts make a
nice delination point for starting an Ansible-first/Bionic update.

Most magic is included in the scripts, so there is not much more to do
than copy them.  The host uses the existing kerberos and openafs roles
and copies the key material into place (to be added before merge).

Note the scripts are removed from the extant puppet so we don't have
two updates happening simultaneously.  This will also require a manual
clean to remove the cron jobs as a once-off when merging.

The other part of mirror-update is the reprepro based scripts for the
various debuntu repositories.  They are left as future work for now.

Testing is added to ensure dependencies and scripts are all in place.

Change-Id: I525ac18b55f0e11b0a541b51fa97ee5d6512bf70
2019-07-02 16:42:33 +10:00
Ian Wienand
d33105535a Separate openafs CI mirror
This is an intermediate step to having both kafs and openafs testing
in the gate; this just makes it clear which host is which.

Change-Id: I8cd006227ed47ad5f2c5eec664083477dd7ba397
2019-06-17 15:56:09 +10:00
Zuul
1fe34e00d4 Merge "Add control plane clouds to nodepool builder clouds.yaml" 2019-06-04 20:15:24 +00:00
Monty Taylor
ff1b8a94c6 Add control plane clouds to nodepool builder clouds.yaml
In order to have nodepool build images and upload them to control
plane clouds, add them to the clouds.yaml on the nodepool-builder
hosts. Keep them out of the launcher configs by splitting the config
templates. So that we can keep our copies of things to a minimum,
create a group called "control-plane-clouds" and put bridge and nb0*
in it.

There are clouds mentions in here that we no longer use, a followup
patch will clean those up.

NOTE: Requires shifting the clouds config dict from
host_vars/bridge.openstack.org.yaml to group_vars/control-plane-clouds.yaml
in the secrets on bridge.

Needed-By: https://review.opendev.org/640044
Change-Id: Id1161bca8f23129202599dba299c288a6aa29212
2019-05-23 14:34:10 -05:00
Ian Wienand
670107045a Create opendev mirrors
This impelements mirrors to live in the opendev.org namespace.  The
implementation is Ansible native for deployment on a Bionic node.

The hostname prefix remains the same (mirrorXX.region.provider.) but
the groups.yaml splits the opendev.org mirrors into a separate group.
The matches in the puppet group are also updated so to not run puppet
on the hosts.

The kerberos and openafs client parts do not need any updating and
works on the Bionic host.

The hosts are setup to provision certificates for themselves from
letsencrypt.  Note we've added a new handler for mirror nodes to use
that restarts apache on certificate issue/renewal.

The new "mirror" role is a port of the existing puppet mirror.pp.  It
installs apache, sets up some modules, makes some symlinks, sets up a
cleanup cron job and installs the apache vhost configuration.

The vhost configuration is also ported from the extant puppet.  It is
simplified somewhat; but the biggest change is that we have extracted
the main port 80 configuration into a macro which is applied to both
port 80 and 443; i.e. the host will have SSL support.  The other ports
are left alone for now, but can be updated in due course.

Thus we should be able to CNAME the existing mirrors to new nodes, and
any existing http access can continue.  We can update our mirror setup
scripts to point to https resources as appropriate.

Change-Id: Iec576d631dd5b02f6b9fb445ee600be060f9cf1e
2019-05-21 11:08:25 +10:00
James E. Blair
8ad300927e Split the base playbook into services
This is a first step toward making smaller playbooks which can be
run by Zuul in CD.

Zuul should be able to handle missing projects now, so remove it
from the puppet_git playbook and into puppet.

Make the base playbook be merely the base roles.

Make service playbooks for each service.

Remove the run-docker job because it's covered by service jobs.

Stop testing that puppet is installed in testinfra. It's accidentally
working due to the selection of non-puppeted hosts only being on
bionic nodes and not installing puppet on bionic. Instead, we can now
rely on actually *running* puppet when it's important, such as in the
eavesdrop job. Also remove the installation of puppet on the nodes in
the base job, since it's only useful to test that a synthetic test
of installing puppet on nodes we don't use works.

Don't run remote_puppet_git on gitea for now - it's too slow. A
followup patch will rework gitea project creation to not take hours.

Change-Id: Ibb78341c2c6be28005cea73542e829d8f7cfab08
2019-05-19 07:31:00 -05:00
Zuul
5ba6fc424d Merge "Use swift to back intermediate docker registry" 2019-04-23 00:30:21 +00:00
OpenDev Sysadmins
1ee61397a3 OpenDev Migration Patch
This commit was bulk generated and pushed by the OpenDev sysadmins
as a part of the Git hosting and code review systems migration
detailed in these mailing list posts:

http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003603.html
http://lists.openstack.org/pipermail/openstack-discuss/2019-April/004920.html

Attempts have been made to correct repository namespaces and
hostnames based on simple pattern matching, but it's possible some
were updated incorrectly or missed entirely. Please reach out to us
via the contact information listed at https://opendev.org/ with any
questions you may have.
2019-04-19 19:26:05 +00:00
James E. Blair
f357e5cdab Use swift to back intermediate docker registry
Note, this does not have complete tests yet (we will need to update
the job to start a swift for that).

Change-Id: I2ee7a9e4fb503a3431366c16c380cf09327f6050
2019-04-18 08:14:37 -07:00
Ian Wienand
afd907c16d letsencrypt support
This change contains the roles and testing for deploying certificates
on hosts using letsencrypt with domain authentication.

From a top level, the process is implemented in the roles as follows:

1) letsencrypt-acme-sh-install

   This role installs the acme.sh tool on hosts in the letsencrypt
   group, along with a small custom driver script to help parse output
   that is used by later roles.

2) letsencrypt-request-certs

   This role runs on each host, and reads a host variable describing
   the certificates required.  It uses the acme.sh tool (via the
   driver) to request the certificates from letsencrypt.  It populates
   a global Ansible variable with the authentication TXT records
   required.

   If the certificate exists on the host and is not within the renewal
   period, it should do nothing.

3) letsencrypt-install-txt-record

   This role runs on the adns server.  It installs the TXT records
   generated in step 2 to the acme.opendev.org domain and then
   refreshes the server.  Hosts wanting certificates will have
   pre-provisioned CNAME records for _acme-challenge.host.opendev.org
   pointing to acme.opendev.org.

4) letsencrypt-create-certs

   This role runs on each host, reading the same variable as in step
   2.  However this time the acme.sh tool is run to authenticate and
   create the certificates, which should now work correctly via the
   TXT records from step 3.  After this, the host will have the
   full certificate material.

Testing is added via testinfra.  For testing purposes requests are
made to the staging letsencrypt servers and a self-signed certificate
is provisioned in step 4 (as the authentication is not available
during CI).  We test that the DNS TXT records are created locally on
the CI adns server, however.

Related-Spec: https://review.openstack.org/587283

Change-Id: I1f66da614751a29cc565b37cdc9ff34d70fdfd3f
2019-04-02 15:31:41 +11:00
James E. Blair
d8f56f827b Disable ansible cron even more
We call the bridge playbook from run-base.yaml to bootstrap bridge,
so that's really where we need to disable the cron installation.

Change-Id: I5f3d604feaca5c1d577636c2d1130eec82a35961
2019-03-08 15:44:27 -08:00
James E. Blair
9ff29b108d Test gitea project creation playbook
Add an option to run a playbook (in the fake bridge context) after
running the base playbook.  Use this to run a new playbook which
exercises gitea project creation after bootstrapping the gitea
service.

Disable ansible-lint 304 because it erroneously thinks shell and
command are the same thing.

Change-Id: I0394b614771bc62b9fe23d811defd7767b3d10db
2019-03-06 18:42:39 +00:00
James E. Blair
4b031f9f24 Run an haproxy load balancer for gitea
This runs an haproxy which is strikingly similar to the one we
currently run for git.openstack.org, but it is run in a docker
container.

Change-Id: I647ae8c02eb2cd4f3db2b203d61a181f7eb632d2
2019-02-22 12:54:04 -08:00
James E. Blair
67cda2c7df Deploy gitea with docker-compose
This deploys a shared-nothing gitea server using docker-compose.
It includes a mariadb server.

Change-Id: I58aff016c7108c69dfc5f2ebd46667c4117ba5da
2019-02-18 08:46:40 -08:00
James E. Blair
12709a1c8b Run a docker registry for CI
Change-Id: If9669bb3286e25bb16ab09373e823b914b645f26
2019-02-01 10:12:51 -08:00
James E. Blair
dae1a0351c Configure opendev nameservers using ansible
Change-Id: Ie6430053159bf5a09b2c002ad6a4f84334a5bca3
2018-11-02 13:49:38 -07:00
James E. Blair
90e6088881 Configure adns1.opendev.org server via ansible
Change-Id: Ib4d3cd7501a276bff62e3bc0998d93c41f3ab185
2018-11-02 13:49:38 -07:00
Monty Taylor
e998db36f2 Add yamlgroup inventory plugin
The constructed inventory plugin allows expressing additional groups,
but it's too heavy weight for our needs. Additionally, it is a full
inventory plugin that will add hosts to the inventory if they don't
exist.

What we want instead is something that will associate existing hosts
(that would have come from another source) with groups.

This also switches to using emergency.yaml instead of emergency, which
uses the same format.

We add an extra groups file for gate testing to ensure the CI nodes
get puppet installed.

Change-Id: Iea8b2eb2e9c723aca06f75d3d3307893e320cced
2018-11-02 08:19:53 +11:00
James E. Blair
4477291111 Add testinfra tests for bridge
Change-Id: I4df79669c9daa3eb998ee666be6c53c957467748
2018-09-05 14:24:00 +10:00