In order to have nodepool build images and upload them to control
plane clouds, add them to the clouds.yaml on the nodepool-builder
hosts. Keep them out of the launcher configs by splitting the config
templates. So that we can keep our copies of things to a minimum,
create a group called "control-plane-clouds" and put bridge and nb0*
in it.
There are clouds mentions in here that we no longer use, a followup
patch will clean those up.
NOTE: Requires shifting the clouds config dict from
host_vars/bridge.openstack.org.yaml to group_vars/control-plane-clouds.yaml
in the secrets on bridge.
Needed-By: https://review.opendev.org/640044
Change-Id: Id1161bca8f23129202599dba299c288a6aa29212
This impelements mirrors to live in the opendev.org namespace. The
implementation is Ansible native for deployment on a Bionic node.
The hostname prefix remains the same (mirrorXX.region.provider.) but
the groups.yaml splits the opendev.org mirrors into a separate group.
The matches in the puppet group are also updated so to not run puppet
on the hosts.
The kerberos and openafs client parts do not need any updating and
works on the Bionic host.
The hosts are setup to provision certificates for themselves from
letsencrypt. Note we've added a new handler for mirror nodes to use
that restarts apache on certificate issue/renewal.
The new "mirror" role is a port of the existing puppet mirror.pp. It
installs apache, sets up some modules, makes some symlinks, sets up a
cleanup cron job and installs the apache vhost configuration.
The vhost configuration is also ported from the extant puppet. It is
simplified somewhat; but the biggest change is that we have extracted
the main port 80 configuration into a macro which is applied to both
port 80 and 443; i.e. the host will have SSL support. The other ports
are left alone for now, but can be updated in due course.
Thus we should be able to CNAME the existing mirrors to new nodes, and
any existing http access can continue. We can update our mirror setup
scripts to point to https resources as appropriate.
Change-Id: Iec576d631dd5b02f6b9fb445ee600be060f9cf1e
This is a first step toward making smaller playbooks which can be
run by Zuul in CD.
Zuul should be able to handle missing projects now, so remove it
from the puppet_git playbook and into puppet.
Make the base playbook be merely the base roles.
Make service playbooks for each service.
Remove the run-docker job because it's covered by service jobs.
Stop testing that puppet is installed in testinfra. It's accidentally
working due to the selection of non-puppeted hosts only being on
bionic nodes and not installing puppet on bionic. Instead, we can now
rely on actually *running* puppet when it's important, such as in the
eavesdrop job. Also remove the installation of puppet on the nodes in
the base job, since it's only useful to test that a synthetic test
of installing puppet on nodes we don't use works.
Don't run remote_puppet_git on gitea for now - it's too slow. A
followup patch will rework gitea project creation to not take hours.
Change-Id: Ibb78341c2c6be28005cea73542e829d8f7cfab08
This change proposes calling a handler each time a certificate is
created/updated. The handler name is based on the name of the
certificate given in the letsencrypt_certs variable, as described in
the role documentation.
Because Ansible considers calling a handler with no listeners an error
this means each letsencrypt user will need to provide a handler.
One simple option illustrated here is just to produce a stamp file.
This can facilitate cross-playbook and even cross-orchestration-tool
communication. For example, puppet or other ansible playbooks can
detect this stamp file and schedule their reloads, etc. then remove
the stamp file. It is conceivable more complex listeners could be
setup via other roles, etc. should the need arise.
A test is added to make sure the stamp file is created for the
letsencrypt test hosts, which are always generating a new certificate
in the gate test.
Change-Id: I4e0609c4751643d6e0c8d9eaa38f184e0ce5452e
Note, this does not have complete tests yet (we will need to update
the job to start a swift for that).
Change-Id: I2ee7a9e4fb503a3431366c16c380cf09327f6050
We currently only have letsencrypt_test_only as a single flag that
sets tests to use the letsencrypt staging environment and also
generates a self-signed certificate.
However, for initial testing we actually want to fully generate
certificates on hosts, but using the staging environment (i.e. *not*
generate self-signed certs). Thus we need to split this option into
two, so the gate tests still use staging+self-signed, but in-progress
production hosts can just using the staging flag.
These variables are split, and graphite01.opendev.org is made to
create staging certificates.
Also remove some debugging that is no longer necessary.
Change-Id: I08959ba904f821c9408d8f363542502cd76a30a4
We don't have python2 on bridge.o.o, force python3.
Change-Id: Ie8eb68007c0854329cf3757e577ebcbfd40ed8aa
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
This change contains the roles and testing for deploying certificates
on hosts using letsencrypt with domain authentication.
From a top level, the process is implemented in the roles as follows:
1) letsencrypt-acme-sh-install
This role installs the acme.sh tool on hosts in the letsencrypt
group, along with a small custom driver script to help parse output
that is used by later roles.
2) letsencrypt-request-certs
This role runs on each host, and reads a host variable describing
the certificates required. It uses the acme.sh tool (via the
driver) to request the certificates from letsencrypt. It populates
a global Ansible variable with the authentication TXT records
required.
If the certificate exists on the host and is not within the renewal
period, it should do nothing.
3) letsencrypt-install-txt-record
This role runs on the adns server. It installs the TXT records
generated in step 2 to the acme.opendev.org domain and then
refreshes the server. Hosts wanting certificates will have
pre-provisioned CNAME records for _acme-challenge.host.opendev.org
pointing to acme.opendev.org.
4) letsencrypt-create-certs
This role runs on each host, reading the same variable as in step
2. However this time the acme.sh tool is run to authenticate and
create the certificates, which should now work correctly via the
TXT records from step 3. After this, the host will have the
full certificate material.
Testing is added via testinfra. For testing purposes requests are
made to the staging letsencrypt servers and a self-signed certificate
is provisioned in step 4 (as the authentication is not available
during CI). We test that the DNS TXT records are created locally on
the CI adns server, however.
Related-Spec: https://review.openstack.org/587283
Change-Id: I1f66da614751a29cc565b37cdc9ff34d70fdfd3f
Change I754637115f8c7469efbc1856e88bbcb6fb83b4ce moved a bunch of log
collection to use "stage-output". This uses "fetch-output" which
automatically puts these logs in hostname subdirectories; but it does
not have an option to put it in hosts/hostname as we were doing with
the other logs.
Although we could add such support, it probably doesn't make sense as
most other multinode jobs will have the same layout with the host logs
at the top level. Remove the intermediate "/hosts/" directory on
system-config jobs so all logs remain at the top level, and we don't
have this confusing split as to where logs are for each host.
Change-Id: I56bd67c659ffb26a460d9406f6f090d431c8aa79
This adds the concept of an unmanaged domain; for unmanaged domains we
will write out the zone file only if it doesn't already exist.
acme.opendev.org is added as an unmanaged domain. It will be managed
by other ansible roles which add TXT records for ACME authentication.
The initial template comes from the dependent change, and this ensures
the bind configuration is always valid.
For flexibility and testing purposes, we allow passing an extra
refspec and version to the git checkout. This is one way to pull in
changes for speculative CI runs (I looked into having the hosts under
test checkout from Zuul; but by the time we're 3-ansible call's deep
on the DNS hosts-under-test it's a real pain. For the amount of times
we update this, it's easier to just allow a speculative change that
can take a gerrit URL; for an example see [1])
[1] https://review.openstack.org/#/c/641155/10/playbooks/group_vars/dns.yaml
Testing is enhanced to check for zone files and correct configuration
stanzas.
Depends-On: https://review.openstack.org/641154
Depends-On: https://review.openstack.org/641168
Change-Id: I9ef5cfc850c3458c63aff46cfaa0d49a5d194e87
This allows the zones to load, which is useful in follow-on changes
where we can query them on the host from testinfra to make sure it's
all working.
Change-Id: I9d22c07ce2d1ebad67b0f1ca222c1b457779ce47
We call the bridge playbook from run-base.yaml to bootstrap bridge,
so that's really where we need to disable the cron installation.
Change-Id: I5f3d604feaca5c1d577636c2d1130eec82a35961
The run_all cron running in test jobs is unawesome because it can
cause the inventory overrides we put in for the testing to get
overwritten with the real inventory. We don't want test jobs
attempting to run against real hosts.
Change-Id: I733f66ff24b329d193799e6063953e88dd6a35b1
Add an option to run a playbook (in the fake bridge context) after
running the base playbook. Use this to run a new playbook which
exercises gitea project creation after bootstrapping the gitea
service.
Disable ansible-lint 304 because it erroneously thinks shell and
command are the same thing.
Change-Id: I0394b614771bc62b9fe23d811defd7767b3d10db
We want to trigger ansible runs on bridge.o.o from zuul jobs. First
iteration of this tried to login as root but this is not allowed by our
ssh config. That config seems reasonable so we add a zuul user instead
which we can ssh in as then run things as root from zuul jobs. This
makes use of our existing user management system.
Change-Id: I257ebb6ffbade4eb645a08d3602a7024069e60b3
This runs an haproxy which is strikingly similar to the one we
currently run for git.openstack.org, but it is run in a docker
container.
Change-Id: I647ae8c02eb2cd4f3db2b203d61a181f7eb632d2
When setting up hosts for testing in CI, configure the docker
mirrors before running the base playbook.
Change-Id: I172ae87156238fa6a07414c74e1ca17df1a30257
Add the gitea k8s cluster to root's .kube/config file on bridge.
The default context does not exist in order to force us to explicitly
specify a context for all commands (so that we do not inadvertently
deploy something on the wrong k8s cluster).
Change-Id: I53368c76e6f5b3ab45b1982e9a977f9ce9f08581
There are upstream jobs in zuul-jobs with the docker build playbooks,
so use them. The system-config jobs are kept so that we don't have
to duplicate the secret stanza.
Change-Id: Iceee55a3d0e8b243549fa988f134b1ea9bb6dac5
This adds the infrastructure for building docker images: the
credential used to upload to Docker Hub as well as the parent jobs
and playbooks to perform the builds.
Change-Id: I7cbbcdd184c4934f1b0ce5905d9760c732b06aa9
Depends-On: https://review.openstack.org/631078
The gerrit source dir needs three plugins cloned into
the plugins dir and also a few files updated.
Depends-On: https://review.openstack.org/631007
Change-Id: I56037137d43ee1cea0a4c17e48d09102e1599ddc
Whenever we promote an image, delete the change tag for that image
in Docker Hub, and also delete any change tags older than 24 hours
in order to keep the Docker Hub image registry tidy.
Change-Id: Id4654c893963bdb0a364b1132793fe4fb152bf27
If we clone gerrit to ~/src/gerrit.googlesource.com/gerrit but
want to keep the Dockerfile in system-config, then we need to be
able to run:
docker build ~/src/gerrit.googlesource.com/gerrit -f Dockerfile
Most of the time the dir will just be '.', so put in a sensible
default.
Change-Id: I235080c05e679d2ac270cd5401b85c655fab3112
This job has no nodes; the playbook needs to run on localhost.
The only tasks use the uri module without local files, so should
be safe.
Change-Id: Ic012426a66be3b85efe9af35089addf1316dfa63
Upload an image to dockerhub with a change-specific tag in every
gate job, and then, if the change lands, re-tag the image in
dockerhub.
Change-Id: Ie57fc342cbe29d261d33845829b77a0c1bae5ff4
This is a role for installing docker on our control-plane servers.
It is based on install-docker from zuul-jobs.
Basic testinfra tests are added; because docker fiddles the iptables
rules in magic ways, the firewall testing is moved out of the base
tests and modified to partially match our base firewall configuration.
Change-Id: Ia4de5032789ff0f2b07d4f93c0c52cf94aa9c25c
This collects syslogs from nodes running in our ansible gate tests.
The node's logs are grouped under a "hosts" directory (the bridge.o.o
logs are moved there for consistentcy too).
Change-Id: I3869946888f09e189c61be4afb280673aa3a3f2e
This change takes the ARA report from the "inner" run of the base
playbooks on our bridge.o.o node and publishes it into the final log
output. This is then displayed by the middleware.
Create a new log hierarchy with a "bridge.o.o" to make it clear the
logs here are related to the test running on that node. Move the
ansible config under there too.
Change-Id: I74122db09f0f712836a0ee820c6fac87c3c9c734
This adds connection information for an experimental kubernetes
cluster hosted in vexxhost-sjc1 to the nodepool servers.
Change-Id: Ie7aad841df1779ddba69315ddd9e0ae96a1c8c53