237 Commits

Author SHA1 Message Date
Dirk Mueller
b3ce1c52dc Move openSUSE Tumbleweed into a caching mirror instead
Tumbleweed is only rarely used in the openStack CI, so mirroring it
fully is not worth the time/space overhead. a caching proxy
should be good enough. Add it to the directories to clean up
and remove the older entries because they will no longer be
matching.

Change-Id: I987da098cf4a7330cdec8da9ae3cfbff2f330bf8
2019-05-24 16:19:40 +10:00
Monty Taylor
69f618d36c Disable openid login and signup
This is not a feature we're intending to support at the current
time.

Change-Id: Ie33c266c8ebcaeb471066b52ce37c56c04f93e5d
2019-05-23 10:32:01 -05:00
Zuul
0ae85ed7bb Merge "Use local fork of gitea and upgrade to 1.8.0" 2019-05-22 16:58:09 +00:00
Ian Wienand
93bb1d549e letsencrypt : use date call for serial number
Per [1] ansible_date_time is NOT actually the date/time -- it is the
time cached from the facts.  It seems this can not be changed because,
of course, things have started depending on this behaviour.

This is particuarly incorrect if you're using this as a serial number
for DNS and it is not incrementing across runs, and thus bind is
refusing to load the new entries in the acme.opendev.org zone during
letsencrypt runs, and the TXT authentication fails.

Use the suggested work-around in the issue which is an external call
to date.

[1] https://github.com/ansible/ansible/issues/22561

Change-Id: Ic3f12f52e8fbb87a7cd673c37c6c4280c56c2b0f
2019-05-22 16:41:51 +10:00
James E. Blair
70b8118ab0 Use local fork of gitea and upgrade to 1.8.0
This has a few emergency local patches while we wait for them to
appear in an upstream release.

This updates the modified templates to match the changes in 1.8.0
upstream.

This also disables the oauth2 service, which is new in 1.8.0.
Without disabling this, gitea tries to generate a JWT secret and
write it to the file, which in our case is read only. If we want
to enable it, we need to add a new JWT_SECRET setting.

Change-Id: I969682bce6ff25b7614ce9265097307ee9cbc6cb
Co-Authored-By: Monty Taylor <mordred@inaugust.com>
2019-05-21 12:16:21 -05:00
Ian Wienand
73bbc6787f Bringup mirror01.dfw.rax.opendev.org
This is an initial host for testing opendev.org mirrors

Change-Id: I26b9ed1e21e2111f48bc7ecc384880c274eed213
Depends-On: https://review.opendev.org/660235
2019-05-21 11:08:30 +10:00
Ian Wienand
670107045a Create opendev mirrors
This impelements mirrors to live in the opendev.org namespace.  The
implementation is Ansible native for deployment on a Bionic node.

The hostname prefix remains the same (mirrorXX.region.provider.) but
the groups.yaml splits the opendev.org mirrors into a separate group.
The matches in the puppet group are also updated so to not run puppet
on the hosts.

The kerberos and openafs client parts do not need any updating and
works on the Bionic host.

The hosts are setup to provision certificates for themselves from
letsencrypt.  Note we've added a new handler for mirror nodes to use
that restarts apache on certificate issue/renewal.

The new "mirror" role is a port of the existing puppet mirror.pp.  It
installs apache, sets up some modules, makes some symlinks, sets up a
cleanup cron job and installs the apache vhost configuration.

The vhost configuration is also ported from the extant puppet.  It is
simplified somewhat; but the biggest change is that we have extracted
the main port 80 configuration into a macro which is applied to both
port 80 and 443; i.e. the host will have SSL support.  The other ports
are left alone for now, but can be updated in due course.

Thus we should be able to CNAME the existing mirrors to new nodes, and
any existing http access can continue.  We can update our mirror setup
scripts to point to https resources as appropriate.

Change-Id: Iec576d631dd5b02f6b9fb445ee600be060f9cf1e
2019-05-21 11:08:25 +10:00
Zuul
2c5847dad9 Merge "Split the base playbook into services" 2019-05-20 10:04:40 +00:00
James E. Blair
8ad300927e Split the base playbook into services
This is a first step toward making smaller playbooks which can be
run by Zuul in CD.

Zuul should be able to handle missing projects now, so remove it
from the puppet_git playbook and into puppet.

Make the base playbook be merely the base roles.

Make service playbooks for each service.

Remove the run-docker job because it's covered by service jobs.

Stop testing that puppet is installed in testinfra. It's accidentally
working due to the selection of non-puppeted hosts only being on
bionic nodes and not installing puppet on bionic. Instead, we can now
rely on actually *running* puppet when it's important, such as in the
eavesdrop job. Also remove the installation of puppet on the nodes in
the base job, since it's only useful to test that a synthetic test
of installing puppet on nodes we don't use works.

Don't run remote_puppet_git on gitea for now - it's too slow. A
followup patch will rework gitea project creation to not take hours.

Change-Id: Ibb78341c2c6be28005cea73542e829d8f7cfab08
2019-05-19 07:31:00 -05:00
Zuul
8ff026ee33 Merge "letsencrypt: use a fake CA for self-signed testing certs" 2019-05-16 23:51:19 +00:00
Zuul
33e09b7ef5 Merge "Use handlers for letsencrypt cert updates" 2019-05-16 23:51:18 +00:00
Zuul
157ad6d521 Merge "Prune docker images after docker-compose up" 2019-05-16 22:55:04 +00:00
Zuul
91a3ce7e4d Merge "Update zuul servers to puppet 4" 2019-05-14 20:21:03 +00:00
Ian Wienand
1992a9c1ec letsencrypt: use a fake CA for self-signed testing certs
Production letsencrypt certificate generation creates an intermediate
chain file (ca.cer); to simulate this during the self-signed tests
generate a fake CA certifcate, and use that to sign the generated
server certificate.

Tests updated to look for all these files

Change-Id: I3990529bca7ff3c6413ed0066f9c4feaf5464b1c
2019-05-14 10:24:28 +10:00
Ian Wienand
733122f0df Use handlers for letsencrypt cert updates
This change proposes calling a handler each time a certificate is
created/updated.  The handler name is based on the name of the
certificate given in the letsencrypt_certs variable, as described in
the role documentation.

Because Ansible considers calling a handler with no listeners an error
this means each letsencrypt user will need to provide a handler.

One simple option illustrated here is just to produce a stamp file.
This can facilitate cross-playbook and even cross-orchestration-tool
communication.  For example, puppet or other ansible playbooks can
detect this stamp file and schedule their reloads, etc. then remove
the stamp file.  It is conceivable more complex listeners could be
setup via other roles, etc. should the need arise.

A test is added to make sure the stamp file is created for the
letsencrypt test hosts, which are always generating a new certificate
in the gate test.

Change-Id: I4e0609c4751643d6e0c8d9eaa38f184e0ce5452e
2019-05-14 08:14:51 +10:00
Ian Wienand
2acfc176b0 Remove graphite.openstack.org
The server has been removed, remove it from inventory.

While we're here, s/graphite.openstack.org/graphite.opendev.org/'
... it's a CNAME redirect but we might as well clean up.

Change-Id: I36c951c85316cd65dde748b1e50ffa2e058c9a88
2019-05-08 05:55:33 +10:00
Clark Boylan
625d45567f Install socat where we install haproxy
Socat is useful for managing haproxy through the haproxy management
socket. Install it when we install haproxy.

Change-Id: Ie2b16cef62f661669756d24d4a69ac1683401268
2019-05-03 08:18:05 -07:00
Clark Boylan
f4bf952f34 Prune docker images after docker-compose up
This ensures that we cleanup images that are superceded and no longer
necessary. We do this to avoid filling the disk with docker images.

Note that we use the -f flag to avoid being prompted by docker image
prune for confirmation.

Change-Id: I8eb5bb97d8c66755e695498707220c9e6e7b2de0
2019-05-02 15:09:37 -07:00
Zuul
2a8783dbab Merge "Double stack size on gitea" 2019-04-23 20:32:59 +00:00
Zuul
5ba6fc424d Merge "Use swift to back intermediate docker registry" 2019-04-23 00:30:21 +00:00
James E. Blair
a845815520 Double stack size on gitea
Git can segfault and cause a gitea error due to the size of the
openstack/openstack repo.  Give it more stack space.

The hard limit is a workaround for
https://github.com/moby/moby/issues/39125

Change-Id: Ibce79d8ab27af3070bf9c5f584d0d78f2b266388
2019-04-22 17:00:00 -07:00
James E. Blair
65563f226e Bind to v4 and v6 in haproxy
Also, add a newline between listener stanzas in the config for
readability.

Change-Id: I599ca06f933e746fae3769e7872ae9911c4b00ed
2019-04-18 15:38:15 -07:00
James E. Blair
f357e5cdab Use swift to back intermediate docker registry
Note, this does not have complete tests yet (we will need to update
the job to start a swift for that).

Change-Id: I2ee7a9e4fb503a3431366c16c380cf09327f6050
2019-04-18 08:14:37 -07:00
Colleen Murphy
180897e49a Update zuul servers to puppet 4
This leaves ask.o.o and lists.o.o, which are still running Trusty, and
the cgit servers, which are likely to be decommissioned soon.

Change-Id: I78e7fd9e3079cc760da0aad955f6eeb32d442fc3
2019-04-17 16:53:56 +00:00
Zuul
6747cf236b Merge "Update nodepool servers to puppet 4" 2019-04-17 16:48:28 +00:00
Clark Boylan
671250095d Install a docker registry GC cron
This installs a daily cron job for garbage collecting the docker
registry. Note that we need to orphan blobs by deleting their tags for
this to result in any cleaned up blobs. This will be done in a separate
change.

Change-Id: I85c87ee3b3a375e0141ef9b15a0b9e56c0938bd8
2019-04-15 12:08:17 -07:00
Zuul
73fc6dde7c Merge "yamlgroup: add regex match; exclude puppet4 for arm64 mirrors" 2019-04-11 22:54:38 +00:00
Colleen Murphy
c7f8b298ef Update nodepool servers to puppet 4
Except nb03.openstack.org, which runs on arm64 for which there are no
puppet 4 packages.

Change-Id: Ia85d20700309a9cd886886c4d4da52fb80ac595f
2019-04-11 21:35:51 +00:00
Ian Wienand
4abd0a3184 yamlgroup: add regex match; exclude puppet4 for arm64 mirrors
Two related changes that need to go together because we test with the
production groups.yaml.

Confusingly, there are arm64 PC1 puppet repos, and it contains a bunch
of things that it turns out are the common java parts only.  The
puppet-agent package is not available, and it doesn't seem like it
will be [1].  I think this means we can not run puppet4 on our arm64
xenial ci hosts.

The problem is the mirrors have been updated to puppet4 -- runs are
now breaking on the arm mirrors because they don't have puppet-agent
packages.  It seems all we can really do at this point is contine to
run them on puppet3.

This is hard (impossible?) to express with a fnmatch in the existing
yamlgroups syntax.  We could do something like list all the mirror
hosts and use anchors etc, but we have to keep that maintained.  Add
an feature to the inventory plugin that if the list entry starts with
a ^ it is considered a full regex and passed to re.match.  This
allows us to write more complex matchers where required -- in this
case the arm64 ci mirror hosts are excluded from the puppet4 group.

Testing is updated.

[1] https://groups.google.com/forum/#!msg/puppet-dev/iBMYJpvhaWM/WTGmJvXxAgAJ

Change-Id: I828e0c524f8d5ca866786978486bc04829464b47
2019-04-11 21:34:57 +00:00
Ian Wienand
dedd3a409f letsencrypt: tighten certificate permissions
Ensure the certificate material is not world-readable.  Create a
letsencrypt group, and have things owned by root but group readable.

Change-Id: I49a6a8520aca27e70b3e48d0fcc874daf1c4ff24
2019-04-11 10:32:28 +10:00
Zuul
f028966fd3 Merge "Update AFS servers to puppet 4" 2019-04-10 23:27:10 +00:00
Zuul
8f9c2aada5 Merge "Update review.openstack.org to puppet 4" 2019-04-10 22:02:31 +00:00
Ian Wienand
86c5bc2b45 letsencrypt: split staging and self-signed generation
We currently only have letsencrypt_test_only as a single flag that
sets tests to use the letsencrypt staging environment and also
generates a self-signed certificate.

However, for initial testing we actually want to fully generate
certificates on hosts, but using the staging environment (i.e. *not*
generate self-signed certs).  Thus we need to split this option into
two, so the gate tests still use staging+self-signed, but in-progress
production hosts can just using the staging flag.

These variables are split, and graphite01.opendev.org is made to
create staging certificates.

Also remove some debugging that is no longer necessary.

Change-Id: I08959ba904f821c9408d8f363542502cd76a30a4
2019-04-10 08:47:32 +10:00
Zuul
693fe27610 Merge "letsencrypt : minor updates" 2019-04-08 23:02:16 +00:00
Zuul
f139a81994 Merge "letsencrypt support" 2019-04-08 22:43:54 +00:00
Zuul
2226ab5c98 Merge "Remove zonefile from nsd config" 2019-04-07 23:22:12 +00:00
Zuul
029c81a84a Merge "Add a stop timeout to gitea docker-compose up" 2019-04-05 21:58:42 +00:00
Colleen Murphy
a988c9253e Update AFS servers to puppet 4
Change-Id: I02d63fe1198a8d023814820602d425f891efdb73
2019-04-05 09:31:29 -07:00
Ian Wienand
6088c788f1 letsencrypt : minor updates
Minor updates from review comments for
I1f66da614751a29cc565b37cdc9ff34d70fdfd3f

Change-Id: Ie011f768345ca3d8fdcc0b833f5645a635983d64
2019-04-05 16:50:34 +11:00
Zuul
c3b25fa22c Merge "Upgrade lists.katacontainers.io to puppet 4" 2019-04-04 16:17:38 +00:00
Ian Wienand
afd907c16d letsencrypt support
This change contains the roles and testing for deploying certificates
on hosts using letsencrypt with domain authentication.

From a top level, the process is implemented in the roles as follows:

1) letsencrypt-acme-sh-install

   This role installs the acme.sh tool on hosts in the letsencrypt
   group, along with a small custom driver script to help parse output
   that is used by later roles.

2) letsencrypt-request-certs

   This role runs on each host, and reads a host variable describing
   the certificates required.  It uses the acme.sh tool (via the
   driver) to request the certificates from letsencrypt.  It populates
   a global Ansible variable with the authentication TXT records
   required.

   If the certificate exists on the host and is not within the renewal
   period, it should do nothing.

3) letsencrypt-install-txt-record

   This role runs on the adns server.  It installs the TXT records
   generated in step 2 to the acme.opendev.org domain and then
   refreshes the server.  Hosts wanting certificates will have
   pre-provisioned CNAME records for _acme-challenge.host.opendev.org
   pointing to acme.opendev.org.

4) letsencrypt-create-certs

   This role runs on each host, reading the same variable as in step
   2.  However this time the acme.sh tool is run to authenticate and
   create the certificates, which should now work correctly via the
   TXT records from step 3.  After this, the host will have the
   full certificate material.

Testing is added via testinfra.  For testing purposes requests are
made to the staging letsencrypt servers and a self-signed certificate
is provisioned in step 4 (as the authentication is not available
during CI).  We test that the DNS TXT records are created locally on
the CI adns server, however.

Related-Spec: https://review.openstack.org/587283

Change-Id: I1f66da614751a29cc565b37cdc9ff34d70fdfd3f
2019-04-02 15:31:41 +11:00
Ian Wienand
6256732c10 Remove zonefile from nsd config
The zonefile isn't required in the config file as we are just
transfering from adns1.  Since we don't create the directory for the
files, it results in warnings in the nsd logs -- this can be a
confusing red-herring in a debugging situation.

Change-Id: I3e16a359549707a4a3967f580161dec9e71ab689
Related-Bug: https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=4244
2019-04-02 13:20:01 +11:00
Colleen Murphy
db0cf87ddb Update review.openstack.org to puppet 4
Change-Id: I841bae26862d4da41849835bb9f9548a2011cc95
2019-04-01 14:54:04 -07:00
Colleen Murphy
9a7172ab8a Upgrade lists.katacontainers.io to puppet 4
Change-Id: Ic0235ffec7d65a30a44fb518414e872a44b99f37
2019-04-01 14:53:42 -07:00
Ian Wienand
66ceb321a6 master-nameserver: Add unmanaged domains; add acme.opendev.org
This adds the concept of an unmanaged domain; for unmanaged domains we
will write out the zone file only if it doesn't already exist.

acme.opendev.org is added as an unmanaged domain.  It will be managed
by other ansible roles which add TXT records for ACME authentication.
The initial template comes from the dependent change, and this ensures
the bind configuration is always valid.

For flexibility and testing purposes, we allow passing an extra
refspec and version to the git checkout.  This is one way to pull in
changes for speculative CI runs (I looked into having the hosts under
test checkout from Zuul; but by the time we're 3-ansible call's deep
on the DNS hosts-under-test it's a real pain.  For the amount of times
we update this, it's easier to just allow a speculative change that
can take a gerrit URL; for an example see [1])

[1] https://review.openstack.org/#/c/641155/10/playbooks/group_vars/dns.yaml

Testing is enhanced to check for zone files and correct configuration
stanzas.

Depends-On: https://review.openstack.org/641154
Depends-On: https://review.openstack.org/641168
Change-Id: I9ef5cfc850c3458c63aff46cfaa0d49a5d194e87
2019-03-27 14:22:59 +11:00
Clark Boylan
fa0d4f949e Update even more servers to puppet4
Change-Id: Ice2a07e0f1914b45690455b6b7199fc8441f21be
2019-03-22 09:51:25 -07:00
Zuul
5538493af2 Merge "Ensure lockfile dir is created for bridge.o.o" 2019-03-19 10:18:59 +00:00
Zuul
ffd56ffc75 Merge "Set the gitea theme color to match the opendev pink" 2019-03-18 23:48:58 +00:00
Clark Boylan
05b56342bc Set the gitea theme color to match the opendev pink
Currently the default is to use gitea green which looks a little weird
on our site when using mobile browsers. I don't see an easy way to
revert to the browsers default color but there may be some magic like
using a 'default' string? That may be an alternative option we can
consider.

Change-Id: Ia4e3d25b75bba169c3b7cc60c52c0de791e6be21
2019-03-18 14:08:51 -07:00
Clark Boylan
177edc0abb Retry gitea repo setting HTTP POSTs
I ran our global gitea project sync playbook across all eight gitea
hosts and one failed with a 404 against a specific project. Rerunning
the playbook against that one gitea server worked fine.

Until we sort out why this might happen lets retry our HTTP POSTs up to
3 times until they succeed.

Some numbers: We have ~2k repos and 8 servers and make two http requests
per repo for a total of 32k requests. If one fails out of that the
success rate is very high so retrying a few times should be fine.

Change-Id: I937a4f852f6713a419c03a17c3b4984a97eae0d8
2019-03-15 13:01:39 -07:00