Currently we don't set a contact email with our accounts. This is an
optional feature, but would be helpful for things like [1] where we
would be notified of certificates affected by bugs, etc.
Setup the email address in the acme.sh config which will apply with
any new accounts created. To update all the existing hosts, we see if
the account email is added/modified in the config *and* if we have
existing account details; if so we need a manual update call.
For anyone who might be poking here, we also add a note on sharing an
account based on some broadly agreed upon discussion in IRC.
[1] https://community.letsencrypt.org/t/revoking-certain-certificates-on-march-4/114864
Change-Id: Ib4dc3e179010419a1b18f355d13b62c6cc4bc7e8
This site was never used nor published, it can be killed according to QA
PTL.
codesearch returns no matches for it in any docs.
Keep the occurence in manifests/static.pp, this will get deleted
as part of https://review.opendev.org/710388.
Change-Id: I3c0d3b567a3eccb959dc903f169197e4581f1e13
The content for many projects has moved but the legacy redirects were
not updated, update to current location.
Change-Id: I7030ad35378085b0c45429c272dc24f00d33b2d2
This is a slight divergence from the accepted spec, where we were
going to implement these redirects via a new haproxy instance
(I961456d44a56f2334d3c94ef27e408f27409cd65). We've decided it's
easier to keep them on static.opendev.org
The following sites are configured to redirect to whatever they are
redirecting to now on static.opendev.org:
* devstack.org
* www.devstack.org
* ci.openstack.org
* cinder.openstack.org
* glance.openstack.org
* horizon.openstack.org
* keystone.openstack.org
* nova.openstack.org
* qa.openstack.org
* summit.openstack.org
* swift.openstack.org
As a bonus, they all get a https instance too, which they didn't have
before.
testinfra coverage should be total for this change. I have created
the _acme-challange CNAME records for all the above.
Story: #2006598
Task: #38881
Change-Id: I3f1fc108e7bb1c9500ad4d1a51df13bb4ae00cb9
When converting this from a htaccess file to run in the virtualhost
context, one instance of '^cgit' -> '^/cgit' was missed. Fix it, and
add a coverage test for it to testinfra.
Change-Id: Icc1dae6dce232e69c5cd1cf98b594f562c60d3f2
This creates the redirect sites
git.airshipit.org
git.openstack.org
git.starlingx.io
git.zuul-ci.org
The htaccess rules are put into the main configuration file to avoid
having to create a directory and manage another file. We use a macro
to duplicate the rules and retain the old semantics of the http site
redirecting directly (as opposed to doing a extra 301 to
https://git.openstack.org first). This required adding "/" to the "^"
matches as it now runs in VirtualHost context; no functional change is
intended over the old sites.
This will require _acme-challenge CNAMEs to acme.opendev.org before
being merged.
testinfra is updated to exercise some redirects matching against the
results of the extant sites.
Change-Id: Iaa9d5dc2af3f5f8abc11c2312e4308b50f5fcd2b
files.openstack.org serves a view of /afs/openstack.org/, which is the
same as static.opendev.org. Add a serveralias for it and certificate.
Make static.openstack.org be consistent with opendev by showing the
same thing.
Change-Id: I4c492e3b02554a7c736c015790bd4cd5bb435a43
This is an alternative to Iccf24a72cf82592bae8c699f9f857aa54fc74f10
which removes the 404 scraping tool. It creates a zuul user and
enables login via the system-config per-project ssh key, and then runs
the 404 scraping script against it periodically.
Change-Id: I30467d791a7877b5469b173926216615eb57d035
This creates sites to serve
developer.openstack.org
docs.openstack.org
docs.opendev.org
docs.starlingx.io
which are all just static directories underneath /afs/openstack.org/.
This is currently done by files02.openstack.org, but will be better
served in the future by consolidating in ansible configuration on
static.opendev.org.
The following dns entries need to be made before merging to ensure the
certificates are provisioned
_acme-challenge.developer.openstack.org
_acme-challenge.docs.openstack.org
_acme-challenge.docs.opendev.org
_acme-challenge.docs.starlingx.io
Once done, we can merge and then cut-over the main DNS entries as we
like.
Since there are some follow-ons, I have not removed the puppet
configuration from files02.openstack.org. I think it's best we
migrate everything away from that and remove it in one lot.
Change-Id: I459a36f823a8868e6cc09e2b0d85f2fe05d69002
This adds the site to publish from
/afs/openstack.org/project/releases.openstack.org
Change-Id: Ia91deb9a51441ac9974137ed39fc5a185689a11c
Task: #37724
Story: #2006598
If you currently hit https://static.opendev.org you get redirected to
the default site, which is just the first site in alphabetical order,
which happens to be governance.openstack.org.
Add a 00-static.opendev.org.conf file so this is the default site. It
will just serve up the top-level afs directory.
Change-Id: Icdcee962b76545c12e84d4cadb0b60a68cabe38b
This migrates the afsmon script from puppet deploying on
mirror-update.openstack.org to ansible deploying on
mirror-update.opendev.org.
There is nothing particularly special and this just a straight install
with some minor dependencies. Since we have log publishing running on
the opendev.org server, we publish the update logs alongside the
others.
Change-Id: Ifa3b4d59f8d0fc23a4492e50348bab30766d5779
This is a migration of the current periodic "vos release" script to
mirror-update.opendev.org.
The current script is deployed by puppet and run by a cron job on
afsdb01.dfw.openstack.org.
My initial motivation for this was wanting to better track our release
of these various volumes. With tarballs and releases moving to AFS
publishing, we are going to want to track the release process more
carefully.
Initially, I wanted to send timing statistics to graphite so we could
build a dashboard and track the release times of all volumes. Because
this requires an additional libraries and since we are deprecating
puppet, further development there is unappealing and it would better
live in ansible.
Since I6c96f89c6f113362e6085febca70d58176f678e7 we have the ability to
call "vos release" with "-localauth" permissions via ssh on
mirror-update; this avoids various timeout issues (see the changelog
comment there for more details). So we do not need to run this script
directly on the afsdb server.
We are alreadying publishing mirror update logs from mirror-update,
and it would be good to also publish these release logs so anyone can
see if there are problems.
All this points to mirror-update.opendev.org being a good future home
for this script.
The script has been refactored some to
- have a no-op mode
- send timing stats for each volume release
- call "vos release" via the ssh mecahnism we created
- use an advisory lock to avoid running over itself
It runs from a virtualenv and it's logs are published via the same
mechanism as the mirror logs (slightly misnamed now).
Note this script is currently a no-op to test the deployment, running
and log publishing. A follow-up will disable the old job and make
this active.
Change-Id: I62ae941e70c7d58e00bc663a50d52e79dfa5a684
Add these hosts to static.opendev.org, serving from AFS. Note that
tarballs.openstack.org just redirects to static.opendev.org/openstack.
This should have no effect currently, it will only become live when we
switch DNS.
For more details see the thread at:
http://lists.openstack.org/pipermail/openstack-infra/2020-January/006584.html
Change-Id: Ie56fac17ffaa91ee55be986de636485a58125a02
Add a new review-dev server on the opendev domain with LE support
enabled.
Depends-On: https://review.opendev.org/705661
Change-Id: Ie32124cd617e9986602301f230e83bb138524fdf
This runs gerrit in a container on review-dev01 using podman.
Remove an unused web_server.py file that we found from copying it
from puppet to ansible.
Change-Id: I399d3cf8471bc8063022b0db0ff81718b2ee2941
The ssh config file is /.ssh/config (not ssh_config)
We are accepting the ed25519 key, not the ecdsa key, so fix that in
the known_hosts stanza.
Change-Id: If3a42a7872f5d5e7a2bf9c3b5184fb14d43e6a1a
Currently we don't have any logs from our gitea sshd processes because
sshd logs to syslog by default and /dev/log isn't in our containers. You
can ask sshd nicely to log to stderr instead with the -e flag which
docker will pick up and store for us.
Update the sshd command to include -e then use testinfra to check we
collect logs and they are accssible from docker.
Change-Id: Ib7d6d405554c3c30be410bc08c6fee7d4363b096
This introduces two new roles for managing the backup-server and hosts
that we wish to back up.
Firstly the "backup" role runs on hosts we wish to backup. This
generates and configures a separate ssh key for running bup and
installs the appropriate cron job to run the backup daily.
The "backup-server" job runs on the backup server (or, indeed
servers). It creates users for each backup host, accepts the remote
keys mentioned above and initalises bup. It is then ready to receive
backups from the remote hosts.
This eliminates a fairly long-standing requirement for manual setup of
the backup server users and keys; this section is removed from the
documentation.
testinfra coverage is added.
Change-Id: I9bf74df351e056791ed817180436617048224d2c
The options to disable installing suggests and recommended packages
has been in diskimage-builder based images for a long time [1].
However we have no setting for it in our base-server role, meaning
that when launching nodes from cloud-provider images we can be out of
sync on this option.
I6d69ac0bd2ade95fede33c5f82e7df218da9458b is an example where packages
pulled in by suggestions can fail (arguably a packaging issue, but
anyway...)
By enabling this here, we make our control plane servers homogenous
with our diskimage-builder based testing nodes, which is better for
general sanity. Overall it gives us more control over what's
installed.
[1] https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/dpkg/pre-install.d/00-disable-apt-recommends
As I6d69ac0bd2ade95fede33c5f82e7df218da9458b showed, installing
suggested or recommended packages might result in
Change-Id: Id6dcc158944a46fc0ae03b6f1ff372dacd67c2e6
Add new IP addresses to inventory for the rebuild, but don't
reactivate it in the haproxy pools yet.
Note this switches the gitea testing to use a host called gitea99 so
that it doesn't conflict with our changes of the production hosts.
Change-Id: I9779e16cca423bcf514dd3a8d9f14e91d43f1ca3
This takes a similar approach to the extant ansible_cron_install_cron
variable to disable the cron job for the cloud launcher when running
under CI.
If you happen to have your CI jobs when the cron job decides to fire,
you end up with a harmless but confusing failed run of the cloud
launcher (that has tried to contact real clouds) in the ARA results.
Use the "disbaled" flag to ensure the cron job doesn't run. Using
"disabled" means we can still check that the job was installed via
testinfra however.
Convert ansible_cron_install_cron to a similar method using disable,
document the variable in the README and add a test for the run_all.sh
script in crontab too.
Change-Id: If4911a5fa4116130c39b5a9717d610867ada7eb1
This adds a periodic job to copy logs to a mirror volume, and export
it via the usual mirror http.
I have precreated the log volume; just as a R/W volume because this is
expected to be very low volume access.
Change-Id: I67870f6d439af2d2a63a5048ef52cecff3e75275
Keytabs are slightly longer than what is being tested; upto 100 bytes
or so. This means the encoded data breaks over lines, which means you
need to be more careful about quoting.
Update the testing to a longer keytab (100 bytes of random data) and
fix up the quoting. Also enable no_logging to avoid putting key
material into the logs.
Change-Id: I73c391a2ebd2c962dc9a422f9d44265160210852
This move was prompted by wishing to expose the mirror update logs for
the rsync updates so that debugging problems does not require a root
user (note: not actually done in this change; will be a follow-on).
Rather than start hacking at puppet, the rsync mirror scripts make a
nice delination point for starting an Ansible-first/Bionic update.
Most magic is included in the scripts, so there is not much more to do
than copy them. The host uses the existing kerberos and openafs roles
and copies the key material into place (to be added before merge).
Note the scripts are removed from the extant puppet so we don't have
two updates happening simultaneously. This will also require a manual
clean to remove the cron jobs as a once-off when merging.
The other part of mirror-update is the reprepro based scripts for the
various debuntu repositories. They are left as future work for now.
Testing is added to ensure dependencies and scripts are all in place.
Change-Id: I525ac18b55f0e11b0a541b51fa97ee5d6512bf70
There are long-standing issues with ntp start ordering w.r.t unbound
and being able to resolve DNS names. Things have moved on to
systemd-timesyncd anyway. Move the ntp start from the generic
locations to only apply to older distros, and use system-timesyncd on
Bionic. Update testing.
Change-Id: I664539f93242e2c68d0cb1cf95c260f3bc03550d
Build a container image with the haproxy-statsd script, and run that
along with the haproxy container.
Change-Id: I18be70d339df613bf9a72e115e80a6da876111e0
This impelements mirrors to live in the opendev.org namespace. The
implementation is Ansible native for deployment on a Bionic node.
The hostname prefix remains the same (mirrorXX.region.provider.) but
the groups.yaml splits the opendev.org mirrors into a separate group.
The matches in the puppet group are also updated so to not run puppet
on the hosts.
The kerberos and openafs client parts do not need any updating and
works on the Bionic host.
The hosts are setup to provision certificates for themselves from
letsencrypt. Note we've added a new handler for mirror nodes to use
that restarts apache on certificate issue/renewal.
The new "mirror" role is a port of the existing puppet mirror.pp. It
installs apache, sets up some modules, makes some symlinks, sets up a
cleanup cron job and installs the apache vhost configuration.
The vhost configuration is also ported from the extant puppet. It is
simplified somewhat; but the biggest change is that we have extracted
the main port 80 configuration into a macro which is applied to both
port 80 and 443; i.e. the host will have SSL support. The other ports
are left alone for now, but can be updated in due course.
Thus we should be able to CNAME the existing mirrors to new nodes, and
any existing http access can continue. We can update our mirror setup
scripts to point to https resources as appropriate.
Change-Id: Iec576d631dd5b02f6b9fb445ee600be060f9cf1e
This is a first step toward making smaller playbooks which can be
run by Zuul in CD.
Zuul should be able to handle missing projects now, so remove it
from the puppet_git playbook and into puppet.
Make the base playbook be merely the base roles.
Make service playbooks for each service.
Remove the run-docker job because it's covered by service jobs.
Stop testing that puppet is installed in testinfra. It's accidentally
working due to the selection of non-puppeted hosts only being on
bionic nodes and not installing puppet on bionic. Instead, we can now
rely on actually *running* puppet when it's important, such as in the
eavesdrop job. Also remove the installation of puppet on the nodes in
the base job, since it's only useful to test that a synthetic test
of installing puppet on nodes we don't use works.
Don't run remote_puppet_git on gitea for now - it's too slow. A
followup patch will rework gitea project creation to not take hours.
Change-Id: Ibb78341c2c6be28005cea73542e829d8f7cfab08
Production letsencrypt certificate generation creates an intermediate
chain file (ca.cer); to simulate this during the self-signed tests
generate a fake CA certifcate, and use that to sign the generated
server certificate.
Tests updated to look for all these files
Change-Id: I3990529bca7ff3c6413ed0066f9c4feaf5464b1c
This change proposes calling a handler each time a certificate is
created/updated. The handler name is based on the name of the
certificate given in the letsencrypt_certs variable, as described in
the role documentation.
Because Ansible considers calling a handler with no listeners an error
this means each letsencrypt user will need to provide a handler.
One simple option illustrated here is just to produce a stamp file.
This can facilitate cross-playbook and even cross-orchestration-tool
communication. For example, puppet or other ansible playbooks can
detect this stamp file and schedule their reloads, etc. then remove
the stamp file. It is conceivable more complex listeners could be
setup via other roles, etc. should the need arise.
A test is added to make sure the stamp file is created for the
letsencrypt test hosts, which are always generating a new certificate
in the gate test.
Change-Id: I4e0609c4751643d6e0c8d9eaa38f184e0ce5452e
Git can segfault and cause a gitea error due to the size of the
openstack/openstack repo. Give it more stack space.
The hard limit is a workaround for
https://github.com/moby/moby/issues/39125
Change-Id: Ibce79d8ab27af3070bf9c5f584d0d78f2b266388
There's a bunch in here. This is mostly big-ticket things and test
fixes. Also, change the README to rst - because why is it markdown?
Depends-On: https://review.opendev.org/654005
Change-Id: I21e5017011e1111b4d7a9e4bf0ea6b10f5dd8c1b
Ensure the certificate material is not world-readable. Create a
letsencrypt group, and have things owned by root but group readable.
Change-Id: I49a6a8520aca27e70b3e48d0fcc874daf1c4ff24
This change contains the roles and testing for deploying certificates
on hosts using letsencrypt with domain authentication.
From a top level, the process is implemented in the roles as follows:
1) letsencrypt-acme-sh-install
This role installs the acme.sh tool on hosts in the letsencrypt
group, along with a small custom driver script to help parse output
that is used by later roles.
2) letsencrypt-request-certs
This role runs on each host, and reads a host variable describing
the certificates required. It uses the acme.sh tool (via the
driver) to request the certificates from letsencrypt. It populates
a global Ansible variable with the authentication TXT records
required.
If the certificate exists on the host and is not within the renewal
period, it should do nothing.
3) letsencrypt-install-txt-record
This role runs on the adns server. It installs the TXT records
generated in step 2 to the acme.opendev.org domain and then
refreshes the server. Hosts wanting certificates will have
pre-provisioned CNAME records for _acme-challenge.host.opendev.org
pointing to acme.opendev.org.
4) letsencrypt-create-certs
This role runs on each host, reading the same variable as in step
2. However this time the acme.sh tool is run to authenticate and
create the certificates, which should now work correctly via the
TXT records from step 3. After this, the host will have the
full certificate material.
Testing is added via testinfra. For testing purposes requests are
made to the staging letsencrypt servers and a self-signed certificate
is provisioned in step 4 (as the authentication is not available
during CI). We test that the DNS TXT records are created locally on
the CI adns server, however.
Related-Spec: https://review.openstack.org/587283
Change-Id: I1f66da614751a29cc565b37cdc9ff34d70fdfd3f
This adds the concept of an unmanaged domain; for unmanaged domains we
will write out the zone file only if it doesn't already exist.
acme.opendev.org is added as an unmanaged domain. It will be managed
by other ansible roles which add TXT records for ACME authentication.
The initial template comes from the dependent change, and this ensures
the bind configuration is always valid.
For flexibility and testing purposes, we allow passing an extra
refspec and version to the git checkout. This is one way to pull in
changes for speculative CI runs (I looked into having the hosts under
test checkout from Zuul; but by the time we're 3-ansible call's deep
on the DNS hosts-under-test it's a real pain. For the amount of times
we update this, it's easier to just allow a speculative change that
can take a gerrit URL; for an example see [1])
[1] https://review.openstack.org/#/c/641155/10/playbooks/group_vars/dns.yaml
Testing is enhanced to check for zone files and correct configuration
stanzas.
Depends-On: https://review.openstack.org/641154
Depends-On: https://review.openstack.org/641168
Change-Id: I9ef5cfc850c3458c63aff46cfaa0d49a5d194e87
We want to trigger ansible runs on bridge.o.o from zuul jobs. First
iteration of this tried to login as root but this is not allowed by our
ssh config. That config seems reasonable so we add a zuul user instead
which we can ssh in as then run things as root from zuul jobs. This
makes use of our existing user management system.
Change-Id: I257ebb6ffbade4eb645a08d3602a7024069e60b3
This runs an haproxy which is strikingly similar to the one we
currently run for git.openstack.org, but it is run in a docker
container.
Change-Id: I647ae8c02eb2cd4f3db2b203d61a181f7eb632d2