Rather than running a local zookeeper, just run a real zookeeper.
Also, get rid of nb01-test and just use nb04 - what could possibly
go wrong?
Dynamically write zookeeper host information to nodepool.yaml
So that we can run an actual zk using the new zk role on hosts in
ansible inventory, we need to write out the ip addresses of the
hosts that we build in zuul. This means having the info baked in
to the file in project-config isn't going to work.
We can do this in prod too, it shouldn't hurt anything.
Increase timeout for run-service-nodepool
We need to fix the playbook, but we'll do that after we get the
puppet gone.
Change-Id: Ib01d461ae2c5cec3c31ec5105a41b1a99ff9d84a
Zuul is publishing lovely container images, so we should
go ahead and start using them.
We can't use containers for zuul-executor because of the
docker->bubblewrap->AFS issue, so install from pip there.
Don't start any of the containers by default, which should
let us safely roll this out and then do a rolling restart.
For things (like web or mergers) where it's safe to do so,
a followup change will swap the flag.
Change-Id: I37dcce3a67477ad3b2c36f2fd3657af18bc25c40
Make a service playbook, manifest and jobs for codesearch.
Remove openstack_project::server - it doesn't do anything.
Change-Id: I44c140de4ae0b283940f8e23e8c47af983934471
Migration plan:
* add zk* to emergency
* copy data files on each node to a safe place for DR backup
* make a json data backup: zk-shell localhost:2181 --run-once 'mirror / json://!tmp!zookeeper-backup.json/'
* manually run a modified playbook to set up the docker infra without starting containers
* rolling restart; for each node:
* stop zk
* split data and log files and move them to new locations
* remove zk packages
* start zk containers
* remove from emergency; land this change.
Change-Id: Ic06c9cf9604402aa8eb4bb79238021c14c5d9563
Upstream likes building the settings file into the image, but that's
less exciting, let's bind-mount ours in.
Depends-On: https://review.opendev.org/717491/
Change-Id: Ia1894d884ef2a84e1282345b77fe07bf8898f367
These hosts have been removed; remove the old references and
unnecessary groups, add the new host to cacti.
Change-Id: Ibcfd78a37e20e514c190ef801c2d44320c8b3f74
Story: #2006598
This should be mostly a no-op - but we will need to do a shutdown
in emergency mode.
Tell the gerrit role to not run compose up when run as part of
remote_puppet_git.
Change-Id: Id45376c2697656a12afeacf317b6f26c85c08dad
This is a start at ansible-deployed nodepool environments.
We rename the minimal-nodepool element to nodepool-base-legacy, and
keep running that for the old nodes.
The groups are updated so that only the .openstack.org hosts will run
puppet. Essentially they should remain unchanged.
We start a nodepool-base element that will replace the current
puppet-<openstackci|nodepool> deployment parts. For step one, this
grabs project-config and links in the elements and config file.
A testing host is added for gate testing which should trigger these
roles. This will build into a full deployment test of the builder
container.
Change-Id: If0eb9f02763535bf200062c51a8a0f8793b1e1aa
Depends-On: https://review.opendev.org/#/c/710700/
We have LE dns entries for review.o.o, but we're not actually
requesting the cert. Go ahead and request it - it'll make the
apache config easier to sort out.
Get the openstack.org certs for review-dev while we're at it.
Change-Id: I91d06c97993ba37204bd1fc326ae823e1b9c0c1a
Depends-On: https://review.opendev.org/707267
Depends-On: https://review.opendev.org/707255
Add a new review-dev server on the opendev domain with LE support
enabled.
Depends-On: https://review.opendev.org/705661
Change-Id: Ie32124cd617e9986602301f230e83bb138524fdf
The review-dev service playbook should do everything now that
the puppet did. Update how we're running things.
Change-Id: I70303c48328ea6713c24bf9c6f63d4808d30b95c
This adds a new handler to restart the zuul registry to pick up the new
cert. We may want to consider updating zuul registry to accept a reload
of ssl config without restarting the service.
Depends-On: https://review.opendev.org/702050
Change-Id: I23f6bea68285bc7cb0d12224235eaa16f0d07986
This provisions the cert but does not use it yet. We will do the
switchover once the cert is confirmed to be in place.
Depends-On: https://review.opendev.org/701819
Change-Id: I04fee48b9a79758527d8f9e8128c0fa915cd133e
This is the first step in managing the opendev.org cert with LE. We
modify gitea01.opendev.org only to request the cert so that if this
breaks the other 7 giteas can continue to serve opendev.org. When we are
happy with the results we can merge the followup change to update the
other 7 giteas.
Depends-On: https://review.opendev.org/694182
Change-Id: I9587b8c2896975aa0148cc3d9b37f325a0be8970
The backup roles have been debugged and are ready to run.
A note is added about having the backup server in a default disabled
state. This was discussed at an infra meeting where consensus was to
keep it disabled [1].
[1] http://eavesdrop.openstack.org/meetings/infra/2019/infra.2019-06-11-19.01.log.html#l-184
Change-Id: I2a3d2d08a9d1514bf6bdcf15bc5bc95689f3020f
This is a new backup server for use with the roles in
I9bf74df351e056791ed817180436617048224d2c
Restrict the puppet group to only the openstack.org servers as this
new server doesn't need puppet.
Depends-On: https://review.opendev.org/674549
Change-Id: Ia8e2e01f579ed9475830c159bf266b63bed52c36
This can be used in an apache vhost later, but should be fine to
merge now.
Depends-On: https://review.opendev.org/673902
Change-Id: Ic2cb7585433351ec1bdabd88915fa1ca07da44e7
We ended up running into a problem with nodepool built control plane
images (has to do with boot from volume not allowing us to delete images
that are in use by a nova instance). We have decided to clean this up
and go back to not doing this until we can do it more properly.
Note this isn't a revert because having a group for access to control
plane clouds does seem like a good idea in general and I believe there
have been changes we'd have to resolve in the clouds.yaml files anyway.
Depends-On: https://review.opendev.org/#/c/665012/
Change-Id: I5e72928ec2dec37afa9c8567eff30eb6e9c04f1d
Add the new mirror-update server as a follow-on to
I525ac18b55f0e11b0a541b51fa97ee5d6512bf70.
Also ensure that the new mirror server isn't in the puppet groups by
only matching the openstack.org one.
Also remove from the afsadmin group. This group is only used for
keytabs stored on bridge.o.o. I don't think that we need group for
the keytabs -- a keytab should only ever be in use on one host at a
time, so we are better off keeping the keytabs in a specific host_var
for the host they are used on, rather than being in a group and
possibly deployed on servers where they are not used.
Depends-On: https://review.opendev.org/668610
Change-Id: Icda92bb234adc00f6718c1c656e8f069ce2704c4
This move was prompted by wishing to expose the mirror update logs for
the rsync updates so that debugging problems does not require a root
user (note: not actually done in this change; will be a follow-on).
Rather than start hacking at puppet, the rsync mirror scripts make a
nice delination point for starting an Ansible-first/Bionic update.
Most magic is included in the scripts, so there is not much more to do
than copy them. The host uses the existing kerberos and openafs roles
and copies the key material into place (to be added before merge).
Note the scripts are removed from the extant puppet so we don't have
two updates happening simultaneously. This will also require a manual
clean to remove the cron jobs as a once-off when merging.
The other part of mirror-update is the reprepro based scripts for the
various debuntu repositories. They are left as future work for now.
Testing is added to ensure dependencies and scripts are all in place.
Change-Id: I525ac18b55f0e11b0a541b51fa97ee5d6512bf70
In order to have nodepool build images and upload them to control
plane clouds, add them to the clouds.yaml on the nodepool-builder
hosts. Keep them out of the launcher configs by splitting the config
templates. So that we can keep our copies of things to a minimum,
create a group called "control-plane-clouds" and put bridge and nb0*
in it.
There are clouds mentions in here that we no longer use, a followup
patch will clean those up.
NOTE: Requires shifting the clouds config dict from
host_vars/bridge.openstack.org.yaml to group_vars/control-plane-clouds.yaml
in the secrets on bridge.
Needed-By: https://review.opendev.org/640044
Change-Id: Id1161bca8f23129202599dba299c288a6aa29212
This removes the groups servers from our inventory as well as our
manifests/modules. We don't run the groups service anymore as many
groups migrated to meetup.com independent of us and the others have
transitioned there.
Change-Id: I7cb76611e6d30e7189821923f36a38dec9ea7241
This impelements mirrors to live in the opendev.org namespace. The
implementation is Ansible native for deployment on a Bionic node.
The hostname prefix remains the same (mirrorXX.region.provider.) but
the groups.yaml splits the opendev.org mirrors into a separate group.
The matches in the puppet group are also updated so to not run puppet
on the hosts.
The kerberos and openafs client parts do not need any updating and
works on the Bionic host.
The hosts are setup to provision certificates for themselves from
letsencrypt. Note we've added a new handler for mirror nodes to use
that restarts apache on certificate issue/renewal.
The new "mirror" role is a port of the existing puppet mirror.pp. It
installs apache, sets up some modules, makes some symlinks, sets up a
cleanup cron job and installs the apache vhost configuration.
The vhost configuration is also ported from the extant puppet. It is
simplified somewhat; but the biggest change is that we have extracted
the main port 80 configuration into a macro which is applied to both
port 80 and 443; i.e. the host will have SSL support. The other ports
are left alone for now, but can be updated in due course.
Thus we should be able to CNAME the existing mirrors to new nodes, and
any existing http access can continue. We can update our mirror setup
scripts to point to https resources as appropriate.
Change-Id: Iec576d631dd5b02f6b9fb445ee600be060f9cf1e