With our increased ability to test in the gate, there's not much use
for review-dev any more. Remove references.
Change-Id: I97e9865e0b655cd157acf9ffa7d067b150e6fc72
This should ensure that if we have a parent job that updates the gitea
version and a do not merge child job that induces an artificial failure
for zuul hold purposes that we test the correct image in the child job's
changes.
Prior to this we were testing the existing published images, but
provides + requires will give the correct signaling to make the desired
"test new proposed image" behavior happen in the child change builds.
Change-Id: Ie6b827b650e0f32606dc5ec7f4aa0adfeebdeb5e
When we cleaned up the puppet in
I6b6dfd0f8ef89a5362f64cfbc8016ba5b1a346b3 we renamed the group
s/refstack-docker/refstack/ but didn't move the variables and some
other references too.
Change-Id: Ib07d1e9ede628c43b4d5d94b64ec35c101e11be8
This adds a program, zookeeper-statsd, which monitors zookeeper
metrics and reports them to statsd. It also adds a container to
run that program. And it runs the container on each of the
ZooKeeper quorum members. And it updates the graphite host to
allow statsd traffic from quorum members. And it updates the
4-letter-word whitelist to allow the mntr command (which is used
to gather metrics) to be issued.
Change-Id: I298f0b13a05cc615d8496edd4622438507fc5423
This adds the new focal nodepool launchers replacements for nl02-04 to
our inventory. This will configure them with an idle configuration. We
then confirm they are happy running in an idle state then switch over
the config from the old to new servers.
Depends-On: https://review.opendev.org/c/openstack/project-config/+/780982
Change-Id: Iea645925caaeee6f498aa690c4f2c848f6899317
This adds a role and related testing to manage our Kerberos KDC
servers, intended to replace the puppet modules currently performing
this task.
This role automates realm creation, initial setup, key material
distribution and replica host configuration. None of this is intended
to run on the production servers which are already setup with an
active database, and the role should be effectively idempotent in
production.
Note that this does not yet switch the production servers into the new
groups; this can be done in a separate step under controlled
conditions and with related upgrades of the host OS to Focal.
Change-Id: I60b40897486b29beafc76025790c501b5055313d
There is some correlation that running the manage-projects playbook
gives our gitea fits. The bulk of the work done here is in trying to
update the descriptions of all projects. There isn't a good way to see
if the description is already set first so we just try and ignore
errors. This creates potentially thousands of operations all at once and
could be why things are sad.
We move these operations under the always update flag which is not set
on normal runs. If we really need to converge to a good updated state we
can manually run the playbook/role with always update set.
We also don't set a limit on the number of ThreadPoolExecutor workers
which will default to 5 * NumProcs. Could be that tuning this down would
make gitea happier.
One other thought is that we may not be using request sessions properly
for connection reuse. In particular requests notes that you need to set
stream to False or read request content to return a connection back to
the pool for reuse. We might look into this for further improvements.
Change-Id: I6e6fb1eb08303e9da7e38cf493d1871364340000
This is a new focal replacement for nl01.openstack.org. We keep
nl01.openstack.org in our inventory for now because we want ansible to
update the nodepool.yaml configs for these two hosts to coordinate a
hand off of responsibilities once we are happy with the new deployment.
We also switch the testing hostname to nl04.openstack.org as this will
be the last nodepool launcher to be removed. When we swap it out the
testing will be updated to use focal hosts.
Depends-On: https://review.opendev.org/c/openstack/project-config/+/779863
Change-Id: Ib3ea6586fe0567c1edf6255ee9be50164d35db62
The production server is trying to send itself to
refstack01.openstack.org, causing cross-site scripting issues. In
production, use the CNAME, but use the FQDN for testing.
Fix up job file matchers while here.
Change-Id: I18a5067ee25c59c5eaa17b7c2d9bd5a942a9173d
The previous refstack server had 'api' in the endpoint
addresses of API calls. Let's try to set it in the new
instance as well to keep the same interface.
Also, fix the typo in the testinfra host match and in
the test name.
Change-Id: I7319990144396b3a753678975a09b0add3ac4465
This has our change to open etherpad on join, so we should no longer need
to run a fork of the web server. Switch to the upstream container image
and stop building our own.
Change-Id: I3e8da211c78b6486a3dcbd362ae7eb03cc9f5a48
These are new focal replacement servers. Because this is the last set of
replacements for the executors we also cleanup the testing of the old
servers in the system-config-run-zuul job and the inventory group
checker job.
Change-Id: I111d42c9dfd6488ef69ff1a7f76062a73d1f37bf
The jobs should have file matchers for "roles/openafs-client" (not
playbooks/). Fix this.
Add the openafs/kerberos role matchers to Zuul as well, as it uses
them on the executors.
Change-Id: I66fd7792d6b533362606291e1bfc01dfa2a2e05b
This server has been replaced by ze01.opendev.org running Focal. Lets
remove the old ze01.openstack.org from inventory so that we can delete
the server. We will follow this up with a rotation of new focal servers
being put in place.
This also renames the xenial executor in testing to ze12.openstack.org
as that will be the last one to be rotated out in production. We will
remove it from testing at that point as well.
We also remove a completely unused zuul-executor-opendev.yaml group_vars
file to avoid confusion.
Change-Id: Ida9c9a5a11578d32a6de2434a41b5d3c54fb7e0c
Zuul reports SKIPPED jobs with a slightly different format, with a
fake URL and no run-time.
ERROR results include a message before the timestamp.
Update our test comments to reflect this, and the dependent change has
the updates for the zuul-summary-status plugin.
Depends-On: https://gerrit-review.googlesource.com/c/plugins/zuul-results-summary/+/298422
Change-Id: Idf2f12a8105c08f34a1600c66a3d7a26671728d2
These servers are all up and running and should be ready to go after the
zm01 canary. Note there will be a followup change to remove
zm02-zm08.openstack.org from the inventory. We split this up so that we
can keep those servers around until we're happy with the replacements.
Change-Id: Ic2671da104df2b01986d1b65c8d13507d6792c40
We have seen some poor performance from gitea which may be related to
manage project updates. Start a dstat service which logs to a csv file
on our system-config-run job hosts in order to collect performance info
from our services in pre merge testing. This will include gitea and
should help us evaluate service upgrades and other changes from a
performance perspective before they hit production.
Change-Id: I7bdaab0a0aeb9e1c00fcfcca3d114ae13a76ccc9
All hosts are now running thier backups via borg to servers in
vexxhost and rax.ord.
For reference, the servers being backed up at this time are:
borg-ask01
borg-ethercalc02
borg-etherpad01
borg-gitea01
borg-lists
borg-review-dev01
borg-review01
borg-storyboard01
borg-translate01
borg-wiki-update-test
borg-zuul01
This removes the old bup backup hosts, the no-longer used ansible
roles for the bup backup server and client roles, and any remaining
bup related configuration.
For simplicity, we will remove any remaining bup cron jobs on the
above servers manually after this merges.
Change-Id: I32554ca857a81ae8a250ce082421a7ede460ea3c
This checks the backup archives and alerts us if anything seems wrong.
This will take a few hours, so we run once a week.
Change-Id: I832c0d29a37df94d4bf2704c59bb3f8d855c3cc8
We need to depend on the buildset registry as we are building this image
in a separate job. We also don't need to depend on the build job in
gate, we only need the upload job.
Change-Id: Ie7c2ed29c028f8c23d67ad38edbe04b12e22d026
This change splits our existing system-config-run-review job into two
jobs, one for gerrit 3.2 and another for 3.3. The biggest change is that
we use a var called zuul_test_gerrit_version to select which version we
want and that ends up in the fake group file written out by Zuul for the
nested ansible run. The nested ansible run will then populate the
docker-compose file with the appropriate version for us.
Change-Id: I00b52c0f4aa8df3ecface964007fcf5724887e5e
Gerrit 3.3 has released. Lets start building images for it so that we
can do testing when ready to start that.
We also add testing files to the list of things that trigger the 3.3
builds. Strictly this isn't necessary since the test will continue to
use 3.2 images until we upgrade to 3.3, but this helps us avoid
forgetting to do this when we do upgrade. Little extra jobs run today to
ensure we continue to run the right jobs tomorrow.
Change-Id: Ib7e7d7313e0827a40009df840119444611d74ca2
This adds a dockerfile to build an opendevorg/refstack image as well as
the jobs to build and publish it.
Change-Id: Icade6c713fa9bf6ab508fd4d8d65debada2ddb30
Add facility to borg-backup role to run a command and save the output
of it to a separate archive file during the backup process.
This is mostly useful for database backups. Compressed on-disk logs
are terrible for differential backups because revisions have
essentially no common data. By saving the uncompressed stream
directly from mysqldump, we allow borg the chance to de-duplicate,
saving considerable space on the backup servers.
This is implemented for our ansible-managed servers currently doing
dumps. We also add it to the testinfra.
This also separates the archive names for the filesystem and stream
backup with unique prefixes so they can be pruned separately.
Otherwise we end up keeping only one of the stream or filesystem
backups which isn't the intention. However, due to issues with
--append-only mode we are not issuing prune commands at this time.
Note the updated dump commands are updated slightly, particularly with
"--skip-extended-insert" which was suggested by mordred and
significantly improves incremental diff-ability by being slightly more
verbose but keeping much more of the output stable across dumps.
Change-Id: I500062c1c52c74a567621df9aaa716de804ffae7
This reworks the gerrit testing slightly to give some broader
coverage.
It sets up ssh keys for the user; not really necessary but can be
helpful when interacting on a held host.
It sets up groups and verification labels just so Zuul can comment
with -2/+2; again this is not really necessary, but makes things a
little closer to production reality.
We make multiple changes, so we can better test navigating between
them. The change comments are updated to have some randomness in them
so they don't all look the same. We take screen shots of two change
pages to validate the navigation between them.
Change-Id: I60b869e4fdcf8849de836e33db643743128f8a70