Zuul change I6d7e7e7a9e19d46a744f9ffac8d532fc6b4bba01 introduced a
multi-line formatter that makes exceptions and other multi-line output
much easier to follow in the logs. Use it here for the simple
formatter in the production Zuul deployment.
Change-Id: I9a8aad8a90f5f4080cdb872d0ed65697a180f57c
This adds a zuul-client config file as well as a convenience script
to execute the docker container to the schedulers.
Change-Id: Ief167c6b7f0407f5eaebecde552e8d91eb3d4ab9
Zuul is changing the way its key management system work from implicit
"backups" to explicit exports that can be used for backups. Additionally
to rename projects we will need to update those keys in zk which can be
done with copy and delete commands. We update the rename playbook to use
these.
Depends-On: https://review.opendev.org/c/zuul/zuul/+/803973
Change-Id: I2ba8015392f22ea615bcba7fb0d73a138dc77034
This removes the kata-containers tenant backup entry as that tenant no
longer exists. We also add status json backups for the opendev,
vexxhost, zuul, pyca, and pypa tenants. This gets us in sync with the
current tenant list.
Change-Id: I8527676dda67915e6ebe0d1c5fde7a57a7ac2e5b
This fixes the zuul debug log's logrotate filename. We also increase the
rotation count to 30 daily logs for all zuul scheduler zuul processes
(this matches the old server).
We also create a /var/lib/zuul/backup dir so that status.json backups
have a location they can write to. We do this in the base zuul role
which means all zuul servers will get this dir. It doesn't currently
conflict with any of the cluster members' /var/lib/zuul contents so
should be fine.
Change-Id: I4709e3c7e542781a65ae24c1f05a32444026fd26
We found a bug in master which will prevent us from merging a fix;
downgrade the scheduler to 4.1.0 to get that in.
Change-Id: Ie9ad75177ab58b34e20cafab496ba7af6f082551
So we can stop/pull/start, move the pull tasks to their own files
and add a playbook that invokes them.
Change-Id: I4f351c1d28e5e4606e0a778e545a3a805525ac71
This avoids the conflict with the zuul user (1000) on the test
nodes. The executor will continue to use the default username
of 'zuul' as the ansible_user in the inventory.
This change also touches the zk and nodepool deployment to use
variables for the usernames and uids to make changes like this
easier. No changes are intended there.
Change-Id: Ib8cef6b7889b23ddc65a07bcba29c21a36e3dcb5
We noticed that our zuul scheduler was running out of disk and one of
the causes of this is we are pulling all of the wonderful new zuul
images and not pruning them. This happens because we were only pruning
when (re)starting services and we don't do that automatically with Zuul.
Address this by always pruning after pulling even if we don't restart
services. This should be safe because prune will leave the latest tagged
images as well as the running images.
This should keep our disk consumption down.
Change-Id: Ibdd22ac42d86781f1e87c3d11e05fd8f99677167
We're running these in containers now. Please not to try to start
them the old way.
failed_when false is because we can't disable the old service
in the gate if there is no service file installed.
Change-Id: Ia4560f385fc98e23f987a67a1dfa60c3188816b6
We use the zuul_scheduler_start flag to determine if we want to start
the zuul-scheduler when new containers show up. Unfortunately we weren't
setting zuul_scheduler_start in prod so we failed with this error:
error while evaluating conditional (zuul_scheduler_start | bool): 'zuul_scheduler_start' is undefined
Fix this by treating an unset var as equivalent to a set truthy var
value. We do this instead of always setting the var to false in prod as
it simplifies testing.
Change-Id: I1f1a86e80199601646c7f2dec2a91c5d65d77231
If we need to start and stop, it's best to use playbooks.
We already have tasks files with start commands in each role,
so put the stop commands into similar task files.
Make the restart playbook import_playbook the stop and start
playbooks to reduce divergence.
Use the graceful shutdown pattern from the gerrit docker-compose
to stop the zuul scheduler.
Change-Id: Ia20124553821f4b41186bce6ba2bff6ca2333a99
We don't want to HUP all the processes in the container, we just
want zuul to reconfigure. Use the smart-reconfigure command.
Also - start the scheduler in the gate job.
Change-Id: I66754ed168165d2444930ab1110e95316f7307a7
Zuul is publishing lovely container images, so we should
go ahead and start using them.
We can't use containers for zuul-executor because of the
docker->bubblewrap->AFS issue, so install from pip there.
Don't start any of the containers by default, which should
let us safely roll this out and then do a rolling restart.
For things (like web or mergers) where it's safe to do so,
a followup change will swap the flag.
Change-Id: I37dcce3a67477ad3b2c36f2fd3657af18bc25c40