We install docker-compose from pypi in order to get newer features
(particularly useful for gerrit). On x86 all the deps for this have
wheels and we don't need build deps but on arm64 wheels don't exist for
things like cffi. Add build-essential, python3-dev, libffi-dev, and
libssl-dev to ensure we can build the necessary deps to install
docker-compose on arm64.
Change-Id: Id9c61dc904d34d2f7cbe17c70ad736a9562bb923
This server is going to be our new arm64 nodepool-builder running on the
new arm64 docker images for nodepool.
Depends-On: https://review.opendev.org/750037
Change-Id: I3b46ff901eb92c7f09b79c22441c3f80bc6f9d15
Modules are collected on bridge and then synchronized to remote hosts
where puppet is run. This is done to ensure an atomic run of puppet
across affected hosts.
These modules are described in modules.env and cloned by
install_modules.sh. Currently this is done in install-ansible, but
after some recent refactoring
(I3b1cea5a25974f56ea9202e252af7b8420f4adc9) the best home for it
appears to now be in puppet-setup-ansible; just before the script is
run.
Change-Id: I4b1d709d7037e2851d73be4bc7a202f52858ad4f
It turns out you can't use "run_once" with the "free" strategy in
Ansible. It actually warns you about this, if you're looking in the
right place.
The existing run-puppet role calls two things with "run_once:", both
delegated to localhost -- cloning the ansible-role-puppet repo (so we
can include_role: puppet) and installing the puppet modules (via
install-ansible-roles role), which are copied from bridge to the
remote side and run by ansible-role-puppet.
With remote_puppet_else.yaml we are running all the puppet hosts at
once with the "free" strategy. This means that these two tasks, both
delegated to localhost (bridge) are actually running for every host.
install-ansible-roles does a git clone, and thus we often see one of
the clones bailing out with a git locking error, because the other
host is running similtaneously.
I8585a1af2dcc294c0e61fc45d9febb044e42151d tried to stop this with
"run_once:" -- but as noted because it's running under the "free"
strategy this is silently ignored.
To get around this, split out the two copying steps into a new role
"puppet-setup". To maintain the namespace, the "run-puppet" module is
renamed to "puppet-run". Before each call of (now) "puppet-run", make
sure we run "puppet-setup" just on localhost.
Remove the run_once and delegation on "install-ansible-roles"; because
this is now called from the playbook with localhost context.
Change-Id: I3b1cea5a25974f56ea9202e252af7b8420f4adc9
Limestone has updated their self signed cert and in order to properly
verify it we need to update the cert material to check against itself.
Maybe we should confirm with logan- that the new cert material looks
correct before landing this just to be sure we're trusting the correct
thing.
Change-Id: Id528716aecb45ffb263850f697c5fb22db3b7969
I forgot in I5b7106e2263010ff353e8a1de43e73b0c0ec57e1 this is a new
mirror, which needs the LE bits setup.
Change-Id: I3109573b2b03453049a265a829445f88f8a87557
We have having constant issues with the bionic arm64 mirror shutting
itself off. Before we go too far down the path of debugging what
appears to be a kernel oops issue, let's rebuild it as focal.
Update the sources list in the base. Update the testing to use a
focal node.
Change-Id: I5b7106e2263010ff353e8a1de43e73b0c0ec57e1
In our beaker rspec testing we ssh into localhost pretending it is a
managed VM because that is how all the config management testing tools
want to work... This is has run into problems with new format ssh keys
which zuul provides. If such a key is present we convert it to PEM
othrewise we generate our own.
Also add ensure-virtualenv to the job as we appear to need it to run
these tests properly.
Change-Id: Ibb6080b5a321a6955866ef9b847c4d00da17f427
The pubmirror[12].math.uh.edu mirrors of Fedora 31 updates for
x86_64 have been sitting stale for several days with a corrupt
index, causing jobs which access our copy of this from our mirror
network to fail. Instead mirror Fedora releases/updates from
mirrors.mit.edu which seems to be updating just fine currently. We
can switch this back if/when the situation with the uh.edu mirrors
is resolved.
We're continuing to mirror EPEL and Fedora Atomic from
pubmirror[12].math.uh.edu for now, as we've had no reports of EPEL
problems on our mirrors (yet anyway), and it's hard to find any
other rsync mirrors of Atomic.
Change-Id: Iefd02602e2f2b39c4b72dc4d95ac62993ca65cdd
Change restart mode to always instead of 'no' as testing shows we won't
restart in a loop in CI and we want production to restart automatically.
Also add ssh pubkey contents for completeness and simplicity if we need
to find those in the future.
Change-Id: I81573a1ad1574419194eb3088070dda95fb81fff
This new ansible role deploys gerritbot with docker-compose on
eavesdrop.openstack.org. This way we can run it where the other bots
live.
Testing is rudimentary for now as we don't really want to connect to a
production gerrit and freenode. We check things the best we can.
We will want to coordinate deployment of this change with disabling the
running service on the gerrit server.
Depends-On: https://review.opendev.org/745240
Change-Id: I008992978791ff0a38f92fb4bc529ff643f01dd6
We are running with the default upload workers count of 4 which is half
of our previous ansible/puppet valud of 8 (we have 8 vcpus on these
servers). Increase the worker count to 8 to improve upload rate.
Change-Id: I3c051968acc8c32711cd7063469d4a80077ba587
The default indexer timeout is 30 seconds. During a recent gitea restart
gitea01 hit this timeout five times: 150 seconds. Increase the timeout
to double that value: 300 seconds.
This is important to ensure that our graceful restarts are in fact
graceful. We don't want the sshd container running while web is being
restarted multiple times. Doing so can lead to lost replication events
from gerrit.
Change-Id: I1f9253ccd6fbb055f848e186f478651454fee7e0
We remove old git web server env vars from the apache config and add
comments to our /p/ handling to describe the need for further cleanup
when Gerrit is upgraded.
Change-Id: I79fc130dec0a8b00706c0ec0f8fcab4d867e34d1
Gerrit is repurposing the /p/ path for project dashboard under
polygerrit. We use this path for Git mirrors. To resolve this let's
disable the /p/ path now then when it is used for project dashboards
users won't be as confused.
This has the added benefit of reducing the number of mirrors we need to
manage which makes managing branches in the mirrors simpler.
Change-Id: I9ebca2049a4a0707ecfbaecd92e42ebc1e6c3f87
Add an override to the systemd configuration for the docker service
unit so that it won't start until after openafs-client is started
and /afs is mounted (the latter because we don't know if the
initscript will possibly return early). Without this, it's a race to
see whether the container will have a working /afs mount, so can
lead to jobs failing to write into AFS with cryptic permissions
errors.
Change-Id: Ie00b1c1bc9c330e2af28c59b3b07a7c244c912dc
We need to add host (and possibly the ssh host key so its here too) in
this playbook because the add_host from the base-jobs side is only
applicable to the playbook running in base-jobs. When we start our
playbook here that state is lost. Simple fix, just add_host it again.
Change-Id: Iee60d04f0232500be745a7a8ca0eac4a6202063d
We can't run ARA on the executor because that involves running
arbitrary commands, instead generate reports on the executor and put
them where the normal fetch-output will find them later.
Change-Id: I20d88a7f03872d19f6bd014bc687a1bf16e4e80e
This uses a new base job which handles pushing the git repos on to
bridge since that must now happen in a trusted playbook.
Depends-On: https://review.opendev.org/742934
Change-Id: Ie6d0668f83af801c0c0e920b676f2f49e19c59f6
This reverts commit 05021f11a29a0213c5aecddf8e7b907b7834214a.
This switches Zuul and Nodepool to use Zookeeper TLS. The ZK
cluster is already listening on both ports.
Change-Id: I03d28fb75610fbf5221eeee28699e4bd6f1157ea
Fedora 33 is not released yet and the TripleO team would
like to perform some tests on that image.
Change-Id: I39f6bedadc12277739292cf31cc601bc3b6e30ec
Note this shouldn't be used until we can configure Gerrit to do similar
with jeepyb. Otherwise we'll end up with mismatched branches between our
canonical source (Gerrit) and our mirrors (Gitea).
Change-Id: I8d353cbc90c2d354e7cdebfc4e247f3f73d97d86
Specifying the family stops a deprecation warning being output.
Add a HTML report and report it as an artifact as well; this is easier
to read.
Change-Id: I2bd6505c19cee2d51e9af27e9344cfe2e1110572
Builds running on the new container-based executors started failing to
connect to remote hosts with
Load key "/root/.ssh/id_rsa": invalid format
It turns out the new executor is writing keys in OpenSSH format,
rather than the older PEM format. And it seems that the OpenSSH
format is more picky about having a trailing space after the
-----END OPENSSH PRIVATE KEY-----
bit of the id_rsa file. By default, the file lookup runs an rstrip on
the incoming file to remove the trailing space. Turn that off so we
generate a valid key.
Change-Id: I49bb255f359bd595e1b88eda890d04cb18205b6e
The host is review-test.opendev.org, so hostvars for
review-test.openstack.org are not so much going to do anything.
It's easier if we just ssh as root from review to gerrit2
on review-test.
review-test needs to be in letsencrypt group and have a
handler.
We need to install mysql - it's on the existing review
servers but not in ansible, it's just left over from
puppet.
The db credentials are in /root/.gerrit_db.cnf
Change-Id: I90e3c9d1b398cc16fea9f7056cfb059c7140160e
I476674036748d284b9f51e30cc2ffc9650a50541 did not open port 3081 so
the proxy isn't visible. Also this group variable is a better place
to update the setting.
Change-Id: Iad0696221bb9a19852e4ce7cbe06b06ab360cf11
We have decided to go with the layer 7 reject rules; enable the
reverse proxy for production hosts.
Change-Id: I476674036748d284b9f51e30cc2ffc9650a50541