2441 Commits

Author SHA1 Message Date
Ian Wienand
66e510f0ee
run-selenium: Use latest tag on firefox image
I'm not sure why I used this tag; I probably copied it from [1] at the
time?  Let's just try latest.

Update matchers so the screenshot jobs run

[1] https://github.com/SeleniumHQ/docker-selenium

Change-Id: I8ea7981dac54883822f3b6076b6f0f564571f018
2022-10-11 10:53:00 +11:00
Zuul
04cfaddece Merge "install-ansible: remove stub install for ARA" 2022-10-10 23:46:52 +00:00
Zuul
fab8810fd6 Merge "install-ansible: remove stevedore workaround" 2022-10-10 23:46:50 +00:00
Zuul
778f9fb523 Merge "install-ansible: remove testinfra version install workaround" 2022-10-10 23:36:56 +00:00
Zuul
d243c2b4ff Merge "Add zuul_console_disable flag to added hosts" 2022-10-10 23:36:40 +00:00
Ian Wienand
1226017574
install-ansible: remove stub install for ARA
The current version of ARA doesn't depend on the "ansible" package any
more, so we don't need this fake stub package installed to fool it.
Remove this workaround.

Change-Id: I9330a22650204464b0b677fca06deec494731641
2022-10-10 16:08:14 +11:00
Ian Wienand
defbc978b9
install-ansible: remove stevedore workaround
stevedore is several releases on from this buggy behaviour, so we can
remove this workaround now.

Change-Id: I466dce31fc3b7370b55f0b03c611460debe92558
2022-10-10 16:08:14 +11:00
Ian Wienand
122644b5c6
install-ansible: remove testinfra version install workaround
This was added with 3cd8cd0765b447049fe57e75c6aa5d0d5c980873 but
ansible-core is a package now.  We can remove this workaround.

Change-Id: Ide1f3bbfe8887315a9f574bb1c19bf3234f58686
2022-10-10 16:08:14 +11:00
Clark Boylan
9a9af41e48 Disable distro cloud image users more forcefully
This updates our user management system to use the userdel --force flag
when disabling and removing distro cloud image users like 'ubuntu',
'centos' and 'admin'. The reason for this is when we switch from using
the distro user to boot strap launchnode over to root the distro user
may still have running processes that prevent userdel from succeeding.
This should address that problem and delete the user anyway.

The last step in the launch node process is to reboot which should clear
out any stale processes.

We don't do this for normal users as they aren't removed at node launch
time and this may be too forceful for them. It would be better for us to
error in that case and clean up any stale processes.

Change-Id: I79caf2a996566ecaec4cb4a70941bb3f03a5fb73
2022-10-03 09:21:42 -07:00
Clark Boylan
2fb310972c Update gitea logs for better request tracing
In gitea 1.14.0 they dropped the macaron http router for go-chi. This
seems to have changed how the request context's RemoteAddr is parsed in
logging. Importantly instead of a valid source port we get :0 which
makes it difficult to trace a connection from apache to gitea.

The origin of this behavior seems to be handling of X-Forwarded-For
headers that apache is setting. To address this we drop those headers
in hopes that gitea will log raw details for the apacher -> gitea
connection in that case. Due to not using x-forwarded-for anymore we
need to log the source port that apache is using for the proxy pass
connection which is done by modifying the apache log format.

Change-Id: I1e69431bf703947dc5c223df2a9e1b55bd0d841c
2022-09-30 11:29:58 -07:00
Clark Boylan
fbb55cd9ec Switch back to default gitea access log format
Switch because our overridden format does not seem to do what we expect
anymore. It always shows the port as :0 which is problematic as it
doesn't include necessary info. The update was made to accomodate
macaron which gitea has apparently moved away from (in favor of go-chi)
it is possible that the default format does what we want now.

Change-Id: If9bcf5cdb11911a46a8fd728346f2f35ffa5e8ba
2022-09-30 11:28:34 -07:00
James E. Blair
bb0dd71c59 Fix jaeger badger config and uid
Apparently "30d" as a ttl value breaks badger, but "720h" is okay.

Also, run the process as the jaeger user.

Change-Id: I00ad6b529aa86b0051343f828e05df8212b34656
2022-09-28 14:22:58 -07:00
James E. Blair
8492420407 Correct internal tracing server cert name
We have instructed zuul to connect to tracing.opendev.org, but
we are generating a certificate using opendev-ca with
S=tracing01.opendev.org.  Update the certificate with the correct
subject.

This also corrects the opendev-ca role which assumed that the cert
filename would always be inventory_hostname.

Change-Id: I9b6b0534f058d386e01910bb7efc30312f3d72ad
2022-09-28 10:38:41 -07:00
James E. Blair
7689c561f2 Correct OTLP TLS configuration in jaeger
The env variables we set for the TLS configuration were for the
jaeger native gRPC server rather than the otlp server.

This corrects the settings.

Change-Id: Id8b9157a104af243ee62a9d0aadbe56d09ea4ae5
2022-09-28 09:39:58 -07:00
Dmitriy Rabotyagov
e69e25dc5d Add Ceph Quincy mirror
Right now latest ceph release is Quincy. In order to test it in CI and
not fetch packages via external network, it's proposed to add relevant
mirror. At the moment OpenStack-Ansible and Loki does use these mirrors.

Change-Id: Icb836b9046571c5f824a3b57dafca05d37f94372
2022-09-27 20:33:33 +00:00
Zuul
84b3eef0bf Merge "Re-expose our Mailman archives.yaml and robots.txt" 2022-09-23 13:19:45 +00:00
James E. Blair
129083b840 Export Zuul traces to Jaeger
This generates TLS certs for Zuul using the jaeger CA and enables
tracing on all Zuul components, exporting to tracing.opendev.org.

Change-Id: I821e5ce4738ea0c93e116684033fa7b78e2da8c6
2022-09-22 15:06:46 -07:00
James E. Blair
11516e0e4b Make zk-ca role more generic
This renames zk-ca to opendev-ca and allows us to operate more than
one ca on bridge.  This way we can keep the CAs for ZooKeeper and
Jaeger distinct (so that a compromise of the jaeger server could not
be used to access the ZooKeeper cluster).

This also starts a new jaeger-ca and uses it on the Jaeger server.

Change-Id: I4e5bc4e3ccd78284ce785c971f7e6ad6e721f887
2022-09-22 15:05:32 -07:00
Jeremy Stanley
087fbd7dd7 Re-expose our Mailman archives.yaml and robots.txt
In switching to all-HTTPS for Mailman sites, it was missed that only
the plain HTTP vhosts set a DocumentRoot of /var/www. This was only
used for publishing metadata so went unnoticed until now. Rather
than add a DocumentRoot to the new HTTPS vhosts, simply use Aliases
to map the specific files we want to expose, for improved clarity
and to make it less likely they'll be overlooked in configuration in
the future.

In order to make sure the archives.yaml file exists at server
creation, before its cronjob fires for the first time, add a direct
invocation of the script which builds it. Move all tasks related to
this after the tasks which create the mailing lists, so that the
generated file will include them. This also simplifies testing.

For the non-multihost configuration, only robots.txt is expected to
be present, so don't add an alias for archives.yaml there.

Also add regression tests to ensure we keep these working.

Change-Id: I6b54b0386f0ea9f888c1f23580ad8698314474b9
2022-09-22 20:10:20 +00:00
Zuul
5e150b7e74 Merge "Add Jaeger tracing server" 2022-09-19 20:51:06 +00:00
Clark Boylan
801d8c2843 Fix jitsi meet jvb connection info and cert CN
This fixes the JVB connection info to use IP addrs instead of names
since nginx can't seem to do name lookups. Additionally, we modify the
cert CN to match the IP address used.

Change-Id: I6bbca44b60559d9586741c6540cb390371e3c120
2022-09-16 15:43:48 -07:00
Zuul
d442287c06 Merge "Update colibri for all the JVBs" 2022-09-16 20:54:02 +00:00
Clark Boylan
fa9aca784d Update colibri for all the JVBs
We are currently running an all in one jitsi meet service at
meetpad.opendev.org due to connectivity issues for colibri websockets to
the jvb servers. Before we open these up we need to configure the http
server for websockets on the jvbs to do tls as they are on different
hosts.

Note it isn't entirely clear yet if a randomly generated keystore is
sufficient for the needs of the jvb colibri websocket system. If not we
may need to convert an LE provisioned cert and key pair into a keystore.

Change-Id: Ifbca19f1c112e30ee45975112863fc808db39fc9
2022-09-16 12:10:00 -07:00
James E. Blair
c661fb0972 Add Jaeger tracing server
Change-Id: I1aa68b1d5f99364fa09776301894b922ed169a3a
2022-09-15 19:21:33 -07:00
Clark Boylan
183e186fe0 Fix error checking with zuul graceful stops
The previous code had attempted to handle the case where the container
isn't running and we exec a zuul graceful stop in the container.
Unfortunately I got the string to check for wrong. I think I must've
checked the `docker exec` output and not the `docker-compose exec`
output.

This change updates the string to match exactly what ansible complained
about:

  TASK [zuul-merger : Gracefully stop Zuul Merger] *******************************
  fatal: [zm05.opendev.org]: FAILED! => {
      "changed": true,
      "cmd": "docker-compose exec merger zuul-merger stop",
      "delta": "0:00:00.573561",
      "end": "2022-09-14 01:59:41.721044",
      "failed_when_result": true,
      "rc": 1,
      "start": "2022-09-14 01:59:41.147483"
  }

  STDERR:

  No container found for merger_1

  MSG:

  non-zero return code

Specifically we check for 'No container found' in stderr.

Change-Id: I737b9da14c210215926804816d1e032540d694dc
2022-09-14 08:12:59 -07:00
Clark Boylan
741f5b333d Fixup zuul merger and executor graceful shutdowns
There are two issues in the zuul merger and executor shutdowns. The
first is that `docker-compose ps -q` will report exited containers
unlike `docker ps -q`. This means we may try to exec into a non running
container which is an error. Handle this by checking the error message
and proceeding if the 'is not running' string is present.

The second issue is a race between stopping a container and running an
exec in the container. If a container stops while an exec is running in
it that exec appears to be treated with some equivalent of kill -9. The
result is the exec returns 137. While theoretically possible for both
executor and merger graceful stop command we seem to only hit this with
the merger so we handle exit code 137 for the merger only. This way
we'll get info if the executors start running into this too.

Change-Id: Ia6dc2d7e397631d72968ffa89c4492b803c89c47
2022-09-12 09:23:21 -07:00
Clark Boylan
9313c8e879 Fix docker wait requires at least one argument
In the graceful shutdown for mergers and executors if we skip the docker
exec to stop the container we also need to skip the docker wait. The
reason for this is docker wait exits with an error code if not provided
with any arguments to wait for.

Change-Id: Id09666ee23e1a9599d477b63a89559e4ab1d21bf
2022-09-07 15:17:10 -07:00
Clark Boylan
bc833f1dfd Fix overindented ansible in zuul_reboot.yaml playbook
Ansible syntax got me. When I updated the apt tasks to retry on apt/dpkg
locks I overindented the register, delay, retries, and until parameters.
These are to the task not the module.

Change-Id: I955d96b5467597503e0e5563e37ffa736ef2fcdc
2022-09-07 14:03:11 -07:00
Zuul
87f96d3356 Merge "Append zuul reboot logs instead of truncating them" 2022-09-07 20:37:04 +00:00
Clark Boylan
d71b3a798d Handle no running containers during zuul graceful stop
The way the currently graceful stop tasks are written for zuul expects
there to always be a running zuul container to exec into. There are
situations where there may not be a running container in which case
there is nothing to stop. Avoid this being an error by checking if the
containers are running before execing into them. If no containers are
running then we'll noop the docker exec step allowing the rest of the
ansible tasks to continue.

Change-Id: I6c47147a589ae12cc33e37e40e49673396d120f7
2022-09-06 13:10:52 -07:00
Clark Boylan
0c59eff0e8 Retry apt tasks in zuul_reboot if apt lock is held
The zuul_reboot playbook runs on each zuul server at what essentially
become random times based on how long the previous servers took to be
updated. We have seen this result in our apt tasks colliding with
unattended upgrades on the server.

Latest ansible would let us workaround this using the lock_timeout
parameter to the apt module, but the version we use on bridge does not
support this parameter. Instead we check the failure message for
'Failed to lock apt for exclusive operation' and if present we retry. We
wait 30 seconds between retries and will perform up to 40 attempts for a
total of 20 minutes of waiting. This method should also be forward
compatibile with new Ansible.

If the lock is held for longer than 20 minutes it likely implies
something has gone wrong and we will need to perform manual intervention
anyway.

Change-Id: I3171838a30e3ea496bb08f8b6ab1c95755b2ff3c
2022-09-06 11:51:33 -07:00
Clark Boylan
895dfbe7a4 Append zuul reboot logs instead of truncating them
Normally this isn't an issue because we run logrotate more frequently
than our weekly cron to upgrade and reboot zuul. But if you need to
manually run the playbook and are referring to the crontab entry to
determine how to run the playbook then the resulting command could
truncate a recent run. Simply append to the file in all cases to avoid
this.

Change-Id: I393741317cccaf447912b1f1517e846c32ee7677
2022-09-04 08:32:46 -07:00
Zuul
17eba9db32 Merge "Pass PUBLIC_URL to jitsi-meet jvb containers" 2022-09-02 23:31:13 +00:00
Jeremy Stanley
080ff3954f Pass PUBLIC_URL to jitsi-meet jvb containers
For some reason, the JVB servers now seem to tell clients to connect
to 8443/tcp on localhost rather than the actual server. It seems it
wants to build the URL based on the PUBLIC_URL envvar, but we
previously did not pass that through to the JVB containers. Add it
to their configuration so they'll have it available.

Change-Id: I10c761105490a72c4eb9ac0b08a304b7d5d1e18c
2022-09-02 21:32:39 +00:00
Zuul
2e441e771f Merge "Move our jitsi-meet interface config to defaults" 2022-09-02 19:16:54 +00:00
Zuul
22eda4e176 Merge "Disable Gravatar in Gitea" 2022-09-02 19:16:53 +00:00
Jeremy Stanley
aeced375fa Move our jitsi-meet interface config to defaults
It appears upstream container init now copies this from defaults,
overwriting our modifications. Shadow the one in the container with
ours so it gets copied into the eventual destination.

Also switch back to the old muting variables we were using before,
since the new "with" bools seem not to work (still worth looking
into later).

Change-Id: I7e91e82e6f91b44c5c7eb1406ba0c64d30e6b8ff
2022-09-02 17:31:57 +00:00
Zuul
c05566557f Merge "Update Jitsi configs to latest upstream samples" 2022-09-02 16:42:35 +00:00
Jeremy Stanley
a2eda2203a Update Jitsi configs to latest upstream samples
Bring our 5 configs into line with current upstream versions
(jitsi-meet_7648 and stable-7648-4 tags from the jitsi-meet and
docker-jitsi-meet repositories respectively). Attempt to preserve
most of our earlier overrides:

 * configure Etherpad integration
 * disable background blurring
 * disable watermarks
 * open shared document on join
 * start with audio and video muted
 * redirect HTTP to HTTPS
 * disable XMPP WebSockets
 * disable P2P connections
 * templated credentials
 * templated unique JVB server identifiers

Drop any options we previously set which later became defaults (like
useRoomAsSharedDocumentName or UTC as the TZ). Identify the upstream
repo and tag on which each file is based. Stop claiming Firefox is
not recommended, now that the default configuration adds a pre-join
page which helps browsers realize they should not treat the audio
stream as unsolicited. Switch to newer vars for muting audio and
video as a boolean rather than at a participant threshold.

Update the docker-compose files to use the stable tag instead of
latest, since upstream seems to just stop refreshing the latest tag
far too often. Clean up extra envvars we were setting for JVB which
we didn't pass through to the containers.

Change-Id: I1e5a3836917f3d90ad7dd1c0771871740fda3cda
2022-09-01 17:41:25 +00:00
Zuul
dcabcd45fa Merge "Revert "Use rackspace mirror to sync centos stream repos"" 2022-09-01 13:17:51 +00:00
e76e0089d1 Revert "Use rackspace mirror to sync centos stream repos"
This reverts commit cc2dd16d3a7194a4185ad6e1da854cb4fde01b1c.

Reason for revert: rax mirrors not synched for 15 hours and causing
issues, facebook mirror is up to date so let's switch to it.

Change-Id: Iaf94540f22e2b49c74ab0704ac94fd1554ce5bbc
Related-Bug: #1988397
2022-09-01 12:09:24 +00:00
Ian Wienand
7a98663678 Add zuul_console_disable flag to added hosts
By setting this variable (added in the dependent change) Zuul's
shell/command override will not write out streaming spool files in
/tmp.  In our case, port 19885 is firewalled off to these hosts, so
they will never be used for streaming results.

Change-Id: Ifbb5b8acb1f231812905cf9643bfec6fbbd08324
Depends-On: https://review.opendev.org/855309
2022-09-01 16:12:58 +10:00
Takashi Kajinami
25ba188137 Update gpg key of puppetlabs repository
The previous GPG key of the puppetlabs repository expired in August
2021[1]. This change updates the key to fix the content sync.

[1] https://puppet.com/blog/updated-puppet-gpg-signing-key-2020-edition/

Change-Id: I1f1c8f1595ee2cc78f85cdbb82b3d90ea3fa762a
2022-08-29 09:38:30 +09:00
Clark Boylan
c01c5c41ce Disable Gravatar in Gitea
We do this to prevent lookups to third parties for information in this
case avatar info. We should verify this doesn't break local avatar
storage usage which we manage directly.

Change-Id: I612bf1629bd211ed14203cc9e39f34ba0be041bf
2022-08-25 13:53:55 -07:00
Zuul
bebbe406fd Merge "Update to Gitea 1.17" 2022-08-25 19:24:37 +00:00
Clark Boylan
7f06a0ce2e Update to Gitea 1.17
Please  carefully review the changelog:

  https://github.com/go-gitea/gitea/blob/v1.17.1/CHANGELOG.md

and ensure that we've properly addressed the items listed there.

I have listed the breaking changes list here and any actions we've taken
or justification for why they don't affect us:

* Require go1.18 for Gitea 1.17 (#19918)
  We were already using go 1.18.
* Make AppDataPath absolute against the AppWorkPath if it is not (#19815)
  Path is already absolute:
  playbooks/roles/gitea/templates/app.ini.j2:APP_DATA_PATH    = /data/gitea
* Nuke the incorrect permission report on /api/v1/notifications (#19761)
  This has to do with how that api endpoint returns permissions. We
  don't use this anywhere as far as I can tell.
* Refactor git module, make Gitea use internal git config (#19732)
  In the gitea container /data/git/.gitconfig is present but we don't
  appear to manage this in system-config. I think that means this
  change is a noop for us as gitea will move its managed .gitconfig
  from /data/git/.gitconfig to /data/git/repositories/.gitconfig.
  I expect the contents to be the same since gitea must be managing
  the file old content today.
* Remove RequireHighlightJS field, update plantuml example. (#19615)
  This was a flag that toggled syntax highlighting on and off as best
  as I can tell. The default is to just have it turned on and we don't
  check the flag in any of our templates.
* Increase minimal required git version to 2.0 (#19577)
  Debian Bullseye ships with 2.30.2-1.
* Add a directory prefix gitea-src-VERSION to release-tar-file (#19396)
  They were tarbombing people and their tarballs extracted into the
  current dir. They now no longer do that. We build from git so this
  doesn't affect us.
* Use "main" as default branch name (#19354)
  We explicitly set the default branch name to master for both gitea and
  gerrit. This should be a noop for us. Testing has been added to check
  this.
  https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/gitea-git-repos/library/gitea_create_repos.py#L129-L132
  https://opendev.org/opendev/jeepyb/src/branch/master/jeepyb/cmd/manage_projects.py#L488
* Make cron task no notice on success (#19221)
  I'm not aware of us relying on any cron tasks or any cron task
  notifications.
* Add pam account authorization check (#19040)
  We don't integrate with pam so the change in behavior to check
  authorization does not affect us.
* Show messages for users if the ROOT_URL is wrong, show JavaScript errors (#18971)
  This message shows up in CI because ROOT_URL is https://opendev.org
  but we access gitea in testing via localhost. I don't think this
  is worth fixing. Its a good reminder that the instance is a test
  instance.
* Refactor mirror code & fix StartToMirror (#18904)
  We don't mirror repos with gitea. Should be a noop for us.
* Remove deprecated SSH ciphers from default (#18697)
  hmac-sha1-96, diffie-hellman-group1-sha1, and arcfour{128,256} are
  removed. The only ssh user is gerrit's replication. MINA should
  be able to support more modern ciphers and be fine.
* Add the possibility to allow the user to have a favicon which differs from the main logo (#18542)
  Previously, logo.svg was used as the favicon.svg and gitea only fell
  back to favicon.png if the browser couldn't so the .svg. But now they
  want to support users having different logo.svg and favicon.svg. This
  necessitates explicitly adding a favicon.svg. Something we already do.
  Details at https://github.com/go-gitea/gitea/pull/18542
* Update reserved usernames list (#18438)
  This shouldn't be a problem for us as we don't have regular users and
  gerrit is not a reserved name.
* Support custom ACME provider (#18340)
  We run ACME with LE out of band. This doesn't affect us.
* Change initial TrustModel to committer (#18335)
  This changes the signed commits trust model from collaborator
  to committer. THis won't affect us as we aren't maintaining trusted
  keys. But basically this now shows if the signed commit by the
  committer matches the committer's key.
* Update HTTP status codes (#18063)
  This changed redirect HTTP codes from 302 to 307. Shouldn't
  affect us.
* Upgrade Alpine from 3.13 to 3.15 (#18050)
  We build on Debian and not alpine. The alpine nodejs version did
  change from 14 to 16 in this change and we've updated to match.
* Restrict email address validation (#17688)
  If we had real users this may pose a problem as they are limiting
  the set of emails gitea would accept to a smaller set than they
  accepted before. Also fewer than actually allowed by email. But
  we don't have real users so this should be fine.
* Refactor Router Logger (#17308)
  This streamlines and improves the log format of some of the gitea
  logs. We aren't automatically processing these logs today so this
  shouldn't have a major impact on us.

Additionally this release adds a new git.HOME_PATH setting to set the
location for writing out git configs and potential gnupg configs. We
should be fine to let gitea write this content out to the default path,
but there is potential for this to impact our ssh daemon.

Changes made include:

 * Minimal updates to web templates to match 1.17
 * Updating nodejs to v16 as v14 failed to build gitea
 * Disabling the new enabled by default "packages" feature
 * New test to check repos have a master branch by default instead of
   Gitea's new default of main.

Change-Id: I88105eccd118e3daca72f0b86a6b351c35e37413
2022-08-18 14:12:30 -07:00
Clark Boylan
5f0718b3b5 Increase the number of Gerrit threads for http requests
We've seen CI systems consume all of our threads which causes the web UI
to become non responsive. To address this increase the number of httpd
threads from 100 to 150. Note that we do not modify sshd.threads beacuse
sshd.threads determines the max number of git requests across both ssh
and http.

In theory what this means is that httpd has an additional 50 threads to
process non git requests (for example web UI requests) which will
hopefully keep that responsive even if git requests are max'd out.

It is possible that we also need to increase the sshd.threads value to
handle those git requests, but we will start by modifying one config
value at a time. If we do bump sshd.threads we should increase
httpd.maxThreads to give it that additional headroom.

Finally, I believe this is likely to be safe as we doubled the size of
our Gerrit server when we moved it to vexxhost. The old server was
pretty well maxed out though so increase these values on the new server
slowly and monitor the results.

Details on the configuration can be found at:

  https://gerrit-review.googlesource.com/Documentation/config-gerrit.html#httpd

Change-Id: I57a1e248c3c01597bb29c7afc304688e834a64cc
2022-08-17 10:34:47 -07:00
Dr. Jens Harbott
08cd3e50e0 reprepro: mirror Ubuntu UCA Zed for Jammy
Change-Id: I542930c349992bd54dc103a92c7366ae060335aa
2022-08-15 19:27:44 +02:00
Zuul
593d9f204e Merge "Use rackspace mirror to sync centos stream repos" 2022-08-12 00:17:44 +00:00
Zuul
74389454ce Merge "system-config-run-borg-backup: rename hosts to distro" 2022-08-11 23:57:30 +00:00