1683 Commits

Author SHA1 Message Date
Zuul
bf8e8916aa Merge "Enable jitsi-meet xmpp websockets" 2021-03-18 22:09:12 +00:00
Clark Boylan
0aa838ce16 Fix jitsi config.js
There were : used when we should have used =. Fix this.

Change-Id: Icb1e04d6e6e27726a12a1e49d53d9eb7c88d1a01
2021-03-18 13:43:39 -07:00
Clark Boylan
f64b935778 Enable jitsi-meet xmpp websockets
This switches desktop clients to using xmpp over websockets instead of
BOSH. The mobile clients continue to use BOSH. Apparently this provides
better performance and is the default expectation of the upstream docker
images. We had disabled it prior to get back to a working state when we
weren't setting this variable at all.

After looking at configs on the docker images I expect that enabling
this explicitly will work (the problem before was we neither enabled nor
disabled it and the images can't handle that case). If that isn't the
case we can always revert.

Change-Id: I59c9fe75a0860782beb8864ff3bd9622b35381c1
2021-03-18 11:59:14 -07:00
Clark Boylan
55f38141c5 More jitsi meet config cleanups
This removes an unused letsencrypt dir bind mount for jitsi-meet web
that was causing confusion (we run letsencrypt out of band and put the
certs in the correct dir so we don't need this specific bind mount).

We also remove the now unused config.js config file from the role.

We stop managing the default nginx config and instead rely on the
container provided template. To properly configure http redirects we
set the ENABLE_HTTP_REDIRECT flag in the env var file.

Finally we update the README file with a bit more info on how this all
works.

Change-Id: Iecb68c9855b5627d25f8bb586b0e6f366f1c80ab
2021-03-18 11:55:02 -07:00
Zuul
bb1b98623d Merge "Restore meetpad etherpad settings." 2021-03-18 18:04:35 +00:00
Zuul
c600c4a2cc Merge "Restore some meetpad settings we had previously set" 2021-03-18 16:59:09 +00:00
Zuul
4302bf3585 Merge "Manage jitsi-meet meet.conf as a template input for the container" 2021-03-18 15:54:14 +00:00
Zuul
99a05bdf75 Merge "Add kerberos-client group" 2021-03-18 02:43:59 +00:00
Ian Wienand
dc827de23d Add kerberos-client group
We duplicate the KDC settings over all our kerberos clients.  Add
clients to a "kerberos-client" group and set the variables in a group
file.

Change-Id: I25ed5f8c68065060205dfbb634c6558488003a38
2021-03-18 11:59:30 +11:00
Zuul
47fa6e0382 Merge "Add zookeeper-statsd" 2021-03-18 00:08:12 +00:00
Clark Boylan
c1bb5b52cf Restore meetpad etherpad settings.
This restores useRoomAsSharedDocumentName and openSharedDocumentOnJoin
config settings in our jitsi meet config.js. We had lost these settings
in the recent jitsi meet web container update. To restore them we
provide an alternative settings-config.js template to the container so
that when it generates its configs we get these vars included.

We stop managing the config.js file in /var/jitsi-meet/web to avoid
confusion with ansible replacing configs that may be used then.

Change-Id: I4d2bd77e03812695792cda2abb7f401288186f2c
2021-03-17 15:04:16 -07:00
James E. Blair
96bac7b486 Add zookeeper-statsd
This adds a program, zookeeper-statsd, which monitors zookeeper
metrics and reports them to statsd.  It also adds a container to
run that program.  And it runs the container on each of the
ZooKeeper quorum members.  And it updates the graphite host to
allow statsd traffic from quorum members.  And it updates the
4-letter-word whitelist to allow the mntr command (which is used
to gather metrics) to be issued.

Change-Id: I298f0b13a05cc615d8496edd4622438507fc5423
2021-03-17 14:52:31 -07:00
Clark Boylan
d410b43b59 Restore some meetpad settings we had previously set
This starts conferences with participant video muted and it disabled p2p
connections for calls between two participants. We had these settings
before but the recent container image update undid them.

Change-Id: I4152ace083f79116758020fbbcbbb96e39eef9ed
2021-03-17 13:24:38 -07:00
Clark Boylan
2ac695f628 Manage jitsi-meet meet.conf as a template input for the container
The jitsi meet containers want to generate configuration from the
templates found in /defaults on the container to config files in the
bind mounted /config (/var/jitsi-meet/ on the host side). This means
that the configs ansible is writing to /var/jitsi-meet are complete
ignored and overwritten by the container using its templating system and
env vars.

This is causing us problems because we would like to use a different
etherpad prxoy config in nginx to ensure the Host header is set
properly. To make this happen we bind mount in our own template file so
that the container can template what we want rather than what is found
in the image.

Change-Id: Ifdde66a01bb7e632fc19ca0a512216584f1ea9f0
2021-03-17 13:03:09 -07:00
Clark Boylan
75a64427a1 Improve meetpad env options for templating
The PUBLIC_URL is quoted which results in quotes ending up in our config
breaking etherpad base url setting in config.js. We remove the quotes as
they are not necessary.

We also remove the /p/ suffix from ETHERPAD_URL_BASE as this causes the
proxying to send extra /p/s to etherpad which results in problems.

Note these fixes appear to be necessary but are not sufficient to have
working meetpad proxying of etherpad. We also need to fix the nginx
meet.conf proxy settings to send valid Host heads. A followup change
will attempt to address that.

Change-Id: I0f59339a33267468ad5481858507a43cefa0021d
2021-03-17 12:47:43 -07:00
Clark Boylan
7b87c7c305 Disable xmpp websocket in jitsi meet config
We unforked our jitsi web container and discovered that etherpad doc
embedding was broken. In the process of debugging this the jitsi meet
services on meetpad were restart which pulled in newer configs which
expect ENABLE_XMPP_WEBSOCKET to be enabled by default. Unfortunately
this wasn't quite working for us. Explicitly disabling this seems to
make audio and video calling work again. But doc sharing isn't even
attempted now.

Let's get this fix in as audio and video are important then we'll keep
debugging the etherpad doc sharing problem.

https://github.com/jitsi/docker-jitsi-meet/issues/902 has details from
others that hit this problem.

Note that part of the issue here seems to be that nginx is using the
default configs in the container found at /default and not the configs
we bind mount at /config. This at least seems to be why the proxying for
etherpad documents is broken.

Change-Id: I03fa9d331e6825b3b953a3573c0dd43c7be478a4
2021-03-17 11:38:56 -07:00
Zuul
77b1c14a9a Merge "Use upstream jitsi-meet web image" 2021-03-17 00:22:50 +00:00
Zuul
4524a92caf Merge "kerberos-kdc: role to manage Kerberos KDC servers" 2021-03-16 22:28:46 +00:00
Zuul
b133afedfd Merge "refstack: cleanup old puppet" 2021-03-16 22:21:03 +00:00
Ian Wienand
c1aff2ed38 kerberos-kdc: role to manage Kerberos KDC servers
This adds a role and related testing to manage our Kerberos KDC
servers, intended to replace the puppet modules currently performing
this task.

This role automates realm creation, initial setup, key material
distribution and replica host configuration.  None of this is intended
to run on the production servers which are already setup with an
active database, and the role should be effectively idempotent in
production.

Note that this does not yet switch the production servers into the new
groups; this can be done in a separate step under controlled
conditions and with related upgrades of the host OS to Focal.

Change-Id: I60b40897486b29beafc76025790c501b5055313d
2021-03-17 08:30:52 +11:00
Ian Wienand
018a14e34f refstack: cleanup old puppet
Remove old puppet configuration for the restack service, which is now
managed by Ansible.

Change-Id: I6b6dfd0f8ef89a5362f64cfbc8016ba5b1a346b3
2021-03-17 07:06:53 +11:00
Clark Boylan
16a4bdce02 Don't always update gitea project descriptions
There is some correlation that running the manage-projects playbook
gives our gitea fits. The bulk of the work done here is in trying to
update the descriptions of all projects. There isn't a good way to see
if the description is already set first so we just try and ignore
errors. This creates potentially thousands of operations all at once and
could be why things are sad.

We move these operations under the always update flag which is not set
on normal runs. If we really need to converge to a good updated state we
can manually run the playbook/role with always update set.

We also don't set a limit on the number of ThreadPoolExecutor workers
which will default to 5 * NumProcs. Could be that tuning this down would
make gitea happier.

One other thought is that we may not be using request sessions properly
for connection reuse. In particular requests notes that you need to set
stream to False or read request content to return a connection back to
the pool for reuse. We might look into this for further improvements.

Change-Id: I6e6fb1eb08303e9da7e38cf493d1871364340000
2021-03-16 13:06:16 -07:00
Zuul
e077281e4e Merge "refstack: fix backup script typo" 2021-03-16 05:43:21 +00:00
Ian Wienand
ea48ffc596 refstack: fix backup script typo
This got copied from another command that also had this typo.

Also, don't bother backing up the on-disk backups, as we backup
directly via the stream dumps.

Change-Id: Ie200a29eec2b1a0725a8872ab548bcb0f26980e6
2021-03-16 15:12:41 +11:00
Zuul
bc94f97de2 Merge "Enable srvr, stat and dump commands in the zk cluster" 2021-03-16 04:10:11 +00:00
Zuul
70079c5771 Merge "gitea-git-repos: update deprecated API path" 2021-03-16 04:00:30 +00:00
Clark Boylan
3f2dd0e681 Enable srvr, stat and dump commands in the zk cluster
Zookeeper supports a number of "4 letter" commands [0] which are useful
for debugging and general diagnostics. By default only srvr is enabled,
but we want to add stat and dump to see details on server and client
connection statuses.

We do this via the 4lw.commands.whitelist configuration option [1] and
not the docker image env vars because we're mounting a zoo.cfg in
already.

[0] https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_4lw
[1] https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_clusterOptions

Change-Id: I24ea9b37cd5766c9d393106e8eab34623cad1624
2021-03-15 16:57:21 -07:00
Ian Wienand
753f9520e6 refstack: add backup
We should be backing up the user-generated refstack data

Change-Id: I1bd5f0de283a4436967dcae6da9c5d9cd055697c
2021-03-12 15:18:04 +11:00
Ian Wienand
d33ce951c0 refstack: use CNAME for production server
The production server is trying to send itself to
refstack01.openstack.org, causing cross-site scripting issues.  In
production, use the CNAME, but use the FQDN for testing.

Fix up job file matchers while here.

Change-Id: I18a5067ee25c59c5eaa17b7c2d9bd5a942a9173d
2021-03-12 10:24:06 +11:00
Zuul
d8cfde1e22 Merge "refstack: Edit URL of public RefStackAPI" 2021-03-11 03:43:17 +00:00
Martin Kopec
834e39fc7e refstack: Edit URL of public RefStackAPI
The previous refstack server had 'api' in the endpoint
addresses of API calls. Let's try to set it in the new
instance as well to keep the same interface.

Also, fix the typo in the testinfra host match and in
the test name.

Change-Id: I7319990144396b3a753678975a09b0add3ac4465
2021-03-10 14:09:20 +11:00
James E. Blair
b768325480 Use upstream jitsi-meet web image
This has our change to open etherpad on join, so we should no longer need
to run a fork of the web server.  Switch to the upstream container image
and stop building our own.

Change-Id: I3e8da211c78b6486a3dcbd362ae7eb03cc9f5a48
2021-03-09 12:35:46 -08:00
Zuul
2a0ea75fb7 Merge "install-ansible: ensure stevedore" 2021-03-09 02:52:10 +00:00
Clark Boylan
a2fd912511 Replace ze09-12.openstack.org with ze09-12.opendev.org
These are new focal replacement servers. Because this is the last set of
replacements for the executors we also cleanup the testing of the old
servers in the system-config-run-zuul job and the inventory group
checker job.

Change-Id: I111d42c9dfd6488ef69ff1a7f76062a73d1f37bf
2021-03-08 10:13:29 -08:00
Daniel Pawlik
97942432c5 Change get-pip url
The path for get-pip.py script in version 3.5 has been changed
with this commit [1].

[1] 2360f025eb

Change-Id: Ie13a6597c23c0a376f9feba2aed664e1129c5b60
2021-03-08 15:03:43 +01:00
Zuul
8998ee96b2 Merge "Update zuul-executor shutdown handling" 2021-03-04 21:48:20 +00:00
Ian Wienand
a12d2fce2b install-ansible: ensure stevedore
We have identified an issue with stevedore < 3.3.0 where the
cloud-launcher, running under ansible, makes stevedore hashe a /tmp
path into a entry-point cache file it makes, causing a never-ending
expansion.

This appears to be fixed by [1] which is available in 3.3.0.  Ensure
we install this on bridge.  For good measure, add a ".disable" file as
we don't really need caches here.

There's currently 491,089 leaked files, so I didn't think it wise to
delete these in a ansible loop as it will probably time out the job.
We can do this manually once we stop creating them :)

[1] d7cfadbb7d

Change-Id: If5773613f953f64941a1d8cc779e893e0b2dd516
2021-03-04 08:29:01 +11:00
Clark Boylan
a42c0b704a Remove ze01.openstack.org
This server has been replaced by ze01.opendev.org running Focal. Lets
remove the old ze01.openstack.org from inventory so that we can delete
the server. We will follow this up with a rotation of new focal servers
being put in place.

This also renames the xenial executor in testing to ze12.openstack.org
as that will be the last one to be rotated out in production. We will
remove it from testing at that point as well.

We also remove a completely unused zuul-executor-opendev.yaml group_vars
file to avoid confusion.

Change-Id: Ida9c9a5a11578d32a6de2434a41b5d3c54fb7e0c
2021-03-02 10:21:59 -08:00
Ian Wienand
3f1d67b99f Add afsdb03 openstack.org
We are in the process of upgrading the AFS servers to focal.  As
explained by auristor (extracted from IRC below) we need 3 servers to
actually perform HA with the ubik protocol:

 the ubik quorum is defined by the list of voting primary ip addresses
 as specified in the ubik service's CellServDB file.  The server with
 the lowest ip address gets 1.5 votes and the others 1 vote.  To win
 election requires greater than 50% of the votes.  In a two server
 configuration there are a total of 2.5 votes to cast.  1.5 > 2.5/2 so
 afsdb02.openstack.org always wins regardless of what
 afsdb01.openstack.org says.  And afsb01.openstack.org can never win
 because 1 < 2.5/2.  by adding a third ubik server to the quorum, the
 total votes cast are 3.5 and it always requires the vote of two
 servers to elect a winner ...  if afsdb03 is added with the highest
 ip address, then either afsdb01 or afsdb02 can be elected

Add a third server which is a focal host and related configuration.

Change-Id: I59e562dd56d6cbabd2560e4205b3bd36045d48c2
2021-03-01 15:51:49 +11:00
Clark Boylan
85d923b74e Update zuul-executor shutdown handling
We update the docker-compose config for zuul-executor to better handle
its shutdown handling. In particular we want to support zuul-executor
graceful which will pause the server then exit with rc 0 when all builds
complete. To do this we switch restart: always to restart: on-failure.
With the always setting docker simply restarts zuul-executor after a
graceful stop.

We also remove the stop signal of SIGHUP with its long timeout. Zuul
executor does not seem to catch SIGHUP for anything anymore so this is
there for old behavior and can be cleaned up.

Change-Id: I5211b91025ce5a13648f3648db3b42d357ecd590
2021-02-26 08:12:30 -08:00
Clark Boylan
2a0508aa08 Add ze01.opendev.org
This is a focal replacement for ze01.openstack.org. Cleanup for
ze01.openstack.org will happen in a followup when we are happy with the
results of running zuul-executor on focal.

Change-Id: If1fef88e2f4778c6e6fbae6b4a5e7621694b64c5
2021-02-25 08:53:40 -08:00
Ian Wienand
f8ca888b2b install-docker: remove fix from prior change
This file is now removed (I0cbcd4694a4796573fe48383756be03597d2da0f);
get rid of this to avoid any confusion.

Change-Id: I837d1fccbfa2461eb1315eac54c2a017fcb86511
2021-02-25 09:19:02 +11:00
Ian Wienand
3303199ba6 install-docker: move rsyslog handler earlier
This syslog configuration is what sends any logs with a program-name
of "docker-<foo>" to /var/log/containers/foo.log.  However, at 98-
level the rules are after the default 50- rules, so we're seeing the
logs copied to both syslog and /var/log/containers.  Since this
contains a "stop" command, we should move this earlier before the
default rules and the docker logs will not be duplicated.

Change-Id: I0cbcd4694a4796573fe48383756be03597d2da0f
2021-02-25 09:16:16 +11:00
Zuul
d1ac0aee2d Merge "etherpad: fix robots.txt" 2021-02-24 00:02:04 +00:00
Zuul
89d73e42f7 Merge "gitea: fix db backup script" 2021-02-23 07:23:01 +00:00
Zuul
70467d8a82 Merge "Stop using mysqlclient ssl flag" 2021-02-23 05:00:42 +00:00
Zuul
6b88e37a50 Merge "service-borg-backup: preload backup server facts" 2021-02-23 03:21:07 +00:00
Ian Wienand
08dba9d026 service-borg-backup: preload backup server facts
As described inline, ensure that minimal facts for the backup servers'
are loaded before running the backup roles on hosts, so they can read
the ansible_ssh_host_key_ed25519_public fact for each backup server
and ensure it is accepted.

Update the other comments slightly as well.

Change-Id: I1f207ca0770d58f61a89f9ade0bd26cebc982c62
2021-02-23 13:04:20 +11:00
Ian Wienand
029dfb55a8 gitea: fix db backup script
I introduced this typo with I500062c1c52c74a567621df9aaa716de804ffae7.
Luckily Ibb63f19817782c25a5929781b0f6342fe4c82cf0 has alerted us to
this problem.

Change-Id: I02bf2f4fa1041642a719100e9591bf5cd1a0bf49
2021-02-23 02:00:20 +00:00
Zuul
4d85fc521a Merge "Use dstat to record performance of system-config-run hosts" 2021-02-23 00:13:59 +00:00