1643 Commits

Author SHA1 Message Date
Jonathan Rosser
17a005a69b Add Debian Bullseye to the reprepro config
Change-Id: I01a0dc2087fecfab39c7e6d49b8909d5bf9442ab
2021-03-01 10:27:06 +00:00
Clark Boylan
2a0508aa08 Add ze01.opendev.org
This is a focal replacement for ze01.openstack.org. Cleanup for
ze01.openstack.org will happen in a followup when we are happy with the
results of running zuul-executor on focal.

Change-Id: If1fef88e2f4778c6e6fbae6b4a5e7621694b64c5
2021-02-25 08:53:40 -08:00
Ian Wienand
f8ca888b2b install-docker: remove fix from prior change
This file is now removed (I0cbcd4694a4796573fe48383756be03597d2da0f);
get rid of this to avoid any confusion.

Change-Id: I837d1fccbfa2461eb1315eac54c2a017fcb86511
2021-02-25 09:19:02 +11:00
Ian Wienand
3303199ba6 install-docker: move rsyslog handler earlier
This syslog configuration is what sends any logs with a program-name
of "docker-<foo>" to /var/log/containers/foo.log.  However, at 98-
level the rules are after the default 50- rules, so we're seeing the
logs copied to both syslog and /var/log/containers.  Since this
contains a "stop" command, we should move this earlier before the
default rules and the docker logs will not be duplicated.

Change-Id: I0cbcd4694a4796573fe48383756be03597d2da0f
2021-02-25 09:16:16 +11:00
Zuul
d1ac0aee2d Merge "etherpad: fix robots.txt" 2021-02-24 00:02:04 +00:00
Zuul
89d73e42f7 Merge "gitea: fix db backup script" 2021-02-23 07:23:01 +00:00
Zuul
70467d8a82 Merge "Stop using mysqlclient ssl flag" 2021-02-23 05:00:42 +00:00
Zuul
6b88e37a50 Merge "service-borg-backup: preload backup server facts" 2021-02-23 03:21:07 +00:00
Ian Wienand
08dba9d026 service-borg-backup: preload backup server facts
As described inline, ensure that minimal facts for the backup servers'
are loaded before running the backup roles on hosts, so they can read
the ansible_ssh_host_key_ed25519_public fact for each backup server
and ensure it is accepted.

Update the other comments slightly as well.

Change-Id: I1f207ca0770d58f61a89f9ade0bd26cebc982c62
2021-02-23 13:04:20 +11:00
Ian Wienand
029dfb55a8 gitea: fix db backup script
I introduced this typo with I500062c1c52c74a567621df9aaa716de804ffae7.
Luckily Ibb63f19817782c25a5929781b0f6342fe4c82cf0 has alerted us to
this problem.

Change-Id: I02bf2f4fa1041642a719100e9591bf5cd1a0bf49
2021-02-23 02:00:20 +00:00
Zuul
4d85fc521a Merge "Use dstat to record performance of system-config-run hosts" 2021-02-23 00:13:59 +00:00
Zuul
1b2435c349 Merge "backups: remove all bup" 2021-02-21 22:41:41 +00:00
James E. Blair
b6cbb52447 Add pull tasks for nodepool/zuul
So we can stop/pull/start, move the pull tasks to their own files
and add a playbook that invokes them.

Change-Id: I4f351c1d28e5e4606e0a778e545a3a805525ac71
2021-02-19 15:42:40 -08:00
Zuul
464bb363e9 Merge "grafana: update to 7.4.2" 2021-02-19 06:03:21 +00:00
Zuul
9db55a55f3 Merge "borg-backup: send explicit email on backup failure" 2021-02-19 05:20:01 +00:00
Ian Wienand
7577439ff8 grafana: update to 7.4.2
This includes a fix for I216528a76307189d8d87bd2fcfeff95c6ceb53cc.
Now it's released we can be a bit more explicit about why we added the
workaround.

Change-Id: Ibaf1850549b5e7ec3622418b650bc5e59a289ab6
2021-02-19 09:54:31 +11:00
Ian Wienand
5a1b8ac179 grafana: take some screenshots during testing
Take some simple screenshots for basic validation of any new releases.

Change-Id: I52770032a6cc91d76da23194f58474f5ceeaed38
2021-02-17 10:43:26 +11:00
Clark Boylan
1560b01f7e Use dstat to record performance of system-config-run hosts
We have seen some poor performance from gitea which may be related to
manage project updates. Start a dstat service which logs to a csv file
on our system-config-run job hosts in order to collect performance info
from our services in pre merge testing. This will include gitea and
should help us evaluate service upgrades and other changes from a
performance perspective before they hit production.

Change-Id: I7bdaab0a0aeb9e1c00fcfcca3d114ae13a76ccc9
2021-02-16 14:31:30 -08:00
Ian Wienand
39ffc685d6 backups: remove all bup
All hosts are now running thier backups via borg to servers in
vexxhost and rax.ord.

For reference, the servers being backed up at this time are:

 borg-ask01
 borg-ethercalc02
 borg-etherpad01
 borg-gitea01
 borg-lists
 borg-review-dev01
 borg-review01
 borg-storyboard01
 borg-translate01
 borg-wiki-update-test
 borg-zuul01

This removes the old bup backup hosts, the no-longer used ansible
roles for the bup backup server and client roles, and any remaining
bup related configuration.

For simplicity, we will remove any remaining bup cron jobs on the
above servers manually after this merges.

Change-Id: I32554ca857a81ae8a250ce082421a7ede460ea3c
2021-02-16 16:00:28 +11:00
Zuul
8360a7ceab Merge "Run gerrit 3.2 and 3.3 functional tests" 2021-02-16 04:27:46 +00:00
Ian Wienand
5ca69113fd borg-backup: send explicit email on backup failure
This sets a global BORG_UNDER_CRON=1 environment variable for
production hosts and makes the borg-backup script send an email if any
part of the backup job appears to fail (this avoids spamming ourselves
if we're testing backups, etc).

We should ideally never get this email, but if we do it's something we
want to investigate quickly.  There's nothing worse than thinking
backups are working when they aren't.

Change-Id: Ibb63f19817782c25a5929781b0f6342fe4c82cf0
2021-02-16 14:49:38 +11:00
Zuul
94fe3610e5 Merge "borg-backup-server: make sure to append verification logs" 2021-02-16 03:14:30 +00:00
Ian Wienand
c7de005738 grafana: ensure snapshots api returns a 403
Change-Id: I216528a76307189d8d87bd2fcfeff95c6ceb53cc
2021-02-15 17:01:15 +11:00
Ian Wienand
ece90fb7f7 borg-backup-server: make sure to append verification logs
We don't want to overwrite every run, but rather append to the log
file.

Change-Id: I304caedecbf6a9552f314636ca82a543ef16a8b6
2021-02-15 14:45:03 +11:00
Zuul
036ac31060 Merge "Refactor AFS groups" 2021-02-11 22:46:00 +00:00
Zuul
a326daed61 Merge "borg-backup: fix backup script failure match" 2021-02-11 22:06:31 +00:00
Ian Wienand
ad1992955a borg-backup: fix backup script failure match
Fix a typo in the failure match, and log the error code in that case.

Change-Id: Ie17042237986d0bed58e95c271f868c735c724d2
2021-02-12 08:16:20 +11:00
Zuul
03f5e8e0de Merge "borg-backup-server: run a weekly backup verification" 2021-02-11 05:53:16 +00:00
Ian Wienand
312b9bec24 Refactor AFS groups
Both the filesevers and db servers have common key material deployed
by the openafs-server-config role.  Put both types of server in a new
group "afs-server-common" so we can define this key material in just
one group file on bridge.

Then separate out the two into afs-<file|db>-server groups for
consistent naming.

Rename afs-admin for consistent naming.

The service file is updated to reflect the new groups.

Change-Id: Ifa5f251fdfb8de737ad2ed96491d45294ce23a0c
2021-02-11 13:35:16 +11:00
Zuul
f3eb16601b Merge "refstack: capture container logs to disk" 2021-02-11 01:33:29 +00:00
Ian Wienand
0d01d941b1 borg-backup-server: run a weekly backup verification
This checks the backup archives and alerts us if anything seems wrong.
This will take a few hours, so we run once a week.

Change-Id: I832c0d29a37df94d4bf2704c59bb3f8d855c3cc8
2021-02-11 00:43:16 +00:00
Zuul
9578f393a2 Merge "openafs-<db|file>-server: fix role name" 2021-02-10 23:59:56 +00:00
Ian Wienand
56a31cb114 refstack: capture container logs to disk
We have setup rsyslogd/logrotate to handle anything with docker- tags
to be persisted to disk in /var/log/containers.  Set this up here so
we keep track of the mariadb and refstack logs.

Change-Id: I760cfeb7226f79986fbf9d7dbc5f899fc87a0cd1
2021-02-11 10:51:30 +11:00
Clark Boylan
9b90e192b1 Run gerrit 3.2 and 3.3 functional tests
This change splits our existing system-config-run-review job into two
jobs, one for gerrit 3.2 and another for 3.3. The biggest change is that
we use a var called zuul_test_gerrit_version to select which version we
want and that ends up in the fake group file written out by Zuul for the
nested ansible run. The nested ansible run will then populate the
docker-compose file with the appropriate version for us.

Change-Id: I00b52c0f4aa8df3ecface964007fcf5724887e5e
2021-02-10 15:10:46 -08:00
Ian Wienand
a246df66b4 refstack: create database storage area
The mariadb container currently doesn't persist it's database
anywhere.  Map /var/lib/refstack/db to /var/lib/mysql in the
container.

We have /var/refstack and /var/lib/refstack with various things.
While we're here move everythign under /var/lib/refstack.

Also use 127.0.0.1 to ensure mysql doesn't try to connect over a
socket, but tcp (I think pymsql does anyway, but it's a little
clearer).

Change-Id: I5605eac2848a6b2222698bf20c707baa4442fcd5
2021-02-10 17:35:04 +11:00
Zuul
8d127946bc Merge "borg-backup: save PIPESTATUS before referencing" 2021-02-10 03:01:08 +00:00
Ian Wienand
3d78f88938 openafs-<db|file>-server: fix role name
This slipped in with I4e80ad8ffe1d4992e405ea516b8762109758d7eb; it
should be openafs, not openstack.

Change-Id: Iefc41f9085d86e9fdaa13c6e5b90f1c99b7a2d83
2021-02-10 13:49:12 +11:00
Zuul
0acbf39c91 Merge "borg-backup-server: volume space monitor" 2021-02-10 01:28:46 +00:00
Zuul
1d79574d82 Merge "borg-backup-server: add script for pruning borg backups" 2021-02-10 01:28:33 +00:00
Zuul
449cabeb46 Merge "refstack: move non-private variables to public" 2021-02-10 00:37:27 +00:00
James E. Blair
e58a18d8a1 Stop running ansible-lint on this repo
It is buggy (throwing exceptions for undefinied variables which are
actualyl defined via set_fact), and we frequently run into problems
using it in this repo.  It was designed to lint roles for Galaxy,
not the way we write ansible.  As of the 5.0.0 release it's
generating >4.5K lines of complaints about files in this repository.

Change-Id: If9d8c19b5e663bdd6b6f35ffed88db3cff3d79f8
2021-02-09 22:08:38 +00:00
Ian Wienand
5a7511f6a6 refstack: move non-private variables to public
These two variables can be deployed via system-config

Change-Id: If696945d7b01ee42eb822d2391405277eb6c23d3
2021-02-10 07:10:39 +11:00
Ian Wienand
8c9ba67296 borg-backup: save PIPESTATUS before referencing
It's not obvious, but the if statements can change the PIPESTATUS
meaning we're not matching what we think we're matching.  Save the
pipestatus of the backup commands so we exit the backup script with
the right code.

Change-Id: I83c7db45d3622067eb05107e26fbdc7a8aeecf63
2021-02-09 16:22:32 +11:00
Zuul
f526060e39 Merge "Deploy refstack with ansible docker" 2021-02-09 03:58:22 +00:00
Ian Wienand
62801d8a93 borg-backup-server: volume space monitor
Due to backups running in append-only mode, we do not have a way to
safely automatically prune backups.  To reduce the likelyhood we
forget about backups and end up with failing jobs, add a cron job to
send a email to infra-root if the backup partition goes over 90%
usage.  At this point a manual prune should be run
(I9559bb8aeeef06b95fb9e172a2c5bfb5be5b480e).

Change-Id: I250d84c4a9f707e63fef6f70cfdcc1fb7807d3a7
2021-02-09 11:31:02 +11:00
Ian Wienand
4f0bfa6d9d borg-backup-server: add script for pruning borg backups
This adds a script that performs a manual pruning of backup
directories.

Change-Id: I9559bb8aeeef06b95fb9e172a2c5bfb5be5b480e
2021-02-09 11:29:46 +11:00
Ian Wienand
98f3d42ab0 gerrit: only backup accountPatchReviewDb
Due to [1] --all-databases is no longer working with our version of
database.  Move to explicitly backing up the only database we care
about now, which is accountPatchReviewDb; everything else is in
notedb.

[1] https://bugs.launchpad.net/ubuntu/+source/mysql-5.7/+bug/1914695

Change-Id: Iab2a8ab612cc0a0f10c90123f2936c0abda9e76f
2021-02-09 11:29:46 +11:00
Clark Boylan
a4604ae0b3 Deploy refstack with ansible docker
This adds a dockerfile to build an opendevorg/refstack image as well as
the jobs to build and publish it.

Change-Id: Icade6c713fa9bf6ab508fd4d8d65debada2ddb30
2021-02-05 19:23:34 +00:00
Ian Wienand
0b86a6a82e borg-backup: add a few more global excludes
These were gleaned from looking at what files are taking up space in
the deltas of backups.  Nothing major, but mlocate in partiuclar is
taking up to a couple of hundred mb on some servers.

Change-Id: I4b08c4e2491fa7138045aabcb23017ff8cef7600
2021-02-05 11:47:47 +11:00
Zuul
2ebb6adbd8 Merge "Add remote port info to gitea apache access logs" 2021-02-03 23:24:46 +00:00