system-config

Author	SHA1	Message	Date
Ian Wienand	c27915c3a7	translate: fix backup extras match This should be called "_extra" ... currently it overrides the default exclude list. This means /var/lxcfs gets incorrectly included in the backup and makes it error out as it has sockets and weird stuff that can't be backed up; this is why we are getting failure mail. Change-Id: Idea70c32b2d42f77fee2b35487d88a8ee982c856	2021-02-23 02:00:34 +00:00
Clark Boylan	1e18cd0163	Add new zm01.opendev.org server This is a Focal server that will replace zm01.openstack.org. Once this is deployed and happy we can also move forward and do the remainder of the mergers. Change-Id: I139c52e26d17ac8d9b604366a3333556d23c5536	2021-02-22 10:58:56 -08:00
Ian Wienand	39ffc685d6	backups: remove all bup All hosts are now running thier backups via borg to servers in vexxhost and rax.ord. For reference, the servers being backed up at this time are: borg-ask01 borg-ethercalc02 borg-etherpad01 borg-gitea01 borg-lists borg-review-dev01 borg-review01 borg-storyboard01 borg-translate01 borg-wiki-update-test borg-zuul01 This removes the old bup backup hosts, the no-longer used ansible roles for the bup backup server and client roles, and any remaining bup related configuration. For simplicity, we will remove any remaining bup cron jobs on the above servers manually after this merges. Change-Id: I32554ca857a81ae8a250ce082421a7ede460ea3c	2021-02-16 16:00:28 +11:00
Zuul	60b5f789ad	Merge "Clean up ethercalc server replacement transition"	2021-02-15 22:20:10 +00:00
Jeremy Stanley	6d0c4b0b3b	Update AFS group vars filenames Ifa5f251fdfb8de737ad2ed96491d45294ce23a0c renamed the afs and afsdb groups to afs-file-server and afs-db-server, but didn't update the group files. Previously the firewall rules were duplicated in the afs/afsdb group; but now all afs servers are in the afs-server-common group. Rename afs.yaml->afs-server-common.yaml and remove the now unnecessary afsdb.yaml. Remove one of the old group vars files and rename the other to afs-server-common so we can restore the udp ports they open in our firewall rules. Change-Id: I17dd0596660addf061ade31b4450bf040c01ffe8	2021-02-12 18:23:45 +11:00
Zuul	036ac31060	Merge "Refactor AFS groups"	2021-02-11 22:46:00 +00:00
Ian Wienand	312b9bec24	Refactor AFS groups Both the filesevers and db servers have common key material deployed by the openafs-server-config role. Put both types of server in a new group "afs-server-common" so we can define this key material in just one group file on bridge. Then separate out the two into afs-<file\|db>-server groups for consistent naming. Rename afs-admin for consistent naming. The service file is updated to reflect the new groups. Change-Id: Ifa5f251fdfb8de737ad2ed96491d45294ce23a0c	2021-02-11 13:35:16 +11:00
Ian Wienand	32b48c81a2	refstack: use external https for API Currently this variable is setting several URL's used in the config to internal http links (port 8000). This bubbles through to the UI which then can't talk to the API. Emperically, changing these values in the container config and restarting it makes things work. Update this variable to make it talk to external https. Change-Id: If61ec1e0383b98d34d092c55ca0095588487902a	2021-02-11 11:44:39 +11:00
Ian Wienand	5a7511f6a6	refstack: move non-private variables to public These two variables can be deployed via system-config Change-Id: If696945d7b01ee42eb822d2391405277eb6c23d3	2021-02-10 07:10:39 +11:00
Zuul	accfb8b0fd	Merge "Add refstack01.openstack.org"	2021-02-09 04:24:10 +00:00
Zuul	f526060e39	Merge "Deploy refstack with ansible docker"	2021-02-09 03:58:22 +00:00
Ian Wienand	cf36af34c1	Add refstack01.openstack.org See Icade6c713fa9bf6ab508fd4d8d65debada2ddb30 Change-Id: I96ba37a1c872d9f5c20224bbad48bc1d17bdc438	2021-02-09 14:39:12 +11:00
Clark Boylan	a4604ae0b3	Deploy refstack with ansible docker This adds a dockerfile to build an opendevorg/refstack image as well as the jobs to build and publish it. Change-Id: Icade6c713fa9bf6ab508fd4d8d65debada2ddb30	2021-02-05 19:23:34 +00:00
Ian Wienand	56277bf70a	ask: fix backup typo and ignore live postgresql This was overriding the main list of ignores; also ignore the live db. Change-Id: Idf5ae8e88805829ee44e7f4ba003ac086f5f1206	2021-02-05 17:40:02 +11:00
Ian Wienand	01990670c9	translate: backup zanata db directly to borg As noted inline, a recent mysql client update has broken the "--all-databases" flag, at least for the client version and very old server version we use. Emperically, dumping individual databases still works with this client. Switch this to stream the db directly into borg. Ignore the old backups and remove the bup backup while we are here, since this is all borg now. Change-Id: I5fe762a003ce2c2ba4830367be87598f67f7e763	2021-02-05 14:05:24 +11:00
Ian Wienand	f9184ce323	ask: stream db backup Despite be deprecated, the ask server is our 3rd biggest backup. Even though the site is R/O we're still backing up the fresh rotations of the gzipped backups every day. To reduce the incremental space requirements, move to our plain-text streaming for the db backup. This just needs a file dropped in /etc; see the backup-borg role README documentation. We do this in puppet to avoid complexity adding this deprecated service to ansible. This then excludes the on-disk db backup dir. Drop the bup backups while we are here. Change-Id: Icfd81aca58b9a0dc3a3b74de04c1b00f03160327	2021-02-05 13:24:57 +11:00
Ian Wienand	16d26586cf	Update airship mirror address The floating IP of this host was changed during a network issue; matches I898dbf7417fb01f608eded85faaae5a417ad2e98 Change-Id: Icf1daa4a761403a3927bcadab08656cd1f42f1aa	2021-02-04 11:11:37 +11:00
Zuul	89cd6972f2	Merge "borg-backup: implement saving a stream, use for database backups"	2021-02-03 03:11:11 +00:00
Zuul	70bd9166f7	Merge "Manage afsdb servers with Ansible"	2021-02-03 02:03:28 +00:00
Ian Wienand	51733e5623	borg-backup: implement saving a stream, use for database backups Add facility to borg-backup role to run a command and save the output of it to a separate archive file during the backup process. This is mostly useful for database backups. Compressed on-disk logs are terrible for differential backups because revisions have essentially no common data. By saving the uncompressed stream directly from mysqldump, we allow borg the chance to de-duplicate, saving considerable space on the backup servers. This is implemented for our ansible-managed servers currently doing dumps. We also add it to the testinfra. This also separates the archive names for the filesystem and stream backup with unique prefixes so they can be pruned separately. Otherwise we end up keeping only one of the stream or filesystem backups which isn't the intention. However, due to issues with --append-only mode we are not issuing prune commands at this time. Note the updated dump commands are updated slightly, particularly with "--skip-extended-insert" which was suggested by mordred and significantly improves incremental diff-ability by being slightly more verbose but keeping much more of the output stable across dumps. Change-Id: I500062c1c52c74a567621df9aaa716de804ffae7	2021-02-03 11:43:12 +11:00
Zuul	e762fd3677	Merge "gitea backup: prune some large directories"	2021-01-21 00:22:06 +00:00
Ian Wienand	c98505c8f2	Manage afsdb servers with Ansible Move common setup steps into a openafs-server-config role, and create openafs-file-server and openafs-db-server roles to manage fileserver and db servers respectively. Modify the playbook to run these roles against the AFS servers. Change-Id: I4e80ad8ffe1d4992e405ea516b8762109758d7eb	2021-01-21 07:08:37 +11:00
Ian Wienand	92250eca82	Remove afs-1.8 group With all AFS file-servers upgraded to 1.8, we can move afs01.dfw back and rename the group to just "afs". Change-Id: Ib31bde124e01cd07d6ff7eb31679c55728b95222	2021-01-21 07:08:29 +11:00
Ian Wienand	99a36d790e	gitea backup: prune some large directories It's not necessary to capture the live db or git trees, so prune these from the backups. Change-Id: I7a27c49035eb0590d0157766eb3392a0f6331aea	2021-01-20 16:01:16 +11:00
Ian Wienand	60a7bfc5f6	Move afs02.dfw.openstack.org to afs-1.8 group This host is now running OpenAFS 1.8 and should be Ansible managed now. Change-Id: Ia0cf0672f3e924a3b6d8e337d3355f6216796e92	2021-01-19 09:34:26 +11:00
Ian Wienand	7683fa11b3	openafs-server : add ansible roles for OpenAFS servers This starts at migrating OpenAFS server setup to Ansible. Firstly we split up the groups and explicitly name hosts, as we will me migrating each one step-by-step. We split out 1.8 hosts into a new afs-1.8 group; the first host is afs01.ord.openstack.org which already has openafs 1.8 installed manually. An openafs-server role is introduced that does the same setup as the extant puppet. The AFS job is renamed to infra-prod-afs as the puppet component will eventually disappear. Otherwise it runs in the same way, but also runs the openafs-server role for the 1.8 servers. Once this is merged, we can run it against afs01.ord.openstack.org to ensure it works and is idempotent. We can then take on upgrading the other file servers, and work further on the database servers. Change-Id: I7998af43961999412f58a78214f4b5387713d30e	2021-01-19 08:08:33 +11:00
Clark Boylan	44a076998b	Cleanup openstackid02 and openstackid03 This servers were spun up to handle extra load to the openstackid service during the virtual summit. The load is no longer present and we have been asked to dial back to the normal setup for this service. Clean these servers up to stop using unneeded resources. We will start by removing them from inventory, then dns, and then shut them down. If everything continues to look happy after that we will delete them. Change-Id: I469d16f80dcc6c20891556272a94b1f7404b3620	2021-01-11 10:20:20 -08:00
Jeremy Stanley	7d48d972b5	Clean up ethercalc server replacement transition The old ethercalc01 server has been deleted as have its DNS entries. Belatedly update cacti to query the new server, and remove an old unused reference which was at one time disabling the former server. Change-Id: Ide70c7d03bfff5bd695272c696913dfb3decc525	2021-01-05 16:27:09 +00:00
Clark Boylan	613810dba1	Revert "Reduce gerrit heap limit to 44g" This reverts commit 95d9b838140e44c9547ad1fa28bc88206823198c. We've found that we run out of memory at 44g. Bump back up to 48g as that should give us a bit more headroom. Change-Id: I14a8f2b298aa1d3cb5c0829508ee137a6769675b	2020-12-09 15:26:43 -08:00
Clark Boylan	95d9b83814	Reduce gerrit heap limit to 44g We had been setting this to 48GB on java 8, but recent gerrit service issues indicate that this may be too large for our current system on java 11. In particular it appears the non heap portions of the jvm may be in the ~8GB range leaving only about 5-6GB of usable system memory for other activities like web servers, backups, and garbage collection. Reduce this to 44GB to increase headroom to see if that helps us. Java 11 is reported to be much more efficient at garbage collecting so hopefully that makes up the difference between lower memory and where we were on java 8. As a side note we could revert back to java 8 as another option. Change-Id: Ie326aad2a9895098b484924a26c9257cd009d89e	2020-12-08 07:31:53 -08:00
fungi.admin	2197f11a0f	Merge "Omnibus Gerrit 3.2 changes"	2020-11-21 17:19:58 +00:00
Zuul	ba27a1fda6	Merge "Add codesearch.opendev.org server"	2020-11-19 23:42:56 +00:00
Zuul	1b16dae681	Merge "Migrate codesearch site to container"	2020-11-19 22:26:12 +00:00
Ian Wienand	4ce223d83a	Add codesearch.opendev.org server Change-Id: I1e75ca551871999a654000f103aaf833679e804e Depends-On: https://review.opendev.org/763297	2020-11-20 07:41:43 +11:00
Ian Wienand	368466730c	Migrate codesearch site to container The hound project has undergone a small re-birth and moved to https://github.com/hound-search/hound which has broken our deployment. We've talked about leaving codesearch up to gitea, but it's not quite there yet. There seems to be no point working on the puppet now. This builds a container than runs houndd. It's an opendev specific container; the config is pulled from project-config directly. There's some custom scripts that drive things. Some points for reviewers: - update-hound-config.sh uses "create-hound-config" (which is in jeepyb for historical reasons) to generate the config file. It grabs the latest projects.yaml from project-config and exits with a return code to indicate if things changed. - when the container starts, it runs update-hound-config.sh to populate the initial config. There is a testing environment flag and small config so it doesn't have to clone the entire opendev for functional testing. - it runs under supervisord so we can restart the daemon when projects are updated. Unlike earlier versions that didn't start listening till indexing was done, this version now puts up a "Hound is not ready yet" message when while it is working; so we can drop all the magic we were doing to probe if hound is listening via netstat and making Apache redirect to a status page. - resync-hound.sh is run from an external cron job daily, and does this update and restart check. Since it only reloads if changes are made, this should be relatively rare anyway. - There is a PR to monitor the config file (https://github.com/hound-search/hound/pull/357) which would mean the restart is unnecessary. This would be good in the near and we could remove the cron job. - playbooks/roles/codesearch is unexciting and deploys the container, certificates and an apache proxy back to localhost:6080 where hound is listening. I've combined removal of the old puppet bits here as the "-codesearch" namespace was already being used. Change-Id: I8c773b5ea6b87e8f7dfd8db2556626f7b2500473	2020-11-20 07:41:12 +11:00
Clark Boylan	57f9e54ad8	Omnibus Gerrit 3.2 changes These changes are squashed together to simplify applying them to config management without zuul and ansible running one of these without the others. We essentially need them all in place at the same time to accurately reflect the post upgrade state. We stop blocking /p/ in gerrit's apache vhost. /p/ is used for dashboards. We add a few java options that new gerrit sets by default. We update the gerrit image in docker compose to 3.2. We update zuul to use basic auth instead of digest auth when talking to Gerrit. Change-Id: I6ea38313544ce1ecbc4cfd914b1f33e77d0d2d03	2020-11-17 16:04:56 -08:00
Zuul	2c7591c318	Merge "Set gerrit.serverId in gerrit.config"	2020-11-17 21:22:53 +00:00
Ian Wienand	c16501af8a	zuul backup : expand debug log match Follow-on to Ia9579c7b3204b47d453fc51388265bf1867af20c, this also matches the web-debug* log files Change-Id: Ibabbfa3b01317528a75eeec17ea28168da57123a	2020-11-13 14:34:06 +11:00
Ian Wienand	dbff6071b1	backup: skip zuul debug logs for backup This cuts out the bulk of the storage expense, but leaves us with the regular logs for enhanced audit trails. Change-Id: Ia9579c7b3204b47d453fc51388265bf1867af20c	2020-11-12 12:11:39 +11:00
Ian Wienand	6bcfe05742	review: trim backups This should help reduce the bulk of the review site backups * launchpadlib cache has ~650,000 files which we don't need to track * review_site/tmp has ~50,000 files * review_site/cache is about 9gb * review_site/index is optional to backup, but a) it's very unlikley to be useful in a full restore situation; we'd have to re-create them and b) things seem to come and go under this directory during the backup, causing it to exit with an error status. Change-Id: If7009cfcd5a3a07c07108149772cc8c1873bf277	2020-11-11 23:36:11 +00:00
Clark Boylan	b9b1cba959	Set gerrit.serverId in gerrit.config This serverId value is used by notedb to identify the gerrit cluster that notedb contents belong to. By default a random uuid is generated by gerrit for this value. In order to avoid config management and gerrit fighting over this value after we upgrade we set a value now. This should be safe to land on 2.13 as old gerrit should ignore the value. Change-Id: I57c9b436a9d0d1dfe77eee907d50fc1dcda6ab12	2020-11-10 10:30:58 -08:00
Ian Wienand	b05a98440a	Remove etherpad from bup backup bup is going crazy and filling the disk when making its backups. We have moved this into the borg backup group and run some backups, so rather than spending time debugging this, we are just going to disable bup on the server. Change-Id: I1daad4eb05f8222131dc84c12577dec924874466	2020-11-10 13:52:03 +11:00
Zuul	9ff95a5f00	Merge "etherpad: ignore live db for borg backups"	2020-11-10 00:11:22 +00:00
Zuul	d11949817d	Merge "Add all backup hosts to borg backups"	2020-11-09 23:39:51 +00:00
Ian Wienand	b26622ad12	etherpad: ignore live db for borg backups Change-Id: Ie7f7e189720e68ec0b07a727be0f5752da20566d	2020-11-10 10:11:24 +11:00
Zuul	d3a53e8ec0	Merge "Remove mirror-update server and related puppet"	2020-11-09 21:07:11 +00:00
Ian Wienand	d533e89089	Add all backup hosts to borg backups Backups have been going well on ethercalc02, so add borg backup runs to all backed-up servers. Port in some additional excludes for Zuul and slightly modify the /var/ matching. Change-Id: Ic3adfd162fa9bedd84402e3c25b5c1bebb21f3cb	2020-11-09 17:23:22 +11:00
Ian Wienand	3568b76c3c	Add * match to grafana.opendev.org This wasn't matching grafana01 Change-Id: I930a6d1428d8becd29d15fdb53d26b0c186b79fd	2020-11-05 11:35:57 +11:00
Zuul	1dc940c74f	Merge "RAX DFW/IAD : add internal mirror DNS to cert"	2020-11-04 03:28:57 +00:00
Ian Wienand	676c5dad44	Add borg backup server in RAX ORD This is our second backup server for borg, hosted in RAX/ORD. Change-Id: I2c896345e497067ce12863bdb1dda8ce467e2243	2020-10-30 16:39:25 +11:00

1 2 3 4 5 ...

419 Commits