system-config

Author	SHA1	Message	Date
Zuul	9c29fd8324	Merge "Remove the gerrit group in favor of the review group"	2021-10-22 16:15:56 +00:00
Zuul	0017bdc468	Merge "Replace testing group vars with host vars for review02"	2021-10-13 17:16:31 +00:00
Clark Boylan	cf91bc0971	Remove the gerrit group in favor of the review group Having two groups here was confusing. We seem to use the review group for most ansible stuff so we prefer that one. We move contents of the gerrit group_vars into the review group_vars and then clean up the use of the old group vars file. Change-Id: I7fa7467f703f5cec075e8e60472868c60ac031f7	2021-10-12 09:48:53 -07:00
Clark Boylan	76baae4e3f	Replace testing group vars with host vars for review02 Previously we had a test specific group vars file for the review Ansible group. This provided junk secrets to our test installations of Gerrit then we relied on the review02.opendev.org production host vars file to set values that are public. Unfortunately, this meant we were using the production heapLimit value which is far too large for our test instances leading to the occasionaly failure: There is insufficient memory for the Java Runtime Environment to continue. Native memory allocation (mmap) failed to map 9596567552 bytes for committing reserved memory. We cannot set the heapLimit in the group var file because the hostvar file overrides those values. To fix this we need to replace the test specific group var contents with a test specific host var file instead. To avoid repeating ourselves we also create a new review.yaml group_vars file to capture common settings between testing and prod. Note we should look at combining this new file with the gerrit.yaml group_vars. On the testing side of things we set the heapLimit to 6GB, we change the serverid value to prevent any unexpected notedb confusion, and we remove replication config. Change-Id: Id8ec5cae967cc38acf79ecf18d3a0faac3a9c4b3	2021-10-12 09:48:45 -07:00
Clark Boylan	46faa6626b	Remove Gerrit 3.2 images This should be merged after we are on 3.3 and happy with the state of things. Depends-On: https://review.opendev.org/c/openstack/project-config/+/813081 Change-Id: I4173df5e4ae38af6423402be0299470323762da2	2021-10-07 20:07:38 +00:00
Clark Boylan	0f6c29c0ee	Test upgrade from Gerrit 3.3 to 3.4 This shifts our Gerrit upgrade testing ahead to testing 3.3 to 3.4 upgrades as we have upgraded to 3.3 at this point. Change-Id: Ibb45113dd50f294a2692c65f19f63f83c96a3c11	2021-10-07 11:57:04 -07:00
Monty Taylor	d49f399b17	Start building gerrit 3.4 Change-Id: I9cd8c9e1fb837dae91057da9bc80a3f15e566a59	2021-10-07 11:54:50 -07:00
Clark Boylan	e47dccdc34	Upgrade Gerrit to 3.3 This bumps the gerrit image up to our 3.3 image. Followup changes will shift upgrade testing to test 3.3 to 3.4 upgrades, clean up no longer needed 3.2 images, and start building 3.4 images. Change-Id: Id0f544846946d4c50737a54ceb909a0a686a594e	2021-10-07 11:54:46 -07:00
Ian Wienand	e772abaf96	gitea: use assets bundle This uses the opendev assets bundle image created with I3166679bde6d771276289b9d32e7e4407957b2f8. The mount options require using BuildKit, hence the Dockerfile update. Otherwise conceptually it's fairly simple; copy in the files from the opendevorg/assets image rather than the file-system. Change-Id: I36bdc76471eec5380a676ebcdd885a88d3985976	2021-09-06 15:07:36 +10:00
Ian Wienand	25cdc97950	Add assets and a related docker image/bundle Move some common assets into a top-level assets/ directory. Services can reference these assets via https://opendev.org/opendev/system-config/raw/branch/master/assets/<file> in <img> tags, etc. Some services want to embed these into their images, but we wish to only keep one canonical copy. For this, add a Dockerfile and jobs that creates a simple bundle of assets in opendevorg/assets. This can be referenced in other builds; the new BuildKit bind-mount is particularly useful for this (c.f. I36bdc76471eec5380a676ebcdd885a88d3985976). Change-Id: I3931566eb86a0618705d276445fa0a5f659692ea	2021-09-01 06:15:43 +10:00
Monty Taylor	8dbf0a3d82	Produce both buster and bullseye container images This will allow us to roll out consumption forward in a methodical manner. This reverts commit 45caec4d43900bc66fb0b8c219c6dcc3180ca8aa. Note the weird ARG definitions are there for a reason: https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact Change-Id: I81174ac035164695f1c27c9662f25335b78c2e64	2021-08-28 16:46:29 -07:00
Clark Boylan	aeddc1bf17	Test a gerrit 3.2 -> 3.3 upgrade We create (a currently test only) playbook that upgrades zuul. This job then runs through project creation and renaming and testinfra testing on the upgraded gerrit version. Future improvements should consider loading state on the old gerrit install before we upgrade that can be asserted as well. Change-Id: I364037232cf0e6f3fa150f4dbb736ef27d1be3f8	2021-08-19 13:19:05 -07:00
Clark Boylan	ce5d207dbb	Run remote-puppet-else daily instead of hourly Update the file matchers to actually match the current set of puppet things. This ensure the deploy job runs when we want it and we can catch up daily instead of hourly. Previously a number of the matchers didn't actually match the puppet things because the path prefix was wrong or works were in different orders for the dir names. Change-Id: I3510da81d942cf6fb7da998b8a73b0a566ea7411	2021-08-17 15:54:38 -07:00
Clark Boylan	ffe06527de	Run infra-prod-service-zuul-preview daily instaed of hourly This is being done beacuse we don't make many changes to the zuul-preview service but it runs in the hourly buildset starving deploy runs. Since this doesn't change much we can move it to the daily run instead. If we need to update it we can run the playbook manually or land a change to trigger it. Change-Id: I89d2c712fcfd18bd4f694b2c90067295253b8836	2021-08-17 15:45:17 -07:00
Clark Boylan	268fc98bd7	Remove extra service-codesearch job in deploy This job was listed twice. Remove the extra one for clarity. Change-Id: I7aa39e3757d6562af474ec7c9cfdda7d8024cd1c	2021-08-16 11:42:37 -07:00
Clark Boylan	711bf9e9f8	Run the cloud launcher daily instead of hourly This is a job that takes quite a bit of time, but only rarely do we need the updates encoded in this job. Move the job from our hourly deployment to the daily deployment to make its impact less painful. Change-Id: I724bcdd67f4c324f497a9d8239bcfd8d37528956	2021-08-16 11:41:37 -07:00
Zuul	af5fcdcb13	Merge "Run matrix-eavesdrop on eavesdrop"	2021-08-02 17:00:09 +00:00
Zuul	ab092e721f	Merge "Add matrix-eavesdrop container image"	2021-08-02 16:35:25 +00:00
James E. Blair	82c966e6da	Run matrix-eavesdrop on eavesdrop Thin runs the new matrix-eavesdrop bot on the eavesdrop server. It will write logs out to the limnoria logs directory, which is mounted inside the container. Change-Id: I867eec692f63099b295a37a028ee096c24109a2e	2021-07-28 18:34:58 -05:00
James E. Blair	b58b204a8e	Add matrix-eavesdrop container image This builds a container image with a simple eavesdrop bot for Matrix. Change-Id: I5304b4ec974b84886ac969b59cfcec8dec2febf9	2021-07-23 14:28:22 -07:00
Ian Wienand	cec6372288	Add infra-service-deploy-paste to deploy pipeline This was added in periodic, but it should also be in the deploy pipline. Change-Id: I392e955667ed56e38c0c1b2386562e04b8dd8dd1	2021-07-13 11:27:15 +10:00
Ian Wienand	916c1d3dc8	Add paste service The paste service needs an upgrade; since others have created a lodgeit container it seems worth us keeping the service going if only to maintain the historical corpus of pastes. This adds the ansible to deploy lodgeit and a sibling mariadb container. I have imported a dump of the old data as a test. The dump is ~4gb and imported it takes up about double that; certainly nothing we need to be too concerned over. The server will be more than capable of running the db container alongside the lodgeit instance. This should have no effect on production until we decide to switch DNS. Change-Id: I284864217aa49d664ddc3ebdc800383b2d7e00e3	2021-07-07 15:12:04 +10:00
Zuul	bfaa4713eb	Merge "Remove system-config-legacy-logstash-filters job"	2021-06-10 17:29:17 +00:00
Ian Wienand	403773d55a	limnoria/meetbot setup on eavesdrop01.opendev.org This installs our Limnoira/meetbot container and configures it on eavesdrop01.opendev.org. I have ported the configuration from the old puppet as best I can (it is very verbose); my procedure was to use the Limnoira wizard to start a new config file then backport everything from the old file. I felt this was best to not miss any new options. This does channel logging (via built-in ChannelLogger plugin, along with a cron job for logs2html) and runs our fork of meetbot. It exports the channel logs via HTTP to /irclogs and meetings logs to /meetings. meetings.opendev.org will proxy to these two locations when the server is active. Note this has not ported the channel list; so the bot will not be listening in our channels. Change-Id: I9f9a466c271e1a706f9f98f816de0e84047519f1	2021-06-10 09:02:16 +10:00
Ian Wienand	0d00b28da8	Create ircbot container This container installs Limnoria, the supybot replacement as the generic ircbot container. We install meetbot plugin as a sibling project. Previously we've conflated supybot with meetbot, which is a bit confusing because meetbot is a plugin, but we also use other plugins such as the channel logger. We also hope to convert some of our other bots to Limnoria (ptgbot?) to consolidate everything. For this reason I've called this the more generic "ircbot". The image installs meetbot as a sibling project, with the idea being any other plugins would also be installed as siblings. The siblings install expects the work directory to be a relative directory. I'm not sure we run this from other projects, but this will work the same if we do. Depends-On: https://review.opendev.org/c/opendev/meetbot/+/793876 Change-Id: Icee4c6bbb5ea235ba69c10f800a14bbf5beef3d5	2021-06-10 09:00:43 +10:00
Clark Boylan	6e04e500fd	Remove system-config-legacy-logstash-filters job We're trying to phase out the ELK systems. While we have agreed to not immediately turn anything off we probably don't need to keep running the system-config-legacy-logstash-filters job as ELK should remain fairly fixed unless someone rewrites config management for it and modernizes it. And if that happens they will want new modern testing too. Depends-On: https://review.opendev.org/c/openstack/project-config/+/792710 Change-Id: I9ac6f12ec3245e3c1be0471d5ed17caec976334f	2021-05-21 17:03:32 -07:00
Zuul	be4f67f23e	Merge "Add infra-prod-service-lists job"	2021-05-19 19:16:32 +00:00
Clark Boylan	caedb11d3d	Add infra-prod-service-lists job This job is not added in the parent so that we can manually run playbooks after the parent lands. Once we are happy with the results from the new service-lists.yaml playbook we can land this change and have zuul automatically apply it when necessary. Change-Id: I38de8b98af9fb08fa5b9b8849d65470cbd7b3fdc	2021-05-11 08:40:06 -07:00
Ian Wienand	629fdec768	Build Python 3.9 python-builder/base containers Python 3.9 is released, so let's build containers. This splits the docker-images/ files up as they are becoming a bit crowded. Change-Id: Id68080575a30e4a08c99df0af603fbb65a0983bd	2021-05-05 09:55:56 +10:00
Ian Wienand	9f11fc5c75	Remove references to review-dev With our increased ability to test in the gate, there's not much use for review-dev any more. Remove references. Change-Id: I97e9865e0b655cd157acf9ffa7d067b150e6fc72	2021-03-24 11:40:31 +11:00
James E. Blair	96bac7b486	Add zookeeper-statsd This adds a program, zookeeper-statsd, which monitors zookeeper metrics and reports them to statsd. It also adds a container to run that program. And it runs the container on each of the ZooKeeper quorum members. And it updates the graphite host to allow statsd traffic from quorum members. And it updates the 4-letter-word whitelist to allow the mntr command (which is used to gather metrics) to be issued. Change-Id: I298f0b13a05cc615d8496edd4622438507fc5423	2021-03-17 14:52:31 -07:00
Zuul	77b1c14a9a	Merge "Use upstream jitsi-meet web image"	2021-03-17 00:22:50 +00:00
Ian Wienand	c1aff2ed38	kerberos-kdc: role to manage Kerberos KDC servers This adds a role and related testing to manage our Kerberos KDC servers, intended to replace the puppet modules currently performing this task. This role automates realm creation, initial setup, key material distribution and replica host configuration. None of this is intended to run on the production servers which are already setup with an active database, and the role should be effectively idempotent in production. Note that this does not yet switch the production servers into the new groups; this can be done in a separate step under controlled conditions and with related upgrades of the host OS to Focal. Change-Id: I60b40897486b29beafc76025790c501b5055313d	2021-03-17 08:30:52 +11:00
James E. Blair	b768325480	Use upstream jitsi-meet web image This has our change to open etherpad on join, so we should no longer need to run a fork of the web server. Switch to the upstream container image and stop building our own. Change-Id: I3e8da211c78b6486a3dcbd362ae7eb03cc9f5a48	2021-03-09 12:35:46 -08:00
Zuul	1b2435c349	Merge "backups: remove all bup"	2021-02-21 22:41:41 +00:00
Ian Wienand	39ffc685d6	backups: remove all bup All hosts are now running thier backups via borg to servers in vexxhost and rax.ord. For reference, the servers being backed up at this time are: borg-ask01 borg-ethercalc02 borg-etherpad01 borg-gitea01 borg-lists borg-review-dev01 borg-review01 borg-storyboard01 borg-translate01 borg-wiki-update-test borg-zuul01 This removes the old bup backup hosts, the no-longer used ansible roles for the bup backup server and client roles, and any remaining bup related configuration. For simplicity, we will remove any remaining bup cron jobs on the above servers manually after this merges. Change-Id: I32554ca857a81ae8a250ce082421a7ede460ea3c	2021-02-16 16:00:28 +11:00
Clark Boylan	2bb3dd797b	Cleanup refstack job dependencies We need to depend on the buildset registry as we are building this image in a separate job. We also don't need to depend on the build job in gate, we only need the upload job. Change-Id: Ie7c2ed29c028f8c23d67ad38edbe04b12e22d026	2021-02-10 15:11:54 -08:00
Clark Boylan	9b90e192b1	Run gerrit 3.2 and 3.3 functional tests This change splits our existing system-config-run-review job into two jobs, one for gerrit 3.2 and another for 3.3. The biggest change is that we use a var called zuul_test_gerrit_version to select which version we want and that ends up in the fake group file written out by Zuul for the nested ansible run. The nested ansible run will then populate the docker-compose file with the appropriate version for us. Change-Id: I00b52c0f4aa8df3ecface964007fcf5724887e5e	2021-02-10 15:10:46 -08:00
Ian Wienand	78167396bf	refstack: add production image and deployment jobs Change-Id: I017a32ee374f0473525c9941c41b26c2a43bf2c8	2021-02-10 07:11:22 +11:00
Clark Boylan	a4604ae0b3	Deploy refstack with ansible docker This adds a dockerfile to build an opendevorg/refstack image as well as the jobs to build and publish it. Change-Id: Icade6c713fa9bf6ab508fd4d8d65debada2ddb30	2021-02-05 19:23:34 +00:00
Ian Wienand	7683fa11b3	openafs-server : add ansible roles for OpenAFS servers This starts at migrating OpenAFS server setup to Ansible. Firstly we split up the groups and explicitly name hosts, as we will me migrating each one step-by-step. We split out 1.8 hosts into a new afs-1.8 group; the first host is afs01.ord.openstack.org which already has openafs 1.8 installed manually. An openafs-server role is introduced that does the same setup as the extant puppet. The AFS job is renamed to infra-prod-afs as the puppet component will eventually disappear. Otherwise it runs in the same way, but also runs the openafs-server role for the 1.8 servers. Once this is merged, we can run it against afs01.ord.openstack.org to ensure it works and is idempotent. We can then take on upgrading the other file servers, and work further on the database servers. Change-Id: I7998af43961999412f58a78214f4b5387713d30e	2021-01-19 08:08:33 +11:00
Clark Boylan	c882808578	Remove container image builds for old gerrit versions Having upgraded to 3.2, we don't need these versions any more. Change-Id: Ifc37a75aa62b2498e649a4c81b589a04c794184a Depends-On: https://review.opendev.org/763617	2020-11-21 13:58:19 -08:00
Ian Wienand	368466730c	Migrate codesearch site to container The hound project has undergone a small re-birth and moved to https://github.com/hound-search/hound which has broken our deployment. We've talked about leaving codesearch up to gitea, but it's not quite there yet. There seems to be no point working on the puppet now. This builds a container than runs houndd. It's an opendev specific container; the config is pulled from project-config directly. There's some custom scripts that drive things. Some points for reviewers: - update-hound-config.sh uses "create-hound-config" (which is in jeepyb for historical reasons) to generate the config file. It grabs the latest projects.yaml from project-config and exits with a return code to indicate if things changed. - when the container starts, it runs update-hound-config.sh to populate the initial config. There is a testing environment flag and small config so it doesn't have to clone the entire opendev for functional testing. - it runs under supervisord so we can restart the daemon when projects are updated. Unlike earlier versions that didn't start listening till indexing was done, this version now puts up a "Hound is not ready yet" message when while it is working; so we can drop all the magic we were doing to probe if hound is listening via netstat and making Apache redirect to a status page. - resync-hound.sh is run from an external cron job daily, and does this update and restart check. Since it only reloads if changes are made, this should be relatively rare anyway. - There is a PR to monitor the config file (https://github.com/hound-search/hound/pull/357) which would mean the restart is unnecessary. This would be good in the near and we could remove the cron job. - playbooks/roles/codesearch is unexciting and deploys the container, certificates and an apache proxy back to localhost:6080 where hound is listening. I've combined removal of the old puppet bits here as the "-codesearch" namespace was already being used. Change-Id: I8c773b5ea6b87e8f7dfd8db2556626f7b2500473	2020-11-20 07:41:12 +11:00
Clark Boylan	bec0127970	Add jvb02 prior to the PTG This will scale up our meetpad install by 50% giving us more capacity for PTG sessions. We also increase the tox linters job timeout as it is slow pip installing then slow running ansible-lint. Do this until we can sort out why it is slow. Change-Id: Ieceafefa27266f0bc0f427af790f920a8c44326c	2020-10-23 15:28:03 -07:00
Zuul	083e8b43ea	Merge "Add borg-backup roles"	2020-10-01 07:36:47 +00:00
Clark Boylan	6df28cec61	Run service-eavesdrop hourly Now that gerritbot is deployed from containers on eavesdrop we want to run the infra-prod-service-eavesdrop job hourly to ensure that we keep the docker image up to date there. We haven't added the service-eavesdrop job to a deploy pipeline in gerritbot because that would require us to add gerritbot's project ssh key to bridge. Change-Id: I5aba91f2ae5c018ee9b2d0481a53b630fc5d1ab7	2020-08-13 08:58:03 -07:00
Ian Wienand	028d655375	Add borg-backup roles This adds roles to implement backup with borg [1]. Our current tool "bup" has no Python 3 support and is not packaged for Ubuntu Focal. This means it is effectively end-of-life. borg fits our model of servers backing themselves up to a central location, is well documented and seems well supported. It also has the clarkb seal of approval :) As mentioned, borg works in the same manner as bup by doing an efficient back up over ssh to a remote server. The core of these roles are the same as the bup based ones; in terms of creating a separate user for each host and deploying keys and ssh config. This chooses to install borg in a virtualenv on /opt. This was chosen for a number of reasons; firstly reading the history of borg there have been incompatible updates (although they provide a tool to update repository formats); it seems important that we both pin the version we are using and keep clients and server in sync. Since we have a hetrogenous distribution collection we don't want to rely on the packaged tools which may differ. I don't feel like this is a great application for a container; we actually don't want it that isolated from the base system because it's goal is to read and copy it offsite with as little chance of things going wrong as possible. Borg has a lot of support for encrypting the data at rest in various ways. However, that introduces the possibility we could lose both the key and the backup data. Really the only thing stopping this is key management, and if we want to go down this path we can do it as a follow-on. The remote end server is configured via ssh command rules to run in append-only mode. This means a misbehaving client can't delete its old backups. In theory we can prune backups on the server side -- something we could not do with bup. The documentation has been updated but is vague on this part; I think we should get some hosts in operation, see how the de-duplication is working out and then decide how we want to mange things long term. Testing is added; a focal and bionic host both run a full backup of themselves to the backup server. Pretty cool, the logs are in /var/log/borg-backup-<host>.log. No hosts are currently in the borg groups, so this can be applied without affecting production. I'd suggest the next steps are to bring up a borg-based backup server and put a few hosts into this. After running for a while, we can add all hosts, and then deprecate the current bup-based backup server in vexxhost and replace that with a borg-based one; giving us dual offsite backups. [1] https://borgbackup.readthedocs.io/en/stable/ Change-Id: I2a125f2fac11d8e3a3279eb7fa7adb33a3acaa4e	2020-07-21 17:36:50 +10:00
Zuul	c2b2efdf5b	Merge "Graphite container deployment"	2020-07-07 00:41:10 +00:00
Zuul	1d610297f3	Merge "Grafana container deployment"	2020-07-06 05:56:02 +00:00
Ian Wienand	3cf11d298e	Update grafana-container There is a new release, update base container. Add promote job that was forgotten with the original commit Iddfafe852166fe95b3e433420e2e2a4a6380fc64. Change-Id: Ie0d7febd2686d267903b29dfeda54e7cd6ad77a3	2020-07-06 10:48:25 +10:00

1 2

63 Commits