125 Commits

Author SHA1 Message Date
Clark Boylan
4279e20293 Remove configuration management for ELK stack
We indicated to the OpenStack TC that this service would be going away
after the Yoga cycle if no one stepped up to start maintaining it. That
help didn't arrive in the form of OpenDev assistance (there is effort
to use OpenSearch external to OpenDev) and Yoga has released. This means
we are now clear to retire and shutdown this service.

This change attempts to remove our configuration management for these
services so that we can shutdown the servers afterwards. It was a good
run. Sad to see it go but it wasn't sustainable anymore.

Note a follow-up will clean up elastic-recheck which runs on the status
server.

Depends-On: https://review.opendev.org/c/opendev/base-jobs/+/837619
Change-Id: I5f7f73affe7b97c74680d182e68eb4bfebbe23e1
2022-04-18 10:04:06 -07:00
Clark Boylan
67c4e0202d Finish removing git.openstack.org references
The remaining references are actually about the service itself.

Change-Id: Id40dd72370add572fec615e0b5cd8c8bd3a200f8
2022-03-18 09:22:37 -07:00
Jeremy Stanley
fa0c1b495c Generate HTTPS certs for Mailman sites
We're going to want Mailman 3 served over HTTPS for security
reasons, so start by generating certificates for each of the sites
we have in v2. Also collect the acme.sh logs for verification.

Change-Id: I261ae55c6bc0a414beb473abcb30f9a86c63db85
2021-12-17 22:25:22 +00:00
Clark Boylan
cf91bc0971 Remove the gerrit group in favor of the review group
Having two groups here was confusing. We seem to use the review group
for most ansible stuff so we prefer that one. We move contents of the
gerrit group_vars into the review group_vars and then clean up the use
of the old group vars file.

Change-Id: I7fa7467f703f5cec075e8e60472868c60ac031f7
2021-10-12 09:48:53 -07:00
Monty Taylor
52a70baa8f Replace callback_whitelist with callback_enabled
This is throwing deprecation warnings from ansible. Also, honestly,
the new name is much clearer.

Change-Id: I5b4648235f1256178d8a102d7d7a1767a9096bfd
2021-07-30 09:56:00 -05:00
Ian Wienand
0142bc10eb backups: add review02.opendev.org
Start backing up the new review server.  Stop backing up the old
server.  Fix the group matching test for the new server.

Change-Id: I8d84b80099d5c4ff7630aca9df312eb388665b86
2021-07-19 15:29:42 +10:00
Ian Wienand
8607ff7d81 review02: move out of staging group
This moves review02 out of the review-staging group and into the main
review group.  At this point, review01.openstack.org is inactive so we
can remove all references to openstack.org from the groups.  We update
the system-config job to run against a focal production server, and
remove the unneeded rsync setup used to move data.

This additionally enables replication; this should be a no-op when
applied as part of the transition process is to manually apply this,
so that DNS setup can pull zone changes from opendev.org.

It also switches to the mysql connector, as noted inline we found some
issues with mariadb.

Note backups follow in a separate step to avoid doing too much at
once, hence dropping the backup group from the testing list.

Change-Id: I7ee3e3051ea8f3237fd5f6bf1dcc3e5996c16d10
2021-07-18 19:45:35 -07:00
David Moreau Simard
fb8a5145df Update ARA
ARA's master branch now has static site generation, so we can move
away from the stable branch and get the new reports.

In the mean time ARA upstream has moved to github, so this updates the
references for the -devel job.

Depends-On: https://review.opendev.org/c/openstack/project-config/+/793530
Change-Id: I008b35562994f1205a4f66e53f93b9885a6b8754
2021-06-01 09:38:32 +10:00
Clark Boylan
4c4e27cb3a Ansible mailman configs
This converts our existing puppeted mailman configuration into a set of
ansible roles and a new playbook. We don't try to do anything new and
instead do our best to map from puppet to ansible as closely as
possible. This helps reduce churn and will help us find problems more
quickly if they happen.

Followups will further cleanup the puppetry.

Change-Id: If8cdb1164c9000438d1977d8965a92ca8eebe4df
2021-05-11 08:40:01 -07:00
Clark Boylan
7502b87837 Add zk04.opendev.org
We will be rotating zk01-03.openstack.org out and replacing them with
zk04-06.opendev.org. This is the first change in that process which puts
zk04 into the rotation. This should only be landed when operators are
ready to manually stop zookeeper on zk03 (which is being replaced by
zk04 in this change).

Change-Id: Iea69130f6b3b2c8e54e3938c60e4a3295601c46f
2021-04-15 13:20:29 -07:00
Clark Boylan
2eebb858af Remove firehose.openstack.org
Once we are satisfied that we have disabled the inputs to firehose we
can land this change to stop managing it in config management. Once that
is complete the server can be removed.

Change-Id: I7ebd54f566f8d6f940a921b38139b54a9c4569d8
2021-04-13 13:51:48 -07:00
Ian Wienand
dc827de23d Add kerberos-client group
We duplicate the KDC settings over all our kerberos clients.  Add
clients to a "kerberos-client" group and set the variables in a group
file.

Change-Id: I25ed5f8c68065060205dfbb634c6558488003a38
2021-03-18 11:59:30 +11:00
Zuul
2a0ea75fb7 Merge "install-ansible: ensure stevedore" 2021-03-09 02:52:10 +00:00
Clark Boylan
a2fd912511 Replace ze09-12.openstack.org with ze09-12.opendev.org
These are new focal replacement servers. Because this is the last set of
replacements for the executors we also cleanup the testing of the old
servers in the system-config-run-zuul job and the inventory group
checker job.

Change-Id: I111d42c9dfd6488ef69ff1a7f76062a73d1f37bf
2021-03-08 10:13:29 -08:00
Ian Wienand
a12d2fce2b install-ansible: ensure stevedore
We have identified an issue with stevedore < 3.3.0 where the
cloud-launcher, running under ansible, makes stevedore hashe a /tmp
path into a entry-point cache file it makes, causing a never-ending
expansion.

This appears to be fixed by [1] which is available in 3.3.0.  Ensure
we install this on bridge.  For good measure, add a ".disable" file as
we don't really need caches here.

There's currently 491,089 leaked files, so I didn't think it wise to
delete these in a ansible loop as it will probably time out the job.
We can do this manually once we stop creating them :)

[1] d7cfadbb7d

Change-Id: If5773613f953f64941a1d8cc779e893e0b2dd516
2021-03-04 08:29:01 +11:00
Clark Boylan
a42c0b704a Remove ze01.openstack.org
This server has been replaced by ze01.opendev.org running Focal. Lets
remove the old ze01.openstack.org from inventory so that we can delete
the server. We will follow this up with a rotation of new focal servers
being put in place.

This also renames the xenial executor in testing to ze12.openstack.org
as that will be the last one to be rotated out in production. We will
remove it from testing at that point as well.

We also remove a completely unused zuul-executor-opendev.yaml group_vars
file to avoid confusion.

Change-Id: Ida9c9a5a11578d32a6de2434a41b5d3c54fb7e0c
2021-03-02 10:21:59 -08:00
Clark Boylan
2a0508aa08 Add ze01.opendev.org
This is a focal replacement for ze01.openstack.org. Cleanup for
ze01.openstack.org will happen in a followup when we are happy with the
results of running zuul-executor on focal.

Change-Id: If1fef88e2f4778c6e6fbae6b4a5e7621694b64c5
2021-02-25 08:53:40 -08:00
Ian Wienand
39ffc685d6 backups: remove all bup
All hosts are now running thier backups via borg to servers in
vexxhost and rax.ord.

For reference, the servers being backed up at this time are:

 borg-ask01
 borg-ethercalc02
 borg-etherpad01
 borg-gitea01
 borg-lists
 borg-review-dev01
 borg-review01
 borg-storyboard01
 borg-translate01
 borg-wiki-update-test
 borg-zuul01

This removes the old bup backup hosts, the no-longer used ansible
roles for the bup backup server and client roles, and any remaining
bup related configuration.

For simplicity, we will remove any remaining bup cron jobs on the
above servers manually after this merges.

Change-Id: I32554ca857a81ae8a250ce082421a7ede460ea3c
2021-02-16 16:00:28 +11:00
Ian Wienand
312b9bec24 Refactor AFS groups
Both the filesevers and db servers have common key material deployed
by the openafs-server-config role.  Put both types of server in a new
group "afs-server-common" so we can define this key material in just
one group file on bridge.

Then separate out the two into afs-<file|db>-server groups for
consistent naming.

Rename afs-admin for consistent naming.

The service file is updated to reflect the new groups.

Change-Id: Ifa5f251fdfb8de737ad2ed96491d45294ce23a0c
2021-02-11 13:35:16 +11:00
Ian Wienand
92250eca82 Remove afs-1.8 group
With all AFS file-servers upgraded to 1.8, we can move afs01.dfw back
and rename the group to just "afs".

Change-Id: Ib31bde124e01cd07d6ff7eb31679c55728b95222
2021-01-21 07:08:29 +11:00
Ian Wienand
3cd8cd0765 devel job: use ansible-core name
As described inline, installing ansible from source now installs the
"ansible-core" package, instead of "ansible-base".  Since they can't
live together nicely, we have to do a manual override for the devel
job.

Change-Id: I1299ea330e6de048b661fc087f016491758631c7
2020-11-18 14:49:46 +11:00
Zuul
d11949817d Merge "Add all backup hosts to borg backups" 2020-11-09 23:39:51 +00:00
Ian Wienand
d533e89089 Add all backup hosts to borg backups
Backups have been going well on ethercalc02, so add borg backup runs
to all backed-up servers.  Port in some additional excludes for Zuul
and slightly modify the /var/ matching.

Change-Id: Ic3adfd162fa9bedd84402e3c25b5c1bebb21f3cb
2020-11-09 17:23:22 +11:00
Ian Wienand
f8852b76fb Remove mirror-update server and related puppet
This has all transitioned to Ansible and the mirror-update.opendev.org
server now.

Change-Id: I5f82139c981c2716f568b15b118690e943b02d52
2020-10-28 11:39:54 +11:00
Ian Wienand
1b4006757a Cleanup graphite01
Server is replaced with graphite02.opendev.org

Change-Id: Ie6099e935a6a7e10c818d1d3003e44bca11dd13a
2020-09-30 11:55:24 +10:00
Ian Wienand
016e961890 install-ansible: fix collections install for devel job
This wasn't quite fixed right when these were moved into
project-config.  Get the projects and install them.

Change-Id: I0f854609fc9aebffc1fa2a2e14d5231cce9b71d0
2020-09-21 17:27:23 +10:00
Zuul
98370830a3 Merge "Remove mirror01.regionone.linaro-us.opendev.org" 2020-09-18 04:46:09 +00:00
Zuul
59785b464f Merge "Ansible devel testing: install ansible-collections from checkout" 2020-09-18 01:36:15 +00:00
Zuul
eca7033d58 Merge "Fix ansible-devel job for Ansible 2.10 changes" 2020-09-18 01:36:13 +00:00
Ian Wienand
b3c01b30b3 install-ansible: move install_modules.sh to puppet-setup-ansible
Modules are collected on bridge and then synchronized to remote hosts
where puppet is run.  This is done to ensure an atomic run of puppet
across affected hosts.

These modules are described in modules.env and cloned by
install_modules.sh.  Currently this is done in install-ansible, but
after some recent refactoring
(I3b1cea5a25974f56ea9202e252af7b8420f4adc9) the best home for it
appears to now be in puppet-setup-ansible; just before the script is
run.

Change-Id: I4b1d709d7037e2851d73be4bc7a202f52858ad4f
2020-09-03 09:28:16 +10:00
Ian Wienand
600c9e78d4 Remove mirror01.regionone.linaro-us.opendev.org
Replaced with 02 mirror

Change-Id: I63114be35836f5ddb204e8c0ca5a1e10b056a4b0
2020-08-25 14:43:07 +10:00
Ian Wienand
d97b114d33 Ansible devel testing: install ansible-collections from checkout
Allow speculative testing of ansible collections in the -devel test
job by linking in the git checkouts from the dependent change.

Depends-On: https://review.opendev.org/747596
Change-Id: I014701f41fb6870360004aa64990e16e278381ed
2020-08-25 08:42:34 +10:00
Ian Wienand
66e249bf95 Fix ansible-devel job for Ansible 2.10 changes
The Ansible devel branch has pulled in some major changes that has
broken our -devel testing job.

Firstly, installing from source checkout now installs the package
"ansible-base"; this means when we install ARA, which has a dependency
on just "ansible" it pulls in the old 2.9 release (which is what the
-devel test is currently testing with -- the reason for this change).

We could remove ARA, but we quite like it's reports for the nested
Ansible runs.  So make a dummy "ansible" 2.9 package and install that
to satisfy the dependency.

Secondly, Ansible devel has split out a lot of things into "community
modules".  To keep testing the -devel branch into the future, we need
to pull in the community modules for testing as well [1].

After some very useful discussion with jborean93 in #ansible I believe
the best way to do this is to clone the community projects into place
in the ansible configuration directory.  Longer term, we should make
Zuul check these out and use that, then we can speculatively test
changes too -- but for now just KISS.

[1] For reference, upstream bundles all this into the "Ansible
Community Distribution" or ACD, which is what you will get when you
download "ansible" from PyPi or similar.  But this job should be
pulling the bleeding edge of ansible and the community modules we use
-- that's what it's for.

Depends-On: https://review.opendev.org/747337
Change-Id: I781e275acb6af85f816ebcaf57a9825b50ca1196
2020-08-25 08:42:30 +10:00
Monty Taylor
fca18e4776 Stop cloning k8s-on-openstack
We're not actually using this repo at the moment.

Change-Id: I765140c65e4d7b45e2258d8fc267090f982de058
2020-07-14 08:22:49 -05:00
Ian Wienand
185797a0e5 Graphite container deployment
This deploys graphite from the upstream container.

We override the statsd configuration to have it listen on ipv6.
Similarly we override the ngnix config to listen on ipv6, enable ssl,
forward port 80 to 443, block the /admin page (we don't use it).

For production we will just want to put some cinder storage in
/opt/graphite/storage on the production host and figure out how to
migrate the old stats.  The is also a bit of cleanup that will follow,
because we half-converted grafana01.opendev.org -- so everything can't
be in the same group till that is gone.

Testing has been added to push some stats and ensure they are seen.

Change-Id: Ie843b3d90a72564ef90805f820c8abc61a71017d
2020-07-03 07:17:28 +10:00
James E. Blair
ff7aa016b0 Make disable-ansible fancier
So that we don't end up in a position where we find a DISABLE-ANSIBLE
file in place and wonder what it is or how it got there, ask the user
for a comment to place in the file.  Append to the file in case it
already exists.  Cat the file at the end to show the user all of the
comments in case there was one previously.  Include the date for even
more clues.

Change-Id: I9c22f94c5ea93452b2975d4aae3bf7fbd9c736d0
2020-06-15 16:14:34 -07:00
Monty Taylor
3d58c5a30a Don't install puppet modules when we don't need them
We are currently cloning all of the puppet modules in install-ansible,
but we only need them when we run run-puppet. Move the cloning there
so that we can stop wasting the time in CI jobs that don't need them.
In prod, this should not have much impact.

Change-Id: I641ffc09e9e0801e0bc2469ceec97820ba354160
2020-06-15 14:35:41 -05:00
Zuul
d97a6e2d5e Merge "Add utility script to disable ansible" 2020-06-13 00:16:38 +00:00
Monty Taylor
dea12612c5 Add utility script to disable ansible
Touching the file works, but it's easy to misspell.

Change-Id: I4980ac2c290abd6cda39846e651fb490bfafe96f
2020-06-12 18:34:29 -05:00
Clark Boylan
c3ba8fd0b5 Remove extra src/ dirs from inventory paths
These extra dirs mean we don't find valid inventory so no ansible is
able to properly run.

Change-Id: I3abf1bc9b1cb752f64369de6d31daaadce4ff847
2020-06-08 10:03:05 -07:00
Monty Taylor
83ced7f6e6 Split inventory into multiple dirs and move hostvars
Make inventory/service for service-specific things, including the
groups.yaml group definitions, and inventory/base for hostvars
related to the base system, including the list of hosts.

Move the exisitng host_vars into inventory/service, since most of
them are likely service-specific. Move group_vars/all.yaml into
base/group_vars as almost all of it is related to base things,
with the execption of the gerrit public key.

A followup patch will move host-specific values into equivilent
files in inventory/base.

This should let us override hostvars in gate jobs. It should also
allow us to do better file matchers - and to be able to organize
our playbooks move if we want to.

Depends-On: https://review.opendev.org/731583
Change-Id: Iddf57b5be47c2e9de16b83a1bc83bee25db995cf
2020-06-04 07:44:36 -05:00
Ian Wienand
45201f3d66 Remove puppet mirror support
Remove the separate "mirror_opendev" group and rename it to just
"mirror".  Update various parts to reflect that change.

We no longer deploy any mirror hosts with puppet, remove the various
configuration files.

Depends-On: https://review.opendev.org/728345
Change-Id: Ia982fe9cb4357447989664f033df976b528aaf84
2020-05-16 10:14:25 +10:00
Ian Wienand
da6d1cbd06 Remove linaro-london cloud
This cloud is no longer used

Change-Id: I14ab277b3877f6674ec3172c06a39f383e76a1d0
Depends-On: https://review.opendev.org/728332
2020-05-16 10:14:09 +10:00
Ian Wienand
80e4b617f1 Remove mirror02.dfw.rax.openstack.org
Replace by opendev mirror

Change-Id: Id5fc956421948c405d5675a746b5c4258905ac74
Depends-On: https://review.opendev.org/690757
2020-05-14 10:02:42 +10:00
Zuul
99f809ccc5 Merge "Use zuul checkouts of ansible roles from other repos" 2020-05-07 18:41:21 +00:00
Ian Wienand
e400865dd0 Retire nb01/02.openstack.org
Remove references to these older builders.  We thank them for their
service.

Change-Id: I1f48f070406bee79ac0d1de61beb44eb7d58d605
2020-05-07 13:06:26 +10:00
Monty Taylor
4b9d1a88bd Use zuul checkouts of ansible roles from other repos
We have two standalone roles, puppet and cloud-launcher, but we
currently install them with galaxy so depends-on patches don't
work. We also install them every time we run anything, even if
we don't need them for the playbook in question.

Add two roles, one to install a set of ansible roles needed by
the host in question, and the other to encapsulate the sequence
of running puppet, which now includes installing the puppet
role, installing puppet, disabling the puppet agent and then
running puppet.

As a followup, we'll do the same thing with the puppet modules,
so that we arent' cloning and rsyncing ALL of the puppet modules
all the time no matter what.

Change-Id: I69a2e99e869ee39a3da573af421b18ad93056d5b
2020-04-30 12:39:12 -05:00
Monty Taylor
f0b77485ec Run Zuul using Ansible and Containers
Zuul is publishing lovely container images, so we should
go ahead and start using them.

We can't use containers for zuul-executor because of the
docker->bubblewrap->AFS issue, so install from pip there.

Don't start any of the containers by default, which should
let us safely roll this out and then do a rolling restart.
For things (like web or mergers) where it's safe to do so,
a followup change will swap the flag.

Change-Id: I37dcce3a67477ad3b2c36f2fd3657af18bc25c40
2020-04-24 09:18:44 -05:00
James E. Blair
42574b2b37 Run ZK from containers
Migration plan:
* add zk* to emergency
* copy data files on each node to a safe place for DR backup
* make a json data backup: zk-shell localhost:2181 --run-once 'mirror / json://!tmp!zookeeper-backup.json/'
* manually run a modified playbook to set up the docker infra without starting containers
* rolling restart; for each node:
  * stop zk
  * split data and log files and move them to new locations
  * remove zk packages
  * start zk containers
* remove from emergency; land this change.

Change-Id: Ic06c9cf9604402aa8eb4bb79238021c14c5d9563
2020-04-17 08:43:09 -07:00
Monty Taylor
c117c1106d Update install-ansible away from /opt/system-config
So that we can start running things from the zuul source rather
thatn update-system-config and /opt/system-config, we need to
install a few things onto the host in install-ansible so that the
ansible env is standalone.

This introduces a split execution path. The ansible config is
now all installed globally onto the machine by install-ansible
and does not reference a git checkout.

For running ad-hoc commands, an ansible.cfg is introduced inside
the root of the system-config dir. So if ansible-playbook is
executed with PWD==/opt/system-config it will find that ansible.cfg,
it will take precedence, and any content from system-config
will take precedence.

As a followup we'll make /opt/system-config/ansible.cfg written
out by install-ansible from the same template, and we'll update
the split to make ansible only work when executed from one of
the two configured locations, so that it's clear where we're
operating from.

Change-Id: I097694244e95751d96e67304aaae53ad19d8b873
2020-04-14 14:54:23 -05:00