928 Commits

Author SHA1 Message Date
Ian Wienand
e0acf4a68d Retire Asterisk service
As announced in [1], retire the Asterisk PBX service

[1] http://lists.opendev.org/pipermail/service-discuss/2021-March/000198.html

Change-Id: I527eb3423831c6a155228b6d79428681f60a3273
2021-05-07 09:53:17 +10:00
Zuul
d3b75eaa30 Merge "Use ECC (ed25519) for artifact signing keys" 2021-05-06 00:14:21 +00:00
Zuul
fec37d6534 Merge "Deprovision Limesurvey config management and docs" 2021-05-05 00:37:39 +00:00
Jeremy Stanley
b87e938a13 Clean up Gerrit global config documentation
Recent work has concluded adding OpenStack Release Manager
permissions explicitly to all openstack/ namespace projects with the
addition of inheritance from openstack/meta-config in their
individual ACLs. This made the earlier Release Manager permissions
in our global configuration redundant, so it's being removed. The
cleanup is done by hand due to how global configuration is managed
in Gerrit's All-Projects metaproject, but we're updating our
documentation to reflect it.

While here, clean up obsolete references to API-Projects inheritance
and stable/.* branch permissions which we've not applied for some
years now.

Change-Id: Ib9314f7a1deb3d343eb2d9b476064de41186f57a
2021-05-03 16:36:48 +00:00
Jeremy Stanley
fb418c2d4a Use ECC (ed25519) for artifact signing keys
GnuPG 2.3.0 (2021-04-07) switched the default key algorithm to
ed25519/cv25519. Even though we're not currently using such a new
release, this is a good signal that we should start doing the same
for our artifact signing keys. Thankfully our current GPG version on
bridge can create them using the --expert option, so document the
slight changes to the required commands and update the example
output to more closely match its new behavior.

While we're here, the version we're using also autogenerates
revocation certificates. Take advantage of that to slightly simplify
our key generation instructions.

Change-Id: Ibb1c5ae8c540713e1c39d0000497c6b8b89b67c8
2021-05-01 19:25:39 +00:00
Jeremy Stanley
1df1001cb4 Deprovision Limesurvey config management and docs
The Limesurvey service hosted at survey.openstack.org was a beta
which saw limited use. The platform it runs on, Xenial, is now EOL
from Ubuntu/Canonical and in order to upgrade to a newer
distribution release we would need to rewrite all the configuration
management (the version of Puppet supported by newer Ubuntu is not
backward-compatible with what we've been running).

If a similar service becomes interesting to users of our
collaboratory in the future, it will need to be reintroduced with
freshly written configuration management anyway. The old configs and
documentation remain in our Git history should anyone wish to use
them as inspiration.

Change-Id: I59b419cf112d32f20084ab93eb6f2417a7f93fdb
2021-05-01 15:12:00 +00:00
Clark Boylan
8346b9ac6f Add zk06.opendev.org to the zk cluster
This new zk06 instance will replace zk01 in the cluster.

Change-Id: Idb5ef47c80d6639744be361f0763b78f83327caf
2021-04-27 12:49:08 -07:00
Clark Boylan
30c1523f4c Add zk05.opendev.org to the zk cluster
This new zk05 instance will replace zk02 in the cluster.

Change-Id: I425708d6a241ad7a90266e5ba5b6ed544bfd5ff0
2021-04-27 10:38:08 -07:00
Zuul
82435b279a Merge "Add zk04.opendev.org" 2021-04-27 16:33:02 +00:00
Piotr Parczewski
ac8fa36a1e Remove Jenkins related documentation fragments
Change-Id: I4cc5b6e11634f3705ceedc0f1f12309f43e1d8e6
2021-04-18 21:42:04 +02:00
Clark Boylan
7502b87837 Add zk04.opendev.org
We will be rotating zk01-03.openstack.org out and replacing them with
zk04-06.opendev.org. This is the first change in that process which puts
zk04 into the rotation. This should only be landed when operators are
ready to manually stop zookeeper on zk03 (which is being replaced by
zk04 in this change).

Change-Id: Iea69130f6b3b2c8e54e3938c60e4a3295601c46f
2021-04-15 13:20:29 -07:00
Zuul
cb5898ae0a Merge "Remove firehose.openstack.org" 2021-04-14 18:50:16 +00:00
Clark Boylan
2eebb858af Remove firehose.openstack.org
Once we are satisfied that we have disabled the inputs to firehose we
can land this change to stop managing it in config management. Once that
is complete the server can be removed.

Change-Id: I7ebd54f566f8d6f940a921b38139b54a9c4569d8
2021-04-13 13:51:48 -07:00
Ian Wienand
db76061c71 Stop managing planet01.openstack.org
This server has been retired.
If141aca5efbdbe60c91ceefaa4e05c98cd0ba5bb has redirected this.

Change-Id: I8d3c089e6e845d98a46ae39c0b32b1c845436add
2021-04-13 16:17:14 +10:00
Ian Wienand
ce7ef6536a openafs-server-config: install UserList
This was missed during recent updates; this UserList needs to be on
all servers to allow bos, vos and backup commands.

Update the documentation to reflect the centralised copy.

Change-Id: I8ada3d5035bb7ef77b19ce6aaffb48335974a124
2021-03-30 09:49:53 +11:00
Ian Wienand
9f11fc5c75 Remove references to review-dev
With our increased ability to test in the gate, there's not much use
for review-dev any more.  Remove references.

Change-Id: I97e9865e0b655cd157acf9ffa7d067b150e6fc72
2021-03-24 11:40:31 +11:00
Zuul
b8874e4f51 Merge "kerberos-kdc: add database backups" 2021-03-19 00:06:59 +00:00
Clark Boylan
be1325fe2c Clean up the old openstack.org nodepool launchers.
These have been replaced with new focal .opendev.org hosts. Note we
don't want to land this until we successfully transitioned from one set
of hosts to another.

Change-Id: I385a74c8a093f5baebb0d4858127c7595be191c0
2021-03-17 15:53:42 -07:00
Zuul
b2b1a9062d Merge "Add new opendev.org nodepool launchers" 2021-03-17 18:13:07 +00:00
Zuul
941d7e7eab Merge "Cleanup nl01.openstack.org" 2021-03-17 15:28:47 +00:00
Zuul
4524a92caf Merge "kerberos-kdc: role to manage Kerberos KDC servers" 2021-03-16 22:28:46 +00:00
Clark Boylan
680ed17ecd Add new opendev.org nodepool launchers
This adds the new focal nodepool launchers replacements for nl02-04 to
our inventory. This will configure them with an idle configuration. We
then confirm they are happy running in an idle state then switch over
the config from the old to new servers.

Depends-On: https://review.opendev.org/c/openstack/project-config/+/780982
Change-Id: Iea645925caaeee6f498aa690c4f2c848f6899317
2021-03-16 15:21:58 -07:00
Zuul
b133afedfd Merge "refstack: cleanup old puppet" 2021-03-16 22:21:03 +00:00
Clark Boylan
893ec329b4 Cleanup nl01.openstack.org
This server is no longer running a nodepool launcher and can be removed
from the inventory so that we can delete it. Next up we'll replace
02-04.

Change-Id: Ia71b9b616bde1018cd4ce3b8c882fba02677165d
2021-03-16 14:36:12 -07:00
Ian Wienand
3052ff4935 kerberos-kdc: add database backups
Add a script to save a db dump to borg backups.  Add the primary KDC
to our backup list.

Change-Id: I32f4ebc1bb4c1952034aba43c75e4d2f85a1b6d3
2021-03-17 08:31:52 +11:00
Ian Wienand
c1aff2ed38 kerberos-kdc: role to manage Kerberos KDC servers
This adds a role and related testing to manage our Kerberos KDC
servers, intended to replace the puppet modules currently performing
this task.

This role automates realm creation, initial setup, key material
distribution and replica host configuration.  None of this is intended
to run on the production servers which are already setup with an
active database, and the role should be effectively idempotent in
production.

Note that this does not yet switch the production servers into the new
groups; this can be done in a separate step under controlled
conditions and with related upgrades of the host OS to Focal.

Change-Id: I60b40897486b29beafc76025790c501b5055313d
2021-03-17 08:30:52 +11:00
Ian Wienand
018a14e34f refstack: cleanup old puppet
Remove old puppet configuration for the restack service, which is now
managed by Ansible.

Change-Id: I6b6dfd0f8ef89a5362f64cfbc8016ba5b1a346b3
2021-03-17 07:06:53 +11:00
Clark Boylan
ed61423b6b Add nl01.opendev.org to our inventory
This is a new focal replacement for nl01.openstack.org. We keep
nl01.openstack.org in our inventory for now because we want ansible to
update the nodepool.yaml configs for these two hosts to coordinate a
hand off of responsibilities once we are happy with the new deployment.

We also switch the testing hostname to nl04.openstack.org as this will
be the last nodepool launcher to be removed. When we swap it out the
testing will be updated to use focal hosts.

Depends-On: https://review.opendev.org/c/openstack/project-config/+/779863
Change-Id: Ib3ea6586fe0567c1edf6255ee9be50164d35db62
2021-03-15 09:48:22 -07:00
Ian Wienand
3f1d67b99f Add afsdb03 openstack.org
We are in the process of upgrading the AFS servers to focal.  As
explained by auristor (extracted from IRC below) we need 3 servers to
actually perform HA with the ubik protocol:

 the ubik quorum is defined by the list of voting primary ip addresses
 as specified in the ubik service's CellServDB file.  The server with
 the lowest ip address gets 1.5 votes and the others 1 vote.  To win
 election requires greater than 50% of the votes.  In a two server
 configuration there are a total of 2.5 votes to cast.  1.5 > 2.5/2 so
 afsdb02.openstack.org always wins regardless of what
 afsdb01.openstack.org says.  And afsb01.openstack.org can never win
 because 1 < 2.5/2.  by adding a third ubik server to the quorum, the
 total votes cast are 3.5 and it always requires the vote of two
 servers to elect a winner ...  if afsdb03 is added with the highest
 ip address, then either afsdb01 or afsdb02 can be elected

Add a third server which is a focal host and related configuration.

Change-Id: I59e562dd56d6cbabd2560e4205b3bd36045d48c2
2021-03-01 15:51:49 +11:00
Zuul
4a6afe927f Merge "Adjust the example Etherpad API delete command" 2021-02-15 22:12:44 +00:00
Ian Wienand
f4209757c2 docs: add note on service-incident list
This is a private list to contact administrators that is suitable for
raising security concerns.

Change-Id: I886f67d875abd09753511f6c33312cfc5eb62933
2021-02-15 06:26:18 +00:00
Ian Wienand
116a2ca4a4 doc: update backup instructions
Update the backup instructions for some recent changes.  Make a note
of the streaming backup method, discuss some caveats with append-only
mode and discuss the pruning scripts and when to run
(c.f. I9559bb8aeeef06b95fb9e172a2c5bfb5be5b480e,
I250d84c4a9f707e63fef6f70cfdcc1fb7807d3a7).

Change-Id: Idb04ebfa5666cd3c20bc0132683d187e705da3f1
2021-02-09 12:15:24 +11:00
Ian Wienand
61e9d0948a Remove AFS puppet
This has all been replaced by Ansible roles and is no longer used

Change-Id: Ic807498ad3ca4f305b168464b86fe197a61b4d13
2021-01-21 07:08:37 +11:00
Jeremy Stanley
501de530d1 Adjust the example Etherpad API delete command
Because our docker images include few CLI utilities, make the
example so that we rely on outside utilities on the host system for
making http connections to the API socket for simplicity.

Change-Id: I6a8abdbb55120db7d0f0b97255824f5a8fac76cb
2021-01-13 17:05:00 +00:00
Zuul
1b16dae681 Merge "Migrate codesearch site to container" 2020-11-19 22:26:12 +00:00
Ian Wienand
368466730c Migrate codesearch site to container
The hound project has undergone a small re-birth and moved to

 https://github.com/hound-search/hound

which has broken our deployment.  We've talked about leaving
codesearch up to gitea, but it's not quite there yet.  There seems to
be no point working on the puppet now.

This builds a container than runs houndd.  It's an opendev specific
container; the config is pulled from project-config directly.

There's some custom scripts that drive things.  Some points for
reviewers:

 - update-hound-config.sh uses "create-hound-config" (which is in
   jeepyb for historical reasons) to generate the config file.  It
   grabs the latest projects.yaml from project-config and exits with a
   return code to indicate if things changed.

 - when the container starts, it runs update-hound-config.sh to
   populate the initial config.  There is a testing environment flag
   and small config so it doesn't have to clone the entire opendev for
   functional testing.

 - it runs under supervisord so we can restart the daemon when
   projects are updated.  Unlike earlier versions that didn't start
   listening till indexing was done, this version now puts up a "Hound
   is not ready yet" message when while it is working; so we can drop
   all the magic we were doing to probe if hound is listening via
   netstat and making Apache redirect to a status page.

 - resync-hound.sh is run from an external cron job daily, and does
   this update and restart check.  Since it only reloads if changes
   are made, this should be relatively rare anyway.

 - There is a PR to monitor the config file
   (https://github.com/hound-search/hound/pull/357) which would mean
   the restart is unnecessary.  This would be good in the near and we
   could remove the cron job.

 - playbooks/roles/codesearch is unexciting and deploys the container,
   certificates and an apache proxy back to localhost:6080 where hound
   is listening.

I've combined removal of the old puppet bits here as the "-codesearch"
namespace was already being used.

Change-Id: I8c773b5ea6b87e8f7dfd8db2556626f7b2500473
2020-11-20 07:41:12 +11:00
Zuul
c6a835ecc4 Merge "Stop managing gerrit's local git mirror dir" 2020-11-17 22:13:24 +00:00
Zuul
d3a53e8ec0 Merge "Remove mirror-update server and related puppet" 2020-11-09 21:07:11 +00:00
Zuul
15d579cf31 Merge "Document dual account split for Gerrit admins" 2020-11-05 17:19:50 +00:00
Ian Wienand
eb07ab3613 borg-backup: add fuse
Add the FUSE dependencies for our hosts backed up with borg, along
with a small script to make mounting the backups easier.  This is the
best way to recover something quickly in what is sure to be a
stressful situation.

Documentation and testing is updated.

Change-Id: I1f409b2df952281deedff2ff8f09e3132a2aff08
2020-11-05 11:56:46 +11:00
Jeremy Stanley
427ae2a2aa Document dual account split for Gerrit admins
Our Gerrit admins follow this model of access management now, in
order to shield Administrators permission from external identity
provider risks.

Change-Id: I3070c28c26548d364da38d366bfa2ac8b2fb4668
2020-10-28 21:03:20 +00:00
Ian Wienand
c49ece9204 Cleanup grafana.openstack.org
The opendev.org server is in production, cleanup the old puppet-based
host.

Change-Id: I6db3ce929226a23b96234b52ece8b17f4c6a326a
2020-10-29 07:59:42 +11:00
Ian Wienand
f8852b76fb Remove mirror-update server and related puppet
This has all transitioned to Ansible and the mirror-update.opendev.org
server now.

Change-Id: I5f82139c981c2716f568b15b118690e943b02d52
2020-10-28 11:39:54 +11:00
Clark Boylan
9011096d49 Stop managing gerrit's local git mirror dir
We stopped serving this content and the next step is to stop managing it
internally. This depends on a change to jeepyb that makes the local git
dir management on the jeepyb side optional. Once that lands we can
update our configs to tell jeepyb to stop managing it.

We also stop doing garbage collection, mounting it into containers that
don't need it, etc.

Depends-On: https://review.opendev.org/758597
Change-Id: I2185e90edfcac71941bc29a4e11b7b2d4c7c2e13
2020-10-16 09:41:07 -07:00
Zuul
083e8b43ea Merge "Add borg-backup roles" 2020-10-01 07:36:47 +00:00
Ian Wienand
e3fb7d2be0 docs: Update some of sysadmin details
Give a little more details on the current ci/cd setup; remove puppet
cruft.

Change-Id: I684df4459cf5940d70b89e4c05103f8a8352af87
2020-09-07 17:14:21 +10:00
Ian Wienand
028d655375 Add borg-backup roles
This adds roles to implement backup with borg [1].

Our current tool "bup" has no Python 3 support and is not packaged for
Ubuntu Focal.  This means it is effectively end-of-life.  borg fits
our model of servers backing themselves up to a central location, is
well documented and seems well supported.  It also has the clarkb seal
of approval :)

As mentioned, borg works in the same manner as bup by doing an
efficient back up over ssh to a remote server.  The core of these
roles are the same as the bup based ones; in terms of creating a
separate user for each host and deploying keys and ssh config.

This chooses to install borg in a virtualenv on /opt.  This was chosen
for a number of reasons; firstly reading the history of borg there
have been incompatible updates (although they provide a tool to update
repository formats); it seems important that we both pin the version
we are using and keep clients and server in sync.  Since we have a
hetrogenous distribution collection we don't want to rely on the
packaged tools which may differ.  I don't feel like this is a great
application for a container; we actually don't want it that isolated
from the base system because it's goal is to read and copy it offsite
with as little chance of things going wrong as possible.

Borg has a lot of support for encrypting the data at rest in various
ways.  However, that introduces the possibility we could lose both the
key and the backup data.  Really the only thing stopping this is key
management, and if we want to go down this path we can do it as a
follow-on.

The remote end server is configured via ssh command rules to run in
append-only mode.  This means a misbehaving client can't delete its
old backups.  In theory we can prune backups on the server side --
something we could not do with bup.  The documentation has been
updated but is vague on this part; I think we should get some hosts in
operation, see how the de-duplication is working out and then decide
how we want to mange things long term.

Testing is added; a focal and bionic host both run a full backup of
themselves to the backup server.  Pretty cool, the logs are in
/var/log/borg-backup-<host>.log.

No hosts are currently in the borg groups, so this can be applied
without affecting production.  I'd suggest the next steps are to bring
up a borg-based backup server and put a few hosts into this.  After
running for a while, we can add all hosts, and then deprecate the
current bup-based backup server in vexxhost and replace that with a
borg-based one; giving us dual offsite backups.

[1] https://borgbackup.readthedocs.io/en/stable/

Change-Id: I2a125f2fac11d8e3a3279eb7fa7adb33a3acaa4e
2020-07-21 17:36:50 +10:00
Ian Wienand
b146181174 Grafana container deployment
This uses the Grafana container created with
Iddfafe852166fe95b3e433420e2e2a4a6380fc64 to run the
grafana.opendev.org service.

We retain the old model of an Apache reverse-proxy; it's well tested
and understood, it's much easier than trying to map all the SSL
termination/renewal/etc. into the Grafana container and we don't have
to convince ourselves the container is safe to be directly web-facing.

Otherwise this is a fairly straight forward deployment of the
container.  As before, it uses the graph configuration kept in
project-config which is loaded in with grafyaml, which is included in
the container.

Once nice advantage is that it makes it quite easy to develop graphs
locally, using the container which can talk to the public graphite
instance.  The documentation has been updated with a reference on how
to do this.

Change-Id: I0cc76d29b6911aecfebc71e5fdfe7cf4fcd071a4
2020-07-03 07:17:22 +10:00
Zuul
c3f5a87a5e Merge "Update refstack reference after rename" 2020-06-19 16:15:34 +00:00
Ian Wienand
ceb711e6d9 Swap mirror-update01 for mirror-update02
This is a new Focal based host, which we want for it's more recent
rsync which hopefully causes less issues resyncing things to AFS
volumes.

See 4918594aa472010a8a112f5f4ed0a471a3351a91 for discussion of the
original issues; we have found that without "-t" all new data seems to
be copied continuously.  Empirical testing shows later rsync doesn't
have this issue.

Depends-On: https://review.opendev.org/736859
Change-Id: Iebfffdf8aea6f123e36f264c87d6775771ce2dd8
2020-06-19 08:41:44 +10:00