168 Commits

Author SHA1 Message Date
James E. Blair
14f4a20628 Remove gearman from Zuul
Zuul no longer uses gearman, so we can remove the infrastructure
around it.

Change-Id: I3613d812971add4733d3fe509ee22835e5814ec6
2022-02-01 13:52:47 -08:00
James E. Blair
535b7162a1 Move Zuul SQL connection to "database"
The sql connection is no longer supported, we need to use "database"
instead.  The corresponding hostvars change has already been made
on bridge.

Change-Id: Ibcac56568f263bd50b2be43baa26c8c514c5272b
2022-01-27 16:46:32 -08:00
Clark Boylan
9bfacda1ac Upgrade Gerrit to 3.4
The actually upgrade will be performed manually, but this change will be
used to update the docker-compose.yaml file.

If we land this change prior to the upgrade then note the
manage-projects commands will be updated to use the 3.4 image possibly
while gerrit 3.3 is still running. I don't expect this to be a problem
as manage-projects operates via network protocols.

Change-Id: I5775f4518ec48ac984b70820ebd2e645213e702a
2022-01-24 10:54:54 -08:00
Zuul
2863b5a509 Merge "Use newlist's automate option" 2021-12-15 19:09:57 +00:00
Jeremy Stanley
759e285184 Use newlist's automate option
It appears that simply setting stdin to an empty string is
insufficient to make newlist calls from Ansible correctly look like
they're coming from a non-interactive shell. As it turns out, newer
versions of the command include a -a (--automate) option which does
exactly what we want: sends list admin notifications on creation
without prompting for manual confirmation.

Drop the test-time addition of -q to quell listadmin notifications,
as we now block outbound 25/tcp from nodes in our deploy tests. This
has repeatedly exposed a testing gap, where the behavior in
production was broken because of newlist processes hanging awaiting
user input even though we never experienced it in testing due to the
-q addition there.

Change-Id: I550ea802929235d55750c4d99c7d9beec28260f0
2021-12-15 17:42:58 +00:00
Zuul
b044cba65a Merge "Block outbound SMTP connections from test jobs" 2021-12-14 20:46:12 +00:00
Jeremy Stanley
e2dbda1bec Block outbound SMTP connections from test jobs
Our deployment tests don't need to send E-mail messages. More to the
point, they may perform actions which would like to send E-mail
messages. Make sure, at the network level, they'll be prevented from
doing so. Also allow all connections to egress from the loopback
interface, so that services like mailman can connect to the Exim MTA
on localhost.

Add new rolevars for egress rules to support this, and also fix up
some missing related vars in the iptables role's documentation.

Change-Id: If4acd2d3d543933ed1e00156cc83fe3a270612bd
2021-12-09 18:46:38 +00:00
James E. Blair
dbc69021e2 Add zuul-client config to schedulers
This adds a zuul-client config file as well as a convenience script
to execute the docker container to the schedulers.

Change-Id: Ief167c6b7f0407f5eaebecde552e8d91eb3d4ab9
2021-12-07 14:26:29 -08:00
Zuul
5a2f1c7037 Merge "Add local auth provider to zuul" 2021-12-07 17:54:57 +00:00
Zuul
94bc7c1455 Merge "Add a keycloak server" 2021-12-04 16:50:26 +00:00
James E. Blair
e79dbbe6bb Add a keycloak server
This adds a keycloak server so we can start experimenting with it.

It's based on the docker-compose file Matthieu made for Zuul
(see https://review.opendev.org/819745 )

We should be able to configure a realm and federate with openstackid
and other providers as described in the opendev auth spec.  However,
I am unable to test federation with openstackid due its inability to
configure an oauth app at "localhost".  Therefore, we will need an
actual deployed system to test it.  This should allow us to do so.

It will also allow use to connect realms to the newly available
Zuul admin api on opendev.

It should be possible to configure the realm the way we want, then
export its configuration into a JSON file and then have our playbooks
or the docker-compose file import it.  That would allow us to drive
change to the configuration of the system through code review.  Because
of the above limitation with openstackid, I think we should regard the
current implementation as experimental.  Once we have a realm
configuration that we like (which we will create using the GUI), we
can chose to either continue to maintain the config with the GUI and
appropriate file backups, or switch to a gitops model based on an
export.

My understanding is that all the data (realms configuration and session)
are kept in an H2 database.  This is probably sufficient for now and even
production use with Zuul, but we should probably switch to mariadb before
any heavy (eg gerrit, etc) production use.

This is a partial implementation of https://docs.opendev.org/opendev/infra-specs/latest/specs/central-auth.html

We can re-deploy with a new domain when it exists.

Change-Id: I2e069b1b220dbd3e0a5754ac094c2b296c141753
Co-Authored-By: Matthieu Huin <mhuin@redhat.com>
2021-12-03 14:17:23 -08:00
James E. Blair
737717585d Add local auth provider to zuul
This will allow us to issue internally generated auth tokens so
that we can use the zuul CLI to perform actions against the REST
API.

Change-Id: I09cafa2e820f5d0e7fa9ada00b9622de093242c7
2021-12-02 15:38:50 -08:00
Ian Wienand
f29aa2da16 Make haproxy role more generic
This makes the haproxy role more generic so we can run another (or
potentially even more) haproxy instance(s) to manage other services.

The config file is moved to a variable for the haproxy role.  The
gitea specific config is then installed for the gitea-lb service by a
new gitea-lb role.

statsd reporting is made optional with an argument.  This
enables/disables the service in the docker compose.

Role documenation is updated.

Needed-By: https://review.opendev.org/678159
Change-Id: I3506ebbed9dda17d910001e71b17a865eba4225d
2021-12-01 09:55:45 +11:00
Ian Wienand
855efc9010 Enable mirroring of 9-stream
This is a re-implementation of
I195ebee548071b0b89bd5bf64b251595271178ca that puts 9-stream in a
separate AFS volume

(Note the automated volume name "mirror.centos-stream" comes just
short of the limit)

Change-Id: I483c2982a6931e7d6fc97ab82f7750b72d2ef265
2021-11-15 17:54:54 +11:00
Clark Boylan
63f5674e6f Switch test gerrit hostname to review99.opendev.org
Previously we had set up the test gerrit instance to use the same
hostname as production: review02.opendev.org. This causes some confusion
as we have to override settings specifically for testing like a reduced
heap size, but then also copy settings from the prod host vars as we
override the host vars entirely. Using a new hostname allows us to use a
different set of host vars with unique values reducing confusion.

Change-Id: I4b95bbe1bde29228164a66f2d3b648062423e294
2021-10-12 09:48:53 -07:00
Clark Boylan
76baae4e3f Replace testing group vars with host vars for review02
Previously we had a test specific group vars file for the review Ansible
group. This provided junk secrets to our test installations of Gerrit
then we relied on the review02.opendev.org production host vars file to
set values that are public.

Unfortunately, this meant we were using the production heapLimit value
which is far too large for our test instances leading to the occasionaly
failure:

  There is insufficient memory for the Java Runtime Environment to continue.
  Native memory allocation (mmap) failed to map 9596567552 bytes for committing reserved memory.

We cannot set the heapLimit in the group var file because the hostvar
file overrides those values. To fix this we need to replace the test
specific group var contents with a test specific host var file instead.
To avoid repeating ourselves we also create a new review.yaml group_vars
file to capture common settings between testing and prod. Note we should
look at combining this new file with the gerrit.yaml group_vars.

On the testing side of things we set the heapLimit to 6GB, we change the
serverid value to prevent any unexpected notedb confusion, and we remove
replication config.

Change-Id: Id8ec5cae967cc38acf79ecf18d3a0faac3a9c4b3
2021-10-12 09:48:45 -07:00
Zuul
fed8ec476b Merge "Upgrade Gerrit to 3.3" 2021-10-10 20:45:48 +00:00
Jeremy Stanley
6df026852e Update ptgbot jobs to use #opendev-sandbox channel
The default channel name in the ptgbot role defaults did not
correctly specify a starting hash which it requires, but also the
test jobs seem to need it set in the eavesdrop group vars specific
to testing.

Change-Id: I16cdeac4f7af50e2cac36c80d78f3a87f482e4aa
2021-10-07 19:34:15 +00:00
Clark Boylan
e47dccdc34 Upgrade Gerrit to 3.3
This bumps the gerrit image up to our 3.3 image. Followup changes will
shift upgrade testing to test 3.3 to 3.4 upgrades, clean up no longer
needed 3.2 images, and start building 3.4 images.

Change-Id: Id0f544846946d4c50737a54ceb909a0a686a594e
2021-10-07 11:54:46 -07:00
Ian Wienand
547a4578bd letsencrypt : don't use staging in the gate
Currently we connect to the LE staging environment with acme.sh during
CI to get the DNS-01 tokens (but we never follow-through and actually
generate the certificate, as we have nowhere to publish the tokens).
We've known for a while that LE staging isn't really meant to be used
by CI like this, and recent instability has made the issue pronounced.

This modifies the driver script to generate fake tokens which work to
ensure all the DNS processing, etc. is happening correctly.

I have put this behind a flag so the letsencrypt job still does this
however.  I think it is worth this job actually calling acme.sh to
validate this path; this shouldn't be required too often.

Change-Id: I7c0b471a0661aa311aaa861fd2a0d47b07e45a72
2021-10-06 15:34:21 +11:00
James E. Blair
ac1dd4eedd Assume gitea reverse proxy
We now depend on the reverse proxy not only for abuse mitigation but
also for serving .well-known files with specific CORS headers.  To
reduce complexity and avoid traps in the future, make it non-optional.

Change-Id: I54760cb0907483eee6dd9707bfda88b205fa0fed
2021-08-20 22:06:03 -07:00
Zuul
92ead4baa1 Merge "Remove the mysql support from our gerrit role and image" 2021-08-10 23:32:37 +00:00
Clark Boylan
75e0cf106a Remove the mysql support from our gerrit role and image
We are now using the mariadb jdbc connector in production and no longer
need to include the mysql legacy connector in our images. We also don't
need support for h2 or mysql as testing and prod are all using the
mariadb connector and local database.

Note this is a separate change to ensure everything is happy with the
mariadb connector before we remove the fallback mysql connector from our
images.

Change-Id: I982d3c3c026a5351bff567ce7fbb32798718ec1b
2021-08-10 13:06:54 -07:00
Zuul
84091f5de4 Merge "Improve gerrit known_hosts management" 2021-08-06 17:10:19 +00:00
Zuul
af5fcdcb13 Merge "Run matrix-eavesdrop on eavesdrop" 2021-08-02 17:00:09 +00:00
Clark Boylan
f6a0bf7be5 Improve gerrit known_hosts management
Previously we were only managing root's known_hosts via ansible but even
then this wasn't happening because the gerrit_self_hostkey var wasn't
set anywhere. On top of that we need to manage multiple known_hosts
because gerrit must recognize itself and all of the gitea servers.
Update the code to take a dict of host key values and add each entry to
known_hosts for both the root and gerrit2 user.

We remove keyscans from tests to ensure that this update is actually
working.

Change-Id: If64c34322f64c1fb63bf2ebdcc04355fff6ebba2
2021-08-02 09:53:27 -07:00
James E. Blair
82c966e6da Run matrix-eavesdrop on eavesdrop
Thin runs the new matrix-eavesdrop bot on the eavesdrop server.

It will write logs out to the limnoria logs directory, which is mounted
inside the container.

Change-Id: I867eec692f63099b295a37a028ee096c24109a2e
2021-07-28 18:34:58 -05:00
James E. Blair
efd6ed5be8 Add DNSSEC configuration for gating.dev
Change-Id: I4d62968456ac72d4f84a63104932cc28d27feccb
2021-07-22 09:36:17 -07:00
Zuul
d68f8ce7bb Merge "Remove review01 references" 2021-07-21 03:08:24 +00:00
Ian Wienand
e79e3a2f04 Remove review01 references
This server is no longer in production, so remove the various
references to it.

Change-Id: I2cdd8052c48713e9ba648be20ccad5069d5fe40e
2021-07-20 11:57:10 +10:00
Ian Wienand
21e25cb4f6 gerrit: fix Launchpad credentials write
The extant variable name is never set so this never writes anything
out.  Move it to a dictionary value.  Use stub values for testing,
this way we don't need the "when:".

Additionally remove an unused old template file.

Change-Id: Id96fde79e28f309aa13e16bdda29f004c3c69c4b
2021-07-20 10:54:22 +10:00
Zuul
f1b559bb7a Merge "review02: move out of staging group" 2021-07-19 04:49:37 +00:00
Ian Wienand
8607ff7d81 review02: move out of staging group
This moves review02 out of the review-staging group and into the main
review group.  At this point, review01.openstack.org is inactive so we
can remove all references to openstack.org from the groups.  We update
the system-config job to run against a focal production server, and
remove the unneeded rsync setup used to move data.

This additionally enables replication; this should be a no-op when
applied as part of the transition process is to manually apply this,
so that DNS setup can pull zone changes from opendev.org.

It also switches to the mysql connector, as noted inline we found some
issues with mariadb.

Note backups follow in a separate step to avoid doing too much at
once, hence dropping the backup group from the testing list.

Change-Id: I7ee3e3051ea8f3237fd5f6bf1dcc3e5996c16d10
2021-07-18 19:45:35 -07:00
Zuul
dea42eb61f Merge "Enable openEuler mirroring" 2021-07-16 04:26:02 +00:00
Xinliang Liu
e54cc45bb8 Enable openEuler mirroring
Mirror latest LTS release openEuler-20.03-LTS-SP2.

Change-Id: I134b0c8b119d4662fc56f139a7ff4b0c7d6a4980
2021-07-15 07:12:22 +00:00
Ian Wienand
916c1d3dc8 Add paste service
The paste service needs an upgrade; since others have created a
lodgeit container it seems worth us keeping the service going if only
to maintain the historical corpus of pastes.

This adds the ansible to deploy lodgeit and a sibling mariadb
container.  I have imported a dump of the old data as a test.  The
dump is ~4gb and imported it takes up about double that; certainly
nothing we need to be too concerned over.  The server will be more
than capable of running the db container alongside the lodgeit
instance.

This should have no effect on production until we decide to switch
DNS.

Change-Id: I284864217aa49d664ddc3ebdc800383b2d7e00e3
2021-07-07 15:12:04 +10:00
Zuul
9181d5198d Merge "gerrit: add mariadb_container option" 2021-06-16 23:14:48 +00:00
Ian Wienand
570ca85cd8 gerrit: add mariadb_container option
This adds a local mariadb container to the gerrit host to hold the
accountPatchReviewDb database.  This is inspired by a few things

 - since migration to NoteDB, there is only one table left where
   Gerrit records what files have been reviewed for a change.  This
   logically scales with the number of reviews users are doing.
   Pulling the stats on this, we can see since the NoteDB upgrade this
   went from a very busy database (~300 queries/70 commits per second)
   to barely registering one hit per second :
   https://imgur.com/a/QGJV7Fw

   Thus separating the db to an external host for performance reasons
   is not a large concern any more.

 - emperically we've done a bad job in keeping the existing hosted db
   up-to-date; it's still running mysql 5.1 and we have been hit by
   bugs such as the one referenced in-line which silently drops
   backups.

 - The other gerrit option is to use an on-disk H2 database.  This is
   certainly an option, however you need special tools to interact
   with it for migration, etc. and it's not safe to backup from files
   on disk (as opposed to mysqldump).  Upstream advice is unclear, and
   varies between H2 being a performance bottleneck to this being
   ephemeral data that users don't care about.  We know how to admin
   mariadb/mysql and this allows us to migrate and backup data, so
   seems like the best choice.

 - we have a pressing need to update the server to a new operating
   system.  Running the db alongside the gerrit instance minimises
   fiddling we have to do manging connections to and migrating the
   hosted db systems.

 - related to that, we are tending towards more provider independence
   for control-plane servers.  A hosted database product is not always
   provided, so this gives us more flexibility in moving things
   around.

 - the main concern here is memory usage.  "docker stats" reports a
   quiescent container, freshly started on a 8GB host:

    gerrit-compose_mariadb_1  67.32MiB

   After loading a copy of the production table, and then dumping it
   back to a file the same container reports:

    gerrit-compose_mariadb_1  462.6MiB

The existing remote mysql configuration path remains mostly the same.
We move the gerrit startup into a script rather than a CMD so we can
call it after a "wait for db" script in the mariadb_container case
(this is the reccommeded way to enforce ordering [1]).

Backups of the local container need different dump commands; backups
are relocated to a new file and updated.

Testing is converted to use this rather than a local H2 database.

[1] https://docs.docker.com/compose/startup-order/

Change-Id: Iec981ef3c2e38889f91e9759e66295dbfb499c2e
2021-06-16 13:57:13 +10:00
Ian Wienand
ef14d11eae statusbot: don't prefix with extra # for testing
statusbot doesn't need a prefix, let's not pollute another channel.

Change-Id: Ifcacad64286c281bf668870688af8dca35622551
2021-06-11 23:30:46 +10:00
Ian Wienand
4ffcc89c8a statusbot: don't use opendevstatus name in testing
Currently when we run tests, this connects to OFTC and tries to use
the opendevstatus nick as it is the default.  Replace this with a
random username.  Also override the channels list, so it only joins

Limnoria was already using a non-conflicting name, but switch it to a
random one for consistency and possible parallel running.  This also
already only joins #opendev-sandbox.

Change-Id: I860b0f1ed4f99140dda0f4d41025f0b5fb844115
2021-06-11 22:59:24 +10:00
Ian Wienand
23fac31c92 Run statusbot from eavesdrop01.opendev.org
This installs statusbot on eavesdrop01.opendev.org.

Otherwise it's just config translation and bringing up the daemon.

Change-Id: I246b2723372594e65bcd1ba90215d6831d4c0c72
2021-06-11 07:52:51 +10:00
Ian Wienand
ccda6d08a1 Move meetbot config to eavesdrop01.opendev.org
This enables the new eavesdrop01.opendev.org server in all current
channels.  Puppet has been disabled on the old server and we will
manually stop supybot/meetbot and mirgrate logs before this applies.

Change-Id: I4a422bb9589c8a8761191313a656f8377e93422f
2021-06-10 09:02:23 +10:00
Clark Boylan
4c4e27cb3a Ansible mailman configs
This converts our existing puppeted mailman configuration into a set of
ansible roles and a new playbook. We don't try to do anything new and
instead do our best to map from puppet to ansible as closely as
possible. This helps reduce churn and will help us find problems more
quickly if they happen.

Followups will further cleanup the puppetry.

Change-Id: If8cdb1164c9000438d1977d8965a92ca8eebe4df
2021-05-11 08:40:01 -07:00
Clark Boylan
f1df36145d Add inmotion cloud to cloud launcher
This adds the new inmotion cloud to clouds.yaml files and the cloud
launcher config. This cloud is running on an openstack as a service
platform so we have quite a bit of freedom to make changes here within
the resource limitations if necessary.

Change-Id: I2aed6dffde4a1d6e3044c4bd8df4ca60065ae1ea
2021-04-21 11:18:40 -07:00
Ian Wienand
efdaa9a12a Add OSU OSL to nodepool configuration
Change-Id: Id97345595a4463617bc1a93675d35e32cfff7d08
2021-04-14 12:34:06 +10:00
Zuul
c2ba9ae565 Merge "Add zuul keystore password" 2021-04-13 17:15:09 +00:00
Ian Wienand
81db207d33 all-clouds: add OSU OSL project_id as well
Otherwise you get

 BadRequest: Expecting to find domain in project - the server could
 not comply with the request since it is either malformed or otherwise
 incorrect. The client is assumed to be in error.

Change-Id: If8869fe888c9f1e9c0a487405574d59dd3001b65
2021-04-13 13:31:49 +10:00
James E. Blair
4505baf9f9 Add zuul keystore password
This matches the proposal in https://review.opendev.org/785972

It's safe to merge now (secret storage on bridge is updated) and get
ahead of the curve.  It's harmless to add unused items.

Change-Id: I942ef5f95f9f1afe39b7d9a044276bfb338d6760
2021-04-12 14:58:07 -07:00
Ian Wienand
28ffbfb12c Add OSUOSL cloud
The Oregon State University Open Source Lab (OSUOSL;
https://osuosl.org/) has kindly donated some ARM64 resources.  Add
initial cloud config.

Change-Id: I43ed7f0cb0b193db52d9908e39c04e351b3887e3
2021-04-12 09:31:51 +10:00
Jeremy Stanley
fd98a1750d Clean up OpenEdge configuration
The OpenEdge cloud has been offline for five months, initially
disabled in I4e46c782a63279d9c18ff4ba2944c15b3027114b, so go ahead
and clean up lingering references. If it is restored later, this can
be reverted fairly easily.

Depends-On: https://review.opendev.org/783989
Depends-On: https://review.opendev.org/783990
Change-Id: I544895003344bc8202363993b52f978e1c07d061
2021-03-31 01:42:36 +00:00