We are now using the mariadb jdbc connector in production and no longer
need to include the mysql legacy connector in our images. We also don't
need support for h2 or mysql as testing and prod are all using the
mariadb connector and local database.
Note this is a separate change to ensure everything is happy with the
mariadb connector before we remove the fallback mysql connector from our
images.
Change-Id: I982d3c3c026a5351bff567ce7fbb32798718ec1b
Upstream stable-3.2 and stable-3.3 branches have been fixed to allow us
to use the mariadb jdbc connector. The previous change has updated our
images to ensure they include this fix. We can now update the config to
use the mariadb connector.
Change-Id: I43ac20d601ff88e42f0d20387fc6ad8842ab8244
After I replace the docker packages some services I thought would come
back did not.
Lodegit seems to be an oversight, add restart always.
Also make sure the ZK containers start themselves.
I believe with Gerrit we've made the choice to not start automatically
due to the general high-touch nature of restarts. Keep the database
consistent and remove the auto restart there.
Change-Id: I98fa3055ac269564ed96570df0700b2aad24e4d2
The extant variable name is never set so this never writes anything
out. Move it to a dictionary value. Use stub values for testing,
this way we don't need the "when:".
Additionally remove an unused old template file.
Change-Id: Id96fde79e28f309aa13e16bdda29f004c3c69c4b
This moves review02 out of the review-staging group and into the main
review group. At this point, review01.openstack.org is inactive so we
can remove all references to openstack.org from the groups. We update
the system-config job to run against a focal production server, and
remove the unneeded rsync setup used to move data.
This additionally enables replication; this should be a no-op when
applied as part of the transition process is to manually apply this,
so that DNS setup can pull zone changes from opendev.org.
It also switches to the mysql connector, as noted inline we found some
issues with mariadb.
Note backups follow in a separate step to avoid doing too much at
once, hence dropping the backup group from the testing list.
Change-Id: I7ee3e3051ea8f3237fd5f6bf1dcc3e5996c16d10
The services: tag was accidentally put inside the mariadb section with
Iec981ef3c2e38889f91e9759e66295dbfb499c2e. This works in the gate
because it uses this path, but fails on current production. Move it
outside.
Change-Id: I8b6009da6271f451f123831a16801a9f0bd5374f
We've stopped relying on jeepyb's track-upstream feature, so stop
installing the entrypoint script and cease running its cronjob.
Depends-On: https://review.opendev.org/799123
Change-Id: I0d6edcc34f25e6bfe2bc41d328ac76618b59f62d
This adds a local mariadb container to the gerrit host to hold the
accountPatchReviewDb database. This is inspired by a few things
- since migration to NoteDB, there is only one table left where
Gerrit records what files have been reviewed for a change. This
logically scales with the number of reviews users are doing.
Pulling the stats on this, we can see since the NoteDB upgrade this
went from a very busy database (~300 queries/70 commits per second)
to barely registering one hit per second :
https://imgur.com/a/QGJV7Fw
Thus separating the db to an external host for performance reasons
is not a large concern any more.
- emperically we've done a bad job in keeping the existing hosted db
up-to-date; it's still running mysql 5.1 and we have been hit by
bugs such as the one referenced in-line which silently drops
backups.
- The other gerrit option is to use an on-disk H2 database. This is
certainly an option, however you need special tools to interact
with it for migration, etc. and it's not safe to backup from files
on disk (as opposed to mysqldump). Upstream advice is unclear, and
varies between H2 being a performance bottleneck to this being
ephemeral data that users don't care about. We know how to admin
mariadb/mysql and this allows us to migrate and backup data, so
seems like the best choice.
- we have a pressing need to update the server to a new operating
system. Running the db alongside the gerrit instance minimises
fiddling we have to do manging connections to and migrating the
hosted db systems.
- related to that, we are tending towards more provider independence
for control-plane servers. A hosted database product is not always
provided, so this gives us more flexibility in moving things
around.
- the main concern here is memory usage. "docker stats" reports a
quiescent container, freshly started on a 8GB host:
gerrit-compose_mariadb_1 67.32MiB
After loading a copy of the production table, and then dumping it
back to a file the same container reports:
gerrit-compose_mariadb_1 462.6MiB
The existing remote mysql configuration path remains mostly the same.
We move the gerrit startup into a script rather than a CMD so we can
call it after a "wait for db" script in the mariadb_container case
(this is the reccommeded way to enforce ordering [1]).
Backups of the local container need different dump commands; backups
are relocated to a new file and updated.
Testing is converted to use this rather than a local H2 database.
[1] https://docs.docker.com/compose/startup-order/
Change-Id: Iec981ef3c2e38889f91e9759e66295dbfb499c2e
With a pure javascript plugin, dropping a new file in the plugins/
directory and reloading the page is sufficient to see changes.
However, with .jar plugins (as zuul-summary-plugin now is) you need to
actually issue a reload, which requires the included permissions.
Enable it dev mode, which is where you'll very likely be trying to
iterate development with a change to a plugin. I don't think it's
really that dangerous for production, but traditionally it's been off
there so let's leave it like that.
While we're here, write out a little script to help you quickly deploy
a new .jar of the plugin when we're testing.
Change-Id: I57fa18755f8a8168da12c48f1f38d272da1c6599
By setting the auth type to DEVELOPMENT_BECOME_ANY_ACCOUNT and passing
--dev to the init process, gerrit will create an initial admin user
for us. We leverage this user to create a sample project, change,
Zuul user and sample CI result comment.
We also update testinfra to take some screenshots of gerrit and report
them back.
Change-Id: I56cda99790d3c172e10b664e57abeca10efc5566
This is enabled on review-test if you want to test it out. It is
intended to speed up fetches and clones and such as you don't have to
list and interact with all of the gerrit change refs to do those common
operations.
Change-Id: I65b430548a2805cd05dc4cdbcf9354a9c18faadc
We are seeing java gc go crazy at times and aren't quite sure what is
causing it. Add jvm gc logging to the gerrit process to help us identify
what is happening.
Additionally we add SYS_PTRACE to the container capabilities so that you
can get heap dumps from the jvm. To get a heap dump you need to do
roughyl:
docker exec -u root -it gerritcontainerid jhsdb jmap --heap --pid $pid
Change-Id: Ib4a5b84fda4eca73c7971c31ee74c3232eb733e4
We were setting these values in gerrit.config but it isn't clear if
these now need to go in jgit.config instead. I've tried to clarify with
upstream maintainers as the documentation is quite confusing. While we
wait for clarification why don't we just set the values in both files to
ensure we are covered.
This converts jgit.config to a jinja2 template so that we can use the
variable number of pack files setting.
Change-Id: I70c1e6b738ed6e9fdb72d86e7cf3fb8cfecf1323
Gerrit 3.2 supports java 11 now and Gerrit 3.3 will be the last to
support java 8. Lets get ahead of things and switch to java 11.
Change-Id: I1b2f6b1bdadad10917ef5c56ce77f7d7cfc8625d
The receive.autogc configuration apparently needs to be in its own
file, not in the general gerrit.conf. Move it to the correct
location.
https://review.opendev.org/Documentation/config-gerrit.html#jgit-receive
While we're here, correct the filename on the gerrit.config and
secure.config templates to make it clear they're jinja2 files, and
add a file mode to the replication.config where it was missing.
Change-Id: I9243bccac103c51ee435725aae482731642a37cc
For unfathomable reasons, Gerrit implements automatic GC on every
push and enables this by default but recommends in the documentation
that it be turned off. Follow their recommendation on this, since it
seems to result in additional load and we already periodically GC
all repos anyway.
Change-Id: I9a46c69b26e0a746f2aed308a28e5408e5c34ef1
We're seeing high system load and decreased performance on our
production Gerrit instance. Some research suggests this may be I/O
contention which can be relieved through better caching:
https://groups.google.com/g/repo-discuss/c/7CemrH4lVJE
According to `gerrit show-caches --show-jvm --show-threads` some of
our memory-only caches are already at their default maximums after
only a few days of operation, and one in particular
(changeid_project) has a particularly poor cache hit ratio of 24% at
the moment. Increase changeid_project from the 1024 entry default by
32x (manual tests at 8x approached 50% cache hit), increase projects
by 4x (greater than the number of repos we host for now), and double
the others (groups_bysubgroup, permission_sort) since they still had
reasonable cache hit ratios while full. Also alpha-order the
existing cache overrides in our config for improved maintainability.
This will require a Gerrit service restart to take effect, once the
file update has been deployed.
Change-Id: Ieecd1802ce53cc0d37c68476b94b44cbe36fbd6e
We're seeing a growing backlog of email events that all must funnel
through the single thread allocated to sending email. We think this may
be related to server slowness that we are observing. Bump the number of
threads to 4 to see if that flushes the queues quicker and gives us a
more responsive server.
Change-Id: I730c8f657191cedb46f81f4abc0e1796ef847b27
This should only land once we are on Gerrit 3.x and happy with it. But
at this point the mysql reviewdb will not be used anymore and config for
it can be removed. We keep general mysql things like tools and backups
in place as the accountPatchReviewDb continues to live in MySQL.
This also comments out calls to jeepyb's welcome-message,
update-blueprint and update-bug entrypoints from the patchset-created
event hook, since they rely on database connections for the moment.
Calls to update-bug in change-abandoned and change-merged event
hooks are retained as those code paths don't rely on database
interaction nor attempt to load the removed configuration.
Change-Id: I6e24dbb223fd3f76954db3dd74a03887cf2e2a8b
More recent Gerrit versions have replaced teh old drafts feature
with a combination of private changes and work in progress state.
The latter might be useful eventually and could be used to augment
or replace our workflow -1 votes, but the not-so-private nature of
drafts is what caused us to disable them and we should do the same
with private changes as they'll become an attractive nuisance
otherwise.
Change-Id: I213a73b0ba6a3dd2a8ce402d6a396e6c494529c8
When we get to Gerrit 3.x the old html theming, hideci.js and
commentlinks that parse html and rely on urls no longer work. Lets clean
that up when we get there.
We can add back in similar things that polygerrit does support later
but we don't want that to make the upgrade even more difficult to do.
This should only be merged once we are running Gerrit 3.x.
Change-Id: I838840e6cbf09ca28faeb2cf06290e298a4a1f74
These changes are squashed together to simplify applying them to config
management without zuul and ansible running one of these without the
others. We essentially need them all in place at the same time to
accurately reflect the post upgrade state.
We stop blocking /p/ in gerrit's apache vhost. /p/ is used for
dashboards.
We add a few java options that new gerrit sets by default.
We update the gerrit image in docker compose to 3.2.
We update zuul to use basic auth instead of digest auth when talking to
Gerrit.
Change-Id: I6ea38313544ce1ecbc4cfd914b1f33e77d0d2d03
Include comments in the Gerrit vhost config template to make setting
a temporary site-wide maintenance message easier.
Change-Id: I81f69185e081b2a6506d5355bb07a90cb3e03fea
This serverId value is used by notedb to identify the gerrit cluster
that notedb contents belong to. By default a random uuid is generated by
gerrit for this value. In order to avoid config management and gerrit
fighting over this value after we upgrade we set a value now.
This should be safe to land on 2.13 as old gerrit should ignore the
value.
Change-Id: I57c9b436a9d0d1dfe77eee907d50fc1dcda6ab12
We stopped serving this content and the next step is to stop managing it
internally. This depends on a change to jeepyb that makes the local git
dir management on the jeepyb side optional. Once that lands we can
update our configs to tell jeepyb to stop managing it.
We also stop doing garbage collection, mounting it into containers that
don't need it, etc.
Depends-On: https://review.opendev.org/758597
Change-Id: I2185e90edfcac71941bc29a4e11b7b2d4c7c2e13
change.move is a new option in gerrit 3.0 that toggles whether or not
the change move api is enabled. We disable it because there are
potential confusing side effect for moving a change with parent changes
then merging the moved change. Details can be found in
https://bugs.chromium.org/p/gerrit/issues/detail?id=9877
We've not needed to move chagnes previously and users can abandon and
push to a different branch instead.
With enableSignedPush we set that to false even though it is an existing
default because newer gerrit seems to write it out to its config file.
We write it out to avoid unnecessary file updates after the upgrade.
Note I believe it is safe to land this on 2.13 or 2.16 as gerrit should
just ignore change.move until we get to 3.x and enableSignedPush already
defaults to false.
Change-Id: I9db2026b1e5cafefd448f33f74d6b7b60efafdb4
We remove old git web server env vars from the apache config and add
comments to our /p/ handling to describe the need for further cleanup
when Gerrit is upgraded.
Change-Id: I79fc130dec0a8b00706c0ec0f8fcab4d867e34d1
Gerrit is repurposing the /p/ path for project dashboard under
polygerrit. We use this path for Git mirrors. To resolve this let's
disable the /p/ path now then when it is used for project dashboards
users won't be as confused.
This has the added benefit of reducing the number of mirrors we need to
manage which makes managing branches in the mirrors simpler.
Change-Id: I9ebca2049a4a0707ecfbaecd92e42ebc1e6c3f87
The ssl flag is deprecated and we get cronspam [0] warning us about
this. The docs [1] say we should use ssl-mode instead.
[0] WARNING: --ssl is deprecated and will be removed in a future version. Use --ssl-mode instead.
[1] https://dev.mysql.com/doc/refman/5.7/en/connection-options.html#option_general_ssl
Change-Id: I060bbfeaf1171dac50dcfcd2c62fcaa8956fb4e2
By default gerrit replication pushes +refs/*:refs/*, which includes
refs/changes. For large repositories that potentially means hundreds
of thousands of references.
Per-repo git mirroring does not push refs/changes, so when it runs it
ends up deleting those references, which can take a long time, blocking
the executor.
To fix that, we should:
- stop pushing refs/changes to GitHub (this change)
- delete refs/changes on GitHub repositories, asynchronously
- enable per-repo replication
- disable Gerrit-wide replication
NB: it is unclear if Gerrit replication would start deleting the
extraneous references on remote GitHub repositories once this
merges. If this is the case, since replication is limited to a
single thread (default value for 'threads') and is not happening
in an executor, this should not have negative impact, beyond
potentially delaying GitHub mirroring.
Change-Id: I94f69c889c9b4418ef81b3b2ca436ba99696ba72
Also add a 5 minute stop grace period. This lines up with the init
script we were using when this was a systemd managed service.
Change-Id: I5a92bb214b96447008ad570e176adda13c4ca0cb
We use project-config for gerrit, gitea and nodepool config. That's
cool, because can clone that from zuul too and make sure that each
prod run we're doing runs with the contents of the patch in question.
Introduce a flag file that can be touched in /home/zuulcd that will
block zuul from running prod playbooks. By default, if the file is
there, zuul will wait for an hour before giving up.
Rename zuulcd to zuul
To better align prod and test, name the zuul user zuul.
Change-Id: I83c38c9c430218059579f3763e02d6b9f40c7b89
We use this in some utlity scripts, but we currently don't
write it out anywhere. It was an old puppet artifact.
Change-Id: Ib6fdfc4f4a9c5d1befdb6d256989450996dd2a3d
We run some utility scripts which ssh to ourselves, but we aren't
setting host keys for them. We should fix that.
Change-Id: I2aa5d5e65b15c5c151767377dbc5ead1e442b3ce
Files are bind-mounted into the container in different locations.
Set envvars pointing to the right places.
Also - we need to bind-mount the projects.yaml and projects.ini
files into the container.
While we're at it, move patchset-created to be a regular file.
Change-Id: Iacd3e921464b24479db13bbf7ae998b8d8e2103d
Turns out our config has a bunch of hardcoded /home/gerrit2/acls
entries in it. That doesn't work if we're just pointing the
config file at /opt/project-config/gerrit/acls.
Change-Id: I387e446501e17a3bdd807807d5ef6b69b53abde5
We use this to make the .gitreview file too, so our thought that
we could just use localhost was a little misguided.
Change-Id: I501b10b2003c7e04ca1ac345d14fa33916b3e60b