We previously set the limit to 70200M on a ~98GB filesystem.
Unfortunately we are able to jump from the ~70GB limit to a full
filesystem before htcachclean happens to run again. Reduce the limit to
60000M to give us more headroom and hopefully avoid filling the fs
between cache clean runs.
Change-Id: I8aa45eb0c396b54dbb3ec84e5ba8fd4ec7da9e27
Rather than restarting the whole scheduler group, just restart
zuul02, which is our only production scheduler. That will allow us
to boot zuul01 as a secondary scheduler and manually add/remove it
for testing.
Once we can reliably run two schedulers, we can revert this change.
Change-Id: I5518ea1d3a6a1d48460b0436d4d1eaf9d52b7ddb
Now that the SKS keyserver network is no more, and there's no
convenient way to share third-party key signatures, we need to
adjust our key management and rollover process accordingly.
Change-Id: I7008706aae06b6e4a16db2dd85a8c7f91530cd50
Mostly just formatting and punctuation, plus some outdated bits.
Signed-off-by: Dr. Jens Harbott <harbott@osism.tech>
Change-Id: I641beb5d65f87173d50c74a4e1f0dba48d006231
This is followon to feedback for earlier docs updates. Basically we
should always log these restarts so make that more clear that it isn't
optional.
Change-Id: Ib0fa05b2075d6c82199e6e043724aeedaf04e49c
Zuul has changed has it stores secret keys and they are in zookeeper
now. This means our old docs on decrypting things are no longer correct.
Update them with a new set of instructions that matches the modern
setup.
Change-Id: I7484a8c02e005fadc41e22a4158b3dcb8434ec5d
It was recently pointed out that our restart process for zuul is a bit
stale. Document the new modern process that deals with ansible playbooks
and docker containers.
Change-Id: I52812e87ed73e6ed538f94a86c1b62ce3de57c37
Last week when we were attempting to only update the subset of projects
that were renamed in gitea we accidentally updated all projects. The
good news is this didn't take significant amounts of time (just a few
minutes).
We should be able to enforce the metadata for all projects given the
cost is now much lower than it was in the past. This will keep things up
to date after renames but also generally if projects update descriptions
or bug tracking locations.
Change-Id: Ief2bb1eb2b11a13fafbe52650317d54d6a0fc824
This reverts commit a39a939e0352741d0b2c43e96e660f52eac22245.
Turns out that ansible module args don't get typed the way we expect
them. This means having a Boolean or List type argument just ends up in
confusion and always_update being truthy every which way. Revert until
we can fix this properly.
Change-Id: I596fe6883098ba636b1cad5196d1fdd76ff19076
Setting the gitea_always_update var for the gitea-git-repos role to
a list will filter metadata updates to only the project names
included in the supplied list. False and True still have their prior
meanings of do no metadata updates or force metadata updates for
every project we host.
Add testing for this, and also actually test that the rename
playbook renamed something.
Get rid of the git clone in the playbook since it's no longer
relevant to how we run things anyway, we'll instead want to rely on
the Zuul supplied projects.yaml path.
Change-Id: Id8238b232caffc242c6bda9fe39eb7e65fe5e059
Previous change If91f79a4648920999de8e6bf6e0c9fec82fde233 replaced
one of the instances of yaml.load() in this file with safe_load() in
order to silence what were then warnings. Now they're errors with
current PyYAML, so go ahead and update the other one.
Change-Id: If9f839f60cd71be8be141423ef2b93884d8aeba7
This removes the old config to choose the old change screen by default
as everything is polygerrit now.
We remove the pre plugin melody config as melody now ships as a plugin
and has separate configuration.
We remove old theming information as that is supplied via external files
now.
We remove anonymous git download config because we don't set
gerrit.canonicalGitUrl which is required for this to work. We don't set
that because we don't have a git:// server anymore.
Bump the lucene thread count from 4 to 8 as we have more cores on the
system we run on.
Finally add some comments to help make sense of config that is left in
place.
Change-Id: Ie0b48e544191839067e66647d2ea32f74ce19ed3
Having two groups here was confusing. We seem to use the review group
for most ansible stuff so we prefer that one. We move contents of the
gerrit group_vars into the review group_vars and then clean up the use
of the old group vars file.
Change-Id: I7fa7467f703f5cec075e8e60472868c60ac031f7
Previously we had set up the test gerrit instance to use the same
hostname as production: review02.opendev.org. This causes some confusion
as we have to override settings specifically for testing like a reduced
heap size, but then also copy settings from the prod host vars as we
override the host vars entirely. Using a new hostname allows us to use a
different set of host vars with unique values reducing confusion.
Change-Id: I4b95bbe1bde29228164a66f2d3b648062423e294
Previously we had a test specific group vars file for the review Ansible
group. This provided junk secrets to our test installations of Gerrit
then we relied on the review02.opendev.org production host vars file to
set values that are public.
Unfortunately, this meant we were using the production heapLimit value
which is far too large for our test instances leading to the occasionaly
failure:
There is insufficient memory for the Java Runtime Environment to continue.
Native memory allocation (mmap) failed to map 9596567552 bytes for committing reserved memory.
We cannot set the heapLimit in the group var file because the hostvar
file overrides those values. To fix this we need to replace the test
specific group var contents with a test specific host var file instead.
To avoid repeating ourselves we also create a new review.yaml group_vars
file to capture common settings between testing and prod. Note we should
look at combining this new file with the gerrit.yaml group_vars.
On the testing side of things we set the heapLimit to 6GB, we change the
serverid value to prevent any unexpected notedb confusion, and we remove
replication config.
Change-Id: Id8ec5cae967cc38acf79ecf18d3a0faac3a9c4b3
This is just a documentation update but reflect the change upstream
Gerrit made in versions 3.3 renaming this group.
Change-Id: I5458afd2683c2a7c4616f4894884e3d3ce03bbaf
We added 3.4 jobs but they aren't running because we haven't tagged 3.4
images on dockerhub successfully.
Change-Id: I1fce44fe562a994c5513ceeb96270a4d5f7c40c3