This reverts commit 05021f11a29a0213c5aecddf8e7b907b7834214a.
This switches Zuul and Nodepool to use Zookeeper TLS. The ZK
cluster is already listening on both ports.
Change-Id: I03d28fb75610fbf5221eeee28699e4bd6f1157ea
Fedora 33 is not released yet and the TripleO team would
like to perform some tests on that image.
Change-Id: I39f6bedadc12277739292cf31cc601bc3b6e30ec
Note this shouldn't be used until we can configure Gerrit to do similar
with jeepyb. Otherwise we'll end up with mismatched branches between our
canonical source (Gerrit) and our mirrors (Gitea).
Change-Id: I8d353cbc90c2d354e7cdebfc4e247f3f73d97d86
Specifying the family stops a deprecation warning being output.
Add a HTML report and report it as an artifact as well; this is easier
to read.
Change-Id: I2bd6505c19cee2d51e9af27e9344cfe2e1110572
Builds running on the new container-based executors started failing to
connect to remote hosts with
Load key "/root/.ssh/id_rsa": invalid format
It turns out the new executor is writing keys in OpenSSH format,
rather than the older PEM format. And it seems that the OpenSSH
format is more picky about having a trailing space after the
-----END OPENSSH PRIVATE KEY-----
bit of the id_rsa file. By default, the file lookup runs an rstrip on
the incoming file to remove the trailing space. Turn that off so we
generate a valid key.
Change-Id: I49bb255f359bd595e1b88eda890d04cb18205b6e
I476674036748d284b9f51e30cc2ffc9650a50541 did not open port 3081 so
the proxy isn't visible. Also this group variable is a better place
to update the setting.
Change-Id: Iad0696221bb9a19852e4ce7cbe06b06ab360cf11
We have decided to go with the layer 7 reject rules; enable the
reverse proxy for production hosts.
Change-Id: I476674036748d284b9f51e30cc2ffc9650a50541
This brings in the settings added with
I87c85f82f6d38506977bc9bf26d34f6e66746b01 to the container deployment.
As noted there, this stops statsd writing null values for sparesly
updated timers and counters.
Change-Id: I14b5ee40fc8efddfb7bad4fad8a8ae66746131d9
This deploys graphite from the upstream container.
We override the statsd configuration to have it listen on ipv6.
Similarly we override the ngnix config to listen on ipv6, enable ssl,
forward port 80 to 443, block the /admin page (we don't use it).
For production we will just want to put some cinder storage in
/opt/graphite/storage on the production host and figure out how to
migrate the old stats. The is also a bit of cleanup that will follow,
because we half-converted grafana01.opendev.org -- so everything can't
be in the same group till that is gone.
Testing has been added to push some stats and ensure they are seen.
Change-Id: Ie843b3d90a72564ef90805f820c8abc61a71017d
This uses the Grafana container created with
Iddfafe852166fe95b3e433420e2e2a4a6380fc64 to run the
grafana.opendev.org service.
We retain the old model of an Apache reverse-proxy; it's well tested
and understood, it's much easier than trying to map all the SSL
termination/renewal/etc. into the Grafana container and we don't have
to convince ourselves the container is safe to be directly web-facing.
Otherwise this is a fairly straight forward deployment of the
container. As before, it uses the graph configuration kept in
project-config which is loaded in with grafyaml, which is included in
the container.
Once nice advantage is that it makes it quite easy to develop graphs
locally, using the container which can talk to the public graphite
instance. The documentation has been updated with a reference on how
to do this.
Change-Id: I0cc76d29b6911aecfebc71e5fdfe7cf4fcd071a4
LXC3 is usable with CentOS 8, while lxc2 is not available for it anymore
So it's worth adding it to reduce network related issues in CI
Change-Id: I562a7d8000ecda8790da88f08128c35b1ec4a2c9
As described inline, this crawler is causing us problems as it hits
the backends indiscriminately. Block it via the known UA strings,
which luckily are old so should not cause real client issues.
Change-Id: I0d78a8b625b69f600e00e8b3ea64576e0fdb84d9
This adds an option to have an Apache based reverse proxy on port 3081
forwarding to 3000. The idea is that we can use some of the Apache
filtering rules to reject certain traffic if/when required.
It is off by default, but tested in the gate.
Change-Id: Ie34772878d9fb239a5f69f2d7b993cc1f2142930
We use the Ctx.Req object's RemoteAddr value as it should include the
IP:port combo according to https://golang.org/pkg/net/http/#Request. The
default template uses Ctx.RemoteAddr which Macaron attempts to parse for
x-forwarded-for values but this has the problem of stripping out any
port info.
The port info is important for us because we are doing layer 4 load
balancing and not http l7 load balancing. That means the ip:port
mappings are necessary to map between haproxy and gitea logs.
Change-Id: Icea0d3d815c9d8dd2afe2b1bae627510c1d76f99
Adding the tcplog option to an haproxy backend definition overrides
the default log format. Remove it so the supplied default (which we
based on the tcplog built-in default with some additions) will be
used instead.
Change-Id: Id302dede950c1c2ab8e74a662cc3cb1186a6593d
When forwarding TCP sockets at OSI layer 4 with haproxy, it helps to
know the ephemeral port from which it sources each connection to the
backend. In this way, backend connections can be mapped to actual
client IP addresses by correlating backend service access logs with
haproxy logs.
Add "[%bi]:%bp" between the frontend name and backend name values
for the default log-format documented here:
https://www.haproxy.com/blog/haproxy-log-customization/
Change-Id: Ic2623d483d98cd686a85d40bc4f2e8577fb9087f
This will write an NCSA style access.log file to the logs volume.
This will let us see user agents, etc, to aid in troubleshooting.
Change-Id: I64457f631861768928038676545067b80ef7a122
The increase in connection volume is not sustainable for the available memory on the backend servers. We'll likely need to scale the cluster before reattempting this.
This reverts commit 79f363164ed0c81e4c7603885f8e9815164b2df2.
Change-Id: Ibe64f472633a62df659c6183aa96e095dda7fdbc
We've set maxconn to 4k concurrent connections on the front side of our
haproxy load balancer. Currently that seems to be creating a large
backlog of requests. Looking at cacti it appears that we have maybe up
to ~6-8 times this amount of overhead in resources on the gitea
backends. Be a little conservative and bump this value up by 4x and tune
from there.
Change-Id: I56d43b52c23f251cc632315c3b57e45541722970
According to gitea swagger definitions all of these GET requests for
lists of items are paginated with a max limit of 50 items per request.
Update our ansible machinery to properly page these items to avoid
problems in the future.
Note we should try and confirm that this is how it works for production
gitea.
Change-Id: I5df13288b497fb4fb716b4223b3dd61c698a7739
This was originally left here because it seemed like a good thing
to test, but it's currently causing thigns to hang. Exclude it
until we know why.
Change-Id: Ibc6f001e1235e9f0d856cc350ed8099e52c706e9
We list gitea repos to determine if we need to create a repo. If the
repo isn't listed by gitea we create it. New gitea paginates these
listings so we were only getting 30 repos listed when we had far more.
This resulted in us trying to create repos which already exist which is
a gitea http 409 error.
Fix this by paging through the listings until we've seen all the
repos. This should give us a complete listing.
To test this we run our manage-projects playbook twice in the
system-config-run-gitea job. The first pass creates all the new
projects. Then the second pass should noop cleanly.
Change-Id: I73b77b9ddaa0106d4dc0a49c4d4b7751a39a16f9
Co-Authored-By: Jeremy Stanley <fungi@yuggoth.org>
We've noticed that our static sites will semi-regularly have
problems due to stale SSL certs served by Apache workers which
predate the latest certificate replacement and haven't terminated
(graceful restart only ends the running workers once they have no
remaining connections). Limit the impact of this by recycling
workers automatically after a reasonable (large) number of
connections.
This implementation is shamelessly stolen from that used in
Ic377f48d1a5a3eecbcb183327c9255134c4364ab for our mirror sites.
Change-Id: I2e5c0bdf012184ebbfccb086b967008bf12582ab
Co-Authored-By: Clark Boylan <clark.boylan@gmail.com>
We found that new data (since we removed -t) was no correctly being
skipped for re-download. We have found that this doesn't happen with
-t on later rsyncs, which have included fixes for -t to not touch the
timestamps if things are not updated. We have updated mirror-update
to Focal that has this rsync, so restore the flag.
Change-Id: I3fa16dbf6487a442549c540796807ef4916d4e6e
We use ansible's to_nice_yaml output filter when writing ansible
datastructures to yaml. This has a default indent of 4, but we humans
usually write yaml with an indent of 2. Make the generated yaml more
similar to what us humans write and set the indent to 2.
Change-Id: I3dc41b54e1b6480d7085261bc37c419009ef5ba7