26 Commits

Author SHA1 Message Date
Clark Boylan
a1cf5b3f6f Run daily backups of nodepool zk image data
This does local backups of the nodepool zk image image data to
/var/log/nodepool on the nodepool-builders. These hosts don't get
offsite backups but we run mutliple redundant servers. This data isn't
critical as we can start from scratch, but may be useful if we don't
want to go through all that trouble.

Change-Id: I7d150df9c0d9566ef2d32167cea535e29822cfa2
2021-09-16 14:12:08 -07:00
Ian Wienand
4e559edbf5 nodepool-builder: add volume for /var/lib/containers
podman, used by the new containerfile element, requires a
non-overlayfs volume at /var/lib/containers to be able to start and
extract the container images for us to build from.  Add a separate
volume for this.

Change-Id: I6629034ad0b300d392d3d989dbbf17a1343c06e1
2021-06-15 09:24:08 +10:00
Ian Wienand
28fed0bcd5 nodepool-builder: configure upload workers, reduce nb03
Add a variable to configure upload-workers for nodepool-builder
daemons.

Reduce our defaults for nb03 to see if we can get more reliable
uploads.

Change-Id: I819bdd262c7118cbde4e6ffdc12aa3ac64569a96
2021-04-15 09:10:37 +10:00
James E. Blair
b6cbb52447 Add pull tasks for nodepool/zuul
So we can stop/pull/start, move the pull tasks to their own files
and add a playbook that invokes them.

Change-Id: I4f351c1d28e5e4606e0a778e545a3a805525ac71
2021-02-19 15:42:40 -08:00
Zuul
30c05ebeb1 Merge "Remove old service cleanups from zuul" 2021-01-27 19:08:43 +00:00
James E. Blair
bfa60880a1 Remove old service cleanups from zuul
These cleanup tasks have all run and we no longer need to carry them.

Change-Id: I6130d1c2fbfe39ea339f1e18f3306b221b0e12e1
2021-01-22 15:51:06 -08:00
Clark Boylan
326e0dc4d7 Set stop_grace_period on nodepool-builder containers
By default docker-compose sends a sigterm to containers when they are
stopping. It then waitsfor 10 seconds before sending a sigkill. Our
nodepool-builder containers are restarted when new images are available
and if that happens during a dib build we leak contents into dib_tmp.
One theory is that we aren't giving dib enough time to clean up after
itself so increase the 10 second period before sending sigkill to 90
seconds.

I'm not sure if this will actually help, but it can't hurt much. If the
processes die quicker we don't go any slower and if they don't die
quicker then we're giving them more time to clean up.

Change-Id: Id12cac89cccfc14a8d262e8f8494046df777a80a
2021-01-21 15:12:15 -08:00
Clark Boylan
837628d2e9 Increate nodepool builder upload workers from 4 to 8
We are running with the default upload workers count of 4 which is half
of our previous ansible/puppet valud of 8 (we have 8 vcpus on these
servers). Increase the worker count to 8 to improve upload rate.

Change-Id: I3c051968acc8c32711cd7063469d4a80077ba587
2020-08-04 12:36:01 -07:00
Monty Taylor
e0a00b4649 Add stop and start playbooks for nodepool
Organize these like our zuul rules.

Change-Id: Idf6148424c08efee9ad421b01d28d938c7058722
2020-06-16 15:48:47 -05:00
James E. Blair
09935ff328 Run Zuul as the zuuld user
This avoids the conflict with the zuul user (1000) on the test
nodes.  The executor will continue to use the default username
of 'zuul' as the ansible_user in the inventory.

This change also touches the zk and nodepool deployment to use
variables for the usernames and uids to make changes like this
easier.  No changes are intended there.

Change-Id: Ib8cef6b7889b23ddc65a07bcba29c21a36e3dcb5
2020-05-20 13:17:28 -07:00
Zuul
1ccabf5529 Merge "nodepool-builder: fix servername" 2020-05-08 03:13:58 +00:00
Clark Boylan
c0fd3e0894 Pull and prune docker images together
We noticed that our zuul scheduler was running out of disk and one of
the causes of this is we are pulling all of the wonderful new zuul
images and not pruning them. This happens because we were only pruning
when (re)starting services and we don't do that automatically with Zuul.
Address this by always pruning after pulling even if we don't restart
services. This should be safe because prune will leave the latest tagged
images as well as the running images.

This should keep our disk consumption down.

Change-Id: Ibdd22ac42d86781f1e87c3d11e05fd8f99677167
2020-05-07 12:51:09 -07:00
Ian Wienand
7a9fa2e530 nodepool-builder: fix servername
This should be set for each host's name; this looks like it was just
not templated correctly from the initial commit
I230f5291e0bd928af2e00966d76c3f385b749cb6.

Change-Id: If86ee21268c0fe6bb60c61750f551db89234ed0e
2020-05-07 13:09:04 +10:00
Monty Taylor
c836437925 Remove old init scripts and services for zuul/nodepool
We're running these in containers now. Please not to try to start
them the old way.

failed_when false is because we can't disable the old service
in the gate if there is no service file installed.

Change-Id: Ia4560f385fc98e23f987a67a1dfa60c3188816b6
2020-05-06 17:13:58 -05:00
Monty Taylor
f7ba1bd6c2 Pass -f to nodepool to run in foreground
We don't want to run it as a daemon now that we're passing
logging config to nodepool. Pass -f - since that's what
the Dockerfile does.

Change-Id: I87b4210ee7e5a622a34944c6345008082e75145b
2020-05-06 14:06:56 -05:00
Monty Taylor
8b1b70c77e Configure nodepool to use logging config
We have a logging config to log to /var/log/nodepool but we weren't
using it. Start using it.

Add logging config to nodepool-builder

We should log nodepool builder to /var/log/nodepool too.

Change-Id: I6e7196dc12e8c1bfc54274432b94cf53629bdf3d
2020-05-06 11:18:19 -05:00
Monty Taylor
e0619f17f1 Run nodepool launchers with ansible and containers
We don't run start in prod normally but we do need to run
it in the gate.

Change-Id: Iec50684280409eb978bf5638bf74ae16fad8aa26
2020-04-30 17:37:22 +00:00
Clark Boylan
8eb981b47f Install docker-compose from pypi
We want to use stop_grace_period to manage gerrit service stops. This
feature was added in docker-compose 1.10 but the distro provides 1.5.
Work around this by installing docker-compose from pypi.

This seems like a useful feature and we want to manage docker-compose
the same way globally so move docker-compose installation into the
install-docker role.

New docker-compose has slightly different output that we must check for
in the gitea start/stop machinery. We also need to check for different
container name formatting in our test cases. We should pause here and
consider if this has any upgrade implications for our existing services.

Change-Id: Ia8249a2b84a2ef167ee4ffd66d7a7e7cff8e21fb
2020-04-16 12:08:00 -07:00
Monty Taylor
06be60bc08 Drop version specifier for nodepool-builder compose
We don't actually need version 3. Mark it as version 2 to keep it
inline with everything else. In general we should only increase
past v2 if we need a specific feature.

Change-Id: Ie243da369ddec30e0eca4805434d572e12c40491
2020-03-17 13:11:25 -05:00
Zuul
87db9b6ac6 Merge "nodepool-builder: put container configs in /etc" 2020-03-17 17:50:12 +00:00
Zuul
11f7e874c1 Merge "Switch back to docker for gerrit and nodepool-builder" 2020-03-17 00:02:22 +00:00
Ian Wienand
b967495dc3 nodepool-builder: put container configs in /etc
Currently we deploy the openstacksdk config into ~nodepool/.config on
the container, and then map this directory back to /etc/openstack in
the docker-compose.  The config-file still hard-codes the
limestone.pem file to ~nodepool/.config.

Switch the nodepool-builder_opendev group to install to
/etc/openstack, and update the nodepool config file template to use
the configured directory for the .pem path.

Also update the testing paths.

Story: #2007407
Task: #39015
Change-Id: I9ca77927046e2b2e3cee9a642d0bc566e3871515
2020-03-17 07:37:00 +11:00
Monty Taylor
e5e925d715 Switch back to docker for gerrit and nodepool-builder
We rolled out review-dev with podman and it worked fine for us. It
worked less fine for nodepool-builder, although we still might be
able to solve it. Maybe right now isn't the time to do this switch.
Gitea, gitea-lb and zuul-registry all use docker instead of podman.

The only thing running with podman right now is review-dev. We can
do a manual cleanup of podman there before runnign this to keep
things simple:

  - stop gerrit service
  - uninstall podman and podman-compose
  - uninstall podman ppa config
  - uninstall pip3

Then let ansible install docker and docker compose up.

Story: #2007407
Task: #39062
Change-Id: I9bf99b18559d49d11ba99a96f02a4a45a4f65a86
2020-03-15 23:26:49 +00:00
Ian Wienand
e79f555bbd nodepool-builder: add /opt/dib_cache
This was missing but is part of the required runtime directories for
the container (for now, until we maybe move all this to volumes).

Change-Id: I9e173eb799026520588722caaf60a160abc6b130
2020-03-13 13:53:04 -07:00
Ian Wienand
b1bfee423b nodepool-builder: Add webserver
This adds the webserver that serves the logs and generated images.

Change-Id: I230f5291e0bd928af2e00966d76c3f385b749cb6
2020-03-11 09:16:31 +11:00
Ian Wienand
1979d6b160 nodepool-builder: deploy from container
This deploys the nodepool-builder container and verifies it has
started in testinfra.

Change-Id: I8a717d06f1291a4112b2753641ff88f074cf0b31
2020-03-11 09:16:24 +11:00