7 Commits

Author SHA1 Message Date
Clark Boylan
3f2dd0e681 Enable srvr, stat and dump commands in the zk cluster
Zookeeper supports a number of "4 letter" commands [0] which are useful
for debugging and general diagnostics. By default only srvr is enabled,
but we want to add stat and dump to see details on server and client
connection statuses.

We do this via the 4lw.commands.whitelist configuration option [1] and
not the docker image env vars because we're mounting a zoo.cfg in
already.

[0] https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_4lw
[1] https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_clusterOptions

Change-Id: I24ea9b37cd5766c9d393106e8eab34623cad1624
2021-03-15 16:57:21 -07:00
James E. Blair
a514aa0f98 Zookeeper: listen on plain and TLS ports
To prepare for switching to TLS, set up TLS certs for Zookeeper and
all of Nodepool and Zuul, but do not have them connect over TLS yet.
We have observed problems with Kazoo using TLS in production.  This
will let us run the ZK quorum using TLS internally, and have Zuul
and Nodepool connect over plaintext while also exposing the TLS
client port so that we can perform some more production tests.

Change-Id: If93b27f5b55be42be1cf6ee23258127fab5ce9ea
2020-06-17 10:38:59 -07:00
James E. Blair
05021f11a2 Revert "Add Zookeeper TLS support"
This reverts commit 29825ac18b58145f007f64b2998357445b8fdd91.

We observed this issue in production:
https://github.com/python-zk/kazoo/issues/587

Revert until we find a fix.

Change-Id: Ib7b8e3b06770a83b39458d09d2b1e655bd94bd22
2020-06-16 11:15:48 -07:00
James E. Blair
29825ac18b Add Zookeeper TLS support
This creates TLS certs for Zookeeper, uses them inside the ZK
quorum, and configures Nodepool and Zuul to use them as well.

A full system restart of all ZK-related components will be required
after merging this patch.

Change-Id: I0cb96a989f3d2c7e0563ce8899f2a5945ea225b3
2020-06-15 11:19:47 -07:00
James E. Blair
09935ff328 Run Zuul as the zuuld user
This avoids the conflict with the zuul user (1000) on the test
nodes.  The executor will continue to use the default username
of 'zuul' as the ansible_user in the inventory.

This change also touches the zk and nodepool deployment to use
variables for the usernames and uids to make changes like this
easier.  No changes are intended there.

Change-Id: Ib8cef6b7889b23ddc65a07bcba29c21a36e3dcb5
2020-05-20 13:17:28 -07:00
Clark Boylan
5141306c71 Cleanup unneeded things post docker-compose upgrade
The zookeeper role can use the default pip installed docker-compose now.
We can also stop ensuring the distro package is removed as this has run
on all hosts at this point.

Change-Id: Ia034ae7d2c8e38494050698e1bfac0cc273dd200
2020-04-20 09:47:12 -07:00
James E. Blair
42574b2b37 Run ZK from containers
Migration plan:
* add zk* to emergency
* copy data files on each node to a safe place for DR backup
* make a json data backup: zk-shell localhost:2181 --run-once 'mirror / json://!tmp!zookeeper-backup.json/'
* manually run a modified playbook to set up the docker infra without starting containers
* rolling restart; for each node:
  * stop zk
  * split data and log files and move them to new locations
  * remove zk packages
  * start zk containers
* remove from emergency; land this change.

Change-Id: Ic06c9cf9604402aa8eb4bb79238021c14c5d9563
2020-04-17 08:43:09 -07:00