I'm not sure if something changed in dkms, but this log file is
helpful on centos 9-stream and the other check doesn't match anything.
Also update the README.rst slightly to be more in line with reality.
Change-Id: Ic8cab980ef43490eb1b3ca0b7a0d0c2329bb94ce
Starting the openafs-client service is an intensive operation as it
walks the cache registering various things. We've seen on our
production ARM64 mirror this can take longer than the 1:30 default
timeout. This is a fatal issue, as the module will try to unload
while afsd is still spinning and working resulting in completely
corrupt kernel state.
This is about double the longest time we've seen, so should give
plenty of overhead.
Change-Id: I37186494b9afd72eab3a092279579f1a5fa5d22c
Now that https://bugs.debian.org/980115 has been fixed in
1.8.2-1+deb10u1 for Buster and appears in the 10.9 stable point
release (2021-03-27), we no longer need our special backport PPA of
the patched packages and are able to safely drop it from the role.
Change-Id: Id062fef9461e8f6ac66585ccf25f85a588782177
Since we have SRV DNS entries for our afsdb services, we don't need to
explicitly list their IP addresses here. From the man page:
For the client CellServDB, it may be desirable to make the client
aware of a cell (so that it's listed by default in /afs when the
-dynroot flag to afsd is in use, for instance) without specifying
the database server machines for that cell. This can be done by
including only the cell line (starting with ">") and omitting any
following database server machine lines. afsd must be configured
with the -afsdb option to use DNS SRV or AFSDB record lookups to
locate database server machines. If the cell has such records and
the client is configured to use them, this configuration won't
require updates to the client CellServDB file when the IP addresses
of the database server machines change.
Thus we just keep the openstack.org entry. We're have not been
keeping the list in here up-to-date with the grand.central.org version
(well, not since 2014 anyway). Since we don't really need to track
any of these, just remove them.
Change-Id: Id358e373c4c804ebe32b7447e5880015119926a5
We are in the process of upgrading the AFS servers to focal. As
explained by auristor (extracted from IRC below) we need 3 servers to
actually perform HA with the ubik protocol:
the ubik quorum is defined by the list of voting primary ip addresses
as specified in the ubik service's CellServDB file. The server with
the lowest ip address gets 1.5 votes and the others 1 vote. To win
election requires greater than 50% of the votes. In a two server
configuration there are a total of 2.5 votes to cast. 1.5 > 2.5/2 so
afsdb02.openstack.org always wins regardless of what
afsdb01.openstack.org says. And afsb01.openstack.org can never win
because 1 < 2.5/2. by adding a third ubik server to the quorum, the
total votes cast are 3.5 and it always requires the vote of two
servers to elect a winner ... if afsdb03 is added with the highest
ip address, then either afsdb01 or afsdb02 can be elected
Add a third server which is a focal host and related configuration.
Change-Id: I59e562dd56d6cbabd2560e4205b3bd36045d48c2
Over time we've had various different reasons for installing our own
OpenAFS packages in various situations and we haven't kept the when:
flags totally up to date. Currently, we need the 1.8.6-5 packages
with the January 20201 timestamp fix installed everywhere; fix this.
I think that on long-running servers, the have the PPA installed from
prior iterations (I don't think we've ever *removed* it). So things
like executors are still running with our packages, perhaps somewhat
unintentionally.
Update the comments a little to reflect what's going on.
Change-Id: I6a58c23daf85cf8fa005e3dad84a665343a947bc
Stable Debain hasn't updated its openafs packages yet to fix the bit
masking problem. This breaks our testing for zuul jobs. Try the bionic
package from our ppa on debian instead.
Change-Id: I2ab469c984ae7d90d2a87abb2e4b29250c9bc8c2
Xenial ARM64 doesn't have openafs-client built; we have 1.8.5 built in
our PPA. Leave our production Xenial x86_64 systems with the inbuilt
1.6 client until we've thought about AFS server upgrades.
Change-Id: I7dad812a714133ffe54d4ecc1978f09abb39eb72
This tests the openafs client installation on all the arm64 types that
build wheels, where we currently need the client to copy the binary
wheel output.
Depends-On: https://review.opendev.org/733755
Change-Id: I278db0b6c8fad04ebf2f971bc7b0c007ee92ac31
The lookup() happens on the local host, not the remote host. ergo we
were never using the Debian.aarch64.yaml file in production anyway
(where bridge is x86 so includes only the x86 file).
So clearly it is not necessary; as we have production ARM64 mirrors
using the base file. This is OK because we build the packages in the
PPA for x86 and arm64.
We can drop openafs_client_apt_repo which isn't used any more.
Follow-on will improve the testing of this.
Change-Id: I298cdfefc813006f7f4218dd37015992556c8498
We are seeing some failures that seem to add up to the yum module not
detecting a failure installing the kernel modules for openafs. See if
this works better with "dnf", which is the native package installer on
CentOS 8.
Change-Id: I82588ed5a02e5dff601b41b27b28a663611bfe89
Our control plane servers generally have large ephemeral storage
attached at /opt; for many uses this is enough space that we don't
need to add extra cinder volumes for a reasonable cache (as we usually
do on mirror nodes; but there we create large caches for both openafs
and httpd reverse proxy whose needs exceed even what we get from
ephemeral storage).
Add an option to set the cache location, and use /opt for our new
static01.opendev.org server.
Change-Id: I16eed1734a0a7e855e27105931a131ce4dbd0793
Linux default udp buffer sizes are somewhat small if sending much udp
traffic. Openafs uses udp for all of its traffic so we increase the
buffer size to 25MB.
Change-Id: Ie6cb7467c186d5471c71ca876ea9e29a90423bed
For whatever reason, the modules package recommends the client
package:
Package: openafs-modules-dkms
Depends: dkms (>= 2.1.0.0), perl:any, libc6-dev
Recommends: openafs-client (>= 1.8.0~pre5-1ubuntu1)
However, if that gets installed before the modules are ready, the
service tries to start and fails, but maybe fools systemd into
thinking it started correctly; so our sanity checks seem to fail on
new servers without a manual restart of the openafs client services.
By ignoring this recommends, we should install the modules, then the
client (which should start OK) in that order only.
Change-Id: I6d69ac0bd2ade95fede33c5f82e7df218da9458b
We've noticed that openafs was not getting upgraded to the PPA version
on one of our opendev.org mirrors. Switch install of packages to
"latest" to make sure it upgrades (reboots to actually apply change
unresolved issue, but at least package is there).
Also, while looking at this, reorder this to install the PPA first,
then ensure we have the kernel headers, then build the openafs kernel
modules, then install. Add a note about having to install/build the
modules first.
Change-Id: I058f5aa52359276a4013c44acfeb980efe4375a1
The role sets up a host as an OpenAFS client.
As noted in the README, OpenAFS is not available in every
distribution, or on every architecture. The goal is to provide
sensible defaults but allow for flexibility.
This is largely a port of the client parts of
openstack-infra/puppet-openafs.
This is a generic role because it will be used from Zuul jobs
(wheel-builds) and in the control-plane (servers mounting AFS)
Tested-By: https://review.openstack.org/589335
Needed-By: https://review.openstack.org/590636
Change-Id: Iaaa18194baca4ebd37669ea00505416ebf6c884c