The sudoers parser really, really, *really* doesn't like it when the
last line of data in your file lacks a trailing newline. Add one so
sudo will work again on these servers.
Change-Id: I40fbb535faf5b41cc56c56f09f248eea398df4e0
If you read the man page
# This will cause sudo to read and parse any files in the /etc/sudoers.d
# directory that do not end in '~' or contain a '.' character.
I don't know why sudo doesn't like files with a ".", but remove it
Fix the syntax in this file which has too many spaces
The theory that specifying a command means you can have nologin as
shell is debunked; change the shell to /bin/bash
root@mirror-update01:~# ssh -i ~/.ssh/id_vos_release vos_release@afs01.dfw.openstack.org vos
This account is currently not available.
Don't use shortcuts for positional parameters, suggested by jaltmann
in If70b27cb974eb8c1bafec2b7ef86d4f5cba3c4c5.
After hand applying these fixes, I can log in and run the script as
expected.
Change-Id: I058aadaa5ca5c7b8e94b275c4b8d26e1e0688ce8
I was trying to simplify things by having a restricted shell script
run by root. However, our base-setup called my bluff as we also need
to setup sshd to allow remote root logins from specific addresses.
It's looking easier to create a new user, and give it sudo permissions
to run the vos release script.
Change-Id: If70b27cb974eb8c1bafec2b7ef86d4f5cba3c4c5
I wasn't correctly sourcing the key; it has to come from hostvars as
it is in a different play on different hosts. This fixes it.
We also need to not have the base roles overwrite the authorized_keys
file each time. The key we provision can only run a limited script
that wraps "vos release".
Unfortunately our gitops falls down a bit here because we don't have
full testing for the AFS servers; put this on the todo list :) I have
run this manually for testing.
Change-Id: I0995434bde7e43082c01daa331c4b8b268d9b4bc
We constantly have problems with various timeouts on the release of
our mirror volumes creating locked volumes or stuck transactions; this
then requires significant manual intervention. This has been
discussed multiple times, but this short exchange from #openafs
probably sums it up best:
Sep 11 13:32:35 <auristor> The timeout problem is due to the fact
that UV_ReleaseVolume performs multiple RPCs. vos acquires a token
from the cache manager when it starts. it has no method of acquiring
a new token if it expired during an RPC. Therefore, if the token did
expire the remaining RPCs are performed unauthenticated. Without
appropriate permissions the cleanup of the volservers, writing the
updating VL entry will fail.
Sep 11 13:33:59 <auristor> A frequent solution is to deploy a remctld
service which has access to issue vos commands as -localauth and then
use remctld ACLs to restrict the identities of the processes that are
permitted to request the volume release.
Sep 11 14:37:28 <kaduk> Yeah, the -localauth tokens are pretty key
for long-running stuff, at the moment.
Indeed remctl [1] has been written to be the kerberos-based remote
control AFS wrapper. However, it is complex to setup, uses a lot of
Perl and it is unlikely to be familiar to very many people (making the
footprint of people who can help us admin it low). Getting it wrong
seems to be a pretty good vector for remote exploits. It does not
seem to be a good fit.
However, we can take a simpler approach. We can use Ansible to setup
our afs server to allow a particular key to run a release script that
wraps the "vos release -localauth" for us. With this in place, we can
update the scripts that run on mirror-update to ssh remotely and call
this, rather than call "vos release" directly.
This implements this basic support for the remote script. A new key
will be generated on mirror-update.opendev.org and it will be allowed
to run the vos_release.sh script remotely; which filters the command
to just do "vos release -localauth".
After we have tested this, we can start using it in scripts. I think
time will tell if we need locking or other features; this seems like
the KISS place to start.
[1] https://www.eyrie.org/~eagle/software/remctl/remctl.html
Change-Id: I6c96f89c6f113362e6085febca70d58176f678e7