Newer apt supports phased updates. These updates mean a subset of
servers will pull newly available packages. There are potentially good
reasons for this like A/B testing and spreading out load for package
updates. The problem is that it can create confusing situations when
packages are not consistent when we expect them to be. Avoid this
confusion by always installing the latest available packages.
Change-Id: I995070823bc2456547ba9d2023d3de7e5d9b6810
A long long time ago we globally removed snapd. Then we started to use
it to install kubectl so snapd was added back to things. Then we stopped
using kubectl snaps relying on openshift tooling instead. When we did
that we removed the removal of snapd. This means that new servers come
with snapd and don't get it removed.
In theory this is a safe change because nothing we deploy should be
relying on snaps. However, I'm not entirely sure how to audit that to be
100% sure. We should probably try to double check this a bit more before
deploying it.
Change-Id: I0441bda5beb018eb0a85dcdefa7f54c0c2d7ade4
The two old entries in here are for puppetmaster.openstack.org (no
longer exists) and bridge.openstack.org (replaced by
bridge01.opendev.org).
Remove the old entries.
Change-Id: I2199166e7d302630792eea6255d274dc2fd1040d
Add the ipv4/ipv6 of the new bridge host as an allowed login source.
We will cleanup the old entries later as migration is finished.
Change-Id: I80e671922210bf251ee4fbc6558029d857e47198
This updates our user management system to use the userdel --force flag
when disabling and removing distro cloud image users like 'ubuntu',
'centos' and 'admin'. The reason for this is when we switch from using
the distro user to boot strap launchnode over to root the distro user
may still have running processes that prevent userdel from succeeding.
This should address that problem and delete the user anyway.
The last step in the launch node process is to reboot which should clear
out any stale processes.
We don't do this for normal users as they aren't removed at node launch
time and this may be too forceful for them. It would be better for us to
error in that case and clean up any stale processes.
Change-Id: I79caf2a996566ecaec4cb4a70941bb3f03a5fb73
This is the first step in running our servers on jammy. This will help
us boot new servers on jammy and bionic replacements on jammy.
Change-Id: If2e8a683c32eca639c35768acecf4f72ce470d7d
Noticed this randomly from cron mail and unattended-upgrades. These are
vmware guest utilities. We don't run inside of vmware. We do not need
this installed.
Change-Id: Ieb2c7601c59f56d78fa350af7e0484c1cb6b8e9b
We thought byobu was removed but it is sneaky and eventually
changed name for some reason. Make sure both versions of the file
are absent.
Change-Id: I0cef293732b02228433dca5b4aa648d550ae5254
These two apt.conf.d config files are installed by different packages
but have overlap in the configuration they set. Unfortunately if the
wrong one sets the flag to disable periodic updates it wins based on apt
conf's priority rules.
To ensure that we continue to auto update and handle different packages
supplying different config files we manage the entirety of the periodic
config in both of these files at the same time using a common source
file.
Change-Id: I5e408fd7c343adb1de9ec564fe430a6f31ecc360
This file has been seen on a few servers with the Unattended-Upgrades
flag set to 0 disabling daily unattended upgrades. Most of our servers
have this set to 1 and are fine, but let's go ahead and manage this file
directly to ensure it is always 1 and auto upgrades are enabled.
Note that previously we had been setting this via apt.conf.d/10periodic
which seems to come from the update-notifier-common package on older
systems and is now no longer used. Since that file's prefix is smaller
than 20auto-upgrades the 20auto-ugprades file installed by
unattended-upgrades overrides this value. A future update would be to
coalesce both 10periodic and 20auto-upgrades together into one config
file.
Change-Id: Ic0bdaaf881780072fda7e60ff89b60b3a07b5804
In order to avoid unfortunate collisions with statically assigned
container account UIDs and GIDs, cap normal users at 9999. That way
we can set our containers to use IDs 10000 and above.
Make sure adduser/addgroup's adduser.conf gets adjusted to match the
values we set in the login.defs referenced by the lower-level
useradd/groupadd tools too. We're not using non-Debian-derivative
servers these days, so don't bother to try making this work on other
distributions for the time being.
Change-Id: I0068d5cea66e898c35b661cd559437dc4049e8f4
This will run the ua tool to attach an UA token and to enable the
esm-infra repos. We also update unattended upgrades to auto pull
security updates from the ESM repos.
Change-Id: Ifb487d12df7b799d5fd2973d56741e0757bc4d4f
We have identified an issue with stevedore < 3.3.0 where the
cloud-launcher, running under ansible, makes stevedore hashe a /tmp
path into a entry-point cache file it makes, causing a never-ending
expansion.
This appears to be fixed by [1] which is available in 3.3.0. Ensure
we install this on bridge. For good measure, add a ".disable" file as
we don't really need caches here.
There's currently 491,089 leaked files, so I didn't think it wise to
delete these in a ansible loop as it will probably time out the job.
We can do this manually once we stop creating them :)
[1] d7cfadbb7d
Change-Id: If5773613f953f64941a1d8cc779e893e0b2dd516
This reverts the changes made with
e0fc90cd067647ffcf06e0bfb84fe11636d33be5 as it has been deployed.
Change-Id: If5de429d2259a151c5e4c22fab0c6588341465e1
This was added in 2013 with I68594d489ab50ef25d351162b9dcb50ca003c409
to avoid rsyslog trying to open /dev/xconsole, and is no longer
relevant.
To get back to the upstream default, remove the modified file, purge
the package and re-install it. We can remove this shortly after it
has applied to servers.
Change-Id: Icf47abc295a6de8d43553f0a4ebdc6ce1483284e
We have having constant issues with the bionic arm64 mirror shutting
itself off. Before we go too far down the path of debugging what
appears to be a kernel oops issue, let's rebuild it as focal.
Update the sources list in the base. Update the testing to use a
focal node.
Change-Id: I5b7106e2263010ff353e8a1de43e73b0c0ec57e1
It's the only part of base that's important to run when we run a
service. Run it in the service playbooks and get rid of the
dependency on infra-prod-base.
Continue running it in base so that new nodes are brought up
with iptables in place.
Bump the timeout for the mirror job, because the iptables addition
seems to have just bumped it over the edge.
Change-Id: I4608216f7a59cfa96d3bdb191edd9bc7bb9cca39
If we move these into a subdir, it cleans up the number of things
we nave to files match on.
Stop running disable-puppet-agent in base. We run it in run-puppet
which should be fine.
Change-Id: Ia16adb96b11d25a097490882c4c59a50a0b7b23d