We have two standalone roles, puppet and cloud-launcher, but we
currently install them with galaxy so depends-on patches don't
work. We also install them every time we run anything, even if
we don't need them for the playbook in question.
Add two roles, one to install a set of ansible roles needed by
the host in question, and the other to encapsulate the sequence
of running puppet, which now includes installing the puppet
role, installing puppet, disabling the puppet agent and then
running puppet.
As a followup, we'll do the same thing with the puppet modules,
so that we arent' cloning and rsyncing ALL of the puppet modules
all the time no matter what.
Change-Id: I69a2e99e869ee39a3da573af421b18ad93056d5b
Zuul is publishing lovely container images, so we should
go ahead and start using them.
We can't use containers for zuul-executor because of the
docker->bubblewrap->AFS issue, so install from pip there.
Don't start any of the containers by default, which should
let us safely roll this out and then do a rolling restart.
For things (like web or mergers) where it's safe to do so,
a followup change will swap the flag.
Change-Id: I37dcce3a67477ad3b2c36f2fd3657af18bc25c40
Migration plan:
* add zk* to emergency
* copy data files on each node to a safe place for DR backup
* make a json data backup: zk-shell localhost:2181 --run-once 'mirror / json://!tmp!zookeeper-backup.json/'
* manually run a modified playbook to set up the docker infra without starting containers
* rolling restart; for each node:
* stop zk
* split data and log files and move them to new locations
* remove zk packages
* start zk containers
* remove from emergency; land this change.
Change-Id: Ic06c9cf9604402aa8eb4bb79238021c14c5d9563
So that we can start running things from the zuul source rather
thatn update-system-config and /opt/system-config, we need to
install a few things onto the host in install-ansible so that the
ansible env is standalone.
This introduces a split execution path. The ansible config is
now all installed globally onto the machine by install-ansible
and does not reference a git checkout.
For running ad-hoc commands, an ansible.cfg is introduced inside
the root of the system-config dir. So if ansible-playbook is
executed with PWD==/opt/system-config it will find that ansible.cfg,
it will take precedence, and any content from system-config
will take precedence.
As a followup we'll make /opt/system-config/ansible.cfg written
out by install-ansible from the same template, and we'll update
the split to make ansible only work when executed from one of
the two configured locations, so that it's clear where we're
operating from.
Change-Id: I097694244e95751d96e67304aaae53ad19d8b873
The normal callback plugin is unreadable for stdout and stderr things.
Update to use the debug plugin which prints their output nicely in
the way we'd like.
Change-Id: I3a6b31af7d6132a1ee31a280f7f21f3132856273
This should be mostly a no-op - but we will need to do a shutdown
in emergency mode.
Tell the gerrit role to not run compose up when run as part of
remote_puppet_git.
Change-Id: Id45376c2697656a12afeacf317b6f26c85c08dad
We have LE dns entries for review.o.o, but we're not actually
requesting the cert. Go ahead and request it - it'll make the
apache config easier to sort out.
Get the openstack.org certs for review-dev while we're at it.
Change-Id: I91d06c97993ba37204bd1fc326ae823e1b9c0c1a
Depends-On: https://review.opendev.org/707267
Depends-On: https://review.opendev.org/707255
Add bool to use_upstream_docker conditional
This is an ansible behavior change that's coming in 2.12 but is
currently spewing warnings. The warnings make the log really hard
to read, so just fix it.
Disable group name auto-renaming
If you have group names with non-python identifier characters, it
prevents you from looking it up in jinja like "groups.group-name"
so ansible auto-transforms it so you can do "groups.group_name".
This is a confusing behavior which is going away. However, ansible
is warning everyone who has groups with characters in them as it
has no idea how you might be accessing those group names. Add
a config setting to suppress the warning about -'s in group names.
Change-Id: Ib3262025799af7c3171ed0b079cb1dd969075931
We ended up running into a problem with nodepool built control plane
images (has to do with boot from volume not allowing us to delete images
that are in use by a nova instance). We have decided to clean this up
and go back to not doing this until we can do it more properly.
Note this isn't a revert because having a group for access to control
plane clouds does seem like a good idea in general and I believe there
have been changes we'd have to resolve in the clouds.yaml files anyway.
Depends-On: https://review.opendev.org/#/c/665012/
Change-Id: I5e72928ec2dec37afa9c8567eff30eb6e9c04f1d
In order to have nodepool build images and upload them to control
plane clouds, add them to the clouds.yaml on the nodepool-builder
hosts. Keep them out of the launcher configs by splitting the config
templates. So that we can keep our copies of things to a minimum,
create a group called "control-plane-clouds" and put bridge and nb0*
in it.
There are clouds mentions in here that we no longer use, a followup
patch will clean those up.
NOTE: Requires shifting the clouds config dict from
host_vars/bridge.openstack.org.yaml to group_vars/control-plane-clouds.yaml
in the secrets on bridge.
Needed-By: https://review.opendev.org/640044
Change-Id: Id1161bca8f23129202599dba299c288a6aa29212
The server has been removed, remove it from inventory.
While we're here, s/graphite.openstack.org/graphite.opendev.org/'
... it's a CNAME redirect but we might as well clean up.
Change-Id: I36c951c85316cd65dde748b1e50ffa2e058c9a88
This leaves ask.o.o and lists.o.o, which are still running Trusty, and
the cgit servers, which are likely to be decommissioned soon.
Change-Id: I78e7fd9e3079cc760da0aad955f6eeb32d442fc3
Two related changes that need to go together because we test with the
production groups.yaml.
Confusingly, there are arm64 PC1 puppet repos, and it contains a bunch
of things that it turns out are the common java parts only. The
puppet-agent package is not available, and it doesn't seem like it
will be [1]. I think this means we can not run puppet4 on our arm64
xenial ci hosts.
The problem is the mirrors have been updated to puppet4 -- runs are
now breaking on the arm mirrors because they don't have puppet-agent
packages. It seems all we can really do at this point is contine to
run them on puppet3.
This is hard (impossible?) to express with a fnmatch in the existing
yamlgroups syntax. We could do something like list all the mirror
hosts and use anchors etc, but we have to keep that maintained. Add
an feature to the inventory plugin that if the list entry starts with
a ^ it is considered a full regex and passed to re.match. This
allows us to write more complex matchers where required -- in this
case the arm64 ci mirror hosts are excluded from the puppet4 group.
Testing is updated.
[1] https://groups.google.com/forum/#!msg/puppet-dev/iBMYJpvhaWM/WTGmJvXxAgAJ
Change-Id: I828e0c524f8d5ca866786978486bc04829464b47
In roughly lexicographical order, upgrade a batch of servers to puppet
4. We skip ask-staging because although it is in the futureparser group
it was temporarily disabled in puppet and so hasn't actually gone
through the futureparser validation stage yet.
Depends-On: https://review.openstack.org/643465
Change-Id: I3971ffb9800e95aaaba0076ec3bd6a05cd92a750
Remove the puppetry for managing nameservers as we now use ansible
configured name servers without puppet.
We will need to follow this up with deletion of the existing
ns*.openstack.org and adns1.openstack.org servers.
Change-Id: Id7ec8fa58c9e37ce94ec71e4562607914e5c3ea4
An upcoming change will remove review.openstack.org and
puppetmaster.openstack.org from our hostgroups, since these servers
have been deleted from the provider already. We were explicitly
testing the hostgroup membership for the former, so replace that
with a couple of new ones which should provide more stable coverage
going forward.
Change-Id: Ida28b65e9f1dc01f233cc9bff4ce32aef70e347a
This change enables the installation of the ARA callback plugin in
the install-ansible role. It does not take care of any web reporting
capabilities.
ARA will not be installed and set up by default.
It can be installed and configured by setting
"install_ansible_enable_ara" to "true".
Co-Authored-By: David Moreau-Simard <dmsimard@redhat.com>
Co-Authored-By: Ian Wienand <iwienand@redhat.com>
Change-Id: Iea84ec8e23ca2e3f021aafae4e89c764f2e05bd2
Rename install_openstacksdk to install_ansible_opensatcksdk to make it
clear this is part of the install-ansible role, and it's the
openstacksdk version used with ansible (might be important if we
switch to virtualenvs). This also clears up inconsistency when we add
ARA install options too.
Change-Id: Ie8cb3d5651322b3f6d2de9d6d80964b0d2822dce
Similar to the pinning introduced in
Ic465efb637c0a1eb475f04b0b0e356d8797ecdeb, use the "latest"
openstacksdk package and allow for passing of pinned versions if
required.
Update the devel test to also use the master of opensatcksdk
Change-Id: I4b437ca9024c87903bdd3569c8309cde725ce28e
This adds arguments to "install-ansible" to allow us to specify the
package name and version.
This is used to pin bridge.o.o to 2.7.0 (see
I9cf4baf1b15893f0c677567f5afede0d0234f0b2).
A new job is added to test against the ansible-devel branch. Added as
voting for now, until it proves to be a concern.
Change-Id: Ic465efb637c0a1eb475f04b0b0e356d8797ecdeb
It's designed to always be used from the latest version.
This trips an ansible lint rule (ANSIBLE0010) which we can ignore, as
we often have pip things that we want to install the latest release
of automatically.
Change-Id: Ieac93ab3a555f2423d4fbcf101d6d9681ae0e497
We removed the mirror nodes from the webservers group to fix iptables
rule application on the nodes. Unfortunately we didn't update our test
that tries to assert mirrors should be in the webservers group. Update
the test results fixture to remove webservers as a valid group for a
mirror node.
Change-Id: Iba18e54f4df4a36c0247f65642faacca9d195769
This mocks out enough of the Ansible inventory framework so we can
test the group matching against a range of corner cases as present in
the results.yaml file.
Change-Id: I05114d9aae6f149122da20f239c8b3546bc140bc
The constructed inventory plugin allows expressing additional groups,
but it's too heavy weight for our needs. Additionally, it is a full
inventory plugin that will add hosts to the inventory if they don't
exist.
What we want instead is something that will associate existing hosts
(that would have come from another source) with groups.
This also switches to using emergency.yaml instead of emergency, which
uses the same format.
We add an extra groups file for gate testing to ensure the CI nodes
get puppet installed.
Change-Id: Iea8b2eb2e9c723aca06f75d3d3307893e320cced