595 Commits

Author SHA1 Message Date
likui
42035e211f The deprecated iscsi deploy interface has been removed since xena
[1] https://docs.openstack.org/releasenotes/ironic/xena.html

Change-Id: Ic0dd9fa7ef76b647682e124b1bae52e931a38225
2021-11-15 18:30:59 +08:00
Zuul
948088abe2 Merge "Update Manila deploy steps for Wallaby" 2021-10-20 09:36:35 +00:00
Maksim Malchuk
37e4dba879 Add support for Ironic inspection through DHCP-relay
This change updates documentation, examples and tests to support
Ironic inspection through DHCP-relay. The dnsmasq service should be
configured with more specific format set in the variable
``ironic_dnsmasq_dhcp_range``. See the dnsmasq manual page [1].

[1] https://thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html

Change-Id: I9488a72db588e31289907668f1997596a8ccdec6
Signed-off-by: Maksim Malchuk <maksim.malchuk@gmail.com>
2021-10-12 22:16:04 +03:00
wu.chunyang
1f71df1a8b Remove chrony role from kolla
chrony is not supported in Xena cycle, remove it from kolla

Moved tasks from chrony role to chrony-cleanup.yml playbook to avoid a
vestigial chrony role.

Co-Authored-By: Mark Goddard <mark@stackhpc.com>

Change-Id: I5a730d55afb49d517c85aeb9208188c81e2c84cf
2021-09-30 18:56:14 +02:00
Zuul
bfba65f286 Merge "Add support for Ceph RadosGW integration" 2021-09-30 16:06:48 +00:00
Mark Goddard
8c5012e940 Add support for Ceph RadosGW integration
* Register Swift-compatible endpoints in Keystone
* Load balance across RadosGW API servers using HAProxy

The support is exercised in the cephadm CI jobs, but since RGW is
not currently enabled via cephadm, it is not yet tested.

https://docs.ceph.com/en/latest/radosgw/keystone/

Implements: blueprint ceph-rgw

Change-Id: I891c3ed4ed93512607afe65a42dd99596fd4dbf9
2021-09-30 13:08:13 +00:00
Mark Goddard
66c84843e4 Deploy source type images by default
Source images get the most test coverage, so it makes sense to deploy
these by default.

Change-Id: I8d0c8750e2c1600e84cc2e677a4eae0e9f502dac
2021-09-30 08:07:48 +00:00
Zuul
f99bf8325f Merge "Never make Docker registry insecure by default" 2021-09-09 10:49:03 +00:00
Zuul
83c5d95b47 Merge "Support monitoring Fluentd with Prometheus" 2021-08-27 09:34:12 +00:00
Radosław Piliszek
802f7c6218 Never make Docker registry insecure by default
To follow best security practices and help fellow operators.

More details inline and in the linked bug report.

Closes-Bug: #1940547
Change-Id: Ide9e9009a6e272f20a43319f27d257efdf315f68
2021-08-20 18:23:56 +00:00
Zuul
a98076f11c Merge "Use more RMQ flags for less busy wait" 2021-08-19 18:20:13 +00:00
Skylar Kelty
8d5dde3723
Update Manila deploy steps for Wallaby
Manila has changed from using subfolders to subvolumes.
We need a bit of a tidy up to prevent deploy errors.
This change also adds the ability to specify the ceph FS
Manila uses instead of relying on the default "first found".

Closes-Bug: #1938285
Closes-Bug: #1935784
Change-Id: I1d0d34919fbbe74a4022cd496bf84b8b764b5e0f
2021-08-17 10:01:58 +01:00
Doug Szumski
b692ce7af1 Support monitoring Fluentd with Prometheus
This patch adds support for integrating Prometheus with Fluentd.
This can be used to extract useful information about the status
of Fluentd, such as output buffer capacity and logging rate,
and also to extract metrics from logs via custom Fluentd
configuration. More information can be found here in [1].

[1] https://docs.fluentd.org/monitoring-fluentd/monitoring-prometheus

Change-Id: I233d6dd744848ef1f1589a462dbf272ed0f3aaae
2021-08-09 10:12:20 +01:00
Zuul
1a4a8c1615 Merge "Reduce container metrics cardinality" 2021-08-06 14:47:38 +00:00
Zuul
bb05cf1150 Merge "Remove support for Prometheus v1" 2021-08-06 14:12:18 +00:00
Zuul
295c69b5ee Merge "Remove tempest role" 2021-08-06 14:04:55 +00:00
Piotr Parczewski
0d79d25fe9 Remove support for Prometheus v1
Change-Id: I0d7c7f47e6653cf2903589a9c86798a8c6404af5
2021-08-05 21:07:22 +02:00
Radosław Piliszek
d7cdad5325 Use more RMQ flags for less busy wait
As mentioned in the Iced014acee7e590c10848e73feca166f48b622dc
commit message, in Ussuri+ we can use ``+sbwtdcpu none
+sbwtdio none`` as well. This is due to relying on RMQ-provided
erlang in version 23.x.

This change adds the extra arguments by default.
It should be backported down to Ussuri before we do a release with
Iced014acee7e590c10848e73feca166f48b622dc.

Change-Id: I32e247a6cb34d7f6763b544f247fd408dce2b3a2
2021-07-28 19:14:43 +00:00
Piotr Parczewski
c2ae21fd97 Reduce container metrics cardinality
Adds support for passing extra runtime options to cAdvisor.
By default new options disable exporting rarely useful metrics
and labels by cAdvisor. This helps reducing the load on Prometheus
and cAdvisor itself.

Change-Id: I81f3845d6cd03a70a0c8569f8d0ea421027df083
2021-07-08 16:31:44 +02:00
wu.chunyang
5261998467 Remove tempest role
Remove tempest role as planned

Change-Id: If3cf073e88c83f670c867a49afe48845f9e81008
2021-07-07 21:58:39 +08:00
Rafael Weingärtner
15f2fdcd5d Make setup module arguments configurable
Ansible facts can have a large impact on the performance of the Ansible
control host. This patch introduces some control over which facts are
gathered (kolla_ansible_setup_gather_subset) and which facts are stored
(kolla_ansible_setup_filter). By default we do not change the default
values of these arguments to the setup module. The flexibility of these
arguments is limited, but they do provide enough for a large performance
improvement in a typical moderate to large OpenStack cloud.

In particular, the large complex dict fact for each interface has a
large effect, and on an OpenStack controller or hypervisor there may be
many virtual interfaces. We can use the kolla_ansible_setup_filter
variable to help:

    kolla_ansible_setup_filter: 'ansible_[!qt]*'

This causes Ansible to collect but not store facts matching that
pattern, which includes the virtual interface facts. Currently we are
not referencing other facts matching the pattern within Kolla Ansible.
Note that including the 'ansible_' prefix causes meta facts module_setup
and gather_subset to be filtered, but this seems to be the only way to
get a good match on the interface facts. To work around this, we use
ansible_facts rather than module_setup to detect whether facts exist in
the cache.

The exact improvement will vary, but has been reported to be as large as
18x on systems with many virtual interfaces.

For reference, here are some other tunings tried:

* Increased the number of forks (great speedup depending of the size of
  the deployment)
* Use `strategy = mitogen_linear` (cut processing time in half)
* Ansible caching (little speed up)
* SSH tunning (little speed up)

Co-Authored-By: Mark Goddard <mark@stackhpc.com>
Closes-Bug: #1921538
Change-Id: Iae8ca4aae945892f1dc65e1b10381d2e26e88805
2021-07-02 10:30:35 -03:00
Zuul
3d7bcca990 Merge "Drop support for Cinder ZFSSA backend" 2021-06-22 02:43:58 +00:00
Zuul
2237e45db3 Merge "Revert "Reduce container metrics cardinality"" 2021-06-21 12:47:19 +00:00
Radosław Piliszek
0158221fd2 Drop support for Cinder ZFSSA backend
Following upstream which removed ZFSSA support in Ussuri [1].

[1] https://review.opendev.org/c/openstack/cinder/+/690137

Change-Id: Idb311e18b437fba696759ecb1cf2a6b4803aa5c5
2021-06-21 09:53:01 +00:00
Radosław Piliszek
640dbb03fa Revert "Reduce container metrics cardinality"
This reverts commit c6259158e3eff4aff9770b7044b0179a7de533aa.

Reason for revert: cAdvisor fails with:

invalid value "percpu,referenced_memory,cpu_topology,resctrl,udp,advtcp,sched,hugetlb,memory_numa,tcp,process" for flag -disable_metrics: unsupported metric "referenced_memory" specified in disable_metrics

Change-Id: I1a0eea5c20f95f38c707401b56b7d2454484377d
2021-06-20 13:58:32 +00:00
Zuul
663be549e0 Merge "Reduce container metrics cardinality" 2021-06-20 11:10:48 +00:00
Piotr Parczewski
c6259158e3 Reduce container metrics cardinality
Adds support for passing extra runtime options to cAdvisor.
By default new options disable exporting rarely useful metrics
and labels by cAdvisor. This helps reducing the load on Prometheus
and cAdvisor itself.

Change-Id: Id0144e8fa518e3236cb94ba2e3961fb455d36443
2021-06-16 08:10:51 +02:00
wu.chunyang
3009109616 Remove rally deployment
Remove rally role as planned

Change-Id: Ic898efe42b21b01c45d4621af2cf90ecd7afc398
2021-06-16 09:12:34 +08:00
Zuul
f5fa171983 Merge "Add ability to use the Neutron packet logging framework" 2021-06-14 14:44:53 +00:00
Zuul
4dcea739d5 Merge "Remove support for panko" 2021-06-11 20:56:40 +00:00
Matthias Runge
ccf8cc5dca Remove support for panko
the project is deprecated and in the process of being removed
from OpenStack upstream.

Change-Id: I9d5ebed293a5fb25f4cd7daa473df152440e8b50
2021-06-11 18:00:05 +02:00
John Garbutt
70f6f8e4c0 Reduce RabbitMQ busy waiting, lowering CPU load
On machines with many cores, we were seeing excessive CPU load on systems
that were not very busy. With the following Erlang VM argument we saw
RabbitMQ CPU usage drop from about 150% to around 20%, on a system with
40 hyperthreads.

    +S 2:2

By default RabbitMQ starts N schedulers where N is the number of CPU
cores, including hyper-threaded cores. This is fine when you assume all
your CPUs are dedicated to RabbitMQ. Its not a good idea in a typical
Kolla Ansible setup. Here we go for two scheduler threads.
More details can be found here:
https://www.rabbitmq.com/runtime.html#scheduling
and here:
https://erlang.org/doc/man/erl.html#emulator-flags

    +sbwt none

This stops busy waiting of the scheduler, for more details see:
https://www.rabbitmq.com/runtime.html#busy-waiting
Newer versions of rabbit may need additional flags:
"+sbwt none +sbwtdcpu none +sbwtdio none"
But this patch should be back portable to older versions of RabbitMQ
used in Train and Stein.

Note that information on this tuning was found by looking at data from:
rabbitmq-diagnostics runtime_thread_stats
More details on that can be found here:
https://www.rabbitmq.com/runtime.html#thread-stats

Related-Bug: #1846467

Change-Id: Iced014acee7e590c10848e73feca166f48b622dc
2021-06-07 13:18:39 +01:00
Florian LEDUC
e923236001 Add ability to use the Neutron packet logging framework
* Enables the Neutron packet logging framework for OVS
(https://docs.openstack.org/neutron/latest/admin/config-logging.html).
* Adds a toggle variable "enable_neutron_packet_logging"

Change-Id: Ica3594cdac634b496949a06ed813dccd18090af4
Implements: blueprint neutron-log-service-plugin
2021-05-11 13:50:49 +02:00
Doug Szumski
82cf40edf2 Remove Monasca Grafana service
In the Xena cycle it was decided to remove the Monasca
Grafana fork due to lack of maintenance. This commit removes
the service and provides a limited workaround using the
Monasca Grafana datasource with vanilla Grafana.

Depends-On: I9db7ec2df050fa20317d84f6cea40d1f5fd42e60
Change-Id: I4917ece1951084f6665722ba9a91d47764d3709a
2021-04-27 11:06:25 +00:00
Mark Goddard
db517a44e4 masakari: support host monitor
Change-Id: I3f43df7766c57622ab8d01a759fbeeef0a0c2b93
Implements: blueprint masakari-hostmonitor
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2021-04-08 16:39:47 +00:00
Gaëtan Trellu
9f578c85e0 Add HAcluster Ansible role
Adds HAcluster Ansible role. This role contains High Availability
clustering solution composed of Corosync, Pacemaker and Pacemaker Remote.

HAcluster is added as a helper role for Masakari which requires it for
its host monitoring, allowing to provide HA to instances on a failed
compute host.

Kolla hacluster images merged in [1].

[1] https://review.opendev.org/#/c/668765/

Change-Id: I91e5c1840ace8f567daf462c4eb3ec1f0c503823
Implements: blueprint ansible-pacemaker-support
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
Co-Authored-By: Mark Goddard <mark@stackhpc.com>
2021-04-08 06:39:19 +00:00
Radosław Piliszek
b647cb4128 Deprecate and disable chrony by default
Per [1].

[1] http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020707.html

Change-Id: Id6f3cd158bf5d01750971249b11364b6a8631789
Closes-Bug: #1885689
2021-04-06 09:17:51 +00:00
Michal Nasiadka
7a066f7154 Add missing octavia-driver-agent
For using 3rd party Octavia providers (such as OVN provider) an
octavia-driver-agent container must be running to expose those providers to
use.

OVN CI job has been extended with deploying Octavia and testing OVN Load
Balancer.

Closes-Bug: #1903506
Depends-On: https://review.opendev.org/c/openstack/kolla/+/771191

Change-Id: Ibafa8b7307981f2a51e630cc113d18af6162171c
2021-03-24 16:36:44 +00:00
Zuul
0bd235dffc Merge "don't use the same CIDR in octavia_amp_network_cidr and init-run-once" 2021-03-17 16:31:28 +00:00
Zuul
261cce4f45 Merge "Add missing elasticsearch cloudkitty storage and prometheus collector backend support." 2021-03-09 20:18:28 +00:00
Zuul
cc1dda3035 Merge "Add Neutron DHCP agent to OVN networking setup" 2021-03-09 20:15:28 +00:00
Doug Szumski
647ff667e6 Add variable for changing Apache HTTP timeout
In services which use the Apache HTTP server to service HTTP requests,
there exists a TimeOut directive [1] which defaults to 60 seconds. APIs
which come under heavy load, such as Cinder, can sometimes exceed this
which results in a HTTP 504 Gateway timeout, or similar. However, the
request can still be serviced without error. For example, if Nova calls
the Cinder API to detach a volume, and this operation takes longer
than the shortest of the two timeouts, Nova will emit a stack trace
with a 504 Gateway timeout. At some time later, the request to detach
the volume will succeed. The Nova and Cinder DBs then become
out-of-sync with each other, and frequently DB surgery is required.

Although strictly this category of bugs should be fixed in OpenStack
services, it is not realistic to expect this to happen in the short
term. Therefore, this change makes it easier to set the Apache HTTP
timeout via a new variable.

An example of a related bug is here:

https://bugs.launchpad.net/nova/+bug/1888665

Whilst this timeout can currently be set by overriding the WSGI
config for individual services, this change makes it much easier.

Change-Id: Ie452516655cbd40d63bdad3635fd66693e40ce34
Closes-Bug: #1917648
2021-03-04 11:25:06 +00:00
Bartosz Bezak
44cf00ab04 don't use the same CIDR in octavia_amp_network_cidr and init-run-once
Currently kolla-ansible uses the same CIDR in init-run-once script
and for octavia_amp_network_cidr.

Change-Id: I5ab24fbf9be4acbd691f13d33908aa44d2b4d496
2021-02-26 09:15:23 +01:00
Piotr Parczewski
fc72887d31 Add Neutron DHCP agent to OVN networking setup
This commit adds possibility to deploy Neutron's DHCP agents in OVN
networking scenario.

Co-Authored-By: Michał Nasiadka <mnasiadka@gmail.com>

Change-Id: I073d04319b510182f5c1478e728c0c5bcc8799f1
2021-02-23 13:57:48 +01:00
Zuul
87d8bd414d Merge "Add support to OpenID Connect Authentication flow" 2021-02-19 23:15:07 +00:00
Pedro Henrique
f3fbe83708 Add support to OpenID Connect Authentication flow
This pull request adds support for the OpenID Connect authentication
flow in Keystone and enables both ID and access token authentication
flows. The ID token configuration is designed to allow users to
authenticate via Horizon using an identity federation; whereas the
Access token is used to allow users to authenticate in the OpenStack CLI
using a federated user.

Without this PR, if one wants to configure OpenStack to use identity
federation, he/she needs to do a lot of configurations in the keystone,
Horizon, and register quite a good number of different parameters using
the CLI such as mappings, identity providers, federated protocols, and
so on. Therefore, with this PR, we propose a method for operators to
introduce/present the IdP's metadata to Kolla-ansible, and based on the
presented metadata, Kolla-ansible takes care of all of the
configurations to prepare OpenStack to work in a federated environment.

Implements: blueprint add-openid-support
Co-Authored-By: Jason Anderson <jasonanderson@uchicago.edu>
Change-Id: I0203a3470d7f8f2a54d5e126d947f540d93b8210
2021-02-15 16:57:47 -03:00
Gaël THEROND (Fl1nt)
9e72c0cb4e Add missing elasticsearch cloudkitty storage
and prometheus collector backend support.

* Fix various remaining typos.
* Fix trailing character on reno.
* Enable Elasticsearch when selected as cloudkitty backend.
* Add a check for ES index creation when ES required.
* Add a release note
* Fix release note line length issue.

Change-Id: I18f3d8f2e10a2996b2ebf92733a1770bef548bda
Closes-bug: #1895945
2021-02-08 09:29:08 +01:00
Carsten Koester
bf6d9308aa Add IPv6 configuration options to Octavia management network
If the Octavia/Amphora management network is created by Kolla, support
setting the IP address family and IPv6 address/RA mode.

Closes-Bug: 1913409

Change-Id: I9f2ef2196654c91596cb5c4b3c157bcee267226a
2021-02-03 08:24:04 -08:00
Piotr Parczewski
5db72659a0 [docs] Unify project's naming convention
There are inconsitencies across the documentation and the source code files
when it comes to project's name (Kolla Ansible vs. Kolla-Ansible). This
commit aims at unifying it so that the naming becomes consistent everywhere.

Change-Id: I903b2e08f5458b1a1abc4af3abefe20b66c23a54
2021-01-27 20:08:41 +01:00
Zuul
031e337898 Merge "Add Prometheus 2.x deployment" 2021-01-15 11:57:52 +00:00