328 Commits

Author SHA1 Message Date
Zuul
b7bbbae981 Merge "Adding Prometheus blackbox exporter" 2019-09-20 17:25:04 +00:00
Zuul
94ece4a702 Merge "CI: collect more system configs (name resolution)" 2019-09-20 17:20:54 +00:00
Radosław Piliszek
e7d5c58415 CI: collect more system configs (name resolution)
This patch adds configs relevant to name resolution.

Change-Id: I7ebc2409e9ec0bd875abf0bf4e452bc89efe940d
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-09-20 10:30:49 +02:00
Mark Goddard
8e40629161 CI: Use VXLAN overlay network
VXLAN is necessary to run HA in CI (due to floating VIP
address handled by keepalived).
It also turned out to be required to have private
IPv6 address assignments.
This patch is based on linux bridge rather than OVS
to avoid problems with OVS deployed in containers.

This patch enables haproxy in multinode jobs.

Includes saving of linux networking details.

Makes DASHBOARD_URL agree with OS_AUTH_URL - properly uses the
pre-upgrade value for testing.

Co-authored-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
Depends-on: https://review.opendev.org/683068
Depends-on: https://review.opendev.org/682957
Change-Id: I66888712da80c3d6f84ee4949762961664d3adea
2019-09-19 11:07:02 +02:00
Radosław Piliszek
e2f511b7d9 CI: Configure the upgrade jobs from the current branch
This lets us control the upgrade process entirely from the
current branch.

Change-Id: Ic8c39e415846596c23dae93c2839375a24e8b888
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-09-18 13:11:42 +00:00
Scott Solkhon
b22375ebfd Adding Prometheus blackbox exporter
This commit follows up the work in Kolla to provide deploy and configure the
Prometheus blackbox exporter.

An example blackbox-exporter module has been added (disabled by default)
called os_endpoint. This allows for the probing of endpoints over HTTP
and HTTPS. This can be used to monitor that OpenStack endpoints return a status
code of either 200 or 300, and the word 'versions' in the payload.

This change introduces a new variable `prometheus_blackbox_exporter_endpoints`.
Currently no defaults are specified because the configuration is heavily
dependent on the deployment.

Co-authored-by: Jack Heskett <Jack.Heskett@gresearch.co.uk>
Change-Id: I36ad4961078d90e2fd70c9a3368f5157d6fd89cd
2019-09-18 11:06:19 +01:00
Mark Goddard
8722c78763 CI: Test accessing dashboard
Also slightly refactor test-openstack.sh script.

Change-Id: I7f10f073e89d2b66367bbb700201b3cd412fc433
Depends-On: https://review.opendev.org/#/c/674241
Depends-On: https://review.opendev.org/#/c/668410
Depends-On: https://review.opendev.org/#/c/668409
2019-09-14 20:06:35 +00:00
Hongbin Lu
0f5e065855 Configure Zun for Placement (Train+)
After the integration with placement [1], we need to configure how
zun-compute is going to work with nova-compute.

* If zun-compute and nova-compute run on the same compute node,
  we need to set 'host_shared_with_nova' as true so that Zun
  will use the resource provider (compute node) created by nova.
  In this mode, containers and VMs could claim allocations against
  the same resource provider.
* If zun-compute runs on a node without nova-compute, no extra
  configuration is needed. By default, each zun-compute will create
  a resource provider in placement to represent the compute node
  it manages.

[1] https://blueprints.launchpad.net/zun/+spec/use-placement-resource-management

Change-Id: I2d85911c4504e541d2994ce3d48e2fbb1090b813
2019-09-10 01:47:15 +00:00
Marcin Juszkiewicz
a5808ad8ba Modernize the way of configuring Docker daemon
Instead of changing Docker daemon command line let's change config
for Docker instead. In /etc/docker/daemon.json file as it should be.

Custom Docker options can be set with 'docker_custom_config' variable.

Old 'docker_custom_option' is still present but should be avoided.

Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
Change-Id: I1215e04ec15b01c0b43bac8c0e81293f6724f278
2019-09-05 08:19:26 +00:00
Zuul
d6e8394320 Merge "Check for CRITICAL, WARNING and ERROR log messages in CI" 2019-08-27 12:42:44 +00:00
Michal Nasiadka
4180bee020 Use fluentd image labels
In order to orchestrate smooth transition to fluentd 0.14.x
aka 1.0 stable branch aka td-agent 3
from td-agent repository - use image labels (fluentd_version
and fluentd_binary).

Depends-On: https://review.opendev.org/676411
Change-Id: Iab8518c34ef876056c6abcdb5f2e9fc9f1f7dbdd
2019-08-22 12:36:51 +00:00
Zuul
d9dd536cf8 Merge "CI: Zun jobs" 2019-08-17 11:58:46 +00:00
Zuul
83d8b1053c Merge "CI: Add docker inspect output to docker_info logs" 2019-08-16 15:40:38 +00:00
Mark Goddard
a14eee24d1 Check for CRITICAL, WARNING and ERROR log messages in CI
At the end of a CI run, check all log files.

Change-Id: I99afc1c5207757e35beabf7daebd86c56151c96d
2019-08-16 15:33:54 +00:00
Radosław Piliszek
d4de1d7520 CI: Zun jobs
- Test Zun on CentOS too
- Make etcd change also trigger Zun jobs (like kuryr and zun)
- Test multinode Zun deployments instead of AIO
  (more likely to break)
- In Zun scenario, stop configuring docker for legacy swarm mode
  (Zun is no swarm)
- Separate test-zun.sh testing script
- Show appcontainer to see which node it has been started on

Change-Id: I289b1009fe00aedb9b78cbd83298b14da5fd9670
Depends-On: https://review.opendev.org/676736
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-08-16 17:11:00 +02:00
Michal Nasiadka
8cf24bcc81 CI: Add docker inspect output to docker_info logs
Change-Id: I081f2f4762651bca935f08a67b20f21946aaf051
2019-08-16 09:30:16 +00:00
Zuul
fac646406f Merge "Testing Masakari role in gate" 2019-08-15 17:26:56 +00:00
Zuul
9401aab752 Merge "CI: Sanity check that nodepool.private_ipv4 is assigned" 2019-08-14 17:21:29 +00:00
Kien Nguyen
fbac54c5f5 Testing Masakari role in gate
Add Masakari testing into the Gate.

Change-Id: I52df33f963e7d2ae4059887df3d24d9e6642134e
Depends-On: https://review.opendev.org/#/c/615469/
Depends-On: https://review.opendev.org/#/c/615715
Implements: blueprint ansible-masakari
Co-Authored-By: Gaëtan Trellu <gaetan.trellu@incloudus.com>
2019-08-14 12:32:51 -04:00
Zuul
571c89712d Merge "CI: Collect docker and systemd configs" 2019-08-12 17:19:36 +00:00
Mark Goddard
eac1e479b7 CI: Sanity check that nodepool.private_ipv4 is assigned
During the MariaDB testing we saw a number of cases where this IP
address was not assigned to one or more hosts, which caused various
issues later on.

Change-Id: I61b54483e4553b926e9ddc0a8848b2daa6bc49f1
2019-08-06 19:03:05 +01:00
Zuul
418e9cccc7 Merge "ceph: fixes to deployment and upgrade" 2019-08-06 13:59:06 +00:00
Zuul
46f0b691dc Merge "CI: Fix multinode job glance issues" 2019-08-05 08:13:58 +00:00
Radosław Piliszek
826f6850d0 ceph: fixes to deployment and upgrade
1) ceph-nfs (ganesha-ceph) - use NFSv4 only
This is recommended upstream.
v3 and UDP require portmapper (aka rpcbind) which we
do not want, except where Ubuntu ganesha version (2.6)
forces it by requiring enabled UDP, see [1].
The issue has been fixed in 2.8, included in CentOS.
Additionally disable v3 helper protocols and kerberos
to avoid meaningless warnings.

2) ceph-nfs (ganesha-ceph) - do not export host dbus
It is not in use. This avoids the temptation to try
handling it on host.

3) Properly handle ceph services deploy and upgrade
Upgrade runs deploy.
The order has been corrected - nfs goes after mds.
Additionally upgrade takes care of rgw for keystone
(for swift emulation).

4) Enhance ceph keyring module with error detection
Now it does not blindly try to create a keyring after
any failure. This used to hide real issue.

5) Retry ceph admin keyring update until cluster works
Reordering deployment caused issue with ceph cluster not being
fully operational before taking actions on it.

6) CI: Remove osd df from collected logs as it may hang CI
Hangs are caused by healthy MON and no healthy MGR.
A descriptive note is left in its place.

7) CI: Add 5s timeout to ceph informational commands
This decreases the timeout from the default 300s.

[1] https://review.opendev.org/669315

Change-Id: I1cf0ad10b80552f503898e723f0c4bd00a38f143
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-08-05 06:26:25 +00:00
Zuul
b59791ca92 Merge "Fix handling of docker restart policy" 2019-08-03 16:27:46 +00:00
Radosław Piliszek
d03172602d CI: Fix multinode job glance issues
This actually replaces two ad-hoc fixes with a more unified
solution (with comment for posterity).

Change-Id: I62f57cb489c900f68a0c7aeb3e20e4715c0e2661
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-26 19:30:36 +02:00
Radosław Piliszek
93ac16ae6b CI: fix checks for upgrade and multinode jobs
Multinode jobs did not run sanity checks for all the hosts,
only primary. Now they check all.

Additionally upgrades are now checked using the proper
(pre-upgrade) scripts (not that it matters too much as they
are the same atm) and both checks are done, not only failures,
but also config.

Change-Id: I10552e256edbddd5b1f8a8a7f8805262e72ce8d8
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-26 17:06:11 +02:00
Zuul
5a9ec1a773 Merge "CI: clean up requirements installation" 2019-07-25 11:59:41 +00:00
Radosław Piliszek
6a737b1968 Fix handling of docker restart policy
Docker has no restart policy named 'never'. It has 'no'.
This has bitten us already (see [1]) and might bite us again whenever
we want to change the restart policy to 'no'.

This patch makes our docker integration honor all valid restart policies
and only valid restart policies.
All relevant docker restart policy usages are patched as well.

I added some FIXMEs around which are relevant to kolla-ansible docker
integration. They are not fixed in here to not alter behavior.

[1] https://review.opendev.org/667363

Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-18 13:39:06 +00:00
Zuul
efa9819b18 Merge "Add ceph-mds/rgw/nfs to gate" 2019-07-17 16:02:44 +00:00
Radosław Piliszek
8a543098d6 CI: clean up requirements installation
We install kolla-ansible requirements in Zuul's Ansible playbooks.
This patch cleans up the installation in scripts so that they are
only concerned with auxiliary requirements:
- ansible (since we do not track it in requirements)
- ara (for log summaries)
- openstack clients (for first init and tests after deployment)

Additionally this patch installs openstack clients in a separate
virtualenv.
Note that all kolla-ansible requirements, ansible and ara are still
installed system-wide.

Change-Id: Iac04082ad39a9d823c515ba11c5db9af50ed225f
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-16 20:43:22 +02:00
Michal Nasiadka
a77b0f624e Add ceph-mds/rgw/nfs to gate
Depends-On: https://review.opendev.org/669315
Change-Id: I6946290cd890f74c59ed5394e8382a8b75c0c4cd
2019-07-16 17:44:22 +00:00
Radosław Piliszek
53ea3fe4af Trivial fix: log stderr of init-runonce as well
Missed by me in a recent merge.

TrivialFix
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>

Change-Id: I83b1e84a43f014ce20be8677868be3f66017e3c2
2019-07-09 15:38:47 +02:00
Zuul
887938bbcb Merge "Exit on failure in init-runonce" 2019-07-09 07:33:46 +00:00
Zuul
65783c90dd Merge "CI: Pull images before upgrade" 2019-07-08 09:21:57 +00:00
Zuul
db55408620 Merge "Fix conditionals in CI playbook" 2019-07-07 10:52:01 +00:00
Mark Goddard
f11d3c694a CI: Pull images before upgrade
This is the documented procedure.

Change-Id: I09ca99e92b112621d66b564a88b13658632242f5
2019-07-04 18:11:16 +00:00
Zuul
56c3603586 Merge "CI: Keep stderr in ansible logs" 2019-07-04 07:45:54 +00:00
Radosław Piliszek
2430c29033 CI: Collect docker and systemd configs
Change-Id: I59a05e8a0a2656596d2cced61bd98f2aa790d60b
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-03 19:45:00 +00:00
Zuul
6aba50e66a Merge "CI: Use template-overrides.j2 from kolla" 2019-07-03 19:22:21 +00:00
Radosław Piliszek
b9aa8b38f4 CI: Keep stderr in ansible logs
Otherwise ara had only the stderr part and logs only the
stdout part which made ordered analysis harder.

Additionally add -vvv for the bootstrap-servers run.

Change-Id: Ia42ac9b90a17245e9df277c40bda24308ebcd11d
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-02 20:44:33 +02:00
Radosław Piliszek
20ab480ca5 CI: Use template-overrides.j2 from kolla
Some kolla-ansible jobs failed due to using external mirrors
instead of local ones.
This was due to not using the template override provided by kolla.
This patch fixes that.

Depends-On: https://review.opendev.org/668226
Change-Id: I27f714fdf05e521aa8ce25c5683a452ceb35eeb8
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-01 17:00:53 +00:00
Radosław Piliszek
a0bdc3669a Add note to CI config regarding registry during upgrade
Change-Id: Ifc898015b9b523ef4c50fc969e464f05762f2151
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-01 18:45:30 +02:00
Zuul
470108a1c0 Merge "Revert "CI - remove unnecessary logic when building images for upgrade"" 2019-07-01 15:48:47 +00:00
Mark Goddard
acac12798c Revert "CI - remove unnecessary logic when building images for upgrade"
This reverts commit 8ce5ffd0c21c221d88bacca5fec03ca042dfed85.

Change-Id: I81ce7c007ff267ebbbb721bcdb7eebc0dd575bf8
2019-07-01 11:12:58 +00:00
Mark Goddard
bc08b44fd1 Exit on failure in init-runonce
Previously we sourced this script in tests/deploy.sh, but this was
recently changed. Following that change we lost the errexit setting,
meaning we ignore errors in init-runonce.

Adding errexit in the script itself means that all callers get error
handling.

Also log init-runonce output.

TrivialFix

Change-Id: I9b35bd5f0f76eec26ddd968d093a3a5fd55a7ce2
2019-06-28 14:31:24 +00:00
Mark Goddard
3b218fd0db Fix conditionals in CI playbook
These were not templated, so always evaluated to true. This shouldn't be
causing any issues.

Change-Id: I7b8e407e688ba201c4f7d1a94bbd41af0918e7df
2019-06-27 10:32:22 +01:00
Zuul
693a30275f Merge "Add CI job for ironic" 2019-06-24 15:15:41 +00:00
Radosław Piliszek
8ce5ffd0c2 CI - remove unnecessary logic when building images for upgrade
Docker registry being insecure is handled by docker_registry_insecure
which is set to true by default when docker_registry is set.
The removed code had no effect because docker_registry is not changed
anyway for base (pre-upgrade) install.

This change makes config more readable and also prevents a potential
conflict with the zun profile if ever used in upgrade mode.

Change-Id: I9b5ae8c5b534fa6cce9dbaca8af191e2ca79d19f
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-06-21 11:31:30 +00:00
Zuul
6cae4dedfe Merge "Remove nova-consoleauth" 2019-06-17 16:28:45 +00:00