
In order for the gates to get a meaningful metric to pass/fail against, write out the percent of failed attempts (that is, attempts that caused some exception) after receiving a termination/interrupt. Along with this, change the except clause in the test loop to catch all failures and log them, rather than aborting the program. This will allow the program to keep running until explicitly stopped, providing more accurate results. Since the exceptions are being caught at the loop level, the tests themselves can use a 'happy path,' without doing their own error handling. This makes implementation of tests more straightforward. Change-Id: I10436d1f3e99234aa167ab7a765e59d46d54eeb8
Bowling Ball - OpenStack-Ansible Rolling Downtime Simulator
- date
-
2017-03-09
- tags
-
rackspace, openstack, ansible
- category
-
*openstack, *nix
About
This project aims to test for issues with rolling downtime on OpenStack-Ansible deployments. It's comprised of two main components:
- The
rolling_restart.py
script - The
rolling_test.py
script
The rolling_restart.py
script will stop containers from
a specified group in a rolling fashion - node 1 will stop, then start,
then node 2, then node 3 and so on. This script runs from the
deployment host.
The tests
directory contains scripts to generate traffic
against the target services. system that will be restarted by
rolling_restart.py
in order to measure the effects. These
scripts run from a utility container.
The rolling_test.py
script contains tests to generate
traffic against the target services. These vary per service, but attempt
to apply usage to a system that will be restarted by
rolling_restart.py
in order to measure the effects. This
script runs from a utility container.
Usage
Start your test from a utility container.
./rolling_test.py keystone
runs the Keystone test../rolling_test.py list
will list tests and their descriptionsFrom the deployment node, run
rolling_restart.py
in the playbooks directory (necessary to find the inventory script). Specify the service you're targeting with the-s
parameter.rolling_restart.py -s keystone_container
You can specify a wait time in seconds between stopping and starting individual nodes.
rolling_restart.py -s keystone_container -w 60
Assumptions
These tools are currently coupled to OSA, and they assume paths to
files as specified by the multi-node-aio
scripts.
Container stopping and starting is done with an ansible command, and the physical host to target is derived from the current inventory.
rolling_restart.py
must currently be run from the
playbooks
directory. This will be fixed later.
You must source openrc
before running
rolling_test.py
.
Creating New Tests
Tests should subclass from the ServiceTest
class in the
same file and implement the following properties and methods:
run
- The actual test to run should be placed in this method. Timings-
will be gathered based on when this function starts and stops.
pre_test
- Any pre-test setup that needs to happen, like creating a-
file for Glance, Cinder, or Swift upload.
post_test
- Any post-test teardown that might be needed.service_name
- The primary service that is being tested.description
- Brief description of what the test does.
Finally, add the test to the available_tests
dictionary
with the invocation name as the key and the class as the value.
Why the name?
It sets 'em up and knocks em down.