675 Commits

Author SHA1 Message Date
Kevin Carter
e0f77d531a Add systemd configs and update playbook uniformity
Systemd overrides have been added to the service unit files for all
beats and services. All of the playbooks have been updated to make them
look and feel uniform.

This also sets handlers within the playbooks so that we're improving the
idempotence.

Change-Id: I2dd3183dae4bfddc607cc74f9dfb7af115b80abc
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-18 21:29:39 +00:00
Kevin Carter
2f1bd5d2ea
Collect facts by default
To install curator facts are required, this change ensures we gather
them.

Change-Id: I510692095cdf8ecb5806a43c714b7bbbace47022
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-16 21:09:10 -05:00
Kevin Carter
7a32b5c9a9 Add additional ES cluster tuning
The following options will reduce cluster pressure and generally
improve search performance.

Change-Id: I1619680db1fd595503f0845b182d6f6ce4c59f3c
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-16 22:52:40 +00:00
Kevin Carter
b6f3293580 Add config overrides for systemd for better auditing
The following change will ensure that the elastic-static is logging to
the journal and that systemd is able to report how well the elastic
slice is running.

Change-Id: I79a9074b5f14a41dec421d6691fd04c0e6be15b7
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-16 22:51:41 +00:00
Corey Wright
f21bc66671 mnaio: Only resize Swift & Cinder machines00 LV when using nspawn
Commit 875fa96f / change-id Ief0040f6 unintentionally tries to enlarge
the "machines00" LV when LXC is the default container technology which
fails due to the Debian automated installation having assigned all the
space within the associated "vmvg00" VG.

As the intention of the aforementioned commit was to apply when
systemd-nspawn was used, codify that explicitly in a `when:` condition
on the problematic Ansible task.

Change-Id: I56ec1290d71d0d09db447e347d7d55432d9b81c6
Signed-off-by: Corey Wright <corey.wright@rackspace.com>
Closes-Bug: #1781823
2018-07-15 15:46:56 -05:00
Zuul
93463d6efc Merge "Add dynamic retention policies to curator" 2018-07-15 04:18:41 +00:00
Zuul
cb8dce9aef Merge "Add meta groups to the elk readme and env.d file" 2018-07-15 04:16:19 +00:00
Kevin Carter
aa657644e8 Add meta groups to the elk readme and env.d file
Change-Id: I8031fd5d6736197d2fd9a36352c941c882bc062d
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-14 21:35:47 -05:00
Zuul
e13163ac54 Merge "Make space for Cinder/Swift nodes if using nspawn" 2018-07-14 10:04:46 +00:00
Kevin Carter
b6a9a6fc7a Add dynamic retention policies to curator
The curator retention policies will now query the storage nodes within
a given deployment and set a suitable index retention policy based on
the total amount of storage each index is assumed to produce every day.
To ensure we're minimizing the storage required and optimizing search
performance several actions are now being taken:

* Indexes will be shrunk after a quarter of their retention time.
* Indexes will be deleted should they exceed the retention time.

Change-Id: I8bf548620b5404d25deaadba8fda93452ef64fa0
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-12 17:03:40 +00:00
Jean-Philippe Evrard
d7f8076294 Fix usage of "|" for tests
With the more recent versions of ansible, we should now use
"is" instead of the "|" sign for the tests.

This should fix it.

Change-Id: I897b918785c34523688c450bec16661f0f6e496e
2018-07-12 17:05:08 +02:00
Antony Messerli
875fa96fb8 Make space for Cinder/Swift nodes if using nspawn
Currently if CONTAINER_TECH=nspawn is uses, Cinder and Swift
are unable to create volumes as space is fully allocated for
machines volume.

This shrinks machine00 mount to 8192 to make space for Cinder
and Swift volumes when using nspawn for the container tech.

Change-Id: Ief0040f638f0d3570557ac76fd5e0a8aee80df8d
2018-07-11 15:54:09 -05:00
Zuul
f1b18d30f1 Merge "Pin get-pip.py to 3.2" 2018-07-11 19:37:46 +00:00
Victor Palma
d98fec1a54 add osquery
* install osquery
   * add filebeat integration

Change-Id: Ia93595482512460ebdd287cf091cb5fe51b00de4
2018-07-10 11:00:48 -05:00
Zuul
861f4e7030 Merge "Add role description section for Ops" 2018-07-10 14:31:31 +00:00
Jonathan Rosser
5c823d77a2 Document how to access all OSA group_vars
Some of the variables used the templates are in the OSA group_vars,
or are composed using other variables from group_vars. These are
not accessible using the embedded ansible. In addition, a deployer
may have referenced group_vars in user_variables. Adding the whole
OSA group vars tree via a symlink covers all of these cases.

Change-Id: I7c842b0d41f24e7c192ab196eb2cfc133bb548a5
2018-07-10 11:11:25 +01:00
jacky06
c060e99539 Add role description section for Ops
Change-Id: Iab1479d98b247d6e7789d5af82306831146449fa
2018-07-10 09:31:06 +00:00
Kevin Carter
3a3430976f
Improve NFS detection
Detect if NFS is present by looking into the available mounts from
ansible facts.

Change-Id: I0e2d90d9e706ad4f6527484d96757b8578cb61bb
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-10 00:16:30 -05:00
Zuul
316f527243 Merge "Tune vars to better support an isolated deployment" 2018-07-10 05:08:50 +00:00
Kevin Carter
91dbd09353
Tune vars to better support an isolated deployment
Change-Id: I93d33bed42976d20919f887ef8096b212a6559a2
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-09 23:47:40 -05:00
Kevin Carter
87d05d45e6 Allow for a converged elk deployment
If kibana and elastic-logstash are the same node the deployment needs to
allow for storage nodes to co-exist with the front-end.

Change-Id: Icf9d26fefe015bd39f16387b4934e573783ed1ea
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-10 04:20:16 +00:00
Kevin Carter
553fd7d30b Correct log path and set handler
The journalbeat process was restarting every playbook run which was not
required. This change moves the restart process to a handler which will
ensure we're not restarting the services when it's not required.

Change-Id: I4c0082d04d99c71c902ae39ee5ad9efc5074889f
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-06 20:23:58 +00:00
Zuul
e04a8b47ca Merge "correct stat option when ceph is present" 2018-07-06 19:59:53 +00:00
Zuul
e5585d6a86 Merge "Add persisted logstash queues and tag journalbeat" 2018-07-06 19:59:52 +00:00
Zuul
0d72aa0954 Merge "Use setuptools==33.1.1 in leap upgrade-requirements.txt" 2018-07-06 19:59:52 +00:00
Kevin Carter
634bf0357e correct stat option when ceph is present
Change-Id: I5316d359c2c334c588b048d877410273782a90f1
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-06 11:25:09 -05:00
Zuul
1123cf7d47 Merge "Use openstack_version to determine Journalbeat hosts" 2018-07-06 16:21:31 +00:00
Jonathan Rosser
b3fb995cd5 Only pass journal dirs that are present to Journalbeat
Change-Id: I3e65a0a06c452632847a55effd2ce4d0c3cb4ac0
2018-07-06 16:33:54 +01:00
Antony Messerli
e5d39f460e Use setuptools==33.1.1 in leap upgrade-requirements.txt
Avoids using setuptools 34.x as using it can hit this bug
when packages are being upgraded:

https://github.com/pypa/setuptools/issues/951

Pinning to setuptools==33.1.1 also aligns to the
global-requirement-pins.txt in newton-eol

Change-Id: Ib33b828751c5a36d61448d148c5941beb6827c73
2018-07-06 09:25:05 -05:00
Kevin Carter
ba551fb081
Add persisted logstash queues and tag journalbeat
The journalbeat shipper will now have a "journald" tag attached to it
which will then be used by logstash to identify items and add tags to
them based on the systemd slice.

To accomodate this load the logstash config has been updated to better
handle blowback and scale according to the capability of the underlying
hardware.

The logstash grok files have been moved into the files directory which
was done to speed up the time it takes to ship files to a given host.
Originally the task used to copy these files was using the template
module however none of these files have anything templated within them
so there's no need to run them through the template engine.

Change-Id: I4f0e50ac491b595c0d276f6e7292d2c6e61baa22
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-07-05 22:09:49 -05:00
Jonathan Rosser
dd3042bcf2 Use openstack_version to determine Journalbeat hosts
Systemd journals are linked from the containers down to the hosts
starting in Rocky. Prior to this Journalbeat must also be on all
containers.

Change-Id: Iaaef19e76c40ba9c1ad58164c20da46766abeee6
2018-07-05 21:48:18 +01:00
Jonathan Rosser
b0654ee8e5 Collect journals from containers as well as the host
When journal_paths is empty the function sdJournal.NewJournal()
is used to open the journal for the host system only.

A single entry in journal_paths is opened with
sdJournal.NewJournalFrom[Dir|Files](), and multiple entries are opened
only with sdJournnalFromFiles().

Adding a single entry of /var/log/journal in the config file causes
all journal files under that directory to be opened, rather than only
that of the host system.

Change-Id: Ib758407edebff6786bf64fcf95328fb89912e3f6
2018-07-05 16:55:22 +01:00
jacky06
3535bfcd64 Pin get-pip.py to 3.2
As get-pip.py evolves based on pip 10, while we're still using
pip 9, changes in the way it can be used are causing problems.

For example, the ability to use --download is no longer there.

As such, let's pin to a known good version and leave it at that
until we no longer need to use this script. Version 3.2 maps to
pip 7.1.2 which fulfills our needs.

Change-Id: I49f6c9b238b42d4840c17af222e3bc82dfa6167f
2018-07-05 07:56:21 -04:00
Zuul
57b618bbc2 Merge "Improve metricbeat Ceph detection" 2018-07-03 13:49:47 +00:00
Heba Naser
c9a2035972 Use tests repo for common role test requirements
Using tox for requirements management requires in-repo
requirements files for all our repositories. Rather than
do that, we make use of the tests repo to capture our
common requirements and use this to install them.

This reduces our review requirement rate and simplifies
maintenance for us for the tox config. It also makes it
usable with 'Depends-On', which is marvellous!

The tox requirements definitions for docs/releasenotes
builds are left in-place as those are standard entries
across the community. If that changes at some point, we
can re-assess those entries too.

Depends-On: https://review.openstack.org/579208
Change-Id: Ib03a2836de7271dcbccab7a0742ed98515de859b
2018-07-02 12:26:40 -04:00
Dave Wilde
482e845d92 Improve multi-node AIO robustness
In order to improve the readability and robustness of the mnaio feature
I have replaced the shell out to virsh tasks to use the virt module
where available.  I have also created a vm-status play that will
hopefully help resolve SSH failures into the VMs.  This play utilizes
the block/rescue/handler pattern to attempt to restart the VM once if
it fails the initial SSH check.  Hopefully this will reduce the SSH
failures due to a suck VM.  This adds a new variable called
vm_ssh_timeout which allows the deployer an easy place to override the
default timeout.  The python-lxml package is needed for the virt module.

Change-Id: I027556b71a8c26d08a56b4ffa56b2eeaf1cbabe9
2018-06-29 10:12:16 -05:00
Jesse Pretorius
d0b0668657 MNAIO: Resolve all Ansible 2.5-related deprecation warnings
To resolve the warnings, we:

1. Use import_playbook instead of include for the conditional
   playbook inclusions in site.yml. TIL that using include_playbook
   does not work with conditionals.

2. Use include_tasks instead of include for the openstack-image-setup
   task set inclusion.

3. Switch to using 'is' instead of '|' for tests.

Change-Id: I6d68bd4fecda122a77f7934842c3479a4c0792fd
2018-06-27 07:47:08 +01:00
Zuul
09c412e8b0 Merge "Remove the unused port 35357" 2018-06-27 00:59:43 +00:00
Jesse Pretorius
7e94c86084 MNAIO: Correct Mac OS X instructions for virt-manager
After various attempts to make virt-manager work on a Mac
using brew and other methods, I found a way that's simpler
and much lighter on the client. This patch changes the
README to reflect this method.

Change-Id: I4c7594e7c0c371b4c0e417a78c8833262e479d22
2018-06-26 11:54:16 +01:00
Jesse Pretorius
0cd5c1704f MNAIO: Check capabilities only once
The capabilities check is done on the host, so it only needs
to be executed once, not once for every VM on the host. This
patch eliminates the duplicated checking.

Change-Id: I2bc7ebbe699e5ace82c1bcbdfd8e917661054fef
2018-06-26 11:54:16 +01:00
Jesse Pretorius
329aa472f2 MNAIO: Enable saving and re-using file-based backing images
Being able to save the images and re-use them on other hosts
is extremely useful to cut down deployment time. This patch
allows an MNAIO setup to be setup using a file-based backing
store, then have those saved and re-used on the same host or
on other hosts.

Change-Id: I491d04fb94352e37312891a9b9bd58093fdd00cf
2018-06-26 11:54:16 +01:00
Jesse Pretorius
250d9b29b3 MNAIO: Disable SSH key checks on host
When accessing the VM's on an MNAIO host and doing multiple
rebuilds, the SSH keys are constantly changing. This creates
a situation where keys constantly have to be deleted and
accepted which isn't very user-friendly.

Given that this tooling is used for test purposes, we can
disable the host key checks without being too concerned.

Change-Id: I3dd1221c4789b0ab8e895b22b05906456fc1fc8f
2018-06-26 11:54:16 +01:00
Jesse Pretorius
bc2ced27c2 MNAIO: Ensure a consistent and readable style
This patch implements the following style changes:

1. The 'environment' argument is placed in the same
   location for all plays, making sure it's easier
   to find.
2. The play tags are located in the same place, also
   making sure they're easier to find.
3. The line breaks between tasks and plays are set
   to be consistently 1 between tasks and 2 between
   plays.
4. Given that there are no roles being used, the use
   of pre/post tasks is converted to only using tasks.

Change-Id: I2e22c8360d65256b8e44ca1e310e0668a651196d
2018-06-26 11:54:16 +01:00
Jesse Pretorius
450a879403 MNAIO: Set tftp/pxe directory permissions using file module
Instead of using the shell module, we use the file module. It
achieves the same result idempotently.

Change-Id: I84adb76ee5d5a9e7dd56c15f7cdf8e220b841b23
2018-06-26 11:54:16 +01:00
Jesse Pretorius
2a21711f04 MNAIO: Use package module properly
To improve the chances of success during builds, retries
are added to the package install tasks. Also, given that
we're using Ansible > 2.1.x, we forgo the with_items loop
for the package installs and just give the package module
the list so that it installs them all at once. Finally,
we ensure that the 'name' argument is used for all package
lists rather than the 'pkg' argument which is for apt only.

Change-Id: I5f27ea0b05c70f6c5396bd41dfe3cce54579ccb3
2018-06-26 11:54:16 +01:00
Jesse Pretorius
7bc4f939a4 MNAIO: Run each setup playbook individually
The documentation describes using each setup-*.yml playbook
individually. This helps to reduce the memory usage on the
deploy host and also makes the output more readable as you're
only having to read the output of a single meta-playbook when
looking at the task results. In this patch we make the setup
process use the same set of playbooks described in the docs.

Change-Id: I596ed599de2e4302a82f2401f8fdf57f97660060
2018-06-26 11:54:16 +01:00
Jesse Pretorius
67c0746b71 MNAIO: Update README to include using the file-based backing store
Using the file-based backing store is currently undocumented. This
patch updates the README to include instructions for using it.

Change-Id: I228171b6619512874aece46f705ee1a922610cd5
2018-06-26 11:54:16 +01:00
Jesse Pretorius
1e845e8d20 MNAIO: Correct README
The README is currently misleading and outdated. This corrects
a few things related to rebuilding the test environment.

Change-Id: I6c9b1698fb77ddfcb4c7ade6cd5a7a14a14c55e6
2018-06-26 11:54:16 +01:00
Zuul
3f27453b12 Merge "Remove the duplicated word" 2018-06-26 05:33:11 +00:00
Zuul
a9e2d93ec2 Merge "Add kibana custom dashboard" 2018-06-26 05:30:59 +00:00