openstack-ansible-ops

Author	SHA1	Message	Date
Mohammed Naser	4f03c51118	Add Instance ID to logs This will parse the logs and grab the instance ID out of it. Change-Id: I9ad0c0e8d6101cca1fc3c4a7cb5cabc3504e6e28	2018-09-27 14:56:52 -04:00
Mohammed Naser	aa647953e0	Refactor Filebeat configuration file - Avoid checking item by item, we always enable modules and prospectors, with an option to disable with opt-in - Updated MySQL and Apache modules to point to right path - Improved and clean-up tagging - All the prospectors are managed using a variable Change-Id: I2a091669d6a77fd2c89a073cf9071292793e2f6b	2018-09-27 14:54:51 -04:00
Mohammed Naser	db6533481a	Clean-up filtering for API requests This updates all of the pipelines for most projects API requests to provide cleaner information. Change-Id: I5cb20a6c104b25d365fe03e4086272fa2965846a	2018-09-23 18:52:35 -04:00
Mohammed Naser	17c3563e27	Create filter for contextual logs The oslo.log has a default pattern for logging all of the entries with context, so let's use that in a common place to avoid duplicating all the information. Change-Id: I7f326221c01f53710f3adbc5fc2d416bec6aef8f	2018-09-23 17:35:44 -04:00
Mohammed Naser	72acd46a31	Use correct parsed timestamp At the moment, we're adding an extra field called "logdate" rather than using the built-in timestamp. This makes things go to the right field. Change-Id: I5e56d01692b7205418e6aba89d1c7c44fa1abfef	2018-09-23 17:25:49 -04:00
Mohammed Naser	eb4e6731b5	Drop oslofmt tag from checks The filebeat does not ship anything tagged with oslofmt, the openstack tag gives us all we need to parse things correctly. Change-Id: I614e4bc5d85559540a9d616407da993ed90de87e	2018-09-23 16:48:11 -04:00
Mohammed Naser	48d7b08773	Drop extra Ceph messages Ceph has a problem where logs that were introduced which are debug messages are logged as normal. They cause a lot of extra useless messages and overloading ELK cluster This was fixed in 12.2.9 which is not out yet, so let's work around it for now by avoiding shipping it. Change-Id: I36a503b7380ce62c65570232a18d2179a98ecfa1	2018-09-23 13:48:14 -04:00
Mohammed Naser	12c9687437	Add ARM64 support This patch adds support for ARM64 beats. Unfortunately, Elastic does not publish any packages, so this points at local builds. Also, it looks like Packetbeat fails to builds so for now we just don't do anything about it on ARM. Change-Id: I1889ce51f1a4c13c311165b8b76dde7c71ecfa2d	2018-09-23 16:17:04 +00:00
Kevin Carter	814622cc6c	Improve logstash and elasticsearch performance The logstash and elasticsearch performance can be improved by using async index options, pulling back the refresh interval, and by not fingerprinting every document. * Async translog allows elasticsearch to using run fsync in the background instead of blocking * the refresh interval will now be 5x the number of replicas with a cap of 30. This integer is representitive of the seconds between index refresh calls which greatly lowers the load generated across the cluster. * All documents were fingerprinted before writting to the cluster. This was a costly operation as elasticsearch will do a forward lookup on all documents with a preset ID resulting in 100's, if not 1000's, of extra reads. The purpose of the fingerprint function is to limit repeading writes so to keep some of this functionality the fingerprint function is now only added to documents with messages. * G1 garbage collection is now enabled by default when the heap size is > 6GiB. Early versions of elasticsearch did not recommend this setting however its since stabalized in recent releases. * JVM options have been moved into the elasticsearch and logstash roles allowing these tasks to trigger service restarts when changes are made. Change-Id: I805129b207ad4db182ae6e59b6ec78eb3e246b54 Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>	2018-09-21 21:47:07 -05:00
Zuul	daffc177a1	Merge "Add more optionality when customizing node roles"	2018-09-21 16:31:54 +00:00
Zuul	28df8729c8	Merge "Add playbook to automatically refresh fields in kibana"	2018-09-21 16:31:53 +00:00
Zuul	f5f581aa0e	Merge "Add variable to define a beat service state"	2018-09-20 21:18:33 +00:00
Guilherme Steinmüller	7430f6c8d5	Add variable to define a beat service state This patch aims to provide the user a way to enable/disable beats by overriding {beatname}_service_state variable accordingly to the beats that the users wants to be receiving data. There are some use cases that users just wants a subset of the beats provided, mostly to avoid unecessary use of bandwidth with data that woudn't be used. So the way that this patch proposes this use case is just enable/disable after install, keeping the service installed in case of the users needs it. Change-Id: I2251095d7fcfc48a239fe9d4984269503cc835da	2018-09-20 16:27:20 +00:00
Kevin Carter	1f9171082e	Add more optionality when customizing node roles The node roles would apply attributes to hosts if an override was set or if a node was part of a given group as determined through auto-detection. This change will now add nodes to a given role when set manually and will ensure no extra nodes are added to the role if the count meets or exceeds what's required to run the service. Change-Id: Ied5f564f0328488d3359ec4dc8e9ad17fefe5eaf Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>	2018-09-20 16:04:47 +00:00
Kevin Carter	218944a4e5	Add playbook to automatically refresh fields in kibana When upgrading or updating a template fields within kibana will not be updated until they're manually refreshed. This change uses the kibana API to gather field information from the indexes and update kibana automatically. Change-Id: Ia5de566521d79da070f4377d1d7cb4d9786447b4 Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>	2018-09-20 10:27:19 -05:00
Kevin Carter	10ffc96ab1	Update monitoring index for replicas Ensures that any monitoring indexes are made with replicas in a custer setup, which will ensure we're able to monitor the growth of ES indexes. The curator action plugin timer was updated to use two different timer files instead of combining them into one timer. Change-Id: I2184ac4ec0b75e442ee8ae6ca8bd2c6f04d51401 Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>	2018-09-20 00:35:17 -05:00
Zuul	b0ead67f54	Merge "Convert the curator action file into multiple files"	2018-09-20 00:18:27 +00:00
Kevin Carter	bebab50f10	Convert the curator action file into multiple files The curator action plugin does not use a logical OR when parsing multiple filters. The only way to do this is to run curator with different action filter files. Change-Id: I97c93c87d6254f79831f2a177098ea52a3a3a49d Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>	2018-09-19 12:06:23 -05:00
Dave Wilde	e4bd1fdaed	MNAIO ELK Updates * We don't need to create the containers as they are created during the initial run. * Remove quoting in favor of {% raw %} blocks Change-Id: Ied696ad0882169d523a60a900788e7c2ba1d3fa3	2018-09-19 10:52:32 -05:00
Zuul	94d8f09b74	Merge "Make openstack-service-setup compatible with older ansible"	2018-09-19 06:58:35 +00:00
Zuul	8f59a6f97c	Merge "Tune-up logrotate config"	2018-09-18 22:32:49 +00:00
Zuul	cf2e5dbdc3	Merge "Add capability to set node role"	2018-09-18 22:32:49 +00:00
Kevin Carter	3c96804a87	Tune-up logrotate config The log rotate configuration was leaving too many logs in place and allowing them to grow too large. This tunes up the logrotation process to ensure we're retaining information but not excessively. Change-Id: If0f02352ee2c274f4c589b05630d28126ceba2ab Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>	2018-09-18 13:20:27 -05:00
Kevin Carter	0b0efcb841	Add capability to set node role Presently the node role assignment is only automatic. Auto selection makes the assumption every node is identical however in many deployments a deployer may want to assign node roles to specific hardware thereby optimizing resources and improving general performance. This change adds and documents the ability to set the node roles within an ansible inventory. Change-Id: I22a2b636cb1441f17e575439b55ca64f9c7b0336 Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>	2018-09-18 12:35:06 -05:00
Zuul	772dad3543	Merge "Deploy ELK in MNAIO"	2018-09-17 18:00:59 +00:00
Wayne Warren	53fe850aa3	Make openstack-service-setup compatible with older ansible This change allows this playbook to be run using an older version of ansible. This change is necessary for my use case where I am running all OSA and related playbooks in a docker container locally for a Newton deploy. The use of Newton OSA's ansible bootstrap script means that the openstack-ansible my workflow uses requires Ansible 2.1, which does not support `include_tasks`. This change addresses that problem by replacing `include_tasks` in the playbook that needs to be run using openstack-ansible with `include` which produces the desired result. Change-Id: I8b2a0217e851d022ee40cbdd8bc8045e18d5a07d	2018-09-17 11:57:14 -05:00
Kevin Carter	0d4a4a92c7	Converg the logstash pipelines and enhance memory backed queues The multi-logstash pipeline setup, while amazingly fast, was crashing and causing index errors when under high load for a long period of time. Because of the crashing behavior and the fact that the folks from Elastic describe multi-pipeline queues to be "beta" at this time the logstash pipelines have been converted back into a single pipeline. The memory backed queue options are now limited by a ram disk (tmpfs) which will ensure that a burst within the queue does not cause OOM issues and ensures a highly performant deployment and limiting memory usage at the same time. Memory backed queues will be enabled when the underlying system is using "rotational" media as detected by ansible facts. This will ensure a fast and consistent experience across all deployment types. Pipeline/ml/template/dashboard setup has been added to the beat configurations which will ensure beats are properly configured even when running in an isolated deployment and outside of normal operations where beats are generally configured on the first data node. Change-Id: Ie3c775f98b14f71bcbed05db9cb1c5aa46d9c436 Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>	2018-09-16 23:44:58 -05:00
Zuul	be70a2078c	Merge "Set the max user watches to 1M"	2018-09-16 03:05:37 +00:00
Zuul	1d6c01ee57	Merge "MNAIO: Cater for galera bootstrap without a master"	2018-09-14 20:58:02 +00:00
Zuul	35a59a4cb6	Merge "update ironic scripts"	2018-09-12 23:03:00 +00:00
Kevin Carter	a98035e177	Correct elasticsearch list entropy The list of elasticsearch hosts was being randomized too much which results in the a performance issue. This change reduces the entropy and ensures that the list of hosts is correctly ordered such that localhost is always used first and other nodes in the cluster will be used as a fall back. Change-Id: Ifb551a6e01b5c0e1f62c1466a3d5b344a3c5da97 Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>	2018-09-12 13:13:13 -05:00
jacky06	e5b8b8b13a	Replace Chinese punctuation with English punctuation Curly quotes(Chinese punctuation) usually input from Chinese input method. When read from english context, it makes some confusion. Change-Id: I42ea55f2840eed70fe731119b259a5c625071e5b Closes-Bug: #1792131	2018-09-12 13:09:10 +00:00
Dave Wilde	525887872b	Deploy ELK in MNAIO This enables the deployment of the elk_metrics_6x stack inside of an MNAIO. Change-Id: Ie611baee79c33d7cbab9f0865127ac5966475838	2018-09-11 20:17:24 +00:00
Cameron Loader	778ce9895f	update ironic scripts Thee scripts currently use 'ironic' commands, whhich is deprecated. This patch converts to openstack commands. Change-Id: I1a16164a7b8e35a61938ec470def37fa52db9edb	2018-09-11 10:34:39 -06:00
Victor Palma	86a2402da9	change osquery defaults * do not install debuging osquery packages * log to filesystem * turn off rsyslog Change-Id: Iae91959847fc7bfd5184d157a44cd994dab397f3	2018-09-11 11:29:44 -05:00
Zuul	42f7f896b4	Merge "Enforce no_proxy when setting up ELK dashboards and rollups"	2018-09-10 22:15:48 +00:00
Jonathan Rosser	1b267c475c	Ensure logstash listens on ipv4 address Upgrading the ELK stack to 6.4.0 leaves logstash only listening on an ipv6 address and thereby unable to receive existing beats inputs. This change makes the jvm prefer binding to ipv4 addresses. Change-Id: I04a0fdbcb253a0a6a3bcc3759eb0b9d0f1962621	2018-09-10 21:14:21 +00:00
Jonathan Rosser	c2d3c44fd8	Enforce no_proxy when setting up ELK dashboards and rollups There is no guarantee that all container IP addressess will be included in an existing no_proxy environment variable. This will cause failures when an http proxy is configured, but the proxy does not allow traffic to 'hairpin' back to internal addresses. This change forces no_proxy to the specific address of the kibana and coordinator endpoints when the uri module is used to load dashboards and configure rollups. Change-Id: I669334c722cce79459b522e6e2d7e1aaec49ef24	2018-09-10 21:14:11 +00:00
Kevin Carter	bb4954b598	Set the max user watches to 1M This increases the default value on elastic hosts from 32k to 1M which improves general stability, especially on high traffic hosts. Change-Id: I18f3e7005d2798dd4008215c7aa949cc37084f5c Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>	2018-09-10 21:12:14 +00:00
Zuul	58a7e461ac	Merge "Move bootstrap-ansible and passwords to run_osa tag"	2018-09-10 17:35:06 +00:00
Zuul	31d0cff14c	Merge "Allow using custom publish host"	2018-09-08 20:53:51 +00:00
Antony Messerli	cd3b3f047a	Move bootstrap-ansible and passwords to run_osa tag Allows for deployment/bootstrap of OSA to be skipped by skipping run_osa while still allowing configuration to be added during pre_config_osa. Change-Id: I40b0c8209f03c7e9543c7c688f2ef8ba2ebdf72d	2018-09-07 17:01:29 -05:00
Jesse Pretorius	41d0e61f0c	MNAIO: Cater for galera bootstrap without a master There can be situations where a gvwstate.dat file is present in at least one galera container, but the my_uuid and view_id do not match in any of them. In this case, we should just pick any container to be the master. This patch caters for this situation, ensuring that the cluster still bootstraps whenever the VM boots. Change-Id: If87cd9399b6624418f16910e4ddc046aaa22e5c5	2018-09-07 17:51:58 +01:00
Jesse Pretorius	2958a629c7	MNAIO: Ensure that nested virt is enabled on host Nested virtualization is important to improve VM performance and enabling it is crucial to ensuring that VM images built on one host work on boot on other hosts because the environment is consistent. In this patch add a task to enable it if it is available. Change-Id: I812d8399cf45fab94f0f46976c9415591d45e463	2018-09-06 17:16:21 +01:00
Jesse Pretorius	f437430212	MNAIO: Ensure that virt-networks are properly setup Due to the rather terrible virt_net module, only one action can be done on the virt networks at any one time. This means that the current action of setting them to autostart has no effect, because the module does not do it. Also, the current action of disabling the default network and disabling it from autostarting also does not take full effect. As such, after a host reboot, the default network autostarts, and the other networks are not started and the VM's cannot start. When trying to resolve this by re-running the host setup, the play ignores any existing virt networks - so the issue cannot be fixed. This patch does the following: 1. Ensures that the default network does not autostart. This is done by splitting the disabling of the network, and the disabling of autostart into two tasks. 2. Changes the define/create action into a single action which will not change the network configuration if it is defined. 3. Implements the setting of the network as active, and the setting of it to autostart as two seperate tasks. This ensures that both actions are actually implemented. Change-Id: I608f2607824fac649f4e018d89094d57047134b3	2018-09-06 13:08:46 +01:00
Zuul	7a9f3ef7f4	Merge "Force filesystem type on swift format"	2018-09-05 20:11:20 +00:00
Antony Messerli	ad1a4bc9ef	Force filesystem type on swift format It currently seems to think that /dev/vmvg00/disk1 is used for btrfs, so force this operation to ensure it's changed to xfs. Change-Id: I0bcc9723fb33b557315422c3259a7ba2b75ceff6	2018-09-05 13:13:22 -05:00
Zuul	b80cb0366b	Merge "Refactor templates to use a single macro template file"	2018-09-05 16:57:53 +00:00
Jesse Pretorius	7618619bf8	MNAIO: Implement retries for image downloads The image downloads may fail, even with aria's built-in retry mechanism. With this patch we ensure that ansible will delay and retry again. This improves the chances of success. With this we also remove the '--quiet' default parameter so that we get console output from the task if it does ultimately fail. This is useful for diagnostic purposes. Change-Id: Ieed41f06a22effb28463637184980a748791edfe	2018-09-05 14:20:50 +01:00
Jesse Pretorius	868a559840	MNAIO: Only run systemd daemon_reload when necessary When the VM's are Ubuntu Trusty, this task causes total failure. We should only try and do the daemon_reload if the system being used supports it. Change-Id: I557856045a7735c8f351df6350f777caae526b10	2018-09-04 19:23:42 +01:00

1 2 3 4 5 ...

872 Commits