We are currently using a lot of wildcard searches in elasticsearch which
are slow. Provide better field data so that we can replace those
wildcard searches with filters. In particular add a short uuid field and
make the filename tag field the basename of the filepath so that grenade
and non grenade files all end up with the same tags.
Change-Id: If558017fceae96bcf197e611ab5cac1cfe7ae9bf
The default java heap size is too small for our logstash indexers.
Double it to 2g. Do this by adding an /etc/default/logstash-indexer file
that the upstart configs source if it is present. This required some
tweaks to the upstart configuration to load the defaults properly.
Co-Authored-By: K Jonathan Harker <k.jonathan.harker@hp.com>
Change-Id: I63447f59f3fa6d466a7d275476121fe8339479dc
index the sublog files so that we can use them in elastic search,
and so that we can start removing content from console.html.
Depends-On: Iede34b970d090f855c701b69c1f5167a08ab9c52
Which brings us time stamping to the sublogs
Change-Id: I4b4484065fda168f4d5efc73e95736226fb36ed0
Add grenade new/ and old/ logs to logstash. To do this without tripling
HTTP GETs for every finished job add a job filter to the log gearman
client that when present only attempts to grab files if the job name
matches the job filter.
Change-Id: Ia33722bf71d482f2fd6b655b28090a10bf46af54
Add them to all the places it's safe to do so for now. Don't actually
spin up any nodes for them yet.
Change-Id: I59e97be7e5b094af3153bc7d5dce0cff57996f55
Initial load testing suggests we should be able to handle at least
4 processes per host.
Also re-enable crm114.
Change-Id: Ia0158ad5f7f524c4fa0a80d479e2b74d28f0d1a6
Separate the jenkins log client and worker bits into a new module
called log_processor with ::client and ::worker classes.
Instantiate two workers on each logstash worker node.
Change-Id: I7cfec410983c25633e6b555f22a85e9435884cfb
Try to identify the probability that each log line indicates an error.
Pass that information on to logstash.
Change-Id: I0b298c2e8c00d8fdf1c907215a7bbf27086bc80c
* modules/logstash/manifests/init.pp: Download and install Logstash
1.2.1.
* modules/openstack_project/files/logstash/log-gearman-client.py:
Logstash 1.2.1 comes with a new schema. Update the job data sent to log
push workers to better accomodate the new schema.
* modules/openstack_project/files/logstash/log-gearman-worker.py: Push
Logstash 1.2.1 schema compliant JSON to the Logstash TCP input.
* modules/openstack_project/templates/logstash/indexer.conf.erb:
Logstash 1.2.1 comes with a new schema and many input and filter
changes. Use the newly supported features like conditionals to keep the
config up to date.
* modules/kibana/templates/config.rb.erb: Change the default field for
kibana to 'message'. It was @message which is deprecated in the new
logstash schema.
Change-Id: Id19fc05bcce8d42c5c0cf33df3da7e95f5794107
* modules/openstack_project/files/logstash/log-gearman-worker.py:
Increase the log worker retry timeout length to 255 seconds and do an
exponential backoff instead of retrying every second when incomplete
files are received. This should reduce load on the file server and make
console.html indexing more reliable.
Change-Id: I42d8f99b3ba2495bb1cac94c4b44e3598f7b9cb6
* .../logstash/log-gearman-worker.py: Only GET log lines greater than
the DEBUG log level. This will cut down the amount of processing that
logstash needs to perform and should speed up logstash log processing.
This changes the order of the $file then $file.gz GETs. Now get $file.gz
first as the log filtering on the apache side only runs on the .gz
files.
Change-Id: I8d88787ddde541f5aec2aee59c27cb1b48e5e4e5
* modules/openstack_project/files/logstash/log-gearman-client.py:
Use os.path.join to join the base log path provided by Jenkins and the
log file provided in the gearman client config. This avoid needing to
worry about trailing slashes in the Jenkins provided path.
Change-Id: I973237dae6f0b7947d322489108a05a99a7cc0be
* modules/openstack_project/files/logstash/log-gearman-worker.py:
The annotated logs served by logs-dev and soon to be served to logs.o.o
return .txt files gzipped. log-gearman-worker.py needs to check the
Content-Type in the reponse headers to see if the txt files were gzipped
in order to properly handle this.
partial-bug: #1207047
Change-Id: I5981cde145a572a6e3d20e8369e407df151143ff
This calculates the full LOG_PATH in the Zuul config and pases it
to Jenkins. The new path is similar to the old but with the
substitution of a short ZUUL_UUID at the end instead of the Jenkins
build number in order to avoid collisions from multiple Jenkins
masters.
Periodic jobs add a node name to their log path to avoid collisions
from multiple masters. Unfortunately, that value is not accessible
to the logstash worker. This can be solved by having Zuul trigger
periodic jobs.
Add the ZUUL_REF to logstash as "build_ref".
Requires https://review.openstack.org/#/c/39130/
Change-Id: I40bad59e3ad8ed6b4706762ed8b833fd15c13b0d
Add jenkins01 and jenkins02, both of which will have unit test and
devstack slaves. Leave jenkins.o.o alone; over time it will be
reduced so that it alone has special jobs and privileged slaves
attached to it.
Note that currently all of the jobs will be defined on all nodes,
including jobs on timers. I think the long-term fix for that is to
have zuul schedule timed jobs.
Change-Id: I10bbd5555e5194b1031700975d5b3ae6b458b8b3
Logstash seems to perform better when it is not running out of buffer
space. Increase the buffer size to reduce contention and give the
gearman workers more room to work when things are queueing up. This
should increase throughput.
Change-Id: Ie4b5cc7a3ae4be517012c20559f59071b7b15dd5
* modules/openstack_project/files/logstash/log-gearman-client.py:
BASE_LOG_PATH is a relative dir and is not rooted. Provide the root '/'
when constructing the source URL for the gearman workers.
Change-Id: I8fcfa19b048019398ffd370d74c271b9656a8688
And use that when constructing log paths and URLs. Use a substr
of the change id or commit sha when constructing URLs so the log
directories are deeper.
Make copying the test results a macro (it's used in several places).
Update the gearman log client to take advantage of the new parameter.
Requires https://review.openstack.org/#/c/36304/
Change-Id: I64faa35eddc4105271efa3de4f83b608b77655c2
Change-Id: I7ba628bb5d7f160f67327310048973483b78b05a
Reviewed-on: https://review.openstack.org/34156
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: James E. Blair <corvus@inaugust.com>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
Change-Id: Ie6905039d4752d578566861ffd340cf607ad270b
Reviewed-on: https://review.openstack.org/32819
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Khai Do <zaro0508@gmail.com>
Approved: Clark Boylan <clark.boylan@gmail.com>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins
* modules/openstack_project/manifests/logstash.pp: Concat is not
available in our version of puppetlabs stdlib. Use flatten instead
which is available. Remove dependency on non existant logstash::indexer
class. Fix requires orders.
* modules/openstack_project/manifests/logstash_worker.pp: Fix requires
orders.
* modules/openstack_project/files/logstash/jenkins-log-client.init
* modules/openstack_project/files/logstash/jenkins-log-worker.init:
Set pidfile argument when calling scripts.
* modules/openstack_project/files/logstash/log-gearman-worker.py:
Use python2 compatible gzip.GzipFile instead of gzip.decompress. Send
work exception instead of work fail when an exception happens. Log these
exceptions locally as well.
Change-Id: Idf0a873215acb72187e058a0306a21ccd928d464
Reviewed-on: https://review.openstack.org/32804
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Khai Do <zaro0508@gmail.com>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Approved: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins
This change reorgs the logstash log pushing so that there is a central
gearman server that listens to Jenkins ZMQ events which are then
converted to per log file gearman jobs which are processed by gearman
workers. The central gearman server will live on logstash.o.o and the
existing logstash-worker hosts will be converted to gearman log pusher
workers.
This commit includes relavent documentation changes.
Change-Id: I45f7185c2479c54b090d223408dff268e1e8d7db
Reviewed-on: https://review.openstack.org/32455
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: Clark Boylan <clark.boylan@gmail.com>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins
* modules/openstack_project/files/logstash/logstash-worker1/jenkins-log-pusher.yaml:
Add the swift log files to the list of files to be processed by
logstash-worker1. This does not include the swift proxy log file as it
will require extra parsing.
* modules/openstack_project/templates/logstash/indexer.conf.erb: Add
Logstash filters for apache combined log format files.
Change-Id: I7545ace8f7601bdca453e0d2ac1b2233823878ce
Reviewed-on: https://review.openstack.org/31103
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: James E. Blair <corvus@inaugust.com>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins
* modules/openstack_project/files/logstash/logstash-worker1/jenkins-log-pusher.yaml:
Add the syslog log file to the list of files to be processed by
logstash-worker1.
* modules/openstack_project/templates/logstash/indexer.conf.erb: Add
Logstash filters for syslog format files.
Change-Id: I0f8f58ab484949eb0506842bdb98385767a50333
Reviewed-on: https://review.openstack.org/31097
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Approved: James E. Blair <corvus@inaugust.com>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
* modules/openstack_project/files/logstash/logstash-worker1/jenkins-log-pusher.yaml:
Add the keystone log file to the list of files to be processed by
logstash-worker1.
* modules/openstack_project/templates/logstash/indexer.conf.erb: Add
Logstash filters for keystone format files.
Change-Id: I5a72fc17ed1f37b816581faabe44f26f8cc36db2
Reviewed-on: https://review.openstack.org/31096
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Approved: James E. Blair <corvus@inaugust.com>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
* modules/openstack_project/files/logstash/log-pusher.py: Add a filename
field to logstash events that can be used to associate multiline events
in files to their appropriate parents in the same file.
* modules/openstack_project/templates/logstash/indexer.conf.erb: Add
stream_identities to the mutliline filters that use the source host and
file name to determine relationships between mutliline events.
Change-Id: Ia325c0e1257131ab1b721c4df8f70f6bea1d0b99
Reviewed-on: https://review.openstack.org/30953
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: James E. Blair <corvus@inaugust.com>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
Logstash performs filtering in a single thread so it does not scale up
very well. Work around this by scaling Logstash out to multiple indexer
hosts.
Current plan is to have a small (2GB) kibana web front end host that
does nothing but talk to elasticsearch, three 4GB logstash indexers that
will run a single log-pusher.py + logstash indexer with some partition
of the logfiles assigned to each indexer, and finally the existing large
elasticsearch node.
Eventually properly load balancing log processing across the worker
nodes would be great, but the current partition method should work well
enough with little additional effort.
Change-Id: Ifc6396560934314ffd6a7c47eb2acff9e9c2a7af
Reviewed-on: https://review.openstack.org/30573
Reviewed-by: James E. Blair <corvus@inaugust.com>
Approved: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Tested-by: Jenkins
Logstash's queues are small. During periods of high load this can cause
the log pusher to cache all of the logs it is pulling in. Set an upper
bound to its output queue size to force it to block when Logstash cannot
keep up. This should prevent OOMing and won't impact log throughput as
Logstash is blocking anyways.
Change-Id: I9ca31e6dbe454e9c4878fd7ba35bc33bc9df7d83
Reviewed-on: https://review.openstack.org/30572
Reviewed-by: James E. Blair <corvus@inaugust.com>
Reviewed-by: Khai Do <zaro0508@gmail.com>
Approved: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Tested-by: Jenkins
New larger elasticsearch server is in place. Open the log floodgates and
let the nova and glance logs flow.
Change-Id: I90ecffc192f7786b9e98d94ee863ec221a5c183b
Reviewed-on: https://review.openstack.org/30371
Reviewed-by: James E. Blair <corvus@inaugust.com>
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins
* modules/openstack_project/files/logstash/jenkins-log-pusher.yaml:
Add the cinder and quantum screen log files to the log pusher. Tag them
with screen and oslofmt (screen because they are screen logs and oslofmt
because they use the oslo log message format).
* modules/openstack_project/templates/logstash/indexer.conf.erb:
Add a grep filter to remove the screen log header lines. Add a multiline
filter to handle oslo log format multi line events. Add a grok filter to
parse the oslo format logs. Handle timestamps without millisecond
precision. Remove event_message field if that message was properly
parsed.
Change-Id: Icd18e252a512416e0cce5ee0e27942b072a25e09
Reviewed-on: https://review.openstack.org/29985
Reviewed-by: James E. Blair <corvus@inaugust.com>
Approved: Clark Boylan <clark.boylan@gmail.com>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins
* modules/openstack_project/files/logstash/log-pusher.py: If a list of
tags is provided under a source file configured in the yaml config
attach those tags to the log events generated from that source file.
Example yaml:
source-files:
- name: console.html
retry-get: True
tags:
- foo
- bar
Change-Id: Ib74abad2d06d3e52e5b21b0fb38033f9474ab4e4
Reviewed-on: https://review.openstack.org/29808
Reviewed-by: James E. Blair <corvus@inaugust.com>
Approved: Clark Boylan <clark.boylan@gmail.com>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins
The Jenkins log pusher daemon was not writing its PID to its PID file.
This prevented init from properly stopping the service. Write the PID
out to the file to fix this.
Change-Id: I35352b6e241b54a439f3675d02344d901249dff1
Reviewed-on: https://review.openstack.org/29578
Reviewed-by: James E. Blair <corvus@inaugust.com>
Approved: Clark Boylan <clark.boylan@gmail.com>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins
* modules/openstack_project/files/logstash/log-pusher.py: Semi properly
daemon the log pusher process by default. Close open file descriptors,
fork into background, detach from terminal, redirect std* to /dev/null,
change the umask to 0, change working dir to '/', and lock a PID file.
* modules/openstack_project/files/logstash/jenkins-log-pusher.init:
Update start-stop-daemon commands to use the presence of a PID file and
no longer background with start-stop-daemon.
Change-Id: I4dcdd48478fa7d27745a3075a6942838e9df20ee
Reviewed-on: https://review.openstack.org/28449
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: James E. Blair <corvus@inaugust.com>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
* modules/openstack_project/files/logstash/log-pusher.py: The
log-pusher.py script does not cleanly daemonize on its own. Refactor the
script to make it possible for it to daemonize itself.
Change-Id: I974b8370f4dced357beb92ea8e74b1a60cb148b5
Reviewed-on: https://review.openstack.org/28365
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: James E. Blair <corvus@inaugust.com>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
* modules/openstack_project/files/logstash/log-pusher.py: If logstash
dies the TCP/UDP log-pusher.py outputs will not be able to reconnect.
Logstash appears to need about a minute to restart after a crash so wait
90 seconds after a socket exception before attempting to reconnect.
Change-Id: I61a31476ce5b521bcde02ac80566d693e0c89114
Reviewed-on: https://review.openstack.org/28364
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: James E. Blair <corvus@inaugust.com>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
* modules/openstack_project/files/logstash/log-pusher.py: Make the log
pusher properly configurable with a yaml configuration. As part of this
change support multiple zmq publisher inputs, multiple file retrievers,
job name filtering, and event tagging (with the filename).
* modules/openstack_project/files/logstash/jenkins-log-pusher.yaml:
Initial config for the log pusher.
* modules/openstack_project/manifests/logstash.pp: Put new log pusher:
config in place.
* modules/openstack_proejct/files/logstash/jenkins-log-pusher.init: Run
the log pusher service with the new config file.
Change-Id: I4c8405b1edfa16bbcc8f998627c6240bef23f302
Reviewed-on: https://review.openstack.org/28113
Reviewed-by: James E. Blair <corvus@inaugust.com>
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: Jeremy Stanley <fungi@yuggoth.org>
Tested-by: Jenkins
* modules/openstack_project/manifests/logstash.pp: Run the Jenkins log
pusher script as a service. This is the first step in making Logstash
use the TCP inputs instead of pipe inputs.
* modules/openstack_project/files/logstash/jenkins-log-pusher.init: Add
a simple init script for the Jenkins log pusher.
* modules/openstack_project/templates/logstash/indexer.conf.erb: Switch
to TCP input instead of pipe input as the new Jenkins log pusher service
will push log events over TCP.
Change-Id: Id80c710abd5facd71d18afb2b250b2d7d92dec2d
Reviewed-on: https://review.openstack.org/28074
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Approved: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins