Apache logs aren't getting indexed after changes to where they exist
in devstack-gate. This means the horizon logs are lost, and that
keystone logs aren't getting indexed at all on the main jobs.
Change-Id: I1ef5084d6bf4dc9f74f4e4b51e00e97573074e38
As for devstack patch Ia3843818014f7c6c7526ef3aa9676bbddb8a85ca if the
fake virt driver is used the n-cpu processes are named n-cpu-${i} where
i is 1 to NUMBER_FAKE_NOVA_COMPUTE.
Currently the name field is an exact match and we don't change
NUMBER_FAKE_NOVA_COMPUTE from its default value of 1. So in order to
start collecting n-cpu in large-ops again just add n-cpu-1.
Change-Id: I963675436d05fe9cd8bf5cc609130fe6e2b90a79
We should index the trove logs now that we have
gate-trove-functional-dsvm-mysql and it can blow up in great ways
throwing off our uncategorized bug percentages in elastic-recheck.
It doesn't look like trove is used in grenade runs so there are no logs
to index there.
Related-Bug: #1402227
Change-Id: I1c0d9a301d5bf40047b69ec64037df8c1ba3425f
This commit adds a new job filter to the gearman client to filter
based on the build queue. This is used for the subunit jobs which
we don't want to run on check jobs.
Change-Id: If81fe98d8d67bb718c53a963695a7d06f5f6625d
The policy for the subunit workers expected a subunit yaml file named:
jenkins-subunit-worker.yaml however the filename had a typo which
prevented things from working. This patch fixes the filename.
Change-Id: I0638f1dcb18561580351959cb29ef2546850450f
This adds a new gearman worker to process the subunit files from
the gate job runs. It will use subunit2sql to connect to a sql
server and process the data from the subunit file. The
log-gearman-client is modified to allow for pushing subunit jobs
to gearman, and the worker model for processsing logs is borrowed
to process the subunit files.
Change-Id: I83103eb6afc22d91f916583c36c0e956c23a64b3
Adding log files and fixing format tags for tripleo jobs
in the config file defining which files get indexed by logstash.
Only adding common / basic files for host_info, os-collect-config,
and mysql logs from /mnt/state/var/log.
Partial-bug: 1328645
Change-Id: I9d858895602440668fbf1f9ecd34bb4e3b9a2548
We now have consistent os-collect-config log formats for both
fedora and ubuntu tripleo ci jobs(from systemd journal and syslog
respectively), see I27ea3d465670277ef1ddf3d1b3b9d52df4162807 and
Ib8c1fc39d56b2b9c6d8e9b64a868def619aa2f1f. Tagging them as syslog
will ensure they are handled by the logstash syslog grok filter,
they currently don't match anything which results in messages
being indexed with a value of "%{logmessage}".
Change-Id: Iec5ffcb08e5b45fb01db25e14c943f000ca97a83
Closes-Bug: #1350121
Dependant on I787c08506d9a4d12081b5b5b16d752d5147f8e72 to collect the
javelin logs.
Since javelin2 uses tempest code to do the testing it generates useful
logs. As javelin2 grows to cover more we will most likely run into
issues that need debugging.
Add these logs to logstash to help us do analysis on javelin
results.
Change-Id: Ic3b2e33177bdc65fe294ddddd581a93de6c1cb60
After the successful implementation of
https://blueprints.launchpad.net/keystone/+spec/unified-logging-in-keystone
keystone uses oslo logging instead of its own. Update logstash settings
to reflect this.
We currently don't have any keystone logs in logstash, this should fix
that.
Change-Id: I7acfa35bfd112a15d2ec3aa7338fb28b37a827cb
The gate surely runs into libvirt bugs on a regular basis, but we don't
capture those logs in logstash, so we can't write elastic-recheck
queries against them.
For example, I believe the underlying cause of this issue is libvirt:
https://bugs.launchpad.net/tempest/+bug/1276778/comments/10
Change-Id: Ie2f03b19249967e78319a4016921f91ef0498540
We need to start doing this before we change grenade to not default
to spitting out everything to the console.
Change-Id: I58d32c27e78d2d6eb791f2d29c3aa886a5218c68
Add the os-collect-config logs to logstash to be indexed. Depends on
another commit in tripleo-ci to remove the random string from the log
name Ie84492ab981b06421d486579fc269ef6b3ad1815.
Change-Id: I2fe3aa4ec3c469d59a6dee8dd54f08d0bbf7b792
We need to index the tempest log so we can fingerprint errors using
multiple parts of the same traceback which doesn't work with the
console.html log.
Closes-Bug: #1323713
Change-Id: I34b2d67e12199ef0145d7a8d25d5385f944c78ed
Now that we have a grenade neutron job that we are trying to stabilize
and gate on, we should collect the neutron logs in grenade so we can use
logstash and elastic-recheck with grenade neutron jobs.
Change-Id: I54c4079b1be00d5201d4fb84dd975576551728fe
We are seeing a race failure in check-tempest-dsvm-postgres-full jobs
where the error message shows up in the screen-n-api-meta log but we can
only get at it from screen-n-api logs (or console). We should index on
the screen-n-api-meta log so we can better filter the query.
Related-Bug: #921858
Change-Id: I1701ac83b2643d819245a7cdbbfb56cc4af12f5b
There are 3 logs for neutron services we are regularly running in
the gate which we aren't indexing. Ensure they are all indexed so
we can actually use them in elastic recheck. They are all pretty
small so shouldn't impact ES load.
Change-Id: I082094574ceb6197e30708f9e05ba4b7fdc6f8af
this adds horizon_error to the indexed log files, which is very
useful in determining how horizon fails, as we get very specific
stack traces.
Change-Id: Ifb323e327dbc2931100a4552d029d91209c4bbba
We are currently using a lot of wildcard searches in elasticsearch which
are slow. Provide better field data so that we can replace those
wildcard searches with filters. In particular add a short uuid field and
make the filename tag field the basename of the filepath so that grenade
and non grenade files all end up with the same tags.
Change-Id: If558017fceae96bcf197e611ab5cac1cfe7ae9bf
We need a much higher level of output on libvirt logs to get to the
bottom of Bug 1254872. However, at that level of output, we crush
elastic search. So turn off indexing this log in the gate until we
get to the bottom of the bug and can return it to a more reasonable
logging level.
Change-Id: I9fec939883e50b421bc0530205e8b0bd7eab2350
Related-Bug: #1254872
The default java heap size is too small for our logstash indexers.
Double it to 2g. Do this by adding an /etc/default/logstash-indexer file
that the upstart configs source if it is present. This required some
tweaks to the upstart configuration to load the defaults properly.
Co-Authored-By: K Jonathan Harker <k.jonathan.harker@hp.com>
Change-Id: I63447f59f3fa6d466a7d275476121fe8339479dc
index the sublog files so that we can use them in elastic search,
and so that we can start removing content from console.html.
Depends-On: Iede34b970d090f855c701b69c1f5167a08ab9c52
Which brings us time stamping to the sublogs
Change-Id: I4b4484065fda168f4d5efc73e95736226fb36ed0
Add grenade new/ and old/ logs to logstash. To do this without tripling
HTTP GETs for every finished job add a job filter to the log gearman
client that when present only attempts to grab files if the job name
matches the job filter.
Change-Id: Ia33722bf71d482f2fd6b655b28090a10bf46af54
Add them to all the places it's safe to do so for now. Don't actually
spin up any nodes for them yet.
Change-Id: I59e97be7e5b094af3153bc7d5dce0cff57996f55
Initial load testing suggests we should be able to handle at least
4 processes per host.
Also re-enable crm114.
Change-Id: Ia0158ad5f7f524c4fa0a80d479e2b74d28f0d1a6
Separate the jenkins log client and worker bits into a new module
called log_processor with ::client and ::worker classes.
Instantiate two workers on each logstash worker node.
Change-Id: I7cfec410983c25633e6b555f22a85e9435884cfb
Try to identify the probability that each log line indicates an error.
Pass that information on to logstash.
Change-Id: I0b298c2e8c00d8fdf1c907215a7bbf27086bc80c