Retry apt tasks in zuul_reboot if apt lock is held
The zuul_reboot playbook runs on each zuul server at what essentially become random times based on how long the previous servers took to be updated. We have seen this result in our apt tasks colliding with unattended upgrades on the server. Latest ansible would let us workaround this using the lock_timeout parameter to the apt module, but the version we use on bridge does not support this parameter. Instead we check the failure message for 'Failed to lock apt for exclusive operation' and if present we retry. We wait 30 seconds between retries and will perform up to 40 attempts for a total of 20 minutes of waiting. This method should also be forward compatibile with new Ansible. If the lock is held for longer than 20 minutes it likely implies something has gone wrong and we will need to perform manual intervention anyway. Change-Id: I3171838a30e3ea496bb08f8b6ab1c95755b2ff3c
This commit is contained in:
parent
17eba9db32
commit
0c59eff0e8
@ -18,6 +18,11 @@
|
||||
apt:
|
||||
update_cache: yes
|
||||
upgrade: yes
|
||||
register: apt_action
|
||||
# 20 minute wait for unattended-upgrades to complete
|
||||
delay: 30
|
||||
retries: 40
|
||||
until: apt_action is success or 'Failed to lock apt for exclusive operation' not in apt_action.msg
|
||||
- name: Reboot the executor server
|
||||
reboot:
|
||||
- name: Start the executor
|
||||
@ -37,6 +42,11 @@
|
||||
apt:
|
||||
update_cache: yes
|
||||
upgrade: yes
|
||||
register: apt_action
|
||||
# 20 minute wait for unattended-upgrades to complete
|
||||
delay: 30
|
||||
retries: 40
|
||||
until: apt_action is success or 'Failed to lock apt for exclusive operation' not in apt_action.msg
|
||||
- name: Reboot the merger server
|
||||
reboot:
|
||||
- name: Start the merger
|
||||
@ -62,6 +72,11 @@
|
||||
apt:
|
||||
update_cache: yes
|
||||
upgrade: yes
|
||||
register: apt_action
|
||||
# 20 minute wait for unattended-upgrades to complete
|
||||
delay: 30
|
||||
retries: 40
|
||||
until: apt_action is success or 'Failed to lock apt for exclusive operation' not in apt_action.msg
|
||||
- name: Reboot the scheduler server
|
||||
reboot:
|
||||
- name: Start the scheduler process
|
||||
|
Loading…
x
Reference in New Issue
Block a user