244 lines
8.8 KiB
YAML
Raw Normal View History

- import_playbook: ../bootstrap-bridge.yaml
vars:
root_rsa_key: "{{ lookup('file', zuul.executor.work_root + '/' + zuul.build + '_id_rsa', rstrip=False) }}"
ansible_cron_disable_job: true
cloud_launcher_disable_job: true
# setup opendev CA
- hosts: bridge.openstack.org
become: true
tasks:
- name: Make temporary dir for CA generation
tempfile:
state: directory
register: _ca_tempdir
- name: Create CA PEM/crt
shell: |
set -x
# Generate a CA key
openssl genrsa -out ca.key 2048
# Create fake CA root certificate
openssl req -x509 -new -nodes -key ca.key -sha256 -days 30 -subj "/C=US/ST=CA/O=OpenDev Infra" -out ca.crt
args:
chdir: '{{ _ca_tempdir.path }}'
executable: /bin/bash
- name: Save key
slurp:
src: '{{ _ca_tempdir.path }}/ca.key'
register: _opendev_ca_key
- name: Save certificate
slurp:
src: '{{ _ca_tempdir.path }}//ca.crt'
register: _opendev_ca_certificate
- name: Cleanup tempdir
file:
path: '{{ _ca_tempdir.path }}'
state: absent
when: _ca_tempdir.path is defined
- hosts: all
become: true
tasks:
- name: Make CA directory
file:
path: '/etc/opendev-ca'
state: directory
owner: root
group: root
mode: 0600
- name: Import files
shell: 'echo "{{ item.content }}" | base64 -d > {{ item.file }}'
args:
creates: '{{ item.file }}'
loop:
- file: '/etc/opendev-ca/ca.key'
content: '{{ hostvars["bridge.openstack.org"]["_opendev_ca_key"]["content"] }}'
- file: '/etc/opendev-ca/ca.crt'
content: '{{ hostvars["bridge.openstack.org"]["_opendev_ca_certificate"]["content"] }}'
- name: Install and trust certificate
shell:
cmd: |
cp /etc/opendev-ca/ca.crt /usr/local/share/ca-certificates/opendev-infra-ca.crt
update-ca-certificates
- hosts: bridge.openstack.org
become: true
tasks:
- name: Write inventory on bridge
include_role:
name: write-inventory
vars:
write_inventory_dest: /home/zuul/src/opendev.org/opendev/system-config/inventory/base/gate-hosts.yaml
write_inventory_exclude_hostvars:
- ansible_user
- ansible_python_interpreter
write_inventory_additional_hostvars:
public_v4: nodepool.private_ipv4
public_v6: nodepool.public_ipv6
- name: Add groups config for test nodes
template:
src: "templates/gate-groups.yaml.j2"
dest: "/etc/ansible/hosts/gate-groups.yaml"
- name: Update ansible.cfg to use job inventory
ini_file:
path: /etc/ansible/ansible.cfg
section: defaults
option: inventory
value: /home/zuul/src/opendev.org/opendev/system-config/inventory/base/gate-hosts.yaml,/home/zuul/src/opendev.org/opendev/system-config/inventory/service/groups.yaml,/etc/ansible/hosts/gate-groups.yaml
- name: Make host_vars directory
file:
path: "/etc/ansible/hosts/host_vars"
state: directory
- name: Make group_vars directory
file:
path: "/etc/ansible/hosts/group_vars"
state: directory
- name: Write hostvars files
vars:
bastion_ipv4: "{{ nodepool['public_ipv4'] }}"
bastion_ipv6: "{{ nodepool['public_ipv6'] }}"
bastion_public_key: "{{ lookup('file', zuul.executor.work_root + '/' + zuul.build + '_id_rsa.pub') }}"
iptables_test_public_tcp_ports:
# Zuul web console
- 19885
# selenium
- 4444
template:
src: "templates/{{ item }}.j2"
dest: "/etc/ansible/hosts/{{ item }}"
loop:
- group_vars/all.yaml
- group_vars/adns.yaml
- group_vars/eavesdrop.yaml
- group_vars/nodepool.yaml
- group_vars/ns.yaml
- group_vars/registry.yaml
- group_vars/gitea.yaml
- group_vars/gitea-lb.yaml
- group_vars/kerberos-kdc.yaml
Add a keycloak server This adds a keycloak server so we can start experimenting with it. It's based on the docker-compose file Matthieu made for Zuul (see https://review.opendev.org/819745 ) We should be able to configure a realm and federate with openstackid and other providers as described in the opendev auth spec. However, I am unable to test federation with openstackid due its inability to configure an oauth app at "localhost". Therefore, we will need an actual deployed system to test it. This should allow us to do so. It will also allow use to connect realms to the newly available Zuul admin api on opendev. It should be possible to configure the realm the way we want, then export its configuration into a JSON file and then have our playbooks or the docker-compose file import it. That would allow us to drive change to the configuration of the system through code review. Because of the above limitation with openstackid, I think we should regard the current implementation as experimental. Once we have a realm configuration that we like (which we will create using the GUI), we can chose to either continue to maintain the config with the GUI and appropriate file backups, or switch to a gitops model based on an export. My understanding is that all the data (realms configuration and session) are kept in an H2 database. This is probably sufficient for now and even production use with Zuul, but we should probably switch to mariadb before any heavy (eg gerrit, etc) production use. This is a partial implementation of https://docs.opendev.org/opendev/infra-specs/latest/specs/central-auth.html We can re-deploy with a new domain when it exists. Change-Id: I2e069b1b220dbd3e0a5754ac094c2b296c141753 Co-Authored-By: Matthieu Huin <mhuin@redhat.com>
2021-11-30 13:03:12 -08:00
- group_vars/keycloak.yaml
letsencrypt support This change contains the roles and testing for deploying certificates on hosts using letsencrypt with domain authentication. From a top level, the process is implemented in the roles as follows: 1) letsencrypt-acme-sh-install This role installs the acme.sh tool on hosts in the letsencrypt group, along with a small custom driver script to help parse output that is used by later roles. 2) letsencrypt-request-certs This role runs on each host, and reads a host variable describing the certificates required. It uses the acme.sh tool (via the driver) to request the certificates from letsencrypt. It populates a global Ansible variable with the authentication TXT records required. If the certificate exists on the host and is not within the renewal period, it should do nothing. 3) letsencrypt-install-txt-record This role runs on the adns server. It installs the TXT records generated in step 2 to the acme.opendev.org domain and then refreshes the server. Hosts wanting certificates will have pre-provisioned CNAME records for _acme-challenge.host.opendev.org pointing to acme.opendev.org. 4) letsencrypt-create-certs This role runs on each host, reading the same variable as in step 2. However this time the acme.sh tool is run to authenticate and create the certificates, which should now work correctly via the TXT records from step 3. After this, the host will have the full certificate material. Testing is added via testinfra. For testing purposes requests are made to the staging letsencrypt servers and a self-signed certificate is provisioned in step 4 (as the authentication is not available during CI). We test that the DNS TXT records are created locally on the CI adns server, however. Related-Spec: https://review.openstack.org/587283 Change-Id: I1f66da614751a29cc565b37cdc9ff34d70fdfd3f
2019-02-14 08:10:51 +11:00
- group_vars/letsencrypt.yaml
- group_vars/meetpad.yaml
- group_vars/jvb.yaml
- group_vars/refstack.yaml
- group_vars/registry.yaml
- group_vars/control-plane-clouds.yaml
- group_vars/afs-client.yaml
- group_vars/zuul-lb.yaml
- group_vars/zuul.yaml
- group_vars/zuul-executor.yaml
- group_vars/zuul-merger.yaml
- group_vars/zuul-scheduler.yaml
- group_vars/zuul-web.yaml
- host_vars/bridge.openstack.org.yaml
Migrate codesearch site to container The hound project has undergone a small re-birth and moved to https://github.com/hound-search/hound which has broken our deployment. We've talked about leaving codesearch up to gitea, but it's not quite there yet. There seems to be no point working on the puppet now. This builds a container than runs houndd. It's an opendev specific container; the config is pulled from project-config directly. There's some custom scripts that drive things. Some points for reviewers: - update-hound-config.sh uses "create-hound-config" (which is in jeepyb for historical reasons) to generate the config file. It grabs the latest projects.yaml from project-config and exits with a return code to indicate if things changed. - when the container starts, it runs update-hound-config.sh to populate the initial config. There is a testing environment flag and small config so it doesn't have to clone the entire opendev for functional testing. - it runs under supervisord so we can restart the daemon when projects are updated. Unlike earlier versions that didn't start listening till indexing was done, this version now puts up a "Hound is not ready yet" message when while it is working; so we can drop all the magic we were doing to probe if hound is listening via netstat and making Apache redirect to a status page. - resync-hound.sh is run from an external cron job daily, and does this update and restart check. Since it only reloads if changes are made, this should be relatively rare anyway. - There is a PR to monitor the config file (https://github.com/hound-search/hound/pull/357) which would mean the restart is unnecessary. This would be good in the near and we could remove the cron job. - playbooks/roles/codesearch is unexciting and deploys the container, certificates and an apache proxy back to localhost:6080 where hound is listening. I've combined removal of the old puppet bits here as the "-codesearch" namespace was already being used. Change-Id: I8c773b5ea6b87e8f7dfd8db2556626f7b2500473
2020-11-17 17:13:46 +11:00
- host_vars/codesearch01.opendev.org.yaml
- host_vars/etherpad01.opendev.org.yaml
letsencrypt support This change contains the roles and testing for deploying certificates on hosts using letsencrypt with domain authentication. From a top level, the process is implemented in the roles as follows: 1) letsencrypt-acme-sh-install This role installs the acme.sh tool on hosts in the letsencrypt group, along with a small custom driver script to help parse output that is used by later roles. 2) letsencrypt-request-certs This role runs on each host, and reads a host variable describing the certificates required. It uses the acme.sh tool (via the driver) to request the certificates from letsencrypt. It populates a global Ansible variable with the authentication TXT records required. If the certificate exists on the host and is not within the renewal period, it should do nothing. 3) letsencrypt-install-txt-record This role runs on the adns server. It installs the TXT records generated in step 2 to the acme.opendev.org domain and then refreshes the server. Hosts wanting certificates will have pre-provisioned CNAME records for _acme-challenge.host.opendev.org pointing to acme.opendev.org. 4) letsencrypt-create-certs This role runs on each host, reading the same variable as in step 2. However this time the acme.sh tool is run to authenticate and create the certificates, which should now work correctly via the TXT records from step 3. After this, the host will have the full certificate material. Testing is added via testinfra. For testing purposes requests are made to the staging letsencrypt servers and a self-signed certificate is provisioned in step 4 (as the authentication is not available during CI). We test that the DNS TXT records are created locally on the CI adns server, however. Related-Spec: https://review.openstack.org/587283 Change-Id: I1f66da614751a29cc565b37cdc9ff34d70fdfd3f
2019-02-14 08:10:51 +11:00
- host_vars/letsencrypt01.opendev.org.yaml
- host_vars/letsencrypt02.opendev.org.yaml
- host_vars/lists.openstack.org.yaml
- host_vars/lists.katacontainers.io.yaml
- host_vars/gitea99.opendev.org.yaml
- host_vars/grafana01.opendev.org.yaml
- host_vars/mirror01.openafs.provider.opendev.org.yaml
- host_vars/mirror02.openafs.provider.opendev.org.yaml
- host_vars/mirror-update01.opendev.org.yaml
- host_vars/paste99.opendev.org.yaml
- host_vars/refstack01.openstack.org.yaml
- host_vars/review99.opendev.org.yaml
- name: Display group membership
command: ansible localhost -m debug -a 'var=groups'
- name: Run base.yaml
shell: 'set -o pipefail && ansible-playbook -f 50 -v /home/zuul/src/opendev.org/opendev/system-config/playbooks/base.yaml 2>&1 | tee /var/log/ansible/base.yaml.log'
args:
executable: /bin/bash
- name: Run bridge service playbook
shell: 'set -o pipefail && ansible-playbook -v /home/zuul/src/opendev.org/opendev/system-config/playbooks/service-bridge.yaml 2>&1 | tee /var/log/ansible/service-bridge.yaml.log'
args:
executable: /bin/bash
- name: Run dstat logger playbook
shell: 'set -o pipefail && ansible-playbook -v /home/zuul/src/opendev.org/opendev/system-config/playbooks/service-dstatlogger.yaml 2>&1 | tee /var/log/ansible/service-dstatlogger.yaml.log'
args:
executable: /bin/bash
Base work for exporting encrypted logs Our production jobs currently only put their logging locally on the bastion host. This means that to help maintain a production system, you effectively need full access to the bastion host to debug any misbehaviour. We've long discussed publishing these Ansible runs as public logs, or via a reporting system (ARA, etc.) but, despite our best efforts at no_log and similar, we are not 100% sure that secret values may not leak. This is the infrastructure for an in-between solution, where we publish the production run logs encrypted to specific GPG public keys. Here we are capturing and encrypting the logs of the system-config-run-* jobs, and providing a small download script to automatically grab and unencrypt the log files. Obviously this is just to exercise the encryption/log-download path for these jobs, as the logs are public. Once this has landed, I will propose similar for the production jobs (because these are post-pipeline this takes a bit more fiddling and doens't run in CI). The variables will be setup in such a way that if someone wishes to help maintain a production system, they can add their public-key and then add themselves to the particular infra-prod-* job they wish to view the logs for. It is planned that the extant operators will be in the default list; however this is still useful over the status quo -- instead of having to search through the log history on the bastion host when debugging a failed run, they can simply view the logs from the failing build in Zuul directly. Depends-On: https://review.opendev.org/c/zuul/zuul-jobs/+/828818/ Change-Id: I5b9f9dd53eb896bb542652e8175c570877842584
2022-02-11 12:06:13 +11:00
- name: Run playbook
when: run_playbooks is defined
loop: "{{ run_playbooks }}"
shell: "set -o pipefail && ansible-playbook -f 50 -v /home/zuul/src/opendev.org/opendev/system-config/{{ item }} 2>&1 | tee /var/log/ansible/{{ item | basename }}.log"
args:
executable: /bin/bash
Base work for exporting encrypted logs Our production jobs currently only put their logging locally on the bastion host. This means that to help maintain a production system, you effectively need full access to the bastion host to debug any misbehaviour. We've long discussed publishing these Ansible runs as public logs, or via a reporting system (ARA, etc.) but, despite our best efforts at no_log and similar, we are not 100% sure that secret values may not leak. This is the infrastructure for an in-between solution, where we publish the production run logs encrypted to specific GPG public keys. Here we are capturing and encrypting the logs of the system-config-run-* jobs, and providing a small download script to automatically grab and unencrypt the log files. Obviously this is just to exercise the encryption/log-download path for these jobs, as the logs are public. Once this has landed, I will propose similar for the production jobs (because these are post-pipeline this takes a bit more fiddling and doens't run in CI). The variables will be setup in such a way that if someone wishes to help maintain a production system, they can add their public-key and then add themselves to the particular infra-prod-* job they wish to view the logs for. It is planned that the extant operators will be in the default list; however this is still useful over the status quo -- instead of having to search through the log history on the bastion host when debugging a failed run, they can simply view the logs from the failing build in Zuul directly. Depends-On: https://review.opendev.org/c/zuul/zuul-jobs/+/828818/ Change-Id: I5b9f9dd53eb896bb542652e8175c570877842584
2022-02-11 12:06:13 +11:00
- name: Build list of playbook logs
find:
paths: '/var/log/ansible'
patterns: '*.yaml.log'
register: _run_playbooks_logs
- name: Encrypt playbook logs
when: run_playbooks is defined
include_role:
name: encrypt-logs
vars:
encrypt_logs_files: '{{ _run_playbooks_logs.files | map(attribute="path") | list }}'
encrypt_logs_artifact_path: 'bridge.openstack.org/ansible'
encrypt_logs_download_script_path: '/var/log/ansible'
- name: Run test playbook
when: run_test_playbook is defined
shell: "set -o pipefail && ANSIBLE_ROLES_PATH=/home/zuul/src/opendev.org/opendev/system-config/playbooks/roles ansible-playbook -v /home/zuul/src/opendev.org/opendev/system-config/{{ run_test_playbook }} 2>&1 | tee /var/log/ansible/{{ run_test_playbook | basename }}.log"
args:
executable: /bin/bash
- name: Generate testinfra extra data fixture
set_fact:
testinfra_extra_data:
zuul_job: '{{ zuul.job }}'
zuul: '{{ zuul }}'
- name: Write out testinfra extra data fixture
copy:
content: '{{ testinfra_extra_data | to_nice_yaml(indent=2) }}'
dest: '/home/zuul/testinfra_extra_data_fixture.yaml'
- name: Make screenshots directory
file:
path: '/var/log/screenshots'
state: directory
- name: Return screenshots artifact
zuul_return:
data:
zuul:
artifacts:
- name: Screenshots
url: "bridge.openstack.org/screenshots"
- name: Allow PBR's git calls to operate in system-config, despite not owning it
command: git config --global safe.directory /home/zuul/src/opendev.org/opendev/system-config
- name: Run and collect testinfra
block:
- name: Run testinfra to validate configuration
include_role:
name: tox
vars:
tox_envlist: testinfra
# This allows us to run from external projects (like testinfra
# itself)
tox_environment:
TESTINFRA_EXTRA_DATA: '/home/zuul/testinfra_extra_data_fixture.yaml'
zuul_work_dir: src/opendev.org/opendev/system-config
always:
- name: Return testinfra report artifact
zuul_return:
data:
zuul:
artifacts:
- name: testinfra results
url: "bridge.openstack.org/test-results.html"