system-config

Author	SHA1	Message	Date
Zuul	6fc894b26b	Merge "Wait for ipv6 addrs when launching nodes"	2020-09-22 19:39:14 +00:00
Clark Boylan	2f9b31a93f	Wait for ipv6 addrs when launching nodes When launching new nodes with launch-node.py we need to wait for ipv6 addresses to configure prior to running ping6 sanity checks. The reason for this is some clouds rely on router advertisements to configure ipv6 addrs on VMs. These happen periodically and the VM may not have its ipv6 address configured yet when we try to ping6 otherwise. Change-Id: I77515fec481e4146765630cd230dd3c2c296958f	2020-09-04 14:25:59 -07:00
Ian Wienand	96dbd1a34e	launch: move old scripts out of top-level These don't make any sense in the top-level these days. Once upon a time we used to use these as node scripts to bring up testing nodes (I think). The important thing is they're not used now. Change-Id: Iffa6c6bee647f1a242e9e71241d829c813f2a3e7	2020-09-03 09:55:42 +10:00
Monty Taylor	3ffeba5a20	Fix launch-node to work with the new inventory reorg We moved some of these, but didnt' catch in launch-node where we reference them. Change-Id: I5939fc0c3cc5f49a99d99f91bca12186a5be2652	2020-06-11 17:22:23 -05:00
Monty Taylor	ebae022d07	Use project-config from zuul instead of direct clones We use project-config for gerrit, gitea and nodepool config. That's cool, because can clone that from zuul too and make sure that each prod run we're doing runs with the contents of the patch in question. Introduce a flag file that can be touched in /home/zuulcd that will block zuul from running prod playbooks. By default, if the file is there, zuul will wait for an hour before giving up. Rename zuulcd to zuul To better align prod and test, name the zuul user zuul. Change-Id: I83c38c9c430218059579f3763e02d6b9f40c7b89	2020-04-15 12:29:33 -05:00
Ian Wienand	c54efaeeaa	launch-node.py : use new(?) image name The "PVHVM" image appears to have disappeared from RAX, replaced with a "Cloud" image. Maybe I haven't looked in the right place, but I can't find any info on if, why or when this was updated. But I started a server with the "Cloud" image and it seems the same as the PVHVM image to me; hdparm showed read speads the same as a older server and dd writes to a file were the same speed (recorded below for posterity). ianw@nb04:~$ dd if=/dev/zero of=testfile bs=1G count=1 oflag=direct 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 5.21766 s, 206 MB/s ianw@nb04:~$ sudo hdparm -Tt /dev/xvda /dev/xvda: Timing cached reads: 16428 MB in 1.99 seconds = 8263.05 MB/sec Timing buffered disk reads: 752 MB in 3.00 seconds = 250.65 MB/sec From looking at dmesg it has [ 0.000000] DMI: Xen HVM domU, BIOS 4.1.5 11/28/2013 [ 0.000000] Hypervisor detected: Xen HVM [ 0.000000] Xen version 4.1. [ 0.000000] Xen Platform PCI: I/O protocol version 1 [ 0.000000] Netfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated NICs. [ 0.000000] Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated disks. which, if [1] is anything to go by suggests it is in PVHVM mode anyway. tl;dr seems like the image name changed. [1] https://xen-orchestra.com/blog/debian-pvhvm-vs-pv/ Change-Id: I4ff14e7e36f59a9487c32fdc6940e8b8a93459e6	2020-03-18 16:54:44 +11:00
Ian Wienand	8980905319	launch-node.py : make sure new inventory comes last If you happen to be booting a replacement host, you don't want ansible to pick up the current host from the current inventory. Put the new server's inventory last in the list so it overrides and before it. Change-Id: I3f1edfb95924dae0256f969bc740f1141e291c25	2020-02-07 14:09:00 +11:00
Zuul	b0ea150b89	Merge "Correct emergency file reference in launch script"	2019-07-31 23:22:38 +00:00
Clark Boylan	b1de301261	Use public_v4 addr when ignoring ipv6 In our launch node script we have the option to ignore ipv6 to deal with clouds like ovh that report an ipv6 address but don't actually provide that data to the instance so it cannot configure ipv6. When we ignore ipv6 we should not try to use the ipv6 address at all. Use the public_v4 address in this case when writing out an ansible inventory to run the base.yaml playbook when launching the node. Otherwise we could use ipv6 which doesn't work. Change-Id: I2ce5cc0db9852d3426828cf88965819f88b3ebd5	2019-07-30 15:00:53 -07:00
Jeremy Stanley	4c04ad5436	Correct emergency file reference in launch script The launch script is referring to the wrong path for the emergency inventory. Also correct the references in the sysadmin guide and update the example for using it. Change-Id: I80bdbd440ec451bcd6fb1a3eb552ffda32407c44	2019-07-26 14:55:32 +00:00
Ian Wienand	f673b71466	launch-node.py : add option to skip ipv6 address checks As noted inline, this needs to be skipped on OVH (and I always forget, and debug this over and over when launching a mirror node there :). Change-Id: I07780e29f5fef75cdbab3b504f278387ddc4b13f	2019-06-26 18:28:28 +10:00
Clark Boylan	4e9fab65b7	Check spamhaus pbl when launching new servers Add a reminder to launch node script to check the spamhaus pbl when launching a new server. Change-Id: I1daaccfb0b90fb46b29c035f8f4fd5788dffe627	2019-06-11 13:12:44 -07:00
Ian Wienand	87d2cea6a7	launch.py: Fix inventory list This was introduced with Ia67e65d25a1d961b619aa445303015fd577dee57 Passing "-i file1,file2,file.." makes Ansible think that the inventory argument is a list of hostnames. Separate out the "-i" flags so it reads each file as desired. Change-Id: I92c9a74de6552968da6c919074d84f2911faf4d4	2019-05-20 13:09:40 +10:00
Ian Wienand	86d0d78255	Add --flush-cache to launch.py ansible I managed to leave off the "--image" flag for a Xenial host, so the script created a Bionic host by default. I let that play out, deleted the host and tried again with the correct image, but what ended up happening was the fact cache thought this new host was Bionic, and several ansible roles therefore ran thinking this too, and we ended up with a bad Xenial/Bionic mashup. Clear the cache on node launch to avoid this sort of thing again. I have launched a node with this new option, and it worked. Change-Id: Ie37f562402bed3846f27fbdd4441b5f4dcec7eb2	2019-03-19 17:09:41 +11:00
Monty Taylor	eb6c3c2f1a	Add global inventory to launch_node Passing the -i to the jobdir means we're overriding the inventory. This means variables that come from the /etc/ansible vars, like sysadmins, are missing. Add the global inventory to the command line for ansible-playbook. We have --limit specified from '-l' - so we should still only run on the host in question. Change-Id: Ia67e65d25a1d961b619aa445303015fd577dee57	2019-03-08 17:53:28 +00:00
Monty Taylor	ecbe164bae	Clean up boot-from-volume volumes on error When we're booting boot-from-volume servers and there are errors, we leave the root volume around. Clean up after ourselves. Change-Id: I6341cdbf21d659d043592f92ddf8ecf6be997802	2019-02-28 17:20:21 +00:00
Clark Boylan	52a4bae170	Apply package updates before we reboot in launch-node When launching a new server we should make sure that all available package updates are installed before we reboot the server. This way we get available security updates applied to things like our kernel. This change adds a new playbook that runs the unattended-upgrade command on debuntu servers. Will need to add support for other platforms in a followup change. Change-Id: Idc88dc33afdd209c388452493e6a7f5731fa0974	2019-02-26 14:17:23 -08:00
Ian Wienand	908e6cfd86	launch-node.py : add timeout option Some clouds may be a little slower than others building images and to override the create_server default timeout of 3 minutes (180) you have to hand edit -- add a global timeout option and use that consistently. Change-Id: I66032ef929746739d07dca3fd178b8c43bb8174c	2018-09-12 12:32:29 +10:00
James E. Blair	d50c384cee	Update launch README for bridge Remove the section on launching nodes in the jenkins tenant. That never happens. Remove the bits about groups and sudo, as they aren't relevant any more. Remove the unused os_client_config import. Change-Id: I676bb7450ec80df73b76ee7841f78eadbe179183	2018-09-06 14:38:25 -07:00
Zuul	ba3e164b1e	Merge "Unlink proper path to ansible cache"	2018-08-29 19:21:02 +00:00
Zuul	37adace72b	Merge "Use python3 and modern ansible for launch node"	2018-08-29 19:21:01 +00:00
Clark Boylan	8627e4e06d	Unlink proper path to ansible cache os.listdir returns dirents relative to the dir being listed. We need to give full path to these entries when unlinking them. Do this by joining the inventory_cache_dir path to each inventory_cache file. Change-Id: I78376cfa3b2aa92641f2685b08616660f523dfaf	2018-08-27 12:33:51 -07:00
Zuul	4dfd604936	Merge "Add volume-size flag to launch-node"	2018-08-27 19:17:44 +00:00
Clark Boylan	53e54af451	Use python3 and modern ansible for launch node Update the launch node readme and script to use python3 on the new bridge node. There is no python2. Also update ansible to pull in python3 support. The version we had been using wasn't python3 happy. Change-Id: I6122160eb70eb6b5f299a8adb6478a9046ff1725	2018-08-27 11:38:46 -07:00
Monty Taylor	18e45a99dd	Update launch-node for the new Ansible world Replace launch-node.py with launch-node-ansible.py. Update it to delete the inventory cache correctly. Also, update the docs to list Bionic by default rather than Trusty. Change-Id: Iadda897b7e71dc12c8db4ced120894054169bbb8	2018-08-17 11:53:52 -05:00
Monty Taylor	0bb4232586	Add base playbooks and roles to bootstrap a new server We want to launch a new bastion host to run ansible on. Because we're working on the transition to ansible, it seems like being able to do that without needing puppet would be nice. This gets user management, base repo setup and whatnot installed. It doesn't remove them from the existing puppet, nor does it change the way we're calling anything that currently exists. Add bridge.openstack.org to the disabled group so that we don't try to run puppet on it. Change-Id: I3165423753009c639d9d2e2ed7d9adbe70360932	2018-08-01 14:57:44 -07:00
Jeremy Stanley	02e4eb0fa8	Finish switching to using new ansible inventory Change I76b1099bf0cf3bfead17f96e456cdce87d0e8a49 altered the name of the inventory script, so reflect that in the corresponding subprocess call in launch-node.py and a comment in the expand-groups.sh script. Change-Id: I4c2c762716813b5d59dcc1b623f5988c8aa7d490	2018-06-18 18:27:34 +00:00
Monty Taylor	1f138011f9	Update connection construction in launch_node The dns.py file uses openstack.connect to make the Connection but launch_node.py was still using shade.OpenStackCloud, so when the connection was passed to dns.py it was trying to use an SDK property but getting a Shade object. This is because while sdk has been updated with all of the shade objects, we haven't updated shade yet to provide the sdk version of the object, so shade objects from sdk have things shade objects from shade don't yet have. Update launch_node.py to use the same Connection construction that dns.py does. Change-Id: I1c6bfe54f94effe0e592280ba179f61a6d983e7a	2018-04-03 12:49:43 -05:00
Paul Belanger	33e1fad0ac	Add volume-size flag to launch-node When booting servers with --boot-from-volume (vexxhost) it is helpful to also provide the size of the volume we want to use. Change-Id: I478e40ba129f267c0d2d5b54e90a6f84716018f0 Signed-off-by: Paul Belanger <pabelanger@redhat.com>	2018-03-20 17:34:45 -04:00
Paul Belanger	2ce7da2301	Bump default image to xenial to launch-node.py We're now launching xenial for control servers, lets update the defaults. Change-Id: I14dc26673c290ae37b7a9ef016d7a343d2763efe Signed-off-by: Paul Belanger <pabelanger@redhat.com>	2017-09-11 18:42:51 -04:00
Ian Wienand	8791e9fafa	launch-node: ignore failure to restart eth0 eth0 might not exist, such as on Xenial hosts with interfaced-based names. Since this is a bit of platform/provider specific hack, just ignore failures. Change-Id: Ie18b7f49ea2f1b72b496c61ac2576ae53f5ad3eb	2017-03-28 13:50:26 +11:00
Jenkins	44f909dc7c	Merge "launch: add puppet environment option"	2016-09-16 01:21:15 +00:00
Jenkins	f2ac215342	Merge "launch: stream syslog on remote host"	2016-09-16 00:23:09 +00:00
James E. Blair	e8da64fa94	launch: add puppet environment option Change-Id: I3e0ce41e399cc5e3fdb8153e5bac5b97fd58f0e1 Depends-On: I1c0b25a838b6fef6487fc3a8e5b78b73a3ae305a	2016-09-15 15:53:08 -07:00
James E. Blair	ccfe5111c8	launch: stream syslog on remote host So that operators have a clue what's going on during the long initial puppet run. Change-Id: If310c3420d942c79c8b04d3e9ce68f39081cc507	2016-09-15 15:46:33 -07:00
Jenkins	2d0f60a3dd	Merge "launch: Use popen rather than check_output"	2016-09-15 22:41:30 +00:00
James E. Blair	334715f12a	launch: Use popen rather than check_output This way we are able to stream the output from commands as they are received for better debugging. We can also move some new debug statements to inside of the new run() function so they are more automatic. Change-Id: I484f5cf70aa15923ea4bb866f3be536b2e8ed4ed	2016-09-15 15:17:50 -07:00
Ian Wienand	20afe1a62f	launch-node.py: More verbose logging One problem with "shell script as python" is that there's no equivalent of "-x" in shell, which makes it really hard to extract what's being called and where output came from. This adds a bit more verbose logging around the ssh calls to try and help someone parsing the logs. Change-Id: I85e2415b47e044cfa1c678fc7786b4891fa1f93e	2016-08-24 11:06:57 +10:00
Ian Wienand	4ac715e0f3	launch-node.py: set ansible log path Avoid a bunch of warnings about unwritable /var/log/ansible.log (the default) by setting the log path environment variable where we call ansible. Note expand-groups.sh is moved inside the JobDir() context so we can use the environment var there too, as it calls ansible underneath. Change-Id: I575d633a36db8cfb891c8903a7bfbea73a4cfb29	2016-08-24 11:06:04 +10:00
Ian Wienand	ab376eac7a	launch-node.py: save key when failing early Save the key to a file in /tmp when failing early with --keep. Although it is put into the JobDir later, if we fail before that we're locked out of the host. While we're here, make what just happened in an error case a little clearer Change-Id: Ide601e2018302664bc4ad609c4483aa1451b3724	2016-08-24 11:04:45 +10:00
Ian Wienand	4ed43d2754	launch-node.py: restart interface RAX nodes are exhibiting new behaviour of having ipv6 configured but not active. Restart eth0 to pick up the address in /etc/network/interfaces so the ping6's work Change-Id: I6b60bde34cc28ca60c5cbbb41de02cd89354cc32	2016-08-24 06:33:06 +10:00
Jenkins	3addf0b5a9	Merge "Correct launch-node.py --config-drive default"	2016-07-08 08:12:37 +00:00
Monty Taylor	3b9eda9359	Prevent launch-node from breaking generated groups There are potentially two related issues here which can result in an empty generated groups file. The first is that if there are OS_ environment variables set, then os-client-config can create an 'environ' cloud. That cloud then, in most cases here, will not be a valid cloud since it won't be a full config, so iterating over all existing clouds to get their server will fail, meaning that the inventory will be empty meaning that generated groups will then be generated empty. To deal with that, we can consume the newer upstream option that allows the inventory to not bail out if it has a bad cloud, but instead get all of the resources from the clouds that do work. Additionally though, we can do an explicit inventory run so that we can look to see if the inventory run failed, and if so, avoid running the expand-groups.sh script, since we'd be fairly assured that it would be running on top of a bad inventory cache. Change-Id: Ib18987b3083f6addc61934b435d7ecb14aa1d25a	2016-07-07 11:21:14 -05:00
Jeremy Stanley	d66b8f47c4	Correct launch-node.py --config-drive default For --config-drive to actually work as advertised in launch-node.py, it needs to default to False. Otherwise this option is useless. Change-Id: Ib29fa758779e89d3d25399615fd009b836dda598	2016-07-07 15:04:53 +00:00
James E. Blair	df81542135	Use global generated groups in launch-node When launching a server, ansible needs to know what groups the new host is in so that it can copy the appropriate files. Figuring that out is done based on the groups.txt file and the expand-groups script. This change runs that script after creating a host, which will update the global list of expanded groups. That is then symlinked into a temporary inventory directory used by launch-node. The JobDir concept is borrowed from Zuul as a simple way of creating and deleting at the appropriate time a complex temporary directory. Change-Id: Icce083ca67a3473b7d77401142f870fd28dd08f5	2016-06-09 09:23:32 -07:00
Clark Boylan	ba4429022b	Properly handle volumes in launch node We can only get the volume attach device if we are attaching a volume. Check if the volume is being attached and only determine the attachment location in that case to avoid errors. Story: 2000569 Change-Id: I4adc5e23abdfc0627a0850f845e2333d3bd25e63	2016-05-02 15:04:57 -07:00
Clark Boylan	6429922e99	Add support to launch-node for cinder attach Now that we have a shade version of the launch node script adding in support for attaching a cinder volume is simple. Do this so that launching mirrors which rely on cinder volumes is simpler. This updates the mount_volume.sh script to setup the first cinder volume with lvm and mount it under the specified path. It will also install lvm2 pacakges since they may not be present on all base images. This updates the make_swap.sh script to avoid blindly using /dev/vdb as the location for swap as this may be a cinder volume or config drive. We add availability zone, device specification, mount path, and fs label support to shade-launch-node.py as these are all necessary inputs to properly mount a cinder volume in a VM. Change-Id: Ie95fd4bd5fca8df4f8046d43d1333935cad567e3	2016-04-19 11:07:23 -07:00
Monty Taylor	453051aafb	Don't pass OS_CLOUD and OS_REGION_NAME to expand-groups There is a bug in OCC that causes an envvars cloud to be created when the only two env vars are the selectors OS_CLOUD and OS_REGION_NAME. So exclude them from the envionment when running the group creation command. Also, there is a bug in the invocation of the hostname playbook, in that it was passing in the UUID as the target to run against, but we're writing out a name-based inventory. Change-Id: I0b524dc43ec96c6645ae82a090744eab463e7fb9	2016-03-22 19:33:42 -05:00
Paul Belanger	72e045ea16	Give 0775 to launch-node.py Change-Id: If56ca023c6a30735a18c891c6603424c718aeb67 Signed-off-by: Paul Belanger <pabelanger@redhat.com>	2016-03-22 23:59:42 +00:00
Monty Taylor	9e678eeb8e	Don't set the host's name as UUID It looks like we solved the duplicate server problem twice in conflicting ways. Using uuid in the inventory is not needed, bcause we're making a specific inventory for the ansible commands and avoiding the OpenStack inventory. So the ansible run has no idea of any other servers other than the one we're making right now. With that, we can use name as the hostname rather than UUID. Story: 2000520 Change-Id: Idb967e10fc00471923077e4e9caa32fdb4c1cc78	2016-03-22 12:16:10 -05:00

1 2

92 Commits