@@ -11,152 +11,85 @@ How to upgrade persistent instances (Amazon AWS)
11
11
This article describes how to upgrade persistent instances (e.g. copr-fe-dev) to
12
12
a new Fedora version.
13
13
14
+ TODO: schedule outage.
14
15
15
16
Requirements
16
17
============
17
18
18
- * access to `Amazon AWS `_
19
- * ssh access to batcave01
20
- * permissions to update aws.fedoraproject.org DNS records
21
-
19
+ * access to the team's `Amazon AWS account `_, and having that account properly
20
+ configured according to the `README.md <helper playbook repository _>`_
21
+ * permissions to run playbooks on `batcave01 <playbook SOP _>`_
22
22
23
23
24
24
Pre-upgrade
25
25
===========
26
26
27
- The goal is to do as much work pre-upgrade as possible while focusing
28
- only on important things and not creating a work overload with tasks,
29
- that can be done post-upgrade.
27
+ The goal is to do as much work pre-upgrade as possible, while focusing on
28
+ as short ** outage window ** as possible, and still doing only important things
29
+ (and not creating a work that can be done post-upgrade) .
30
30
31
31
Don't do the pre-upgrade too long before the actual upgrade. Ideally a couple of
32
32
hours or a day before.
33
33
34
34
35
- Launch a new instance
36
- ---------------------
37
-
38
- First, login into `Amazon AWS `_, otherwise the following step will not
39
- work. Once you are logged-in, feel free to close the page.
40
-
41
-
42
- 1. Choose AMI
43
- .............
44
-
45
- Navigate to the `Cloud Base Images `_ download page and scroll down to
46
- the section with cloud base images for Amazon public cloud. Use
47
- ``Click to launch `` button to launch an instance from the x86_64
48
- AMI. Select the US East (N. Virginia) region.
49
-
50
- You will get redirected to the Amazon AWS page.
51
-
52
-
53
- 2. Name and tags
54
- ................
55
-
56
- - Set ``Name `` and add ``-new `` suffix (e.g. ``copr-distgit-dev-new ``
57
- or ``copr-distgit-prod-new ``)
58
- - Set ``CoprInstance `` to ``devel `` or ``production ``
59
- - Set ``CoprPurpose `` to ``infrastructure ``
60
- - Set ``FedoraGroup `` to ``copr ``
61
-
62
-
63
- 3. Application and OS Images (Amazon Machine Image)
64
- ...................................................
65
-
66
- Skip this section, we already chose the correct AMI from the Fedora
67
- website.
68
-
69
-
70
- 4. Instance type
71
- ................
72
-
73
- Currently, we use the following instance types:
74
-
75
- +----------------+-------------+-------------+
76
- | | Dev | Production |
77
- +================+=============+=============+
78
- | **frontend ** | t3a.medium | t3a.xlarge |
79
- +----------------+-------------+-------------+
80
- | **backend ** | t3a.medium | m5a.4xlarge |
81
- +----------------+-------------+-------------+
82
- | **keygen ** | t3a.small | t3a.xlarge |
83
- +----------------+-------------+-------------+
84
- | **distgit ** | t3a.medium | t3a.medium |
85
- +----------------+-------------+-------------+
86
- | **pulp ** | t3a.medium | TODO |
87
- +----------------+-------------+-------------+
88
-
89
- When more power is needed, please use the `ec2instances.info `_ comparator to get
90
- the cheapest available instance type according to our needs.
91
-
92
-
93
- 5. Key pair (login)
94
- ...................
95
-
96
- - Make sure to use existing key pair named ``Ansible Key ``. This allows us to
97
- run the playbooks on ``batcave01 `` box against the newly spawned VM.
98
-
99
-
100
- 6. Network settings
101
- ...................
102
-
103
- - Click the ``Edit `` button in the box heading to show more options
104
- - Select VPC ``vpc-0af***********972 ``
105
- - Select ``Subnet `` to be ``us-east-1c ``
106
- - Switch ``Auto-assign IPv6 IP `` to ``Enable ``
107
- - Switch to ``Select existing security group `` and pick one of
108
-
109
- - ``copr-frontend-sg ``
110
- - ``copr-backend-sg ``
111
- - ``copr-distgit-sg ``
112
- - ``copr-keygen-sg ``
113
- - ``copr-pulp-sg ``
114
-
115
-
116
- 7. Configure storage
117
- ....................
118
-
119
- - Click the ``Advanced `` button in the box heading to show more options
120
- - Update the ``Size (GiB) `` of the root partition
121
-
122
- +----------------+-------------+-------------+
123
- | | Dev | Production |
124
- +================+=============+=============+
125
- | **frontend ** | 50G | 50G |
126
- +----------------+-------------+-------------+
127
- | **backend ** | 20G | 100G |
128
- +----------------+-------------+-------------+
129
- | **keygen ** | 10G | 20G |
130
- +----------------+-------------+-------------+
131
- | **distgit ** | 20G | 80G |
132
- +----------------+-------------+-------------+
133
- | **pulp ** | 20G | TODO |
134
- +----------------+-------------+-------------+
135
-
136
- - Turn on the ``Encrypted `` option
137
- - Select ``KMS key `` to whatever is ``(default) ``
138
-
139
-
140
- 8. Advanced details
141
- ...................
142
-
143
- - ``Termination protection `` - ``Enable ``
144
-
35
+ Preparation
36
+ -----------
145
37
146
- 9. Launch instance
147
- ..................
38
+ Make sure you have the `helper playbook repository `_ cloned locally, step into
39
+ the clone directory.
40
+
41
+ At this point, please review ``dev.yml ``, ``prod.yml `` and ``all.yml ``
42
+ configuration in the ``./group_vars `` directory. Namely review all the
43
+ ``old_instance_id ``, ``old_network_id `` and data volume IDs, **these REALLY NEED
44
+ to match EC2 reality! **
45
+
46
+ You are going to run these playbooks on your machine::
47
+
48
+ play-vm-migration-01-new-box.yml
49
+ play-vm-migration-02-migrate-backend-box.yml
50
+ play-vm-migration-02-migrate-non-backend-box.yml
51
+ play-vm-migration-03-rename-instances.yml
52
+
53
+ While doing so, you will have to specify two Ansible variables explicitly,
54
+ ``copr_instance `` (to either ``dev `` or ``prod `` string) and ``server_id `` (to
55
+ one of ``frontend ``, ``backend ``, ``distgit `` or ``keygen ``). Example command
56
+ will look like::
57
+
58
+ $ opts=( -e copr_instance=dev -e server_id=keygen )
59
+ $ ansible-playbook play-vm-migration-01-new-box.yml "${opts[@]}"
60
+
61
+ Please realize AMI (golden images) you want to use when starting new instances,
62
+ we typically upgrade to ``Fedora N+2 ``, e.g. we migrate the infrastructure from
63
+ Fedora 37 to Fedora 39. Navigate to the `Cloud Base Images `_ download page, see
64
+ the section for **Intel and AMD x86_64 systems **, click the button next to the
65
+ **Fedora Cloud 39 AWS ** column (JavaScript needs to be enabled!). Note the
66
+ ``ami-* `` ID in the **US East (N. Virginia) ** region (e.g.
67
+ ``ami-0746fc234df9c1ee0 ``). This ``ami-* `` needs to be specified in
68
+ ``group_vars/all.yml ``, and both ``group_vars/{dev,prod}.yml ``
69
+ need to correctly refer it.
70
+
71
+ You can double check other machine parameters like instance types (when more
72
+ power is needed, please use the `ec2instances.info `_ comparator to get
73
+ the cheapest available instance type according to our needs), naming, tags, IP
74
+ addresses, root volume sizes, etc. But typically, the defaults will be good
75
+ as-is.
148
76
149
- Click ``Launch instance `` in the right panel.
77
+ .. note ::
78
+ The ``group_vars/ `` directory is the ultimate source of thruth for the Fedora
79
+ Copr instance, so please update the configuration later anytime you change
80
+ the instance parameters.
150
81
82
+ Make sure to use the existing key pair named ``Ansible Key ``. This allows us to
83
+ **first ** run the playbooks on ``batcave01 `` box against the newly spawned VM
84
+ (the playbook then enables the Fedora Copr team members to ssh using their own
85
+ keys, as uploaded to FAS).
151
86
152
- Add names for the root volumes
153
- ------------------------------
87
+ Launch new instances
88
+ --------------------
154
89
155
- Once the instance is created, go to its details, switch to the
156
- ``Storage `` tab, and go through all attached volumes. Set the ``Name ``
157
- tag for each of them. Use the name of the instance as a prefix, e.g.
158
- ``copr-keygen-dev-root ``, ``copr-frontend-prod-root ``, etc.
90
+ This should be as simple as::
159
91
92
+ $ ansible-playbook play-vm-migration-01-new-box.yml "${opts[@]}"
160
93
161
94
Backup the current letsencrypt certificates
162
95
-------------------------------------------
@@ -170,21 +103,26 @@ Copy the certificate files by running the playbooks **against the current (old)
170
103
copr stack ** (all machines). There's the ``-t certbot `` ansible tag that allows
171
104
you to speedup the playbook runs.
172
105
173
-
174
- Pre-prepare the new VM
175
- ----------------------
106
+ Pre-prepare the new VM - backend only!
107
+ --------------------------------------
176
108
177
109
.. note ::
178
110
179
- Backend - It's possible to run the playbook against the new copr-backend
180
- server before we actually shut-down the old one. But to make sure that
181
- ansible won't complain, we need
111
+ It's possible to run the playbook against the new copr-backend server before
112
+ we actually shut the old one down. But to make sure that ansible won't
113
+ complain, we need
114
+
115
+ - A temporary volume attached to the new box providing an ext4 filesystem
116
+ with ``copr-repo `` label.
117
+
118
+ - An existing temporary hostname (with existing DNS record) to execute the
119
+ playbook against it.
120
+
121
+ The Volume, DNS record and a corresponding Elastic IP for this purpose is
122
+ already prepared. The ``play-vm-migration-01-new-box.yml `` playbook should
123
+ already make them available.
124
+
182
125
183
- - A volume attached to the new box with label 'copr-repo'. Use already
184
- existing volume named ``data-copr-be-dev-initial-playbook-run ``
185
- - An existing complementary DNS record (``copr-be-temp `` or
186
- ``copr-be-dev-temp ``). poiting to the non-elastic IP of the new
187
- server. See the `DNS SOP `_.
188
126
189
127
190
128
Note the private IP addresses
@@ -514,7 +452,9 @@ Close the infrastructure ticket, the upgrade is done.
514
452
.. _`Fedora infrastructure issue #7966` : https://pagure.io/fedora-infrastructure/issue/7966
515
453
.. _`fedora devel` : https://lists.fedorahosted.org/archives/list/devel@lists.fedoraproject.org/
516
454
.. _`copr devel` : https://lists.fedoraproject.org/archives/list/copr-devel@lists.fedorahosted.org/
517
- .. _`Amazon AWS` : https://id.fedoraproject.org/saml2/SSO/Redirect?SPIdentifier=urn:amazon:webservices&RelayState=https://console.aws.amazon.com
518
- .. _`Cloud Base Images` : https://alt. fedoraproject.org/cloud/
455
+ .. _`Amazon AWS account ` : https://id.fedoraproject.org/saml2/SSO/Redirect?SPIdentifier=urn:amazon:webservices&RelayState=https://console.aws.amazon.com
456
+ .. _`Cloud Base Images` : https://fedoraproject.org/cloud/download /
519
457
.. _`DNS SOP` : https://docs.fedoraproject.org/en-US/infra/sysadmin_guide/dns/
520
458
.. _`ec2instances.info` : https://ec2instances.info/
459
+ .. _`helper playbook repository` : https://github.com/fedora-copr/ansible-fedora-copr
460
+ .. _`playbook SOP` : https://docs.fedoraproject.org/en-US/infra/sysadmin_guide/ansible/
0 commit comments