Project

General

Profile

The ungleich kubernetes infrastructure » History » Version 229

Nico Schottelius, 05/15/2025 05:59 AM

1 22 Nico Schottelius
h1. The ungleich kubernetes infrastructure and ungleich kubernetes manual
2 1 Nico Schottelius
3 3 Nico Schottelius
{{toc}}
4
5 1 Nico Schottelius
h2. Status
6
7 211 Nico Schottelius
This document is **production**.
8
This document is the ungleich kubernetes infrastructure overview as well as the ungleich kubernetes manual.
9 1 Nico Schottelius
10 10 Nico Schottelius
h2. k8s clusters
11
12 123 Nico Schottelius
| Cluster            | Purpose/Setup     | Maintainer | Master(s)                     | argo                                                   | v4 http proxy | last verified |
13
| c0.k8s.ooo         | Dev               | -          | UNUSED                        |                                                        |               |    2021-10-05 |
14
| c1.k8s.ooo         | retired           |            | -                             |                                                        |               |    2022-03-15 |
15
| c2.k8s.ooo         | Dev p7 HW         | Nico       | server47 server53 server54    | "argo":https://argocd-server.argocd.svc.c2.k8s.ooo     |               |    2021-10-05 |
16
| c3.k8s.ooo         | retired           | -          | -                             |                                                        |               |    2021-10-05 |
17
| c4.k8s.ooo         | Dev2 p7 HW        | Jin-Guk    | server52 server53 server54    |                                                        |               |             - |
18
| c5.k8s.ooo         | retired           |            | -                             |                                                        |               |    2022-03-15 |
19
| c6.k8s.ooo         | Dev p6 VM Jin-Guk | Jin-Guk    |                               |                                                        |               |               |
20
| [[p5.k8s.ooo]]     | production        |            | server34 server36 server38    | "argo":https://argocd-server.argocd.svc.p5.k8s.ooo     | -             |               |
21
| [[p5-cow.k8s.ooo]] | production        | Nico       | server47 server51 server55    | "argo":https://argocd-server.argocd.svc.p5-cow.k8s.ooo |               |    2022-08-27 |
22
| [[p6.k8s.ooo]]     | production        |            | server67 server69 server71    | "argo":https://argocd-server.argocd.svc.p6.k8s.ooo     | 147.78.194.13 |    2021-10-05 |
23 184 Nico Schottelius
| [[p6-cow.k8s.ooo]] | production        |            | server134 server135 server136 | "argo":https://argocd-server.argocd.svc.p6in10.k8s.ooo | ?             |    2023-05-17 |
24 177 Nico Schottelius
| [[p10.k8s.ooo]]    | production        |            | server131 server132 server133 | "argo":https://argocd-server.argocd.svc.p10.k8s.ooo    | 147.78.194.12 |    2021-10-05 |
25 123 Nico Schottelius
| [[k8s.ge.nau.so]]  | development       |            | server107 server108 server109 | "argo":https://argocd-server.argocd.svc.k8s.ge.nau.so  |               |               |
26
| [[dev.k8s.ooo]]    | development       |            | server110 server111 server112 | "argo":https://argocd-server.argocd.svc.dev.k8s.ooo    | -             |    2022-07-08 |
27 164 Nico Schottelius
| [[r1r2p15k8sooo|r1.p15.k8s.ooo]] | production | Nico | server120 | | | 2022-10-30 |
28
| [[r1r2p15k8sooo|r2.p15.k8s.ooo]] | production | Nico | server121 | | | 2022-09-06 |
29 162 Nico Schottelius
| [[r1r2p10k8sooo|r1.p10.k8s.ooo]] | production | Nico | server122 | | | 2022-10-30 |
30
| [[r1r2p10k8sooo|r2.p10.k8s.ooo]] | production | Nico | server123 | | | 2022-10-15 |
31
| [[r1r2p5k8sooo|r1.p5.k8s.ooo]] | production | Nico | server137 | | | 2022-10-30 |
32
| [[r1r2p5k8sooo|r2.p5.k8s.ooo]] | production | Nico | server138 | | | 2022-10-30 |
33
| [[r1r2p6k8sooo|r1.p6.k8s.ooo]] | production | Nico | server139 | | | 2022-10-30 |
34
| [[r1r2p6k8sooo|r2.p6.k8s.ooo]] | production | Nico | server140 | | | 2022-10-30 |
35 21 Nico Schottelius
36 1 Nico Schottelius
h2. General architecture and components overview
37
38
* All k8s clusters are IPv6 only
39
* We use BGP peering to propagate podcidr and serviceCidr networks to our infrastructure
40
* The main public testing repository is "ungleich-k8s":https://code.ungleich.ch/ungleich-public/ungleich-k8s
41 18 Nico Schottelius
** Private configurations are found in the **k8s-config** repository
42 1 Nico Schottelius
43
h3. Cluster types
44
45 28 Nico Schottelius
| **Type/Feature**            | **Development**                | **Production**         |
46
| Min No. nodes               | 3 (1 master, 3 worker)         | 5 (3 master, 3 worker) |
47
| Recommended minimum         | 4 (dedicated master, 3 worker) | 8 (3 master, 5 worker) |
48
| Separation of control plane | optional                       | recommended            |
49
| Persistent storage          | required                       | required               |
50
| Number of storage monitors  | 3                              | 5                      |
51 1 Nico Schottelius
52 43 Nico Schottelius
h2. General k8s operations
53 1 Nico Schottelius
54 46 Nico Schottelius
h3. Cheat sheet / external great references
55
56
* "kubectl cheatsheet":https://kubernetes.io/docs/reference/kubectl/cheatsheet/
57
58 214 Nico Schottelius
Some examples:
59
60
h4. Use kubectl to print only the node names
61
62
<pre>
63
kubectl get nodes -o jsonpath='{.items[*].metadata.name}'
64
</pre>
65
66
Can easily be used in a shell loop like this:
67
68
<pre>
69
for host in $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}'); do echo $host; ssh root@${host} uptime; done
70
</pre>
71
72 117 Nico Schottelius
h3. Allowing to schedule work on the control plane / removing node taints
73 69 Nico Schottelius
74
* Mostly for single node / test / development clusters
75
* Just remove the master taint as follows
76
77
<pre>
78
kubectl taint nodes --all node-role.kubernetes.io/master-
79 118 Nico Schottelius
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
80 69 Nico Schottelius
</pre>
81 1 Nico Schottelius
82 117 Nico Schottelius
You can check the node taints using @kubectl describe node ...@
83 69 Nico Schottelius
84 208 Nico Schottelius
h3. Adding taints
85
86
* For instance to limit nodes to specific customers
87
88
<pre>
89
kubectl taint nodes serverXX customer=CUSTOMERNAME:NoSchedule
90
</pre>
91
92 44 Nico Schottelius
h3. Get the cluster admin.conf
93
94
* On the masters of each cluster you can find the file @/etc/kubernetes/admin.conf@
95
* To be able to administrate the cluster you can copy the admin.conf to your local machine
96
* Multi cluster debugging can very easy if you name the config ~/cX-admin.conf (see example below)
97
98
<pre>
99
% scp root@server47.place7.ungleich.ch:/etc/kubernetes/admin.conf ~/c2-admin.conf
100
% export KUBECONFIG=~/c2-admin.conf    
101
% kubectl get nodes
102
NAME       STATUS                     ROLES                  AGE   VERSION
103
server47   Ready                      control-plane,master   82d   v1.22.0
104
server48   Ready                      control-plane,master   82d   v1.22.0
105
server49   Ready                      <none>                 82d   v1.22.0
106
server50   Ready                      <none>                 82d   v1.22.0
107
server59   Ready                      control-plane,master   82d   v1.22.0
108
server60   Ready,SchedulingDisabled   <none>                 82d   v1.22.0
109
server61   Ready                      <none>                 82d   v1.22.0
110
server62   Ready                      <none>                 82d   v1.22.0               
111
</pre>
112
113 18 Nico Schottelius
h3. Installing a new k8s cluster
114 8 Nico Schottelius
115 9 Nico Schottelius
* Decide on the cluster name (usually *cX.k8s.ooo*), X counting upwards
116 28 Nico Schottelius
** Using pXX.k8s.ooo for production clusters of placeXX
117 9 Nico Schottelius
* Use cdist to configure the nodes with requirements like crio
118
* Decide between single or multi node control plane setups (see below)
119 28 Nico Schottelius
** Single control plane suitable for development clusters
120 9 Nico Schottelius
121 28 Nico Schottelius
Typical init procedure:
122 9 Nico Schottelius
123 206 Nico Schottelius
h4. Single control plane:
124
125
<pre>
126
kubeadm init --config bootstrap/XXX/kubeadm.yaml
127
</pre>
128
129
h4. Multi control plane (HA):
130
131
<pre>
132
kubeadm init --config bootstrap/XXX/kubeadm.yaml --upload-certs
133
</pre>
134
135 10 Nico Schottelius
136 29 Nico Schottelius
h3. Deleting a pod that is hanging in terminating state
137
138
<pre>
139
kubectl delete pod <PODNAME> --grace-period=0 --force --namespace <NAMESPACE>
140
</pre>
141
142
(from https://stackoverflow.com/questions/35453792/pods-stuck-in-terminating-status)
143
144 42 Nico Schottelius
h3. Listing nodes of a cluster
145
146
<pre>
147
[15:05] bridge:~% kubectl get nodes
148
NAME       STATUS   ROLES                  AGE   VERSION
149
server22   Ready    <none>                 52d   v1.22.0
150
server23   Ready    <none>                 52d   v1.22.2
151
server24   Ready    <none>                 52d   v1.22.0
152
server25   Ready    <none>                 52d   v1.22.0
153
server26   Ready    <none>                 52d   v1.22.0
154
server27   Ready    <none>                 52d   v1.22.0
155
server63   Ready    control-plane,master   52d   v1.22.0
156
server64   Ready    <none>                 52d   v1.22.0
157
server65   Ready    control-plane,master   52d   v1.22.0
158
server66   Ready    <none>                 52d   v1.22.0
159
server83   Ready    control-plane,master   52d   v1.22.0
160
server84   Ready    <none>                 52d   v1.22.0
161
server85   Ready    <none>                 52d   v1.22.0
162
server86   Ready    <none>                 52d   v1.22.0
163
</pre>
164
165 41 Nico Schottelius
h3. Removing / draining a node
166
167
Usually @kubectl drain server@ should do the job, but sometimes we need to be more aggressive:
168
169 1 Nico Schottelius
<pre>
170 103 Nico Schottelius
kubectl drain --delete-emptydir-data --ignore-daemonsets serverXX
171 42 Nico Schottelius
</pre>
172
173
h3. Readding a node after draining
174
175
<pre>
176
kubectl uncordon serverXX
177 1 Nico Schottelius
</pre>
178 43 Nico Schottelius
179 50 Nico Schottelius
h3. (Re-)joining worker nodes after creating the cluster
180 49 Nico Schottelius
181
* We need to have an up-to-date token
182
* We use different join commands for the workers and control plane nodes
183
184
Generating the join command on an existing control plane node:
185
186
<pre>
187
kubeadm token create --print-join-command
188
</pre>
189
190 50 Nico Schottelius
h3. (Re-)joining control plane nodes after creating the cluster
191 1 Nico Schottelius
192 50 Nico Schottelius
* We generate the token again
193
* We upload the certificates
194
* We need to combine/create the join command for the control plane node
195
196
Example session:
197
198
<pre>
199
% kubeadm token create --print-join-command
200
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash 
201
202
% kubeadm init phase upload-certs --upload-certs
203
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
204
[upload-certs] Using certificate key:
205
CERTKEY
206
207
# Then we use these two outputs on the joining node:
208
209
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash --control-plane --certificate-key CERTKEY
210
</pre>
211
212
Commands to be used on a control plane node:
213
214
<pre>
215
kubeadm token create --print-join-command
216
kubeadm init phase upload-certs --upload-certs
217
</pre>
218
219
Commands to be used on the joining node:
220
221
<pre>
222
JOINCOMMAND --control-plane --certificate-key CERTKEY
223
</pre>
224 49 Nico Schottelius
225 51 Nico Schottelius
SEE ALSO
226
227
* https://stackoverflow.com/questions/63936268/how-to-generate-kubeadm-token-for-secondary-control-plane-nodes
228
* https://blog.scottlowe.org/2019/08/15/reconstructing-the-join-command-for-kubeadm/
229
230 53 Nico Schottelius
h3. How to fix etcd does not start when rejoining a kubernetes cluster as a control plane
231 52 Nico Schottelius
232
If during the above step etcd does not come up, @kubeadm join@ can hang as follows:
233
234
<pre>
235
[control-plane] Creating static Pod manifest for "kube-apiserver"                                                              
236
[control-plane] Creating static Pod manifest for "kube-controller-manager"                                                     
237
[control-plane] Creating static Pod manifest for "kube-scheduler"                                                              
238
[check-etcd] Checking that the etcd cluster is healthy                                                                         
239
error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://[2a0a:e5c0:10:1:225:b3ff:fe20:37
240
8a]:2379 with maintenance client: context deadline exceeded                                                                    
241
To see the stack trace of this error execute with --v=5 or higher         
242
</pre>
243
244
Then the problem is likely that the etcd server is still a member of the cluster. We first need to remove it from the etcd cluster and then the join works.
245
246
To fix this we do:
247
248
* Find a working etcd pod
249
* Find the etcd members / member list
250
* Remove the etcd member that we want to re-join the cluster
251
252
253
<pre>
254
# Find the etcd pods
255
kubectl -n kube-system get pods -l component=etcd,tier=control-plane
256
257
# Get the list of etcd servers with the member id 
258
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
259
260
# Remove the member
261
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove MEMBERID
262
</pre>
263
264
Sample session:
265
266
<pre>
267
[10:48] line:~% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
268
NAME            READY   STATUS    RESTARTS     AGE
269
etcd-server63   1/1     Running   0            3m11s
270
etcd-server65   1/1     Running   3            7d2h
271
etcd-server83   1/1     Running   8 (6d ago)   7d2h
272
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
273
356891cd676df6e4, started, server65, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2379, false
274
371b8a07185dee7e, started, server63, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2379, false
275
5942bc58307f8af9, started, server83, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2380, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2379, false
276
277
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove 371b8a07185dee7e
278
Member 371b8a07185dee7e removed from cluster e3c0805f592a8f77
279 1 Nico Schottelius
280
</pre>
281
282
SEE ALSO
283
284
* We found the solution using https://stackoverflow.com/questions/67921552/re-installed-node-cannot-join-kubernetes-cluster
285 56 Nico Schottelius
286 213 Nico Schottelius
h4. Updating the members
287
288
1) get alive member
289
290
<pre>
291
% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
292
NAME            READY   STATUS    RESTARTS   AGE
293
etcd-server67   1/1     Running   1          185d
294
etcd-server69   1/1     Running   1          185d
295
etcd-server71   1/1     Running   2          185d
296
[20:57] sun:~% 
297
</pre>
298
299
2) get member list
300
301
* in this case via crictl, as the api does not work correctly anymore
302
303
<pre>
304
305
306
</pre>
307
308
309
3) update
310
311
<pre>
312
etcdctl member update MEMBERID  --peer-urls=https://[...]:2380
313
314
315
</pre>
316
317 147 Nico Schottelius
h3. Node labels (adding, showing, removing)
318
319
Listing the labels:
320
321
<pre>
322
kubectl get nodes --show-labels
323
</pre>
324
325
Adding labels:
326
327
<pre>
328
kubectl label nodes LIST-OF-NODES label1=value1 
329
330
</pre>
331
332
For instance:
333
334
<pre>
335
kubectl label nodes router2 router3 hosttype=router 
336
</pre>
337
338
Selecting nodes in pods:
339
340
<pre>
341
apiVersion: v1
342
kind: Pod
343
...
344
spec:
345
  nodeSelector:
346
    hosttype: router
347
</pre>
348
349 148 Nico Schottelius
Removing labels by adding a minus at the end of the label name:
350
351
<pre>
352
kubectl label node <nodename> <labelname>-
353
</pre>
354
355
For instance:
356
357
<pre>
358
kubectl label nodes router2 router3 hosttype- 
359
</pre>
360
361 147 Nico Schottelius
SEE ALSO
362 1 Nico Schottelius
363 148 Nico Schottelius
* https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes/
364
* https://stackoverflow.com/questions/34067979/how-to-delete-a-node-label-by-command-and-api
365 147 Nico Schottelius
366 199 Nico Schottelius
h3. Listing all pods on a node
367
368
<pre>
369
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=serverXX
370
</pre>
371
372
Found on https://stackoverflow.com/questions/62000559/how-to-list-all-the-pods-running-in-a-particular-worker-node-by-executing-a-comm
373
374 101 Nico Schottelius
h3. Hardware Maintenance using ungleich-hardware
375
376
Use the following manifest and replace the HOST with the actual host:
377
378
<pre>
379
apiVersion: v1
380
kind: Pod
381
metadata:
382
  name: ungleich-hardware-HOST
383
spec:
384
  containers:
385
  - name: ungleich-hardware
386
    image: ungleich/ungleich-hardware:0.0.5
387
    args:
388
    - sleep
389
    - "1000000"
390
    volumeMounts:
391
      - mountPath: /dev
392
        name: dev
393
    securityContext:
394
      privileged: true
395
  nodeSelector:
396
    kubernetes.io/hostname: "HOST"
397
398
  volumes:
399
    - name: dev
400
      hostPath:
401
        path: /dev
402
</pre>
403
404 102 Nico Schottelius
Also see: [[The_ungleich_hardware_maintenance_guide]]
405
406 105 Nico Schottelius
h3. Triggering a cronjob / creating a job from a cronjob
407 104 Nico Schottelius
408
To test a cronjob, we can create a job from a cronjob:
409
410
<pre>
411
kubectl create job --from=cronjob/volume2-daily-backup volume2-manual
412
</pre>
413
414
This creates a job volume2-manual based on the cronjob  volume2-daily
415
416 112 Nico Schottelius
h3. su-ing into a user that has nologin shell set
417
418
Many times users are having nologin as their shell inside the container. To be able to execute maintenance commands within the
419
container, we can use @su -s /bin/sh@ like this:
420
421
<pre>
422
su -s /bin/sh -c '/path/to/your/script' testuser
423
</pre>
424
425
Found on https://serverfault.com/questions/351046/how-to-run-command-as-user-who-has-usr-sbin-nologin-as-shell
426
427 113 Nico Schottelius
h3. How to print a secret value
428
429
Assuming you want the "password" item from a secret, use:
430
431
<pre>
432
kubectl get secret SECRETNAME -o jsonpath="{.data.password}" | base64 -d; echo "" 
433
</pre>
434
435 209 Nico Schottelius
h3. Fixing the "ImageInspectError"
436
437
If you see this problem:
438
439
<pre>
440
# kubectl get pods
441
NAME                                                       READY   STATUS                   RESTARTS   AGE
442
bird-router-server137-bird-767f65bb47-g4xsh                0/1     Init:ImageInspectError   0          77d
443
bird-router-server137-openvpn-server120-5c987b7ffb-cn9xf   0/1     ImageInspectError        1          159d
444
bird-router-server137-unbound-5c6f5d4bb6-cxbpr             0/1     ImageInspectError        1          159d
445
</pre>
446
447
Fixes so far:
448
449
* correct registries.conf
450
451 212 Nico Schottelius
h3. Automatic cleanup of images
452
453
* options to kubelet
454
455
<pre>
456
  --image-gc-high-threshold=90: The percent of disk usage after which image garbage collection is always run. Default: 90%
457
  --image-gc-low-threshold=80: The percent of disk usage before which image garbage collection is never run. Lowest disk usage to garbage collect to. Default: 80%
458
</pre>
459 209 Nico Schottelius
460 173 Nico Schottelius
h3. How to upgrade a kubernetes cluster
461 172 Nico Schottelius
462
h4. General
463
464
* Should be done every X months to stay up-to-date
465
** X probably something like 3-6
466
* kubeadm based clusters
467
* Needs specific kubeadm versions for upgrade
468
* Follow instructions on https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
469 190 Nico Schottelius
* Finding releases: https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG
470 172 Nico Schottelius
471
h4. Getting a specific kubeadm or kubelet version
472
473
<pre>
474 190 Nico Schottelius
RELEASE=v1.22.17
475
RELEASE=v1.23.17
476 181 Nico Schottelius
RELEASE=v1.24.9
477 1 Nico Schottelius
RELEASE=v1.25.9
478
RELEASE=v1.26.6
479 190 Nico Schottelius
RELEASE=v1.27.2
480
481 187 Nico Schottelius
ARCH=amd64
482 172 Nico Schottelius
483
curl -L --remote-name-all https://dl.k8s.io/release/${RELEASE}/bin/linux/${ARCH}/{kubeadm,kubelet}
484 182 Nico Schottelius
chmod u+x kubeadm kubelet
485 172 Nico Schottelius
</pre>
486
487
h4. Steps
488
489
* kubeadm upgrade plan
490
** On one control plane node
491
* kubeadm upgrade apply vXX.YY.ZZ
492
** On one control plane node
493 189 Nico Schottelius
* kubeadm upgrade node
494
** On all other control plane nodes
495
** On all worker nodes afterwards
496
497 172 Nico Schottelius
498 173 Nico Schottelius
Repeat for all control planes nodes. The upgrade kubelet on all other nodes via package manager.
499 172 Nico Schottelius
500 193 Nico Schottelius
h4. Upgrading to 1.22.17
501 1 Nico Schottelius
502 193 Nico Schottelius
* https://v1-22.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
503 194 Nico Schottelius
* Need to create a kubeadm config map
504 198 Nico Schottelius
** f.i. using the following
505
** @/usr/local/bin/kubeadm-v1.22.17   upgrade --config kubeadm.yaml --ignore-preflight-errors=CoreDNSUnsupportedPlugins,CoreDNSMigration apply -y v1.22.17@
506 193 Nico Schottelius
* Done for p6 on 2023-10-04
507
508
h4. Upgrading to 1.23.17
509
510
* https://v1-23.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
511
* No special notes
512
* Done for p6 on 2023-10-04
513
514
h4. Upgrading to 1.24.17
515
516
* https://v1-24.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
517
* No special notes
518
* Done for p6 on 2023-10-04
519
520
h4. Upgrading to 1.25.14
521
522
* https://v1-24.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
523
* No special notes
524
* Done for p6 on 2023-10-04
525
526
h4. Upgrading to 1.26.9
527
528 1 Nico Schottelius
* https://v1-26.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
529 193 Nico Schottelius
* No special notes
530
* Done for p6 on 2023-10-04
531 188 Nico Schottelius
532 196 Nico Schottelius
h4. Upgrading to 1.27
533 186 Nico Schottelius
534 192 Nico Schottelius
* https://v1-27.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
535 186 Nico Schottelius
* kubelet will not start anymore
536
* reason: @"command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"@
537
* /var/lib/kubelet/kubeadm-flags.env contains that parameter
538
* remove it, start kubelet
539 192 Nico Schottelius
540 197 Nico Schottelius
h4. Upgrading to 1.28
541 192 Nico Schottelius
542
* https://v1-28.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
543 186 Nico Schottelius
544 223 Nico Schottelius
h4. Upgrading to 1.29
545
546
* Done for many clusters around 2024-01-10
547
* Unsure if it was properly released
548
* https://v1-29.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
549
550 219 Nico Schottelius
h4. Upgrading to 1.31
551
552
* Cluster needs to updated FIRST before kubelet/the OS
553
554
Otherwise you run into errors in the pod like this:
555
556
<pre>
557
  Warning  Failed     11s (x3 over 12s)  kubelet            Error: services have not yet been read at least once, cannot construct envvars
558
</pre>
559
560 210 Nico Schottelius
And the resulting pod state is:
561
562
<pre>
563
Init:CreateContainerConfigError
564
</pre>
565
566 224 Nico Schottelius
Fix: 
567
568
* find an old 1.30 kubelet package, downgrade kubelet, upgrade the control plane, upgrade kubelet again
569
570 225 Nico Schottelius
<pre>
571
wget https://mirror.ungleich.ch/mirror/packages/alpine/v3.20/community/x86_64/kubelet-1.30.0-r3.apk
572
wget https://mirror.ungleich.ch/mirror/packages/alpine/v3.20/community/x86_64/kubelet-openrc-1.30.0-r3.apk
573
apk add ./kubelet-1.30.0-r3.apk ./kubelet-openrc-1.30.0-r3.apk
574 226 Nico Schottelius
/etc/init.d/kubelet restart
575 225 Nico Schottelius
</pre>
576 226 Nico Schottelius
577
Then upgrade:
578
579
<pre>
580
/usr/local/bin/kubeadm-v1.31.3   upgrade apply -y v1.31.3
581
</pre>
582
583
Then re-upgrade the kubelet:
584
585
<pre>
586
apk upgrade -a
587
</pre>
588
589 225 Nico Schottelius
590 186 Nico Schottelius
h4. Upgrade to crio 1.27: missing crun
591
592
Error message
593
594
<pre>
595
level=fatal msg="validating runtime config: runtime validation: \"crun\" not found in $PATH: exec: \"crun\": executable file not found in $PATH"
596
</pre>
597 1 Nico Schottelius
598 186 Nico Schottelius
Fix:
599
600
<pre>
601
apk add crun
602
</pre>
603 223 Nico Schottelius
604 186 Nico Schottelius
605 157 Nico Schottelius
h2. Reference CNI
606
607
* Mainly "stupid", but effective plugins
608
* Main documentation on https://www.cni.dev/plugins/current/
609 158 Nico Schottelius
* Plugins
610
** bridge
611
*** Can create the bridge on the host
612
*** But seems not to be able to add host interfaces to it as well
613
*** Has support for vlan tags
614
** vlan
615
*** creates vlan tagged sub interface on the host
616 160 Nico Schottelius
*** "It's a 1:1 mapping (i.e. no bridge in between)":https://github.com/k8snetworkplumbingwg/multus-cni/issues/569
617 158 Nico Schottelius
** host-device
618
*** moves the interface from the host into the container
619
*** very easy for physical connections to containers
620 159 Nico Schottelius
** ipvlan
621
*** "virtualisation" of a host device
622
*** routing based on IP
623
*** Same MAC for everyone
624
*** Cannot reach the master interface
625
** maclvan
626
*** With mac addresses
627
*** Supports various modes (to be checked)
628
** ptp ("point to point")
629
*** Creates a host device and connects it to the container
630
** win*
631 158 Nico Schottelius
*** Windows implementations
632 157 Nico Schottelius
633 62 Nico Schottelius
h2. Calico CNI
634
635
h3. Calico Installation
636
637
* We install "calico using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
638 228 Nico Schottelius
* Check the tags on https://github.com/projectcalico/calico/tags for the latest release
639 62 Nico Schottelius
* This has the following advantages:
640
** Easy to upgrade
641
** Does not require os to configure IPv6/dual stack settings as the tigera operator figures out things on its own
642
643
Usually plain calico can be installed directly using:
644
645 1 Nico Schottelius
<pre>
646 228 Nico Schottelius
VERSION=v3.30.0
647 149 Nico Schottelius
648 1 Nico Schottelius
helm repo add projectcalico https://docs.projectcalico.org/charts
649 167 Nico Schottelius
helm repo update
650 124 Nico Schottelius
helm upgrade --install --namespace tigera calico projectcalico/tigera-operator --version $VERSION --create-namespace
651 92 Nico Schottelius
</pre>
652 1 Nico Schottelius
653 229 Nico Schottelius
h3. Calico upgrade
654 92 Nico Schottelius
655 229 Nico Schottelius
* As of 3.30 or so, CRDs need to be applied manually beforehand
656
657
<pre>
658
VERSION=v3.30.0
659
660
kubectl apply --server-side --force-conflicts -f https://raw.githubusercontent.com/projectcalico/calico/${VERSION}/manifests/operator-crds.yaml
661
helm upgrade --install --namespace tigera calico projectcalico/tigera-operator --version $VERSION --create-namespace
662
</pre>
663 228 Nico Schottelius
664 62 Nico Schottelius
h3. Installing calicoctl
665
666 115 Nico Schottelius
* General installation instructions, including binary download: https://projectcalico.docs.tigera.io/maintenance/clis/calicoctl/install
667
668 62 Nico Schottelius
To be able to manage and configure calico, we need to 
669
"install calicoctl (we choose the version as a pod)":https://docs.projectcalico.org/getting-started/clis/calicoctl/install#install-calicoctl-as-a-kubernetes-pod
670
671
<pre>
672
kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml
673
</pre>
674
675 93 Nico Schottelius
Or version specific:
676
677
<pre>
678
kubectl apply -f https://github.com/projectcalico/calico/blob/v3.20.4/manifests/calicoctl.yaml
679 97 Nico Schottelius
680
# For 3.22
681
kubectl apply -f https://projectcalico.docs.tigera.io/archive/v3.22/manifests/calicoctl.yaml
682 93 Nico Schottelius
</pre>
683
684 70 Nico Schottelius
And making it easier accessible by alias:
685
686
<pre>
687
alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"
688
</pre>
689
690 62 Nico Schottelius
h3. Calico configuration
691
692 63 Nico Schottelius
By default our k8s clusters "BGP peer":https://docs.projectcalico.org/networking/bgp
693
with an upstream router to propagate podcidr and servicecidr.
694 62 Nico Schottelius
695
Default settings in our infrastructure:
696
697
* We use a full-mesh using the @nodeToNodeMeshEnabled: true@ option
698
* We keep the original next hop so that *only* the server with the pod is announcing it (instead of ecmp)
699 1 Nico Schottelius
* We use private ASNs for k8s clusters
700 63 Nico Schottelius
* We do *not* use any overlay
701 62 Nico Schottelius
702
After installing calico and calicoctl the last step of the installation is usually:
703
704 1 Nico Schottelius
<pre>
705 79 Nico Schottelius
calicoctl create -f - < calico-bgp.yaml
706 62 Nico Schottelius
</pre>
707
708
709
A sample BGP configuration:
710
711
<pre>
712
---
713
apiVersion: projectcalico.org/v3
714
kind: BGPConfiguration
715
metadata:
716
  name: default
717
spec:
718
  logSeverityScreen: Info
719
  nodeToNodeMeshEnabled: true
720
  asNumber: 65534
721
  serviceClusterIPs:
722
  - cidr: 2a0a:e5c0:10:3::/108
723
  serviceExternalIPs:
724
  - cidr: 2a0a:e5c0:10:3::/108
725
---
726
apiVersion: projectcalico.org/v3
727
kind: BGPPeer
728
metadata:
729
  name: router1-place10
730
spec:
731
  peerIP: 2a0a:e5c0:10:1::50
732
  asNumber: 213081
733
  keepOriginalNextHop: true
734
</pre>
735
736 227 Nico Schottelius
h3. Get installed calico version
737
738
* might be in calico or tigera namespace
739
740
<pre>
741
helm ls -A | grep calico
742
</pre>
743
744 126 Nico Schottelius
h2. Cilium CNI (experimental)
745
746 137 Nico Schottelius
h3. Status
747
748 138 Nico Schottelius
*NO WORKING CILIUM CONFIGURATION FOR IPV6 only modes*
749 137 Nico Schottelius
750 146 Nico Schottelius
h3. Latest error
751
752
It seems cilium does not run on IPv6 only hosts:
753
754
<pre>
755
level=info msg="Validating configured node address ranges" subsys=daemon
756
level=fatal msg="postinit failed" error="external IPv4 node address could not be derived, please configure via --ipv4-node" subsys=daemon
757
level=info msg="Starting IP identity watcher" subsys=ipcache
758
</pre>
759
760
It crashes after that log entry
761
762 128 Nico Schottelius
h3. BGP configuration
763
764
* The cilium-operator will not start without a correct configmap being present beforehand (see error message below)
765
* Creating the bgp config beforehand as a configmap is thus required.
766
767
The error one gets without the configmap present:
768
769
Pods are hanging with:
770
771
<pre>
772
cilium-bpqm6                       0/1     Init:0/4            0             9s
773
cilium-operator-5947d94f7f-5bmh2   0/1     ContainerCreating   0             9s
774
</pre>
775
776
The error message in the cilium-*perator is:
777
778
<pre>
779
Events:
780
  Type     Reason       Age                From               Message
781
  ----     ------       ----               ----               -------
782
  Normal   Scheduled    80s                default-scheduler  Successfully assigned kube-system/cilium-operator-5947d94f7f-lqcsp to server56
783
  Warning  FailedMount  16s (x8 over 80s)  kubelet            MountVolume.SetUp failed for volume "bgp-config-path" : configmap "bgp-config" not found
784
</pre>
785
786
A correct bgp config looks like this:
787
788
<pre>
789
apiVersion: v1
790
kind: ConfigMap
791
metadata:
792
  name: bgp-config
793
  namespace: kube-system
794
data:
795
  config.yaml: |
796
    peers:
797
      - peer-address: 2a0a:e5c0::46
798
        peer-asn: 209898
799
        my-asn: 65533
800
      - peer-address: 2a0a:e5c0::47
801
        peer-asn: 209898
802
        my-asn: 65533
803
    address-pools:
804
      - name: default
805
        protocol: bgp
806
        addresses:
807
          - 2a0a:e5c0:0:14::/64
808
</pre>
809 127 Nico Schottelius
810
h3. Installation
811 130 Nico Schottelius
812 127 Nico Schottelius
Adding the repo
813 1 Nico Schottelius
<pre>
814 127 Nico Schottelius
815 129 Nico Schottelius
helm repo add cilium https://helm.cilium.io/
816 130 Nico Schottelius
helm repo update
817
</pre>
818 129 Nico Schottelius
819 135 Nico Schottelius
Installing + configuring cilium
820 129 Nico Schottelius
<pre>
821 130 Nico Schottelius
ipv6pool=2a0a:e5c0:0:14::/112
822 1 Nico Schottelius
823 146 Nico Schottelius
version=1.12.2
824 129 Nico Schottelius
825
helm upgrade --install cilium cilium/cilium --version $version \
826 1 Nico Schottelius
  --namespace kube-system \
827
  --set ipv4.enabled=false \
828
  --set ipv6.enabled=true \
829 146 Nico Schottelius
  --set enableIPv6Masquerade=false \
830
  --set bgpControlPlane.enabled=true 
831 1 Nico Schottelius
832 146 Nico Schottelius
#  --set ipam.operator.clusterPoolIPv6PodCIDRList=$ipv6pool
833
834
# Old style bgp?
835 136 Nico Schottelius
#   --set bgp.enabled=true --set bgp.announce.podCIDR=true \
836 127 Nico Schottelius
837
# Show possible configuration options
838
helm show values cilium/cilium
839
840 1 Nico Schottelius
</pre>
841 132 Nico Schottelius
842
Using a /64 for ipam.operator.clusterPoolIPv6PodCIDRList fails with:
843
844
<pre>
845
level=fatal msg="Unable to init cluster-pool allocator" error="unable to initialize IPv6 allocator New CIDR set failed; the node CIDR size is too big" subsys=cilium-operator-generic
846
</pre>
847
848 126 Nico Schottelius
849 1 Nico Schottelius
See also https://github.com/cilium/cilium/issues/20756
850 135 Nico Schottelius
851
Seems a /112 is actually working.
852
853
h3. Kernel modules
854
855
Cilium requires the following modules to be loaded on the host (not loaded by default):
856
857
<pre>
858 1 Nico Schottelius
modprobe  ip6table_raw
859
modprobe  ip6table_filter
860
</pre>
861 146 Nico Schottelius
862
h3. Interesting helm flags
863
864
* autoDirectNodeRoutes
865
* bgpControlPlane.enabled = true
866
867
h3. SEE ALSO
868
869
* https://docs.cilium.io/en/v1.12/helm-reference/
870 133 Nico Schottelius
871 179 Nico Schottelius
h2. Multus
872 168 Nico Schottelius
873
* https://github.com/k8snetworkplumbingwg/multus-cni
874
* Installing a deployment w/ CRDs
875 150 Nico Schottelius
876 169 Nico Schottelius
<pre>
877 176 Nico Schottelius
VERSION=v4.0.1
878 169 Nico Schottelius
879 170 Nico Schottelius
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/${VERSION}/deployments/multus-daemonset-crio.yml
880
</pre>
881 169 Nico Schottelius
882 191 Nico Schottelius
h2. ArgoCD
883 56 Nico Schottelius
884 60 Nico Schottelius
h3. Argocd Installation
885 1 Nico Schottelius
886 116 Nico Schottelius
* See https://argo-cd.readthedocs.io/en/stable/
887
888 60 Nico Schottelius
As there is no configuration management present yet, argocd is installed using
889
890 1 Nico Schottelius
<pre>
891 60 Nico Schottelius
kubectl create namespace argocd
892 1 Nico Schottelius
893
# OR: latest stable
894
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
895
896 191 Nico Schottelius
# OR Specific Version
897
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/v2.3.2/manifests/install.yaml
898 56 Nico Schottelius
899 191 Nico Schottelius
900
</pre>
901 1 Nico Schottelius
902 60 Nico Schottelius
h3. Get the argocd credentials
903
904
<pre>
905
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d; echo ""
906
</pre>
907 52 Nico Schottelius
908 87 Nico Schottelius
h3. Accessing argocd
909
910
In regular IPv6 clusters:
911
912
* Navigate to https://argocd-server.argocd.CLUSTERDOMAIN
913
914
In legacy IPv4 clusters
915
916
<pre>
917
kubectl --namespace argocd port-forward svc/argocd-server 8080:80
918
</pre>
919
920 88 Nico Schottelius
* Navigate to https://localhost:8080
921
922 68 Nico Schottelius
h3. Using the argocd webhook to trigger changes
923 67 Nico Schottelius
924
* To trigger changes post json https://argocd.example.com/api/webhook
925
926 72 Nico Schottelius
h3. Deploying an application
927
928
* Applications are deployed via git towards gitea (code.ungleich.ch) and then pulled by argo
929 73 Nico Schottelius
* Always include the *redmine-url* pointing to the (customer) ticket
930
** Also add the support-url if it exists
931 72 Nico Schottelius
932
Application sample
933
934
<pre>
935
apiVersion: argoproj.io/v1alpha1
936
kind: Application
937
metadata:
938
  name: gitea-CUSTOMER
939
  namespace: argocd
940
spec:
941
  destination:
942
    namespace: default
943
    server: 'https://kubernetes.default.svc'
944
  source:
945
    path: apps/prod/gitea
946
    repoURL: 'https://code.ungleich.ch/ungleich-intern/k8s-config.git'
947
    targetRevision: HEAD
948
    helm:
949
      parameters:
950
        - name: storage.data.storageClass
951
          value: rook-ceph-block-hdd
952
        - name: storage.data.size
953
          value: 200Gi
954
        - name: storage.db.storageClass
955
          value: rook-ceph-block-ssd
956
        - name: storage.db.size
957
          value: 10Gi
958
        - name: storage.letsencrypt.storageClass
959
          value: rook-ceph-block-hdd
960
        - name: storage.letsencrypt.size
961
          value: 50Mi
962
        - name: letsencryptStaging
963
          value: 'no'
964
        - name: fqdn
965
          value: 'code.verua.online'
966
  project: default
967
  syncPolicy:
968
    automated:
969
      prune: true
970
      selfHeal: true
971
  info:
972
    - name: 'redmine-url'
973
      value: 'https://redmine.ungleich.ch/issues/ISSUEID'
974
    - name: 'support-url'
975
      value: 'https://support.ungleich.ch/Ticket/Display.html?id=TICKETID'
976
</pre>
977
978 80 Nico Schottelius
h2. Helm related operations and conventions
979 55 Nico Schottelius
980 61 Nico Schottelius
We use helm charts extensively.
981
982
* In production, they are managed via argocd
983
* In development, helm chart can de developed and deployed manually using the helm utility.
984
985 55 Nico Schottelius
h3. Installing a helm chart
986
987
One can use the usual pattern of
988
989
<pre>
990
helm install <releasename> <chartdirectory>
991
</pre>
992
993
However often you want to reinstall/update when testing helm charts. The following pattern is "better", because it allows you to reinstall, if it is already installed:
994
995
<pre>
996
helm upgrade --install <releasename> <chartdirectory>
997 1 Nico Schottelius
</pre>
998 80 Nico Schottelius
999
h3. Naming services and deployments in helm charts [Application labels]
1000
1001
* We always have {{ .Release.Name }} to identify the current "instance"
1002
* Deployments:
1003
** use @app: <what it is>@, f.i. @app: nginx@, @app: postgres@, ...
1004 81 Nico Schottelius
* See more about standard labels on
1005
** https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
1006
** https://helm.sh/docs/chart_best_practices/labels/
1007 55 Nico Schottelius
1008 151 Nico Schottelius
h3. Show all versions of a helm chart
1009
1010
<pre>
1011
helm search repo -l repo/chart
1012
</pre>
1013
1014
For example:
1015
1016
<pre>
1017
% helm search repo -l projectcalico/tigera-operator 
1018
NAME                         	CHART VERSION	APP VERSION	DESCRIPTION                            
1019
projectcalico/tigera-operator	v3.23.3      	v3.23.3    	Installs the Tigera operator for Calico
1020
projectcalico/tigera-operator	v3.23.2      	v3.23.2    	Installs the Tigera operator for Calico
1021
....
1022
</pre>
1023
1024 152 Nico Schottelius
h3. Show possible values of a chart
1025
1026
<pre>
1027
helm show values <repo/chart>
1028
</pre>
1029
1030
Example:
1031
1032
<pre>
1033
helm show values ingress-nginx/ingress-nginx
1034
</pre>
1035
1036 207 Nico Schottelius
h3. Show all possible charts in a repo
1037
1038
<pre>
1039
helm search repo REPO
1040
</pre>
1041
1042 178 Nico Schottelius
h3. Download a chart
1043
1044
For instance for checking it out locally. Use:
1045
1046
<pre>
1047
helm pull <repo/chart>
1048
</pre>
1049 152 Nico Schottelius
1050 139 Nico Schottelius
h2. Rook + Ceph
1051
1052
h3. Installation
1053
1054
* Usually directly via argocd
1055
1056 71 Nico Schottelius
h3. Executing ceph commands
1057
1058
Using the ceph-tools pod as follows:
1059
1060
<pre>
1061
kubectl exec -n rook-ceph -ti $(kubectl -n rook-ceph get pods -l app=rook-ceph-tools -o jsonpath='{.items[*].metadata.name}') -- ceph -s
1062
</pre>
1063
1064 43 Nico Schottelius
h3. Inspecting the logs of a specific server
1065
1066
<pre>
1067
# Get the related pods
1068
kubectl -n rook-ceph get pods -l app=rook-ceph-osd-prepare 
1069
...
1070
1071
# Inspect the logs of a specific pod
1072
kubectl -n rook-ceph logs -f rook-ceph-osd-prepare-server23--1-444qx
1073
1074 71 Nico Schottelius
</pre>
1075
1076
h3. Inspecting the logs of the rook-ceph-operator
1077
1078
<pre>
1079
kubectl -n rook-ceph logs -f -l app=rook-ceph-operator
1080 43 Nico Schottelius
</pre>
1081
1082 200 Nico Schottelius
h3. (Temporarily) Disabling the rook-operation
1083
1084
* first disabling the sync in argocd
1085
* then scale it down
1086
1087
<pre>
1088
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=0
1089
</pre>
1090
1091
When done with the work/maintenance, re-enable sync in argocd.
1092
The following command is thus strictly speaking not required, as argocd will fix it on its own:
1093
1094
<pre>
1095
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=1
1096
</pre>
1097
1098 121 Nico Schottelius
h3. Restarting the rook operator
1099
1100
<pre>
1101
kubectl -n rook-ceph delete pods  -l app=rook-ceph-operator
1102
</pre>
1103
1104 43 Nico Schottelius
h3. Triggering server prepare / adding new osds
1105
1106
The rook-ceph-operator triggers/watches/creates pods to maintain hosts. To trigger a full "re scan", simply delete that pod:
1107
1108
<pre>
1109
kubectl -n rook-ceph delete pods -l app=rook-ceph-operator
1110
</pre>
1111
1112
This will cause all the @rook-ceph-osd-prepare-..@ jobs to be recreated and thus OSDs to be created, if new disks have been added.
1113
1114
h3. Removing an OSD
1115
1116
* See "Ceph OSD Management":https://rook.io/docs/rook/v1.7/ceph-osd-mgmt.html
1117 77 Nico Schottelius
* More specifically: https://github.com/rook/rook/blob/release-1.7/cluster/examples/kubernetes/ceph/osd-purge.yaml
1118 99 Nico Schottelius
* Then delete the related deployment
1119 41 Nico Schottelius
1120 98 Nico Schottelius
Set osd id in the osd-purge.yaml and apply it. OSD should be down before.
1121
1122
<pre>
1123
apiVersion: batch/v1
1124
kind: Job
1125
metadata:
1126
  name: rook-ceph-purge-osd
1127
  namespace: rook-ceph # namespace:cluster
1128
  labels:
1129
    app: rook-ceph-purge-osd
1130
spec:
1131
  template:
1132
    metadata:
1133
      labels:
1134
        app: rook-ceph-purge-osd
1135
    spec:
1136
      serviceAccountName: rook-ceph-purge-osd
1137
      containers:
1138
        - name: osd-removal
1139
          image: rook/ceph:master
1140
          # TODO: Insert the OSD ID in the last parameter that is to be removed
1141
          # The OSD IDs are a comma-separated list. For example: "0" or "0,2".
1142
          # If you want to preserve the OSD PVCs, set `--preserve-pvc true`.
1143
          #
1144
          # A --force-osd-removal option is available if the OSD should be destroyed even though the
1145
          # removal could lead to data loss.
1146
          args:
1147
            - "ceph"
1148
            - "osd"
1149
            - "remove"
1150
            - "--preserve-pvc"
1151
            - "false"
1152
            - "--force-osd-removal"
1153
            - "false"
1154
            - "--osd-ids"
1155
            - "SETTHEOSDIDHERE"
1156
          env:
1157
            - name: POD_NAMESPACE
1158
              valueFrom:
1159
                fieldRef:
1160
                  fieldPath: metadata.namespace
1161
            - name: ROOK_MON_ENDPOINTS
1162
              valueFrom:
1163
                configMapKeyRef:
1164
                  key: data
1165
                  name: rook-ceph-mon-endpoints
1166
            - name: ROOK_CEPH_USERNAME
1167
              valueFrom:
1168
                secretKeyRef:
1169
                  key: ceph-username
1170
                  name: rook-ceph-mon
1171
            - name: ROOK_CEPH_SECRET
1172
              valueFrom:
1173
                secretKeyRef:
1174
                  key: ceph-secret
1175
                  name: rook-ceph-mon
1176
            - name: ROOK_CONFIG_DIR
1177
              value: /var/lib/rook
1178
            - name: ROOK_CEPH_CONFIG_OVERRIDE
1179
              value: /etc/rook/config/override.conf
1180
            - name: ROOK_FSID
1181
              valueFrom:
1182
                secretKeyRef:
1183
                  key: fsid
1184
                  name: rook-ceph-mon
1185
            - name: ROOK_LOG_LEVEL
1186
              value: DEBUG
1187
          volumeMounts:
1188
            - mountPath: /etc/ceph
1189
              name: ceph-conf-emptydir
1190
            - mountPath: /var/lib/rook
1191
              name: rook-config
1192
      volumes:
1193
        - emptyDir: {}
1194
          name: ceph-conf-emptydir
1195
        - emptyDir: {}
1196
          name: rook-config
1197
      restartPolicy: Never
1198
1199
1200 99 Nico Schottelius
</pre>
1201
1202 1 Nico Schottelius
Deleting the deployment:
1203
1204
<pre>
1205
[18:05] bridge:~% kubectl -n rook-ceph delete deployment rook-ceph-osd-6
1206 99 Nico Schottelius
deployment.apps "rook-ceph-osd-6" deleted
1207
</pre>
1208 185 Nico Schottelius
1209
h3. Placement of mons/osds/etc.
1210
1211
See https://rook.io/docs/rook/v1.11/CRDs/Cluster/ceph-cluster-crd/#placement-configuration-settings
1212 98 Nico Schottelius
1213 215 Nico Schottelius
h3. Setting up and managing S3 object storage
1214
1215 217 Nico Schottelius
h4. Endpoints
1216
1217
| Location | Enpdoint |
1218
| p5 | https://s3.k8s.place5.ungleich.ch |
1219
1220
1221 215 Nico Schottelius
h4. Setting up a storage class
1222
1223
* This will store the buckets of a specific customer
1224
1225
Similar to this:
1226
1227
<pre>
1228
apiVersion: storage.k8s.io/v1
1229
kind: StorageClass
1230
metadata:
1231
  name: ungleich-archive-bucket-sc
1232
  namespace: rook-ceph
1233
provisioner: rook-ceph.ceph.rook.io/bucket
1234
reclaimPolicy: Delete
1235
parameters:
1236
  objectStoreName: place5
1237
  objectStoreNamespace: rook-ceph
1238
</pre>
1239
1240
h4. Setting up the Bucket
1241
1242
Similar to this:
1243
1244
<pre>
1245
apiVersion: objectbucket.io/v1alpha1
1246
kind: ObjectBucketClaim
1247
metadata:
1248
  name: ungleich-archive-bucket-claim
1249
  namespace: rook-ceph
1250
spec:
1251
  generateBucketName: ungleich-archive-ceph-bkt
1252
  storageClassName: ungleich-archive-bucket-sc
1253
  additionalConfig:
1254
    # To set for quota for OBC
1255
    #maxObjects: "1000"
1256
    maxSize: "100G"
1257
</pre>
1258
1259
* See also: https://rook.io/docs/rook/latest-release/Storage-Configuration/Object-Storage-RGW/ceph-object-bucket-claim/#obc-custom-resource
1260
1261
h4. Getting the credentials for the bucket
1262
1263
* Get "public" information from the configmap
1264
* Get secret from the secret
1265
1266 216 Nico Schottelius
<pre>
1267 1 Nico Schottelius
name=BUCKETNAME
1268 221 Nico Schottelius
s3host=s3.k8s.place5.ungleich.ch
1269
endpoint=https://${s3host}
1270 1 Nico Schottelius
1271
cm=$(kubectl -n rook-ceph get configmap -o yaml ${name}-bucket-claim)
1272 217 Nico Schottelius
1273 1 Nico Schottelius
sec=$(kubectl -n rook-ceph get secrets -o yaml ${name}-bucket-claim)
1274 222 Nico Schottelius
export AWS_ACCESS_KEY_ID=$(echo $sec | yq .data.AWS_ACCESS_KEY_ID | base64 -d ; echo "")
1275
export AWS_SECRET_ACCESS_KEY=$(echo $sec | yq .data.AWS_SECRET_ACCESS_KEY | base64 -d ; echo "")
1276 1 Nico Schottelius
1277 217 Nico Schottelius
1278 216 Nico Schottelius
bucket_name=$(echo $cm | yq .data.BUCKET_NAME)
1279 1 Nico Schottelius
</pre>
1280 217 Nico Schottelius
1281 220 Nico Schottelius
h5. Access via s3cmd
1282 1 Nico Schottelius
1283 221 Nico Schottelius
it is *NOT*:
1284
1285 220 Nico Schottelius
<pre>
1286 221 Nico Schottelius
s3cmd --host ${s3host}:443 --access_key=${AWS_ACCESS_KEY_ID} --secret_key=${AWS_SECRET_ACCESS_KEY} ls s3://${name}
1287 220 Nico Schottelius
</pre>
1288
1289 217 Nico Schottelius
h5. Access via s4cmd
1290
1291
<pre>
1292 1 Nico Schottelius
s4cmd --endpoint-url ${endpoint} --access-key=$(AWS_ACCESS_KEY_ID) --secret-key=$(AWS_SECRET_ACCESS_KEY) ls
1293
</pre>
1294 221 Nico Schottelius
1295
h5. Access via s5cmd
1296
1297
* Uses environment variables
1298
1299
<pre>
1300
s5cmd --endpoint-url ${endpoint} ls
1301
</pre>
1302 215 Nico Schottelius
1303 145 Nico Schottelius
h2. Ingress + Cert Manager
1304
1305
* We deploy "nginx-ingress":https://docs.nginx.com/nginx-ingress-controller/ to get an ingress
1306
* we deploy "cert-manager":https://cert-manager.io/ to handle certificates
1307
* We independently deploy @ClusterIssuer@ to allow the cert-manager app to deploy and the issuer to be created once the CRDs from cert manager are in place
1308
1309
h3. IPv4 reachability 
1310
1311
The ingress is by default IPv6 only. To make it reachable from the IPv4 world, get its IPv6 address and configure a NAT64 mapping in Jool.
1312
1313
Steps:
1314
1315
h4. Get the ingress IPv6 address
1316
1317
Use @kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.spec.clusterIP}'; echo ''@
1318
1319
Example:
1320
1321
<pre>
1322
kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.spec.clusterIP}'; echo ''
1323
2a0a:e5c0:10:1b::ce11
1324
</pre>
1325
1326
h4. Add NAT64 mapping
1327
1328
* Update the __dcl_jool_siit cdist type
1329
* Record the two IPs (IPv6 and IPv4)
1330
* Configure all routers
1331
1332
1333
h4. Add DNS record
1334
1335
To use the ingress capable as a CNAME destination, create an "ingress" DNS record, such as:
1336
1337
<pre>
1338
; k8s ingress for dev
1339
dev-ingress                 AAAA 2a0a:e5c0:10:1b::ce11
1340
dev-ingress                 A 147.78.194.23
1341
1342
</pre> 
1343
1344
h4. Add supporting wildcard DNS
1345
1346
If you plan to add various sites under a specific domain, we can add a wildcard DNS entry, such as *.k8s-dev.django-hosting.ch:
1347
1348
<pre>
1349
*.k8s-dev         CNAME dev-ingress.ungleich.ch.
1350
</pre>
1351
1352 76 Nico Schottelius
h2. Harbor
1353
1354 175 Nico Schottelius
* We user "Harbor":https://goharbor.io/ as an image registry for our own images. Internal app reference: apps/prod/harbor.
1355
* The admin password is in the password store, it is Harbor12345 by default
1356 76 Nico Schottelius
* At the moment harbor only authenticates against the internal ldap tree
1357
1358
h3. LDAP configuration
1359
1360
* The url needs to be ldaps://...
1361
* uid = uid
1362
* rest standard
1363 75 Nico Schottelius
1364 89 Nico Schottelius
h2. Monitoring / Prometheus
1365
1366 90 Nico Schottelius
* Via "kube-prometheus":https://github.com/prometheus-operator/kube-prometheus/
1367 89 Nico Schottelius
1368 91 Nico Schottelius
Access via ...
1369
1370
* http://prometheus-k8s.monitoring.svc:9090
1371
* http://grafana.monitoring.svc:3000
1372
* http://alertmanager.monitoring.svc:9093
1373
1374
1375 100 Nico Schottelius
h3. Prometheus Options
1376
1377
* "helm/kube-prometheus-stack":https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
1378
** Includes dashboards and co.
1379
* "manifest based kube-prometheus":https://github.com/prometheus-operator/kube-prometheus
1380
** Includes dashboards and co.
1381
* "Prometheus Operator (mainly CRD manifest":https://github.com/prometheus-operator/prometheus-operator
1382
1383 171 Nico Schottelius
h3. Grafana default password
1384
1385 218 Nico Schottelius
* If not changed: admin / @prom-operator@
1386
** Can be changed via:
1387
1388
<pre>
1389
    helm:
1390
      values: |-
1391
        configurations: |-
1392
          grafana:
1393
            adminPassword: "..."
1394
</pre>
1395 171 Nico Schottelius
1396 82 Nico Schottelius
h2. Nextcloud
1397
1398 85 Nico Schottelius
h3. How to get the nextcloud credentials 
1399 84 Nico Schottelius
1400
* The initial username is set to "nextcloud"
1401
* The password is autogenerated and saved in a kubernetes secret
1402
1403
<pre>
1404 85 Nico Schottelius
kubectl get secret RELEASENAME-nextcloud -o jsonpath="{.data.PASSWORD}" | base64 -d; echo "" 
1405 84 Nico Schottelius
</pre>
1406
1407 83 Nico Schottelius
h3. How to fix "Access through untrusted domain"
1408
1409 82 Nico Schottelius
* Nextcloud stores the initial domain configuration
1410 1 Nico Schottelius
* If the FQDN is changed, it will show the error message "Access through untrusted domain"
1411 82 Nico Schottelius
* To fix, edit /var/www/html/config/config.php and correct the domain
1412 1 Nico Schottelius
* Then delete the pods
1413 165 Nico Schottelius
1414
h3. Running occ commands inside the nextcloud container
1415
1416
* Find the pod in the right namespace
1417
1418
Exec:
1419
1420
<pre>
1421
su www-data -s /bin/sh -c ./occ
1422
</pre>
1423
1424
* -s /bin/sh is needed as the default shell is set to /bin/false
1425
1426 166 Nico Schottelius
h4. Rescanning files
1427 165 Nico Schottelius
1428 166 Nico Schottelius
* If files have been added without nextcloud's knowledge
1429
1430
<pre>
1431
su www-data -s /bin/sh -c "./occ files:scan --all"
1432
</pre>
1433 82 Nico Schottelius
1434 201 Nico Schottelius
h2. Sealed Secrets
1435
1436 202 Jin-Guk Kwon
* install kubeseal
1437 1 Nico Schottelius
1438 202 Jin-Guk Kwon
<pre>
1439
KUBESEAL_VERSION='0.23.0'
1440
wget "https://github.com/bitnami-labs/sealed-secrets/releases/download/v${KUBESEAL_VERSION:?}/kubeseal-${KUBESEAL_VERSION:?}-linux-amd64.tar.gz" 
1441
tar -xvzf kubeseal-${KUBESEAL_VERSION:?}-linux-amd64.tar.gz kubeseal
1442
sudo install -m 755 kubeseal /usr/local/bin/kubeseal
1443
</pre>
1444
1445
* create key for sealed-secret
1446
1447
<pre>
1448
kubeseal --fetch-cert > /tmp/public-key-cert.pem
1449
</pre>
1450
1451
* create the secret
1452
1453
<pre>
1454 203 Jin-Guk Kwon
ex)
1455 202 Jin-Guk Kwon
apiVersion: v1
1456
kind: Secret
1457
metadata:
1458
  name: Release.Name-postgres-config
1459
  annotations:
1460
    secret-generator.v1.mittwald.de/autogenerate: POSTGRES_PASSWORD
1461
    hosting: Release.Name
1462
  labels:
1463
    app.kubernetes.io/instance: Release.Name
1464
    app.kubernetes.io/component: postgres
1465
stringData:
1466
  POSTGRES_USER: postgresUser
1467
  POSTGRES_DB: postgresDBName
1468
  POSTGRES_INITDB_ARGS: "--no-locale --encoding=UTF8"
1469
</pre>
1470
1471
* convert secret.yaml to sealed-secret.yaml
1472
1473
<pre>
1474
kubeseal -n <namespace> --cert=/tmp/public-key-cert.pem --format=yaml < ./secret.yaml  > ./sealed-secret.yaml
1475
</pre>
1476
1477
* use sealed-secret.yaml on helm-chart directory
1478 201 Nico Schottelius
1479 205 Jin-Guk Kwon
* refer ticket : #11989 , #12120
1480 204 Jin-Guk Kwon
1481 1 Nico Schottelius
h2. Infrastructure versions
1482 35 Nico Schottelius
1483 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v5 (2021-10)
1484 1 Nico Schottelius
1485 57 Nico Schottelius
Clusters are configured / setup in this order:
1486
1487
* Bootstrap via kubeadm
1488 59 Nico Schottelius
* "Networking via calico + BGP (non ECMP) using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
1489
* "ArgoCD for CD":https://argo-cd.readthedocs.io/en/stable/
1490
** "rook for storage via argocd":https://rook.io/
1491 58 Nico Schottelius
** haproxy for in IPv6-cluster-IPv4-to-IPv6 proxy via argocd
1492
** "kubernetes-secret-generator for in cluster secrets":https://github.com/mittwald/kubernetes-secret-generator
1493
** "ungleich-certbot managing certs and nginx":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
1494
1495 57 Nico Schottelius
1496
h3. ungleich kubernetes infrastructure v4 (2021-09)
1497
1498 54 Nico Schottelius
* rook is configured via manifests instead of using the rook-ceph-cluster helm chart
1499 1 Nico Schottelius
* The rook operator is still being installed via helm
1500 35 Nico Schottelius
1501 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v3 (2021-07)
1502 1 Nico Schottelius
1503 10 Nico Schottelius
* rook is now installed via helm via argocd instead of directly via manifests
1504 28 Nico Schottelius
1505 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v2 (2021-05)
1506 28 Nico Schottelius
1507
* Replaced fluxv2 from ungleich k8s v1 with argocd
1508 1 Nico Schottelius
** argocd can apply helm templates directly without needing to go through Chart releases
1509 28 Nico Schottelius
* We are also using argoflow for build flows
1510
* Planned to add "kaniko":https://github.com/GoogleContainerTools/kaniko for image building
1511
1512 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v1 (2021-01)
1513 28 Nico Schottelius
1514
We are using the following components:
1515
1516
* "Calico as a CNI":https://www.projectcalico.org/ with BGP, IPv6 only, no encapsulation
1517
** Needed for basic networking
1518
* "kubernetes-secret-generator":https://github.com/mittwald/kubernetes-secret-generator for creating secrets
1519
** Needed so that secrets are not stored in the git repository, but only in the cluster
1520
* "ungleich-certbot":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
1521
** Needed to get letsencrypt certificates for services
1522
* "rook with ceph rbd + cephfs":https://rook.io/ for storage
1523
** rbd for almost everything, *ReadWriteOnce*
1524
** cephfs for smaller things, multi access *ReadWriteMany*
1525
** Needed for providing persistent storage
1526
* "flux v2":https://fluxcd.io/
1527
** Needed to manage resources automatically