Project

General

Profile

The ungleich kubernetes infrastructure » History » Version 230

Nico Schottelius, 05/28/2025 09:15 AM

1 22 Nico Schottelius
h1. The ungleich kubernetes infrastructure and ungleich kubernetes manual
2 1 Nico Schottelius
3 3 Nico Schottelius
{{toc}}
4
5 1 Nico Schottelius
h2. Status
6
7 211 Nico Schottelius
This document is **production**.
8
This document is the ungleich kubernetes infrastructure overview as well as the ungleich kubernetes manual.
9 1 Nico Schottelius
10 10 Nico Schottelius
h2. k8s clusters
11
12 123 Nico Schottelius
| Cluster            | Purpose/Setup     | Maintainer | Master(s)                     | argo                                                   | v4 http proxy | last verified |
13
| c0.k8s.ooo         | Dev               | -          | UNUSED                        |                                                        |               |    2021-10-05 |
14
| c1.k8s.ooo         | retired           |            | -                             |                                                        |               |    2022-03-15 |
15
| c2.k8s.ooo         | Dev p7 HW         | Nico       | server47 server53 server54    | "argo":https://argocd-server.argocd.svc.c2.k8s.ooo     |               |    2021-10-05 |
16
| c3.k8s.ooo         | retired           | -          | -                             |                                                        |               |    2021-10-05 |
17
| c4.k8s.ooo         | Dev2 p7 HW        | Jin-Guk    | server52 server53 server54    |                                                        |               |             - |
18
| c5.k8s.ooo         | retired           |            | -                             |                                                        |               |    2022-03-15 |
19
| c6.k8s.ooo         | Dev p6 VM Jin-Guk | Jin-Guk    |                               |                                                        |               |               |
20
| [[p5.k8s.ooo]]     | production        |            | server34 server36 server38    | "argo":https://argocd-server.argocd.svc.p5.k8s.ooo     | -             |               |
21
| [[p5-cow.k8s.ooo]] | production        | Nico       | server47 server51 server55    | "argo":https://argocd-server.argocd.svc.p5-cow.k8s.ooo |               |    2022-08-27 |
22
| [[p6.k8s.ooo]]     | production        |            | server67 server69 server71    | "argo":https://argocd-server.argocd.svc.p6.k8s.ooo     | 147.78.194.13 |    2021-10-05 |
23 184 Nico Schottelius
| [[p6-cow.k8s.ooo]] | production        |            | server134 server135 server136 | "argo":https://argocd-server.argocd.svc.p6in10.k8s.ooo | ?             |    2023-05-17 |
24 177 Nico Schottelius
| [[p10.k8s.ooo]]    | production        |            | server131 server132 server133 | "argo":https://argocd-server.argocd.svc.p10.k8s.ooo    | 147.78.194.12 |    2021-10-05 |
25 123 Nico Schottelius
| [[k8s.ge.nau.so]]  | development       |            | server107 server108 server109 | "argo":https://argocd-server.argocd.svc.k8s.ge.nau.so  |               |               |
26
| [[dev.k8s.ooo]]    | development       |            | server110 server111 server112 | "argo":https://argocd-server.argocd.svc.dev.k8s.ooo    | -             |    2022-07-08 |
27 164 Nico Schottelius
| [[r1r2p15k8sooo|r1.p15.k8s.ooo]] | production | Nico | server120 | | | 2022-10-30 |
28
| [[r1r2p15k8sooo|r2.p15.k8s.ooo]] | production | Nico | server121 | | | 2022-09-06 |
29 162 Nico Schottelius
| [[r1r2p10k8sooo|r1.p10.k8s.ooo]] | production | Nico | server122 | | | 2022-10-30 |
30
| [[r1r2p10k8sooo|r2.p10.k8s.ooo]] | production | Nico | server123 | | | 2022-10-15 |
31
| [[r1r2p5k8sooo|r1.p5.k8s.ooo]] | production | Nico | server137 | | | 2022-10-30 |
32
| [[r1r2p5k8sooo|r2.p5.k8s.ooo]] | production | Nico | server138 | | | 2022-10-30 |
33
| [[r1r2p6k8sooo|r1.p6.k8s.ooo]] | production | Nico | server139 | | | 2022-10-30 |
34
| [[r1r2p6k8sooo|r2.p6.k8s.ooo]] | production | Nico | server140 | | | 2022-10-30 |
35 21 Nico Schottelius
36 1 Nico Schottelius
h2. General architecture and components overview
37
38
* All k8s clusters are IPv6 only
39
* We use BGP peering to propagate podcidr and serviceCidr networks to our infrastructure
40
* The main public testing repository is "ungleich-k8s":https://code.ungleich.ch/ungleich-public/ungleich-k8s
41 18 Nico Schottelius
** Private configurations are found in the **k8s-config** repository
42 1 Nico Schottelius
43
h3. Cluster types
44
45 28 Nico Schottelius
| **Type/Feature**            | **Development**                | **Production**         |
46
| Min No. nodes               | 3 (1 master, 3 worker)         | 5 (3 master, 3 worker) |
47
| Recommended minimum         | 4 (dedicated master, 3 worker) | 8 (3 master, 5 worker) |
48
| Separation of control plane | optional                       | recommended            |
49
| Persistent storage          | required                       | required               |
50
| Number of storage monitors  | 3                              | 5                      |
51 1 Nico Schottelius
52 43 Nico Schottelius
h2. General k8s operations
53 1 Nico Schottelius
54 46 Nico Schottelius
h3. Cheat sheet / external great references
55
56
* "kubectl cheatsheet":https://kubernetes.io/docs/reference/kubectl/cheatsheet/
57
58 214 Nico Schottelius
Some examples:
59
60
h4. Use kubectl to print only the node names
61
62
<pre>
63
kubectl get nodes -o jsonpath='{.items[*].metadata.name}'
64
</pre>
65
66
Can easily be used in a shell loop like this:
67
68
<pre>
69
for host in $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}'); do echo $host; ssh root@${host} uptime; done
70
</pre>
71
72 117 Nico Schottelius
h3. Allowing to schedule work on the control plane / removing node taints
73 69 Nico Schottelius
74
* Mostly for single node / test / development clusters
75
* Just remove the master taint as follows
76
77
<pre>
78
kubectl taint nodes --all node-role.kubernetes.io/master-
79 118 Nico Schottelius
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
80 69 Nico Schottelius
</pre>
81 1 Nico Schottelius
82 117 Nico Schottelius
You can check the node taints using @kubectl describe node ...@
83 69 Nico Schottelius
84 208 Nico Schottelius
h3. Adding taints
85
86
* For instance to limit nodes to specific customers
87
88
<pre>
89
kubectl taint nodes serverXX customer=CUSTOMERNAME:NoSchedule
90
</pre>
91
92 44 Nico Schottelius
h3. Get the cluster admin.conf
93
94
* On the masters of each cluster you can find the file @/etc/kubernetes/admin.conf@
95
* To be able to administrate the cluster you can copy the admin.conf to your local machine
96
* Multi cluster debugging can very easy if you name the config ~/cX-admin.conf (see example below)
97
98
<pre>
99
% scp root@server47.place7.ungleich.ch:/etc/kubernetes/admin.conf ~/c2-admin.conf
100
% export KUBECONFIG=~/c2-admin.conf    
101
% kubectl get nodes
102
NAME       STATUS                     ROLES                  AGE   VERSION
103
server47   Ready                      control-plane,master   82d   v1.22.0
104
server48   Ready                      control-plane,master   82d   v1.22.0
105
server49   Ready                      <none>                 82d   v1.22.0
106
server50   Ready                      <none>                 82d   v1.22.0
107
server59   Ready                      control-plane,master   82d   v1.22.0
108
server60   Ready,SchedulingDisabled   <none>                 82d   v1.22.0
109
server61   Ready                      <none>                 82d   v1.22.0
110
server62   Ready                      <none>                 82d   v1.22.0               
111
</pre>
112
113 18 Nico Schottelius
h3. Installing a new k8s cluster
114 8 Nico Schottelius
115 9 Nico Schottelius
* Decide on the cluster name (usually *cX.k8s.ooo*), X counting upwards
116 28 Nico Schottelius
** Using pXX.k8s.ooo for production clusters of placeXX
117 9 Nico Schottelius
* Use cdist to configure the nodes with requirements like crio
118
* Decide between single or multi node control plane setups (see below)
119 28 Nico Schottelius
** Single control plane suitable for development clusters
120 9 Nico Schottelius
121 28 Nico Schottelius
Typical init procedure:
122 9 Nico Schottelius
123 206 Nico Schottelius
h4. Single control plane:
124
125
<pre>
126
kubeadm init --config bootstrap/XXX/kubeadm.yaml
127
</pre>
128
129
h4. Multi control plane (HA):
130
131
<pre>
132
kubeadm init --config bootstrap/XXX/kubeadm.yaml --upload-certs
133
</pre>
134
135 10 Nico Schottelius
136 29 Nico Schottelius
h3. Deleting a pod that is hanging in terminating state
137
138
<pre>
139
kubectl delete pod <PODNAME> --grace-period=0 --force --namespace <NAMESPACE>
140
</pre>
141
142
(from https://stackoverflow.com/questions/35453792/pods-stuck-in-terminating-status)
143
144 42 Nico Schottelius
h3. Listing nodes of a cluster
145
146
<pre>
147
[15:05] bridge:~% kubectl get nodes
148
NAME       STATUS   ROLES                  AGE   VERSION
149
server22   Ready    <none>                 52d   v1.22.0
150
server23   Ready    <none>                 52d   v1.22.2
151
server24   Ready    <none>                 52d   v1.22.0
152
server25   Ready    <none>                 52d   v1.22.0
153
server26   Ready    <none>                 52d   v1.22.0
154
server27   Ready    <none>                 52d   v1.22.0
155
server63   Ready    control-plane,master   52d   v1.22.0
156
server64   Ready    <none>                 52d   v1.22.0
157
server65   Ready    control-plane,master   52d   v1.22.0
158
server66   Ready    <none>                 52d   v1.22.0
159
server83   Ready    control-plane,master   52d   v1.22.0
160
server84   Ready    <none>                 52d   v1.22.0
161
server85   Ready    <none>                 52d   v1.22.0
162
server86   Ready    <none>                 52d   v1.22.0
163
</pre>
164
165 41 Nico Schottelius
h3. Removing / draining a node
166
167
Usually @kubectl drain server@ should do the job, but sometimes we need to be more aggressive:
168
169 1 Nico Schottelius
<pre>
170 103 Nico Schottelius
kubectl drain --delete-emptydir-data --ignore-daemonsets serverXX
171 42 Nico Schottelius
</pre>
172
173
h3. Readding a node after draining
174
175
<pre>
176
kubectl uncordon serverXX
177 1 Nico Schottelius
</pre>
178 43 Nico Schottelius
179 50 Nico Schottelius
h3. (Re-)joining worker nodes after creating the cluster
180 49 Nico Schottelius
181
* We need to have an up-to-date token
182
* We use different join commands for the workers and control plane nodes
183
184
Generating the join command on an existing control plane node:
185
186
<pre>
187
kubeadm token create --print-join-command
188
</pre>
189
190 50 Nico Schottelius
h3. (Re-)joining control plane nodes after creating the cluster
191 1 Nico Schottelius
192 50 Nico Schottelius
* We generate the token again
193
* We upload the certificates
194
* We need to combine/create the join command for the control plane node
195
196
Example session:
197
198
<pre>
199
% kubeadm token create --print-join-command
200
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash 
201
202
% kubeadm init phase upload-certs --upload-certs
203
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
204
[upload-certs] Using certificate key:
205
CERTKEY
206
207
# Then we use these two outputs on the joining node:
208
209
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash --control-plane --certificate-key CERTKEY
210
</pre>
211
212
Commands to be used on a control plane node:
213
214
<pre>
215
kubeadm token create --print-join-command
216
kubeadm init phase upload-certs --upload-certs
217
</pre>
218
219
Commands to be used on the joining node:
220
221
<pre>
222
JOINCOMMAND --control-plane --certificate-key CERTKEY
223
</pre>
224 49 Nico Schottelius
225 51 Nico Schottelius
SEE ALSO
226
227
* https://stackoverflow.com/questions/63936268/how-to-generate-kubeadm-token-for-secondary-control-plane-nodes
228
* https://blog.scottlowe.org/2019/08/15/reconstructing-the-join-command-for-kubeadm/
229
230 53 Nico Schottelius
h3. How to fix etcd does not start when rejoining a kubernetes cluster as a control plane
231 52 Nico Schottelius
232
If during the above step etcd does not come up, @kubeadm join@ can hang as follows:
233
234
<pre>
235
[control-plane] Creating static Pod manifest for "kube-apiserver"                                                              
236
[control-plane] Creating static Pod manifest for "kube-controller-manager"                                                     
237
[control-plane] Creating static Pod manifest for "kube-scheduler"                                                              
238
[check-etcd] Checking that the etcd cluster is healthy                                                                         
239
error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://[2a0a:e5c0:10:1:225:b3ff:fe20:37
240
8a]:2379 with maintenance client: context deadline exceeded                                                                    
241
To see the stack trace of this error execute with --v=5 or higher         
242
</pre>
243
244
Then the problem is likely that the etcd server is still a member of the cluster. We first need to remove it from the etcd cluster and then the join works.
245
246
To fix this we do:
247
248
* Find a working etcd pod
249
* Find the etcd members / member list
250
* Remove the etcd member that we want to re-join the cluster
251
252
253
<pre>
254
# Find the etcd pods
255
kubectl -n kube-system get pods -l component=etcd,tier=control-plane
256
257
# Get the list of etcd servers with the member id 
258
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
259
260
# Remove the member
261
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove MEMBERID
262
</pre>
263
264
Sample session:
265
266
<pre>
267
[10:48] line:~% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
268
NAME            READY   STATUS    RESTARTS     AGE
269
etcd-server63   1/1     Running   0            3m11s
270
etcd-server65   1/1     Running   3            7d2h
271
etcd-server83   1/1     Running   8 (6d ago)   7d2h
272
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
273
356891cd676df6e4, started, server65, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2379, false
274
371b8a07185dee7e, started, server63, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2379, false
275
5942bc58307f8af9, started, server83, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2380, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2379, false
276
277
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove 371b8a07185dee7e
278
Member 371b8a07185dee7e removed from cluster e3c0805f592a8f77
279 1 Nico Schottelius
280
</pre>
281
282
SEE ALSO
283
284
* We found the solution using https://stackoverflow.com/questions/67921552/re-installed-node-cannot-join-kubernetes-cluster
285 56 Nico Schottelius
286 213 Nico Schottelius
h4. Updating the members
287
288
1) get alive member
289
290
<pre>
291
% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
292
NAME            READY   STATUS    RESTARTS   AGE
293
etcd-server67   1/1     Running   1          185d
294
etcd-server69   1/1     Running   1          185d
295
etcd-server71   1/1     Running   2          185d
296
[20:57] sun:~% 
297
</pre>
298
299
2) get member list
300
301
* in this case via crictl, as the api does not work correctly anymore
302
303
<pre>
304
305
306
</pre>
307
308
309
3) update
310
311
<pre>
312
etcdctl member update MEMBERID  --peer-urls=https://[...]:2380
313
314
315
</pre>
316
317 147 Nico Schottelius
h3. Node labels (adding, showing, removing)
318
319
Listing the labels:
320
321
<pre>
322
kubectl get nodes --show-labels
323
</pre>
324
325
Adding labels:
326
327
<pre>
328
kubectl label nodes LIST-OF-NODES label1=value1 
329
330
</pre>
331
332
For instance:
333
334
<pre>
335
kubectl label nodes router2 router3 hosttype=router 
336
</pre>
337
338
Selecting nodes in pods:
339
340
<pre>
341
apiVersion: v1
342
kind: Pod
343
...
344
spec:
345
  nodeSelector:
346
    hosttype: router
347
</pre>
348
349 148 Nico Schottelius
Removing labels by adding a minus at the end of the label name:
350
351
<pre>
352
kubectl label node <nodename> <labelname>-
353
</pre>
354
355
For instance:
356
357
<pre>
358
kubectl label nodes router2 router3 hosttype- 
359
</pre>
360
361 147 Nico Schottelius
SEE ALSO
362 1 Nico Schottelius
363 148 Nico Schottelius
* https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes/
364
* https://stackoverflow.com/questions/34067979/how-to-delete-a-node-label-by-command-and-api
365 147 Nico Schottelius
366 199 Nico Schottelius
h3. Listing all pods on a node
367
368
<pre>
369
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=serverXX
370
</pre>
371
372
Found on https://stackoverflow.com/questions/62000559/how-to-list-all-the-pods-running-in-a-particular-worker-node-by-executing-a-comm
373
374 101 Nico Schottelius
h3. Hardware Maintenance using ungleich-hardware
375
376
Use the following manifest and replace the HOST with the actual host:
377
378
<pre>
379
apiVersion: v1
380
kind: Pod
381
metadata:
382
  name: ungleich-hardware-HOST
383
spec:
384
  containers:
385
  - name: ungleich-hardware
386
    image: ungleich/ungleich-hardware:0.0.5
387
    args:
388
    - sleep
389
    - "1000000"
390
    volumeMounts:
391
      - mountPath: /dev
392
        name: dev
393
    securityContext:
394
      privileged: true
395
  nodeSelector:
396
    kubernetes.io/hostname: "HOST"
397
398
  volumes:
399
    - name: dev
400
      hostPath:
401
        path: /dev
402
</pre>
403
404 102 Nico Schottelius
Also see: [[The_ungleich_hardware_maintenance_guide]]
405
406 105 Nico Schottelius
h3. Triggering a cronjob / creating a job from a cronjob
407 104 Nico Schottelius
408
To test a cronjob, we can create a job from a cronjob:
409
410
<pre>
411
kubectl create job --from=cronjob/volume2-daily-backup volume2-manual
412
</pre>
413
414
This creates a job volume2-manual based on the cronjob  volume2-daily
415
416 112 Nico Schottelius
h3. su-ing into a user that has nologin shell set
417
418
Many times users are having nologin as their shell inside the container. To be able to execute maintenance commands within the
419
container, we can use @su -s /bin/sh@ like this:
420
421
<pre>
422
su -s /bin/sh -c '/path/to/your/script' testuser
423
</pre>
424
425
Found on https://serverfault.com/questions/351046/how-to-run-command-as-user-who-has-usr-sbin-nologin-as-shell
426
427 113 Nico Schottelius
h3. How to print a secret value
428
429
Assuming you want the "password" item from a secret, use:
430
431
<pre>
432
kubectl get secret SECRETNAME -o jsonpath="{.data.password}" | base64 -d; echo "" 
433
</pre>
434
435 209 Nico Schottelius
h3. Fixing the "ImageInspectError"
436
437
If you see this problem:
438
439
<pre>
440
# kubectl get pods
441
NAME                                                       READY   STATUS                   RESTARTS   AGE
442
bird-router-server137-bird-767f65bb47-g4xsh                0/1     Init:ImageInspectError   0          77d
443
bird-router-server137-openvpn-server120-5c987b7ffb-cn9xf   0/1     ImageInspectError        1          159d
444
bird-router-server137-unbound-5c6f5d4bb6-cxbpr             0/1     ImageInspectError        1          159d
445
</pre>
446
447
Fixes so far:
448
449
* correct registries.conf
450
451 212 Nico Schottelius
h3. Automatic cleanup of images
452
453
* options to kubelet
454
455
<pre>
456
  --image-gc-high-threshold=90: The percent of disk usage after which image garbage collection is always run. Default: 90%
457
  --image-gc-low-threshold=80: The percent of disk usage before which image garbage collection is never run. Lowest disk usage to garbage collect to. Default: 80%
458
</pre>
459 209 Nico Schottelius
460 173 Nico Schottelius
h3. How to upgrade a kubernetes cluster
461 172 Nico Schottelius
462
h4. General
463
464
* Should be done every X months to stay up-to-date
465
** X probably something like 3-6
466
* kubeadm based clusters
467
* Needs specific kubeadm versions for upgrade
468
* Follow instructions on https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
469 190 Nico Schottelius
* Finding releases: https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG
470 172 Nico Schottelius
471
h4. Getting a specific kubeadm or kubelet version
472
473
<pre>
474 190 Nico Schottelius
RELEASE=v1.22.17
475
RELEASE=v1.23.17
476 181 Nico Schottelius
RELEASE=v1.24.9
477 1 Nico Schottelius
RELEASE=v1.25.9
478
RELEASE=v1.26.6
479 190 Nico Schottelius
RELEASE=v1.27.2
480
481 187 Nico Schottelius
ARCH=amd64
482 172 Nico Schottelius
483
curl -L --remote-name-all https://dl.k8s.io/release/${RELEASE}/bin/linux/${ARCH}/{kubeadm,kubelet}
484 182 Nico Schottelius
chmod u+x kubeadm kubelet
485 172 Nico Schottelius
</pre>
486
487
h4. Steps
488
489
* kubeadm upgrade plan
490
** On one control plane node
491
* kubeadm upgrade apply vXX.YY.ZZ
492
** On one control plane node
493 189 Nico Schottelius
* kubeadm upgrade node
494
** On all other control plane nodes
495
** On all worker nodes afterwards
496
497 172 Nico Schottelius
498 173 Nico Schottelius
Repeat for all control planes nodes. The upgrade kubelet on all other nodes via package manager.
499 172 Nico Schottelius
500 193 Nico Schottelius
h4. Upgrading to 1.22.17
501 1 Nico Schottelius
502 193 Nico Schottelius
* https://v1-22.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
503 194 Nico Schottelius
* Need to create a kubeadm config map
504 198 Nico Schottelius
** f.i. using the following
505
** @/usr/local/bin/kubeadm-v1.22.17   upgrade --config kubeadm.yaml --ignore-preflight-errors=CoreDNSUnsupportedPlugins,CoreDNSMigration apply -y v1.22.17@
506 193 Nico Schottelius
* Done for p6 on 2023-10-04
507
508
h4. Upgrading to 1.23.17
509
510
* https://v1-23.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
511
* No special notes
512
* Done for p6 on 2023-10-04
513
514
h4. Upgrading to 1.24.17
515
516
* https://v1-24.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
517
* No special notes
518
* Done for p6 on 2023-10-04
519
520
h4. Upgrading to 1.25.14
521
522
* https://v1-24.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
523
* No special notes
524
* Done for p6 on 2023-10-04
525
526
h4. Upgrading to 1.26.9
527
528 1 Nico Schottelius
* https://v1-26.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
529 193 Nico Schottelius
* No special notes
530
* Done for p6 on 2023-10-04
531 188 Nico Schottelius
532 196 Nico Schottelius
h4. Upgrading to 1.27
533 186 Nico Schottelius
534 192 Nico Schottelius
* https://v1-27.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
535 186 Nico Schottelius
* kubelet will not start anymore
536
* reason: @"command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"@
537
* /var/lib/kubelet/kubeadm-flags.env contains that parameter
538
* remove it, start kubelet
539 192 Nico Schottelius
540 197 Nico Schottelius
h4. Upgrading to 1.28
541 192 Nico Schottelius
542
* https://v1-28.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
543 186 Nico Schottelius
544 223 Nico Schottelius
h4. Upgrading to 1.29
545
546
* Done for many clusters around 2024-01-10
547
* Unsure if it was properly released
548
* https://v1-29.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
549
550 219 Nico Schottelius
h4. Upgrading to 1.31
551
552
* Cluster needs to updated FIRST before kubelet/the OS
553
554
Otherwise you run into errors in the pod like this:
555
556
<pre>
557
  Warning  Failed     11s (x3 over 12s)  kubelet            Error: services have not yet been read at least once, cannot construct envvars
558
</pre>
559
560 210 Nico Schottelius
And the resulting pod state is:
561
562
<pre>
563
Init:CreateContainerConfigError
564
</pre>
565
566 224 Nico Schottelius
Fix: 
567
568
* find an old 1.30 kubelet package, downgrade kubelet, upgrade the control plane, upgrade kubelet again
569
570 225 Nico Schottelius
<pre>
571
wget https://mirror.ungleich.ch/mirror/packages/alpine/v3.20/community/x86_64/kubelet-1.30.0-r3.apk
572
wget https://mirror.ungleich.ch/mirror/packages/alpine/v3.20/community/x86_64/kubelet-openrc-1.30.0-r3.apk
573
apk add ./kubelet-1.30.0-r3.apk ./kubelet-openrc-1.30.0-r3.apk
574 226 Nico Schottelius
/etc/init.d/kubelet restart
575 225 Nico Schottelius
</pre>
576 226 Nico Schottelius
577
Then upgrade:
578
579
<pre>
580
/usr/local/bin/kubeadm-v1.31.3   upgrade apply -y v1.31.3
581
</pre>
582
583
Then re-upgrade the kubelet:
584
585
<pre>
586
apk upgrade -a
587
</pre>
588
589 225 Nico Schottelius
590 186 Nico Schottelius
h4. Upgrade to crio 1.27: missing crun
591
592
Error message
593
594
<pre>
595
level=fatal msg="validating runtime config: runtime validation: \"crun\" not found in $PATH: exec: \"crun\": executable file not found in $PATH"
596
</pre>
597 1 Nico Schottelius
598 186 Nico Schottelius
Fix:
599
600
<pre>
601
apk add crun
602
</pre>
603 223 Nico Schottelius
604 186 Nico Schottelius
605 157 Nico Schottelius
h2. Reference CNI
606
607
* Mainly "stupid", but effective plugins
608
* Main documentation on https://www.cni.dev/plugins/current/
609 158 Nico Schottelius
* Plugins
610
** bridge
611
*** Can create the bridge on the host
612
*** But seems not to be able to add host interfaces to it as well
613
*** Has support for vlan tags
614
** vlan
615
*** creates vlan tagged sub interface on the host
616 160 Nico Schottelius
*** "It's a 1:1 mapping (i.e. no bridge in between)":https://github.com/k8snetworkplumbingwg/multus-cni/issues/569
617 158 Nico Schottelius
** host-device
618
*** moves the interface from the host into the container
619
*** very easy for physical connections to containers
620 159 Nico Schottelius
** ipvlan
621
*** "virtualisation" of a host device
622
*** routing based on IP
623
*** Same MAC for everyone
624
*** Cannot reach the master interface
625
** maclvan
626
*** With mac addresses
627
*** Supports various modes (to be checked)
628
** ptp ("point to point")
629
*** Creates a host device and connects it to the container
630
** win*
631 158 Nico Schottelius
*** Windows implementations
632 157 Nico Schottelius
633 62 Nico Schottelius
h2. Calico CNI
634
635
h3. Calico Installation
636
637
* We install "calico using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
638 228 Nico Schottelius
* Check the tags on https://github.com/projectcalico/calico/tags for the latest release
639 62 Nico Schottelius
* This has the following advantages:
640
** Easy to upgrade
641
** Does not require os to configure IPv6/dual stack settings as the tigera operator figures out things on its own
642 230 Nico Schottelius
* As of 2025-05-28, tigera should be installed in the *tigera-operator* namespace
643 62 Nico Schottelius
644
Usually plain calico can be installed directly using:
645
646 1 Nico Schottelius
<pre>
647 228 Nico Schottelius
VERSION=v3.30.0
648 149 Nico Schottelius
649 1 Nico Schottelius
helm repo add projectcalico https://docs.projectcalico.org/charts
650
helm repo update
651 230 Nico Schottelius
helm upgrade --install calico projectcalico/tigera-operator --version $VERSION --namespace tigera-operator --create-namespace
652 92 Nico Schottelius
</pre>
653 1 Nico Schottelius
654 229 Nico Schottelius
h3. Calico upgrade
655 92 Nico Schottelius
656 229 Nico Schottelius
* As of 3.30 or so, CRDs need to be applied manually beforehand
657
658
<pre>
659
VERSION=v3.30.0
660
661
kubectl apply --server-side --force-conflicts -f https://raw.githubusercontent.com/projectcalico/calico/${VERSION}/manifests/operator-crds.yaml
662
helm upgrade --install --namespace tigera calico projectcalico/tigera-operator --version $VERSION --create-namespace
663
</pre>
664 228 Nico Schottelius
665 62 Nico Schottelius
h3. Installing calicoctl
666
667 115 Nico Schottelius
* General installation instructions, including binary download: https://projectcalico.docs.tigera.io/maintenance/clis/calicoctl/install
668
669 62 Nico Schottelius
To be able to manage and configure calico, we need to 
670
"install calicoctl (we choose the version as a pod)":https://docs.projectcalico.org/getting-started/clis/calicoctl/install#install-calicoctl-as-a-kubernetes-pod
671
672
<pre>
673
kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml
674
</pre>
675
676 93 Nico Schottelius
Or version specific:
677
678
<pre>
679
kubectl apply -f https://github.com/projectcalico/calico/blob/v3.20.4/manifests/calicoctl.yaml
680 97 Nico Schottelius
681
# For 3.22
682
kubectl apply -f https://projectcalico.docs.tigera.io/archive/v3.22/manifests/calicoctl.yaml
683 93 Nico Schottelius
</pre>
684
685 70 Nico Schottelius
And making it easier accessible by alias:
686
687
<pre>
688
alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"
689
</pre>
690
691 62 Nico Schottelius
h3. Calico configuration
692
693 63 Nico Schottelius
By default our k8s clusters "BGP peer":https://docs.projectcalico.org/networking/bgp
694
with an upstream router to propagate podcidr and servicecidr.
695 62 Nico Schottelius
696
Default settings in our infrastructure:
697
698
* We use a full-mesh using the @nodeToNodeMeshEnabled: true@ option
699
* We keep the original next hop so that *only* the server with the pod is announcing it (instead of ecmp)
700 1 Nico Schottelius
* We use private ASNs for k8s clusters
701 63 Nico Schottelius
* We do *not* use any overlay
702 62 Nico Schottelius
703
After installing calico and calicoctl the last step of the installation is usually:
704
705 1 Nico Schottelius
<pre>
706 79 Nico Schottelius
calicoctl create -f - < calico-bgp.yaml
707 62 Nico Schottelius
</pre>
708
709
710
A sample BGP configuration:
711
712
<pre>
713
---
714
apiVersion: projectcalico.org/v3
715
kind: BGPConfiguration
716
metadata:
717
  name: default
718
spec:
719
  logSeverityScreen: Info
720
  nodeToNodeMeshEnabled: true
721
  asNumber: 65534
722
  serviceClusterIPs:
723
  - cidr: 2a0a:e5c0:10:3::/108
724
  serviceExternalIPs:
725
  - cidr: 2a0a:e5c0:10:3::/108
726
---
727
apiVersion: projectcalico.org/v3
728
kind: BGPPeer
729
metadata:
730
  name: router1-place10
731
spec:
732
  peerIP: 2a0a:e5c0:10:1::50
733
  asNumber: 213081
734
  keepOriginalNextHop: true
735
</pre>
736
737 227 Nico Schottelius
h3. Get installed calico version
738
739
* might be in calico or tigera namespace
740
741
<pre>
742
helm ls -A | grep calico
743
</pre>
744
745 126 Nico Schottelius
h2. Cilium CNI (experimental)
746
747 137 Nico Schottelius
h3. Status
748
749 138 Nico Schottelius
*NO WORKING CILIUM CONFIGURATION FOR IPV6 only modes*
750 137 Nico Schottelius
751 146 Nico Schottelius
h3. Latest error
752
753
It seems cilium does not run on IPv6 only hosts:
754
755
<pre>
756
level=info msg="Validating configured node address ranges" subsys=daemon
757
level=fatal msg="postinit failed" error="external IPv4 node address could not be derived, please configure via --ipv4-node" subsys=daemon
758
level=info msg="Starting IP identity watcher" subsys=ipcache
759
</pre>
760
761
It crashes after that log entry
762
763 128 Nico Schottelius
h3. BGP configuration
764
765
* The cilium-operator will not start without a correct configmap being present beforehand (see error message below)
766
* Creating the bgp config beforehand as a configmap is thus required.
767
768
The error one gets without the configmap present:
769
770
Pods are hanging with:
771
772
<pre>
773
cilium-bpqm6                       0/1     Init:0/4            0             9s
774
cilium-operator-5947d94f7f-5bmh2   0/1     ContainerCreating   0             9s
775
</pre>
776
777
The error message in the cilium-*perator is:
778
779
<pre>
780
Events:
781
  Type     Reason       Age                From               Message
782
  ----     ------       ----               ----               -------
783
  Normal   Scheduled    80s                default-scheduler  Successfully assigned kube-system/cilium-operator-5947d94f7f-lqcsp to server56
784
  Warning  FailedMount  16s (x8 over 80s)  kubelet            MountVolume.SetUp failed for volume "bgp-config-path" : configmap "bgp-config" not found
785
</pre>
786
787
A correct bgp config looks like this:
788
789
<pre>
790
apiVersion: v1
791
kind: ConfigMap
792
metadata:
793
  name: bgp-config
794
  namespace: kube-system
795
data:
796
  config.yaml: |
797
    peers:
798
      - peer-address: 2a0a:e5c0::46
799
        peer-asn: 209898
800
        my-asn: 65533
801
      - peer-address: 2a0a:e5c0::47
802
        peer-asn: 209898
803
        my-asn: 65533
804
    address-pools:
805
      - name: default
806
        protocol: bgp
807
        addresses:
808
          - 2a0a:e5c0:0:14::/64
809
</pre>
810 127 Nico Schottelius
811
h3. Installation
812 130 Nico Schottelius
813 127 Nico Schottelius
Adding the repo
814 1 Nico Schottelius
<pre>
815 127 Nico Schottelius
816 129 Nico Schottelius
helm repo add cilium https://helm.cilium.io/
817 130 Nico Schottelius
helm repo update
818
</pre>
819 129 Nico Schottelius
820 135 Nico Schottelius
Installing + configuring cilium
821 129 Nico Schottelius
<pre>
822 130 Nico Schottelius
ipv6pool=2a0a:e5c0:0:14::/112
823 1 Nico Schottelius
824 146 Nico Schottelius
version=1.12.2
825 129 Nico Schottelius
826
helm upgrade --install cilium cilium/cilium --version $version \
827 1 Nico Schottelius
  --namespace kube-system \
828
  --set ipv4.enabled=false \
829
  --set ipv6.enabled=true \
830 146 Nico Schottelius
  --set enableIPv6Masquerade=false \
831
  --set bgpControlPlane.enabled=true 
832 1 Nico Schottelius
833 146 Nico Schottelius
#  --set ipam.operator.clusterPoolIPv6PodCIDRList=$ipv6pool
834
835
# Old style bgp?
836 136 Nico Schottelius
#   --set bgp.enabled=true --set bgp.announce.podCIDR=true \
837 127 Nico Schottelius
838
# Show possible configuration options
839
helm show values cilium/cilium
840
841 1 Nico Schottelius
</pre>
842 132 Nico Schottelius
843
Using a /64 for ipam.operator.clusterPoolIPv6PodCIDRList fails with:
844
845
<pre>
846
level=fatal msg="Unable to init cluster-pool allocator" error="unable to initialize IPv6 allocator New CIDR set failed; the node CIDR size is too big" subsys=cilium-operator-generic
847
</pre>
848
849 126 Nico Schottelius
850 1 Nico Schottelius
See also https://github.com/cilium/cilium/issues/20756
851 135 Nico Schottelius
852
Seems a /112 is actually working.
853
854
h3. Kernel modules
855
856
Cilium requires the following modules to be loaded on the host (not loaded by default):
857
858
<pre>
859 1 Nico Schottelius
modprobe  ip6table_raw
860
modprobe  ip6table_filter
861
</pre>
862 146 Nico Schottelius
863
h3. Interesting helm flags
864
865
* autoDirectNodeRoutes
866
* bgpControlPlane.enabled = true
867
868
h3. SEE ALSO
869
870
* https://docs.cilium.io/en/v1.12/helm-reference/
871 133 Nico Schottelius
872 179 Nico Schottelius
h2. Multus
873 168 Nico Schottelius
874
* https://github.com/k8snetworkplumbingwg/multus-cni
875
* Installing a deployment w/ CRDs
876 150 Nico Schottelius
877 169 Nico Schottelius
<pre>
878 176 Nico Schottelius
VERSION=v4.0.1
879 169 Nico Schottelius
880 170 Nico Schottelius
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/${VERSION}/deployments/multus-daemonset-crio.yml
881
</pre>
882 169 Nico Schottelius
883 191 Nico Schottelius
h2. ArgoCD
884 56 Nico Schottelius
885 60 Nico Schottelius
h3. Argocd Installation
886 1 Nico Schottelius
887 116 Nico Schottelius
* See https://argo-cd.readthedocs.io/en/stable/
888
889 60 Nico Schottelius
As there is no configuration management present yet, argocd is installed using
890
891 1 Nico Schottelius
<pre>
892 60 Nico Schottelius
kubectl create namespace argocd
893 1 Nico Schottelius
894
# OR: latest stable
895
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
896
897 191 Nico Schottelius
# OR Specific Version
898
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/v2.3.2/manifests/install.yaml
899 56 Nico Schottelius
900 191 Nico Schottelius
901
</pre>
902 1 Nico Schottelius
903 60 Nico Schottelius
h3. Get the argocd credentials
904
905
<pre>
906
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d; echo ""
907
</pre>
908 52 Nico Schottelius
909 87 Nico Schottelius
h3. Accessing argocd
910
911
In regular IPv6 clusters:
912
913
* Navigate to https://argocd-server.argocd.CLUSTERDOMAIN
914
915
In legacy IPv4 clusters
916
917
<pre>
918
kubectl --namespace argocd port-forward svc/argocd-server 8080:80
919
</pre>
920
921 88 Nico Schottelius
* Navigate to https://localhost:8080
922
923 68 Nico Schottelius
h3. Using the argocd webhook to trigger changes
924 67 Nico Schottelius
925
* To trigger changes post json https://argocd.example.com/api/webhook
926
927 72 Nico Schottelius
h3. Deploying an application
928
929
* Applications are deployed via git towards gitea (code.ungleich.ch) and then pulled by argo
930 73 Nico Schottelius
* Always include the *redmine-url* pointing to the (customer) ticket
931
** Also add the support-url if it exists
932 72 Nico Schottelius
933
Application sample
934
935
<pre>
936
apiVersion: argoproj.io/v1alpha1
937
kind: Application
938
metadata:
939
  name: gitea-CUSTOMER
940
  namespace: argocd
941
spec:
942
  destination:
943
    namespace: default
944
    server: 'https://kubernetes.default.svc'
945
  source:
946
    path: apps/prod/gitea
947
    repoURL: 'https://code.ungleich.ch/ungleich-intern/k8s-config.git'
948
    targetRevision: HEAD
949
    helm:
950
      parameters:
951
        - name: storage.data.storageClass
952
          value: rook-ceph-block-hdd
953
        - name: storage.data.size
954
          value: 200Gi
955
        - name: storage.db.storageClass
956
          value: rook-ceph-block-ssd
957
        - name: storage.db.size
958
          value: 10Gi
959
        - name: storage.letsencrypt.storageClass
960
          value: rook-ceph-block-hdd
961
        - name: storage.letsencrypt.size
962
          value: 50Mi
963
        - name: letsencryptStaging
964
          value: 'no'
965
        - name: fqdn
966
          value: 'code.verua.online'
967
  project: default
968
  syncPolicy:
969
    automated:
970
      prune: true
971
      selfHeal: true
972
  info:
973
    - name: 'redmine-url'
974
      value: 'https://redmine.ungleich.ch/issues/ISSUEID'
975
    - name: 'support-url'
976
      value: 'https://support.ungleich.ch/Ticket/Display.html?id=TICKETID'
977
</pre>
978
979 80 Nico Schottelius
h2. Helm related operations and conventions
980 55 Nico Schottelius
981 61 Nico Schottelius
We use helm charts extensively.
982
983
* In production, they are managed via argocd
984
* In development, helm chart can de developed and deployed manually using the helm utility.
985
986 55 Nico Schottelius
h3. Installing a helm chart
987
988
One can use the usual pattern of
989
990
<pre>
991
helm install <releasename> <chartdirectory>
992
</pre>
993
994
However often you want to reinstall/update when testing helm charts. The following pattern is "better", because it allows you to reinstall, if it is already installed:
995
996
<pre>
997
helm upgrade --install <releasename> <chartdirectory>
998 1 Nico Schottelius
</pre>
999 80 Nico Schottelius
1000
h3. Naming services and deployments in helm charts [Application labels]
1001
1002
* We always have {{ .Release.Name }} to identify the current "instance"
1003
* Deployments:
1004
** use @app: <what it is>@, f.i. @app: nginx@, @app: postgres@, ...
1005 81 Nico Schottelius
* See more about standard labels on
1006
** https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
1007
** https://helm.sh/docs/chart_best_practices/labels/
1008 55 Nico Schottelius
1009 151 Nico Schottelius
h3. Show all versions of a helm chart
1010
1011
<pre>
1012
helm search repo -l repo/chart
1013
</pre>
1014
1015
For example:
1016
1017
<pre>
1018
% helm search repo -l projectcalico/tigera-operator 
1019
NAME                         	CHART VERSION	APP VERSION	DESCRIPTION                            
1020
projectcalico/tigera-operator	v3.23.3      	v3.23.3    	Installs the Tigera operator for Calico
1021
projectcalico/tigera-operator	v3.23.2      	v3.23.2    	Installs the Tigera operator for Calico
1022
....
1023
</pre>
1024
1025 152 Nico Schottelius
h3. Show possible values of a chart
1026
1027
<pre>
1028
helm show values <repo/chart>
1029
</pre>
1030
1031
Example:
1032
1033
<pre>
1034
helm show values ingress-nginx/ingress-nginx
1035
</pre>
1036
1037 207 Nico Schottelius
h3. Show all possible charts in a repo
1038
1039
<pre>
1040
helm search repo REPO
1041
</pre>
1042
1043 178 Nico Schottelius
h3. Download a chart
1044
1045
For instance for checking it out locally. Use:
1046
1047
<pre>
1048
helm pull <repo/chart>
1049
</pre>
1050 152 Nico Schottelius
1051 139 Nico Schottelius
h2. Rook + Ceph
1052
1053
h3. Installation
1054
1055
* Usually directly via argocd
1056
1057 71 Nico Schottelius
h3. Executing ceph commands
1058
1059
Using the ceph-tools pod as follows:
1060
1061
<pre>
1062
kubectl exec -n rook-ceph -ti $(kubectl -n rook-ceph get pods -l app=rook-ceph-tools -o jsonpath='{.items[*].metadata.name}') -- ceph -s
1063
</pre>
1064
1065 43 Nico Schottelius
h3. Inspecting the logs of a specific server
1066
1067
<pre>
1068
# Get the related pods
1069
kubectl -n rook-ceph get pods -l app=rook-ceph-osd-prepare 
1070
...
1071
1072
# Inspect the logs of a specific pod
1073
kubectl -n rook-ceph logs -f rook-ceph-osd-prepare-server23--1-444qx
1074
1075 71 Nico Schottelius
</pre>
1076
1077
h3. Inspecting the logs of the rook-ceph-operator
1078
1079
<pre>
1080
kubectl -n rook-ceph logs -f -l app=rook-ceph-operator
1081 43 Nico Schottelius
</pre>
1082
1083 200 Nico Schottelius
h3. (Temporarily) Disabling the rook-operation
1084
1085
* first disabling the sync in argocd
1086
* then scale it down
1087
1088
<pre>
1089
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=0
1090
</pre>
1091
1092
When done with the work/maintenance, re-enable sync in argocd.
1093
The following command is thus strictly speaking not required, as argocd will fix it on its own:
1094
1095
<pre>
1096
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=1
1097
</pre>
1098
1099 121 Nico Schottelius
h3. Restarting the rook operator
1100
1101
<pre>
1102
kubectl -n rook-ceph delete pods  -l app=rook-ceph-operator
1103
</pre>
1104
1105 43 Nico Schottelius
h3. Triggering server prepare / adding new osds
1106
1107
The rook-ceph-operator triggers/watches/creates pods to maintain hosts. To trigger a full "re scan", simply delete that pod:
1108
1109
<pre>
1110
kubectl -n rook-ceph delete pods -l app=rook-ceph-operator
1111
</pre>
1112
1113
This will cause all the @rook-ceph-osd-prepare-..@ jobs to be recreated and thus OSDs to be created, if new disks have been added.
1114
1115
h3. Removing an OSD
1116
1117
* See "Ceph OSD Management":https://rook.io/docs/rook/v1.7/ceph-osd-mgmt.html
1118 77 Nico Schottelius
* More specifically: https://github.com/rook/rook/blob/release-1.7/cluster/examples/kubernetes/ceph/osd-purge.yaml
1119 99 Nico Schottelius
* Then delete the related deployment
1120 41 Nico Schottelius
1121 98 Nico Schottelius
Set osd id in the osd-purge.yaml and apply it. OSD should be down before.
1122
1123
<pre>
1124
apiVersion: batch/v1
1125
kind: Job
1126
metadata:
1127
  name: rook-ceph-purge-osd
1128
  namespace: rook-ceph # namespace:cluster
1129
  labels:
1130
    app: rook-ceph-purge-osd
1131
spec:
1132
  template:
1133
    metadata:
1134
      labels:
1135
        app: rook-ceph-purge-osd
1136
    spec:
1137
      serviceAccountName: rook-ceph-purge-osd
1138
      containers:
1139
        - name: osd-removal
1140
          image: rook/ceph:master
1141
          # TODO: Insert the OSD ID in the last parameter that is to be removed
1142
          # The OSD IDs are a comma-separated list. For example: "0" or "0,2".
1143
          # If you want to preserve the OSD PVCs, set `--preserve-pvc true`.
1144
          #
1145
          # A --force-osd-removal option is available if the OSD should be destroyed even though the
1146
          # removal could lead to data loss.
1147
          args:
1148
            - "ceph"
1149
            - "osd"
1150
            - "remove"
1151
            - "--preserve-pvc"
1152
            - "false"
1153
            - "--force-osd-removal"
1154
            - "false"
1155
            - "--osd-ids"
1156
            - "SETTHEOSDIDHERE"
1157
          env:
1158
            - name: POD_NAMESPACE
1159
              valueFrom:
1160
                fieldRef:
1161
                  fieldPath: metadata.namespace
1162
            - name: ROOK_MON_ENDPOINTS
1163
              valueFrom:
1164
                configMapKeyRef:
1165
                  key: data
1166
                  name: rook-ceph-mon-endpoints
1167
            - name: ROOK_CEPH_USERNAME
1168
              valueFrom:
1169
                secretKeyRef:
1170
                  key: ceph-username
1171
                  name: rook-ceph-mon
1172
            - name: ROOK_CEPH_SECRET
1173
              valueFrom:
1174
                secretKeyRef:
1175
                  key: ceph-secret
1176
                  name: rook-ceph-mon
1177
            - name: ROOK_CONFIG_DIR
1178
              value: /var/lib/rook
1179
            - name: ROOK_CEPH_CONFIG_OVERRIDE
1180
              value: /etc/rook/config/override.conf
1181
            - name: ROOK_FSID
1182
              valueFrom:
1183
                secretKeyRef:
1184
                  key: fsid
1185
                  name: rook-ceph-mon
1186
            - name: ROOK_LOG_LEVEL
1187
              value: DEBUG
1188
          volumeMounts:
1189
            - mountPath: /etc/ceph
1190
              name: ceph-conf-emptydir
1191
            - mountPath: /var/lib/rook
1192
              name: rook-config
1193
      volumes:
1194
        - emptyDir: {}
1195
          name: ceph-conf-emptydir
1196
        - emptyDir: {}
1197
          name: rook-config
1198
      restartPolicy: Never
1199
1200
1201 99 Nico Schottelius
</pre>
1202
1203 1 Nico Schottelius
Deleting the deployment:
1204
1205
<pre>
1206
[18:05] bridge:~% kubectl -n rook-ceph delete deployment rook-ceph-osd-6
1207 99 Nico Schottelius
deployment.apps "rook-ceph-osd-6" deleted
1208
</pre>
1209 185 Nico Schottelius
1210
h3. Placement of mons/osds/etc.
1211
1212
See https://rook.io/docs/rook/v1.11/CRDs/Cluster/ceph-cluster-crd/#placement-configuration-settings
1213 98 Nico Schottelius
1214 215 Nico Schottelius
h3. Setting up and managing S3 object storage
1215
1216 217 Nico Schottelius
h4. Endpoints
1217
1218
| Location | Enpdoint |
1219
| p5 | https://s3.k8s.place5.ungleich.ch |
1220
1221
1222 215 Nico Schottelius
h4. Setting up a storage class
1223
1224
* This will store the buckets of a specific customer
1225
1226
Similar to this:
1227
1228
<pre>
1229
apiVersion: storage.k8s.io/v1
1230
kind: StorageClass
1231
metadata:
1232
  name: ungleich-archive-bucket-sc
1233
  namespace: rook-ceph
1234
provisioner: rook-ceph.ceph.rook.io/bucket
1235
reclaimPolicy: Delete
1236
parameters:
1237
  objectStoreName: place5
1238
  objectStoreNamespace: rook-ceph
1239
</pre>
1240
1241
h4. Setting up the Bucket
1242
1243
Similar to this:
1244
1245
<pre>
1246
apiVersion: objectbucket.io/v1alpha1
1247
kind: ObjectBucketClaim
1248
metadata:
1249
  name: ungleich-archive-bucket-claim
1250
  namespace: rook-ceph
1251
spec:
1252
  generateBucketName: ungleich-archive-ceph-bkt
1253
  storageClassName: ungleich-archive-bucket-sc
1254
  additionalConfig:
1255
    # To set for quota for OBC
1256
    #maxObjects: "1000"
1257
    maxSize: "100G"
1258
</pre>
1259
1260
* See also: https://rook.io/docs/rook/latest-release/Storage-Configuration/Object-Storage-RGW/ceph-object-bucket-claim/#obc-custom-resource
1261
1262
h4. Getting the credentials for the bucket
1263
1264
* Get "public" information from the configmap
1265
* Get secret from the secret
1266
1267 216 Nico Schottelius
<pre>
1268 1 Nico Schottelius
name=BUCKETNAME
1269 221 Nico Schottelius
s3host=s3.k8s.place5.ungleich.ch
1270
endpoint=https://${s3host}
1271 1 Nico Schottelius
1272
cm=$(kubectl -n rook-ceph get configmap -o yaml ${name}-bucket-claim)
1273 217 Nico Schottelius
1274 1 Nico Schottelius
sec=$(kubectl -n rook-ceph get secrets -o yaml ${name}-bucket-claim)
1275 222 Nico Schottelius
export AWS_ACCESS_KEY_ID=$(echo $sec | yq .data.AWS_ACCESS_KEY_ID | base64 -d ; echo "")
1276
export AWS_SECRET_ACCESS_KEY=$(echo $sec | yq .data.AWS_SECRET_ACCESS_KEY | base64 -d ; echo "")
1277 1 Nico Schottelius
1278 217 Nico Schottelius
1279 216 Nico Schottelius
bucket_name=$(echo $cm | yq .data.BUCKET_NAME)
1280 1 Nico Schottelius
</pre>
1281 217 Nico Schottelius
1282 220 Nico Schottelius
h5. Access via s3cmd
1283 1 Nico Schottelius
1284 221 Nico Schottelius
it is *NOT*:
1285
1286 220 Nico Schottelius
<pre>
1287 221 Nico Schottelius
s3cmd --host ${s3host}:443 --access_key=${AWS_ACCESS_KEY_ID} --secret_key=${AWS_SECRET_ACCESS_KEY} ls s3://${name}
1288 220 Nico Schottelius
</pre>
1289
1290 217 Nico Schottelius
h5. Access via s4cmd
1291
1292
<pre>
1293 1 Nico Schottelius
s4cmd --endpoint-url ${endpoint} --access-key=$(AWS_ACCESS_KEY_ID) --secret-key=$(AWS_SECRET_ACCESS_KEY) ls
1294
</pre>
1295 221 Nico Schottelius
1296
h5. Access via s5cmd
1297
1298
* Uses environment variables
1299
1300
<pre>
1301
s5cmd --endpoint-url ${endpoint} ls
1302
</pre>
1303 215 Nico Schottelius
1304 145 Nico Schottelius
h2. Ingress + Cert Manager
1305
1306
* We deploy "nginx-ingress":https://docs.nginx.com/nginx-ingress-controller/ to get an ingress
1307
* we deploy "cert-manager":https://cert-manager.io/ to handle certificates
1308
* We independently deploy @ClusterIssuer@ to allow the cert-manager app to deploy and the issuer to be created once the CRDs from cert manager are in place
1309
1310
h3. IPv4 reachability 
1311
1312
The ingress is by default IPv6 only. To make it reachable from the IPv4 world, get its IPv6 address and configure a NAT64 mapping in Jool.
1313
1314
Steps:
1315
1316
h4. Get the ingress IPv6 address
1317
1318
Use @kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.spec.clusterIP}'; echo ''@
1319
1320
Example:
1321
1322
<pre>
1323
kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.spec.clusterIP}'; echo ''
1324
2a0a:e5c0:10:1b::ce11
1325
</pre>
1326
1327
h4. Add NAT64 mapping
1328
1329
* Update the __dcl_jool_siit cdist type
1330
* Record the two IPs (IPv6 and IPv4)
1331
* Configure all routers
1332
1333
1334
h4. Add DNS record
1335
1336
To use the ingress capable as a CNAME destination, create an "ingress" DNS record, such as:
1337
1338
<pre>
1339
; k8s ingress for dev
1340
dev-ingress                 AAAA 2a0a:e5c0:10:1b::ce11
1341
dev-ingress                 A 147.78.194.23
1342
1343
</pre> 
1344
1345
h4. Add supporting wildcard DNS
1346
1347
If you plan to add various sites under a specific domain, we can add a wildcard DNS entry, such as *.k8s-dev.django-hosting.ch:
1348
1349
<pre>
1350
*.k8s-dev         CNAME dev-ingress.ungleich.ch.
1351
</pre>
1352
1353 76 Nico Schottelius
h2. Harbor
1354
1355 175 Nico Schottelius
* We user "Harbor":https://goharbor.io/ as an image registry for our own images. Internal app reference: apps/prod/harbor.
1356
* The admin password is in the password store, it is Harbor12345 by default
1357 76 Nico Schottelius
* At the moment harbor only authenticates against the internal ldap tree
1358
1359
h3. LDAP configuration
1360
1361
* The url needs to be ldaps://...
1362
* uid = uid
1363
* rest standard
1364 75 Nico Schottelius
1365 89 Nico Schottelius
h2. Monitoring / Prometheus
1366
1367 90 Nico Schottelius
* Via "kube-prometheus":https://github.com/prometheus-operator/kube-prometheus/
1368 89 Nico Schottelius
1369 91 Nico Schottelius
Access via ...
1370
1371
* http://prometheus-k8s.monitoring.svc:9090
1372
* http://grafana.monitoring.svc:3000
1373
* http://alertmanager.monitoring.svc:9093
1374
1375
1376 100 Nico Schottelius
h3. Prometheus Options
1377
1378
* "helm/kube-prometheus-stack":https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
1379
** Includes dashboards and co.
1380
* "manifest based kube-prometheus":https://github.com/prometheus-operator/kube-prometheus
1381
** Includes dashboards and co.
1382
* "Prometheus Operator (mainly CRD manifest":https://github.com/prometheus-operator/prometheus-operator
1383
1384 171 Nico Schottelius
h3. Grafana default password
1385
1386 218 Nico Schottelius
* If not changed: admin / @prom-operator@
1387
** Can be changed via:
1388
1389
<pre>
1390
    helm:
1391
      values: |-
1392
        configurations: |-
1393
          grafana:
1394
            adminPassword: "..."
1395
</pre>
1396 171 Nico Schottelius
1397 82 Nico Schottelius
h2. Nextcloud
1398
1399 85 Nico Schottelius
h3. How to get the nextcloud credentials 
1400 84 Nico Schottelius
1401
* The initial username is set to "nextcloud"
1402
* The password is autogenerated and saved in a kubernetes secret
1403
1404
<pre>
1405 85 Nico Schottelius
kubectl get secret RELEASENAME-nextcloud -o jsonpath="{.data.PASSWORD}" | base64 -d; echo "" 
1406 84 Nico Schottelius
</pre>
1407
1408 83 Nico Schottelius
h3. How to fix "Access through untrusted domain"
1409
1410 82 Nico Schottelius
* Nextcloud stores the initial domain configuration
1411 1 Nico Schottelius
* If the FQDN is changed, it will show the error message "Access through untrusted domain"
1412 82 Nico Schottelius
* To fix, edit /var/www/html/config/config.php and correct the domain
1413 1 Nico Schottelius
* Then delete the pods
1414 165 Nico Schottelius
1415
h3. Running occ commands inside the nextcloud container
1416
1417
* Find the pod in the right namespace
1418
1419
Exec:
1420
1421
<pre>
1422
su www-data -s /bin/sh -c ./occ
1423
</pre>
1424
1425
* -s /bin/sh is needed as the default shell is set to /bin/false
1426
1427 166 Nico Schottelius
h4. Rescanning files
1428 165 Nico Schottelius
1429 166 Nico Schottelius
* If files have been added without nextcloud's knowledge
1430
1431
<pre>
1432
su www-data -s /bin/sh -c "./occ files:scan --all"
1433
</pre>
1434 82 Nico Schottelius
1435 201 Nico Schottelius
h2. Sealed Secrets
1436
1437 202 Jin-Guk Kwon
* install kubeseal
1438 1 Nico Schottelius
1439 202 Jin-Guk Kwon
<pre>
1440
KUBESEAL_VERSION='0.23.0'
1441
wget "https://github.com/bitnami-labs/sealed-secrets/releases/download/v${KUBESEAL_VERSION:?}/kubeseal-${KUBESEAL_VERSION:?}-linux-amd64.tar.gz" 
1442
tar -xvzf kubeseal-${KUBESEAL_VERSION:?}-linux-amd64.tar.gz kubeseal
1443
sudo install -m 755 kubeseal /usr/local/bin/kubeseal
1444
</pre>
1445
1446
* create key for sealed-secret
1447
1448
<pre>
1449
kubeseal --fetch-cert > /tmp/public-key-cert.pem
1450
</pre>
1451
1452
* create the secret
1453
1454
<pre>
1455 203 Jin-Guk Kwon
ex)
1456 202 Jin-Guk Kwon
apiVersion: v1
1457
kind: Secret
1458
metadata:
1459
  name: Release.Name-postgres-config
1460
  annotations:
1461
    secret-generator.v1.mittwald.de/autogenerate: POSTGRES_PASSWORD
1462
    hosting: Release.Name
1463
  labels:
1464
    app.kubernetes.io/instance: Release.Name
1465
    app.kubernetes.io/component: postgres
1466
stringData:
1467
  POSTGRES_USER: postgresUser
1468
  POSTGRES_DB: postgresDBName
1469
  POSTGRES_INITDB_ARGS: "--no-locale --encoding=UTF8"
1470
</pre>
1471
1472
* convert secret.yaml to sealed-secret.yaml
1473
1474
<pre>
1475
kubeseal -n <namespace> --cert=/tmp/public-key-cert.pem --format=yaml < ./secret.yaml  > ./sealed-secret.yaml
1476
</pre>
1477
1478
* use sealed-secret.yaml on helm-chart directory
1479 201 Nico Schottelius
1480 205 Jin-Guk Kwon
* refer ticket : #11989 , #12120
1481 204 Jin-Guk Kwon
1482 1 Nico Schottelius
h2. Infrastructure versions
1483 35 Nico Schottelius
1484 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v5 (2021-10)
1485 1 Nico Schottelius
1486 57 Nico Schottelius
Clusters are configured / setup in this order:
1487
1488
* Bootstrap via kubeadm
1489 59 Nico Schottelius
* "Networking via calico + BGP (non ECMP) using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
1490
* "ArgoCD for CD":https://argo-cd.readthedocs.io/en/stable/
1491
** "rook for storage via argocd":https://rook.io/
1492 58 Nico Schottelius
** haproxy for in IPv6-cluster-IPv4-to-IPv6 proxy via argocd
1493
** "kubernetes-secret-generator for in cluster secrets":https://github.com/mittwald/kubernetes-secret-generator
1494
** "ungleich-certbot managing certs and nginx":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
1495
1496 57 Nico Schottelius
1497
h3. ungleich kubernetes infrastructure v4 (2021-09)
1498
1499 54 Nico Schottelius
* rook is configured via manifests instead of using the rook-ceph-cluster helm chart
1500 1 Nico Schottelius
* The rook operator is still being installed via helm
1501 35 Nico Schottelius
1502 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v3 (2021-07)
1503 1 Nico Schottelius
1504 10 Nico Schottelius
* rook is now installed via helm via argocd instead of directly via manifests
1505 28 Nico Schottelius
1506 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v2 (2021-05)
1507 28 Nico Schottelius
1508
* Replaced fluxv2 from ungleich k8s v1 with argocd
1509 1 Nico Schottelius
** argocd can apply helm templates directly without needing to go through Chart releases
1510 28 Nico Schottelius
* We are also using argoflow for build flows
1511
* Planned to add "kaniko":https://github.com/GoogleContainerTools/kaniko for image building
1512
1513 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v1 (2021-01)
1514 28 Nico Schottelius
1515
We are using the following components:
1516
1517
* "Calico as a CNI":https://www.projectcalico.org/ with BGP, IPv6 only, no encapsulation
1518
** Needed for basic networking
1519
* "kubernetes-secret-generator":https://github.com/mittwald/kubernetes-secret-generator for creating secrets
1520
** Needed so that secrets are not stored in the git repository, but only in the cluster
1521
* "ungleich-certbot":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
1522
** Needed to get letsencrypt certificates for services
1523
* "rook with ceph rbd + cephfs":https://rook.io/ for storage
1524
** rbd for almost everything, *ReadWriteOnce*
1525
** cephfs for smaller things, multi access *ReadWriteMany*
1526
** Needed for providing persistent storage
1527
* "flux v2":https://fluxcd.io/
1528
** Needed to manage resources automatically