Project

General

Profile

The ungleich kubernetes infrastructure » History » Version 232

Nico Schottelius, 07/09/2025 12:03 PM

1 22 Nico Schottelius
h1. The ungleich kubernetes infrastructure and ungleich kubernetes manual
2 1 Nico Schottelius
3 3 Nico Schottelius
{{toc}}
4
5 1 Nico Schottelius
h2. Status
6
7 211 Nico Schottelius
This document is **production**.
8
This document is the ungleich kubernetes infrastructure overview as well as the ungleich kubernetes manual.
9 1 Nico Schottelius
10 10 Nico Schottelius
h2. k8s clusters
11
12 123 Nico Schottelius
| Cluster            | Purpose/Setup     | Maintainer | Master(s)                     | argo                                                   | v4 http proxy | last verified |
13
| c0.k8s.ooo         | Dev               | -          | UNUSED                        |                                                        |               |    2021-10-05 |
14
| c1.k8s.ooo         | retired           |            | -                             |                                                        |               |    2022-03-15 |
15
| c2.k8s.ooo         | Dev p7 HW         | Nico       | server47 server53 server54    | "argo":https://argocd-server.argocd.svc.c2.k8s.ooo     |               |    2021-10-05 |
16
| c3.k8s.ooo         | retired           | -          | -                             |                                                        |               |    2021-10-05 |
17
| c4.k8s.ooo         | Dev2 p7 HW        | Jin-Guk    | server52 server53 server54    |                                                        |               |             - |
18
| c5.k8s.ooo         | retired           |            | -                             |                                                        |               |    2022-03-15 |
19
| c6.k8s.ooo         | Dev p6 VM Jin-Guk | Jin-Guk    |                               |                                                        |               |               |
20
| [[p5.k8s.ooo]]     | production        |            | server34 server36 server38    | "argo":https://argocd-server.argocd.svc.p5.k8s.ooo     | -             |               |
21
| [[p5-cow.k8s.ooo]] | production        | Nico       | server47 server51 server55    | "argo":https://argocd-server.argocd.svc.p5-cow.k8s.ooo |               |    2022-08-27 |
22
| [[p6.k8s.ooo]]     | production        |            | server67 server69 server71    | "argo":https://argocd-server.argocd.svc.p6.k8s.ooo     | 147.78.194.13 |    2021-10-05 |
23 184 Nico Schottelius
| [[p6-cow.k8s.ooo]] | production        |            | server134 server135 server136 | "argo":https://argocd-server.argocd.svc.p6in10.k8s.ooo | ?             |    2023-05-17 |
24 177 Nico Schottelius
| [[p10.k8s.ooo]]    | production        |            | server131 server132 server133 | "argo":https://argocd-server.argocd.svc.p10.k8s.ooo    | 147.78.194.12 |    2021-10-05 |
25 123 Nico Schottelius
| [[k8s.ge.nau.so]]  | development       |            | server107 server108 server109 | "argo":https://argocd-server.argocd.svc.k8s.ge.nau.so  |               |               |
26
| [[dev.k8s.ooo]]    | development       |            | server110 server111 server112 | "argo":https://argocd-server.argocd.svc.dev.k8s.ooo    | -             |    2022-07-08 |
27 164 Nico Schottelius
| [[r1r2p15k8sooo|r1.p15.k8s.ooo]] | production | Nico | server120 | | | 2022-10-30 |
28
| [[r1r2p15k8sooo|r2.p15.k8s.ooo]] | production | Nico | server121 | | | 2022-09-06 |
29 162 Nico Schottelius
| [[r1r2p10k8sooo|r1.p10.k8s.ooo]] | production | Nico | server122 | | | 2022-10-30 |
30
| [[r1r2p10k8sooo|r2.p10.k8s.ooo]] | production | Nico | server123 | | | 2022-10-15 |
31
| [[r1r2p5k8sooo|r1.p5.k8s.ooo]] | production | Nico | server137 | | | 2022-10-30 |
32
| [[r1r2p5k8sooo|r2.p5.k8s.ooo]] | production | Nico | server138 | | | 2022-10-30 |
33
| [[r1r2p6k8sooo|r1.p6.k8s.ooo]] | production | Nico | server139 | | | 2022-10-30 |
34
| [[r1r2p6k8sooo|r2.p6.k8s.ooo]] | production | Nico | server140 | | | 2022-10-30 |
35 21 Nico Schottelius
36 1 Nico Schottelius
h2. General architecture and components overview
37
38
* All k8s clusters are IPv6 only
39
* We use BGP peering to propagate podcidr and serviceCidr networks to our infrastructure
40
* The main public testing repository is "ungleich-k8s":https://code.ungleich.ch/ungleich-public/ungleich-k8s
41 18 Nico Schottelius
** Private configurations are found in the **k8s-config** repository
42 1 Nico Schottelius
43
h3. Cluster types
44
45 28 Nico Schottelius
| **Type/Feature**            | **Development**                | **Production**         |
46
| Min No. nodes               | 3 (1 master, 3 worker)         | 5 (3 master, 3 worker) |
47
| Recommended minimum         | 4 (dedicated master, 3 worker) | 8 (3 master, 5 worker) |
48
| Separation of control plane | optional                       | recommended            |
49
| Persistent storage          | required                       | required               |
50
| Number of storage monitors  | 3                              | 5                      |
51 1 Nico Schottelius
52 43 Nico Schottelius
h2. General k8s operations
53 1 Nico Schottelius
54 46 Nico Schottelius
h3. Cheat sheet / external great references
55
56
* "kubectl cheatsheet":https://kubernetes.io/docs/reference/kubectl/cheatsheet/
57
58 214 Nico Schottelius
Some examples:
59
60
h4. Use kubectl to print only the node names
61
62
<pre>
63
kubectl get nodes -o jsonpath='{.items[*].metadata.name}'
64
</pre>
65
66
Can easily be used in a shell loop like this:
67
68
<pre>
69
for host in $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}'); do echo $host; ssh root@${host} uptime; done
70
</pre>
71
72 117 Nico Schottelius
h3. Allowing to schedule work on the control plane / removing node taints
73 69 Nico Schottelius
74
* Mostly for single node / test / development clusters
75
* Just remove the master taint as follows
76
77
<pre>
78
kubectl taint nodes --all node-role.kubernetes.io/master-
79 118 Nico Schottelius
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
80 69 Nico Schottelius
</pre>
81 1 Nico Schottelius
82 117 Nico Schottelius
You can check the node taints using @kubectl describe node ...@
83 69 Nico Schottelius
84 208 Nico Schottelius
h3. Adding taints
85
86
* For instance to limit nodes to specific customers
87
88
<pre>
89
kubectl taint nodes serverXX customer=CUSTOMERNAME:NoSchedule
90
</pre>
91
92 44 Nico Schottelius
h3. Get the cluster admin.conf
93
94
* On the masters of each cluster you can find the file @/etc/kubernetes/admin.conf@
95
* To be able to administrate the cluster you can copy the admin.conf to your local machine
96
* Multi cluster debugging can very easy if you name the config ~/cX-admin.conf (see example below)
97
98
<pre>
99
% scp root@server47.place7.ungleich.ch:/etc/kubernetes/admin.conf ~/c2-admin.conf
100
% export KUBECONFIG=~/c2-admin.conf    
101
% kubectl get nodes
102
NAME       STATUS                     ROLES                  AGE   VERSION
103
server47   Ready                      control-plane,master   82d   v1.22.0
104
server48   Ready                      control-plane,master   82d   v1.22.0
105
server49   Ready                      <none>                 82d   v1.22.0
106
server50   Ready                      <none>                 82d   v1.22.0
107
server59   Ready                      control-plane,master   82d   v1.22.0
108
server60   Ready,SchedulingDisabled   <none>                 82d   v1.22.0
109
server61   Ready                      <none>                 82d   v1.22.0
110
server62   Ready                      <none>                 82d   v1.22.0               
111
</pre>
112
113 18 Nico Schottelius
h3. Installing a new k8s cluster
114 8 Nico Schottelius
115 9 Nico Schottelius
* Decide on the cluster name (usually *cX.k8s.ooo*), X counting upwards
116 28 Nico Schottelius
** Using pXX.k8s.ooo for production clusters of placeXX
117 9 Nico Schottelius
* Use cdist to configure the nodes with requirements like crio
118
* Decide between single or multi node control plane setups (see below)
119 28 Nico Schottelius
** Single control plane suitable for development clusters
120 9 Nico Schottelius
121 28 Nico Schottelius
Typical init procedure:
122 9 Nico Schottelius
123 206 Nico Schottelius
h4. Single control plane:
124
125
<pre>
126
kubeadm init --config bootstrap/XXX/kubeadm.yaml
127
</pre>
128
129
h4. Multi control plane (HA):
130
131
<pre>
132
kubeadm init --config bootstrap/XXX/kubeadm.yaml --upload-certs
133
</pre>
134
135 10 Nico Schottelius
136 29 Nico Schottelius
h3. Deleting a pod that is hanging in terminating state
137
138
<pre>
139
kubectl delete pod <PODNAME> --grace-period=0 --force --namespace <NAMESPACE>
140
</pre>
141
142
(from https://stackoverflow.com/questions/35453792/pods-stuck-in-terminating-status)
143
144 42 Nico Schottelius
h3. Listing nodes of a cluster
145
146
<pre>
147
[15:05] bridge:~% kubectl get nodes
148
NAME       STATUS   ROLES                  AGE   VERSION
149
server22   Ready    <none>                 52d   v1.22.0
150
server23   Ready    <none>                 52d   v1.22.2
151
server24   Ready    <none>                 52d   v1.22.0
152
server25   Ready    <none>                 52d   v1.22.0
153
server26   Ready    <none>                 52d   v1.22.0
154
server27   Ready    <none>                 52d   v1.22.0
155
server63   Ready    control-plane,master   52d   v1.22.0
156
server64   Ready    <none>                 52d   v1.22.0
157
server65   Ready    control-plane,master   52d   v1.22.0
158
server66   Ready    <none>                 52d   v1.22.0
159
server83   Ready    control-plane,master   52d   v1.22.0
160
server84   Ready    <none>                 52d   v1.22.0
161
server85   Ready    <none>                 52d   v1.22.0
162
server86   Ready    <none>                 52d   v1.22.0
163
</pre>
164
165 41 Nico Schottelius
h3. Removing / draining a node
166
167
Usually @kubectl drain server@ should do the job, but sometimes we need to be more aggressive:
168
169 1 Nico Schottelius
<pre>
170 103 Nico Schottelius
kubectl drain --delete-emptydir-data --ignore-daemonsets serverXX
171 42 Nico Schottelius
</pre>
172
173
h3. Readding a node after draining
174
175
<pre>
176
kubectl uncordon serverXX
177 1 Nico Schottelius
</pre>
178 43 Nico Schottelius
179 50 Nico Schottelius
h3. (Re-)joining worker nodes after creating the cluster
180 49 Nico Schottelius
181
* We need to have an up-to-date token
182
* We use different join commands for the workers and control plane nodes
183
184
Generating the join command on an existing control plane node:
185
186
<pre>
187
kubeadm token create --print-join-command
188
</pre>
189
190 50 Nico Schottelius
h3. (Re-)joining control plane nodes after creating the cluster
191 1 Nico Schottelius
192 50 Nico Schottelius
* We generate the token again
193
* We upload the certificates
194
* We need to combine/create the join command for the control plane node
195
196
Example session:
197
198
<pre>
199
% kubeadm token create --print-join-command
200
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash 
201
202
% kubeadm init phase upload-certs --upload-certs
203
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
204
[upload-certs] Using certificate key:
205
CERTKEY
206
207
# Then we use these two outputs on the joining node:
208
209
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash --control-plane --certificate-key CERTKEY
210
</pre>
211
212
Commands to be used on a control plane node:
213
214
<pre>
215
kubeadm token create --print-join-command
216
kubeadm init phase upload-certs --upload-certs
217
</pre>
218
219
Commands to be used on the joining node:
220
221
<pre>
222
JOINCOMMAND --control-plane --certificate-key CERTKEY
223
</pre>
224 49 Nico Schottelius
225 51 Nico Schottelius
SEE ALSO
226
227
* https://stackoverflow.com/questions/63936268/how-to-generate-kubeadm-token-for-secondary-control-plane-nodes
228
* https://blog.scottlowe.org/2019/08/15/reconstructing-the-join-command-for-kubeadm/
229
230 232 Nico Schottelius
h3. Fixing outdated certificates
231
232
If the upgrade process was delayed too much, the certs are not updated and will expire. Renew can be done via:
233
234
<pre>
235
kubeadm certs renew all
236
</pre>
237
238
Components need to be (manually) restarted.
239
240
241 53 Nico Schottelius
h3. How to fix etcd does not start when rejoining a kubernetes cluster as a control plane
242 52 Nico Schottelius
243
If during the above step etcd does not come up, @kubeadm join@ can hang as follows:
244
245
<pre>
246
[control-plane] Creating static Pod manifest for "kube-apiserver"                                                              
247
[control-plane] Creating static Pod manifest for "kube-controller-manager"                                                     
248
[control-plane] Creating static Pod manifest for "kube-scheduler"                                                              
249
[check-etcd] Checking that the etcd cluster is healthy                                                                         
250
error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://[2a0a:e5c0:10:1:225:b3ff:fe20:37
251
8a]:2379 with maintenance client: context deadline exceeded                                                                    
252
To see the stack trace of this error execute with --v=5 or higher         
253
</pre>
254
255
Then the problem is likely that the etcd server is still a member of the cluster. We first need to remove it from the etcd cluster and then the join works.
256
257
To fix this we do:
258
259
* Find a working etcd pod
260
* Find the etcd members / member list
261
* Remove the etcd member that we want to re-join the cluster
262
263
264
<pre>
265
# Find the etcd pods
266
kubectl -n kube-system get pods -l component=etcd,tier=control-plane
267
268
# Get the list of etcd servers with the member id 
269
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
270
271
# Remove the member
272
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove MEMBERID
273
</pre>
274
275
Sample session:
276
277
<pre>
278
[10:48] line:~% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
279
NAME            READY   STATUS    RESTARTS     AGE
280
etcd-server63   1/1     Running   0            3m11s
281
etcd-server65   1/1     Running   3            7d2h
282
etcd-server83   1/1     Running   8 (6d ago)   7d2h
283
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
284
356891cd676df6e4, started, server65, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2379, false
285
371b8a07185dee7e, started, server63, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2379, false
286
5942bc58307f8af9, started, server83, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2380, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2379, false
287
288
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove 371b8a07185dee7e
289
Member 371b8a07185dee7e removed from cluster e3c0805f592a8f77
290 1 Nico Schottelius
291
</pre>
292
293
SEE ALSO
294
295
* We found the solution using https://stackoverflow.com/questions/67921552/re-installed-node-cannot-join-kubernetes-cluster
296 56 Nico Schottelius
297 213 Nico Schottelius
h4. Updating the members
298
299
1) get alive member
300
301
<pre>
302
% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
303
NAME            READY   STATUS    RESTARTS   AGE
304
etcd-server67   1/1     Running   1          185d
305
etcd-server69   1/1     Running   1          185d
306
etcd-server71   1/1     Running   2          185d
307
[20:57] sun:~% 
308
</pre>
309
310
2) get member list
311
312
* in this case via crictl, as the api does not work correctly anymore
313
314
<pre>
315
316
317
</pre>
318
319
320
3) update
321
322
<pre>
323
etcdctl member update MEMBERID  --peer-urls=https://[...]:2380
324
325
326
</pre>
327
328 147 Nico Schottelius
h3. Node labels (adding, showing, removing)
329
330
Listing the labels:
331
332
<pre>
333
kubectl get nodes --show-labels
334
</pre>
335
336
Adding labels:
337
338
<pre>
339
kubectl label nodes LIST-OF-NODES label1=value1 
340
341
</pre>
342
343
For instance:
344
345
<pre>
346
kubectl label nodes router2 router3 hosttype=router 
347
</pre>
348
349
Selecting nodes in pods:
350
351
<pre>
352
apiVersion: v1
353
kind: Pod
354
...
355
spec:
356
  nodeSelector:
357
    hosttype: router
358
</pre>
359
360 148 Nico Schottelius
Removing labels by adding a minus at the end of the label name:
361
362
<pre>
363
kubectl label node <nodename> <labelname>-
364
</pre>
365
366
For instance:
367
368
<pre>
369
kubectl label nodes router2 router3 hosttype- 
370
</pre>
371
372 147 Nico Schottelius
SEE ALSO
373 1 Nico Schottelius
374 148 Nico Schottelius
* https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes/
375
* https://stackoverflow.com/questions/34067979/how-to-delete-a-node-label-by-command-and-api
376 147 Nico Schottelius
377 199 Nico Schottelius
h3. Listing all pods on a node
378
379
<pre>
380
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=serverXX
381
</pre>
382
383
Found on https://stackoverflow.com/questions/62000559/how-to-list-all-the-pods-running-in-a-particular-worker-node-by-executing-a-comm
384
385 101 Nico Schottelius
h3. Hardware Maintenance using ungleich-hardware
386
387
Use the following manifest and replace the HOST with the actual host:
388
389
<pre>
390
apiVersion: v1
391
kind: Pod
392
metadata:
393
  name: ungleich-hardware-HOST
394
spec:
395
  containers:
396
  - name: ungleich-hardware
397
    image: ungleich/ungleich-hardware:0.0.5
398
    args:
399
    - sleep
400
    - "1000000"
401
    volumeMounts:
402
      - mountPath: /dev
403
        name: dev
404
    securityContext:
405
      privileged: true
406
  nodeSelector:
407
    kubernetes.io/hostname: "HOST"
408
409
  volumes:
410
    - name: dev
411
      hostPath:
412
        path: /dev
413
</pre>
414
415 102 Nico Schottelius
Also see: [[The_ungleich_hardware_maintenance_guide]]
416
417 105 Nico Schottelius
h3. Triggering a cronjob / creating a job from a cronjob
418 104 Nico Schottelius
419
To test a cronjob, we can create a job from a cronjob:
420
421
<pre>
422
kubectl create job --from=cronjob/volume2-daily-backup volume2-manual
423
</pre>
424
425
This creates a job volume2-manual based on the cronjob  volume2-daily
426
427 112 Nico Schottelius
h3. su-ing into a user that has nologin shell set
428
429
Many times users are having nologin as their shell inside the container. To be able to execute maintenance commands within the
430
container, we can use @su -s /bin/sh@ like this:
431
432
<pre>
433
su -s /bin/sh -c '/path/to/your/script' testuser
434
</pre>
435
436
Found on https://serverfault.com/questions/351046/how-to-run-command-as-user-who-has-usr-sbin-nologin-as-shell
437
438 113 Nico Schottelius
h3. How to print a secret value
439
440
Assuming you want the "password" item from a secret, use:
441
442
<pre>
443
kubectl get secret SECRETNAME -o jsonpath="{.data.password}" | base64 -d; echo "" 
444
</pre>
445
446 209 Nico Schottelius
h3. Fixing the "ImageInspectError"
447
448
If you see this problem:
449
450
<pre>
451
# kubectl get pods
452
NAME                                                       READY   STATUS                   RESTARTS   AGE
453
bird-router-server137-bird-767f65bb47-g4xsh                0/1     Init:ImageInspectError   0          77d
454
bird-router-server137-openvpn-server120-5c987b7ffb-cn9xf   0/1     ImageInspectError        1          159d
455
bird-router-server137-unbound-5c6f5d4bb6-cxbpr             0/1     ImageInspectError        1          159d
456
</pre>
457
458
Fixes so far:
459
460
* correct registries.conf
461
462 212 Nico Schottelius
h3. Automatic cleanup of images
463
464
* options to kubelet
465
466
<pre>
467
  --image-gc-high-threshold=90: The percent of disk usage after which image garbage collection is always run. Default: 90%
468
  --image-gc-low-threshold=80: The percent of disk usage before which image garbage collection is never run. Lowest disk usage to garbage collect to. Default: 80%
469
</pre>
470 209 Nico Schottelius
471 173 Nico Schottelius
h3. How to upgrade a kubernetes cluster
472 172 Nico Schottelius
473
h4. General
474
475
* Should be done every X months to stay up-to-date
476
** X probably something like 3-6
477
* kubeadm based clusters
478
* Needs specific kubeadm versions for upgrade
479
* Follow instructions on https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
480 190 Nico Schottelius
* Finding releases: https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG
481 172 Nico Schottelius
482
h4. Getting a specific kubeadm or kubelet version
483
484
<pre>
485 190 Nico Schottelius
RELEASE=v1.22.17
486
RELEASE=v1.23.17
487 181 Nico Schottelius
RELEASE=v1.24.9
488 1 Nico Schottelius
RELEASE=v1.25.9
489
RELEASE=v1.26.6
490 190 Nico Schottelius
RELEASE=v1.27.2
491
492 187 Nico Schottelius
ARCH=amd64
493 172 Nico Schottelius
494
curl -L --remote-name-all https://dl.k8s.io/release/${RELEASE}/bin/linux/${ARCH}/{kubeadm,kubelet}
495 182 Nico Schottelius
chmod u+x kubeadm kubelet
496 172 Nico Schottelius
</pre>
497
498
h4. Steps
499
500
* kubeadm upgrade plan
501
** On one control plane node
502
* kubeadm upgrade apply vXX.YY.ZZ
503
** On one control plane node
504 189 Nico Schottelius
* kubeadm upgrade node
505
** On all other control plane nodes
506
** On all worker nodes afterwards
507
508 172 Nico Schottelius
509 173 Nico Schottelius
Repeat for all control planes nodes. The upgrade kubelet on all other nodes via package manager.
510 172 Nico Schottelius
511 193 Nico Schottelius
h4. Upgrading to 1.22.17
512 1 Nico Schottelius
513 193 Nico Schottelius
* https://v1-22.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
514 194 Nico Schottelius
* Need to create a kubeadm config map
515 198 Nico Schottelius
** f.i. using the following
516
** @/usr/local/bin/kubeadm-v1.22.17   upgrade --config kubeadm.yaml --ignore-preflight-errors=CoreDNSUnsupportedPlugins,CoreDNSMigration apply -y v1.22.17@
517 193 Nico Schottelius
* Done for p6 on 2023-10-04
518
519
h4. Upgrading to 1.23.17
520
521
* https://v1-23.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
522
* No special notes
523
* Done for p6 on 2023-10-04
524
525
h4. Upgrading to 1.24.17
526
527
* https://v1-24.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
528
* No special notes
529
* Done for p6 on 2023-10-04
530
531
h4. Upgrading to 1.25.14
532
533
* https://v1-24.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
534
* No special notes
535
* Done for p6 on 2023-10-04
536
537
h4. Upgrading to 1.26.9
538
539 1 Nico Schottelius
* https://v1-26.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
540 193 Nico Schottelius
* No special notes
541
* Done for p6 on 2023-10-04
542 188 Nico Schottelius
543 196 Nico Schottelius
h4. Upgrading to 1.27
544 186 Nico Schottelius
545 192 Nico Schottelius
* https://v1-27.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
546 186 Nico Schottelius
* kubelet will not start anymore
547
* reason: @"command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"@
548
* /var/lib/kubelet/kubeadm-flags.env contains that parameter
549
* remove it, start kubelet
550 192 Nico Schottelius
551 197 Nico Schottelius
h4. Upgrading to 1.28
552 192 Nico Schottelius
553
* https://v1-28.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
554 186 Nico Schottelius
555 223 Nico Schottelius
h4. Upgrading to 1.29
556
557
* Done for many clusters around 2024-01-10
558
* Unsure if it was properly released
559
* https://v1-29.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
560
561 219 Nico Schottelius
h4. Upgrading to 1.31
562
563
* Cluster needs to updated FIRST before kubelet/the OS
564
565
Otherwise you run into errors in the pod like this:
566
567
<pre>
568
  Warning  Failed     11s (x3 over 12s)  kubelet            Error: services have not yet been read at least once, cannot construct envvars
569
</pre>
570
571 210 Nico Schottelius
And the resulting pod state is:
572
573
<pre>
574
Init:CreateContainerConfigError
575
</pre>
576
577 224 Nico Schottelius
Fix: 
578
579
* find an old 1.30 kubelet package, downgrade kubelet, upgrade the control plane, upgrade kubelet again
580
581 225 Nico Schottelius
<pre>
582
wget https://mirror.ungleich.ch/mirror/packages/alpine/v3.20/community/x86_64/kubelet-1.30.0-r3.apk
583
wget https://mirror.ungleich.ch/mirror/packages/alpine/v3.20/community/x86_64/kubelet-openrc-1.30.0-r3.apk
584
apk add ./kubelet-1.30.0-r3.apk ./kubelet-openrc-1.30.0-r3.apk
585 226 Nico Schottelius
/etc/init.d/kubelet restart
586 225 Nico Schottelius
</pre>
587 226 Nico Schottelius
588
Then upgrade:
589
590
<pre>
591
/usr/local/bin/kubeadm-v1.31.3   upgrade apply -y v1.31.3
592
</pre>
593
594
Then re-upgrade the kubelet:
595
596
<pre>
597
apk upgrade -a
598
</pre>
599
600 225 Nico Schottelius
601 186 Nico Schottelius
h4. Upgrade to crio 1.27: missing crun
602
603
Error message
604
605
<pre>
606
level=fatal msg="validating runtime config: runtime validation: \"crun\" not found in $PATH: exec: \"crun\": executable file not found in $PATH"
607
</pre>
608 1 Nico Schottelius
609 186 Nico Schottelius
Fix:
610
611
<pre>
612
apk add crun
613
</pre>
614 223 Nico Schottelius
615 186 Nico Schottelius
616 157 Nico Schottelius
h2. Reference CNI
617
618
* Mainly "stupid", but effective plugins
619
* Main documentation on https://www.cni.dev/plugins/current/
620 158 Nico Schottelius
* Plugins
621
** bridge
622
*** Can create the bridge on the host
623
*** But seems not to be able to add host interfaces to it as well
624
*** Has support for vlan tags
625
** vlan
626
*** creates vlan tagged sub interface on the host
627 160 Nico Schottelius
*** "It's a 1:1 mapping (i.e. no bridge in between)":https://github.com/k8snetworkplumbingwg/multus-cni/issues/569
628 158 Nico Schottelius
** host-device
629
*** moves the interface from the host into the container
630
*** very easy for physical connections to containers
631 159 Nico Schottelius
** ipvlan
632
*** "virtualisation" of a host device
633
*** routing based on IP
634
*** Same MAC for everyone
635
*** Cannot reach the master interface
636
** maclvan
637
*** With mac addresses
638
*** Supports various modes (to be checked)
639
** ptp ("point to point")
640
*** Creates a host device and connects it to the container
641
** win*
642 158 Nico Schottelius
*** Windows implementations
643 157 Nico Schottelius
644 62 Nico Schottelius
h2. Calico CNI
645
646
h3. Calico Installation
647
648
* We install "calico using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
649 228 Nico Schottelius
* Check the tags on https://github.com/projectcalico/calico/tags for the latest release
650 62 Nico Schottelius
* This has the following advantages:
651
** Easy to upgrade
652
** Does not require os to configure IPv6/dual stack settings as the tigera operator figures out things on its own
653 230 Nico Schottelius
* As of 2025-05-28, tigera should be installed in the *tigera-operator* namespace
654 62 Nico Schottelius
655
Usually plain calico can be installed directly using:
656
657 1 Nico Schottelius
<pre>
658 228 Nico Schottelius
VERSION=v3.30.0
659 149 Nico Schottelius
660 1 Nico Schottelius
helm repo add projectcalico https://docs.projectcalico.org/charts
661
helm repo update
662 230 Nico Schottelius
helm upgrade --install calico projectcalico/tigera-operator --version $VERSION --namespace tigera-operator --create-namespace
663 92 Nico Schottelius
</pre>
664 1 Nico Schottelius
665 229 Nico Schottelius
h3. Calico upgrade
666 92 Nico Schottelius
667 229 Nico Schottelius
* As of 3.30 or so, CRDs need to be applied manually beforehand
668
669
<pre>
670 231 Nico Schottelius
VERSION=v3.30.1
671 229 Nico Schottelius
672 1 Nico Schottelius
kubectl apply --server-side --force-conflicts -f https://raw.githubusercontent.com/projectcalico/calico/${VERSION}/manifests/operator-crds.yaml
673 231 Nico Schottelius
helm repo update
674
helm upgrade --install --namespace tigera-operator calico projectcalico/tigera-operator --version $VERSION --create-namespace
675 229 Nico Schottelius
</pre>
676 228 Nico Schottelius
677 62 Nico Schottelius
h3. Installing calicoctl
678
679 115 Nico Schottelius
* General installation instructions, including binary download: https://projectcalico.docs.tigera.io/maintenance/clis/calicoctl/install
680
681 62 Nico Schottelius
To be able to manage and configure calico, we need to 
682
"install calicoctl (we choose the version as a pod)":https://docs.projectcalico.org/getting-started/clis/calicoctl/install#install-calicoctl-as-a-kubernetes-pod
683
684
<pre>
685
kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml
686
</pre>
687
688 93 Nico Schottelius
Or version specific:
689
690
<pre>
691
kubectl apply -f https://github.com/projectcalico/calico/blob/v3.20.4/manifests/calicoctl.yaml
692 97 Nico Schottelius
693
# For 3.22
694
kubectl apply -f https://projectcalico.docs.tigera.io/archive/v3.22/manifests/calicoctl.yaml
695 93 Nico Schottelius
</pre>
696
697 70 Nico Schottelius
And making it easier accessible by alias:
698
699
<pre>
700
alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"
701
</pre>
702
703 62 Nico Schottelius
h3. Calico configuration
704
705 63 Nico Schottelius
By default our k8s clusters "BGP peer":https://docs.projectcalico.org/networking/bgp
706
with an upstream router to propagate podcidr and servicecidr.
707 62 Nico Schottelius
708
Default settings in our infrastructure:
709
710
* We use a full-mesh using the @nodeToNodeMeshEnabled: true@ option
711
* We keep the original next hop so that *only* the server with the pod is announcing it (instead of ecmp)
712 1 Nico Schottelius
* We use private ASNs for k8s clusters
713 63 Nico Schottelius
* We do *not* use any overlay
714 62 Nico Schottelius
715
After installing calico and calicoctl the last step of the installation is usually:
716
717 1 Nico Schottelius
<pre>
718 79 Nico Schottelius
calicoctl create -f - < calico-bgp.yaml
719 62 Nico Schottelius
</pre>
720
721
722
A sample BGP configuration:
723
724
<pre>
725
---
726
apiVersion: projectcalico.org/v3
727
kind: BGPConfiguration
728
metadata:
729
  name: default
730
spec:
731
  logSeverityScreen: Info
732
  nodeToNodeMeshEnabled: true
733
  asNumber: 65534
734
  serviceClusterIPs:
735
  - cidr: 2a0a:e5c0:10:3::/108
736
  serviceExternalIPs:
737
  - cidr: 2a0a:e5c0:10:3::/108
738
---
739
apiVersion: projectcalico.org/v3
740
kind: BGPPeer
741
metadata:
742
  name: router1-place10
743
spec:
744
  peerIP: 2a0a:e5c0:10:1::50
745
  asNumber: 213081
746
  keepOriginalNextHop: true
747
</pre>
748
749 227 Nico Schottelius
h3. Get installed calico version
750
751
* might be in calico or tigera namespace
752
753
<pre>
754
helm ls -A | grep calico
755
</pre>
756
757 126 Nico Schottelius
h2. Cilium CNI (experimental)
758
759 137 Nico Schottelius
h3. Status
760
761 138 Nico Schottelius
*NO WORKING CILIUM CONFIGURATION FOR IPV6 only modes*
762 137 Nico Schottelius
763 146 Nico Schottelius
h3. Latest error
764
765
It seems cilium does not run on IPv6 only hosts:
766
767
<pre>
768
level=info msg="Validating configured node address ranges" subsys=daemon
769
level=fatal msg="postinit failed" error="external IPv4 node address could not be derived, please configure via --ipv4-node" subsys=daemon
770
level=info msg="Starting IP identity watcher" subsys=ipcache
771
</pre>
772
773
It crashes after that log entry
774
775 128 Nico Schottelius
h3. BGP configuration
776
777
* The cilium-operator will not start without a correct configmap being present beforehand (see error message below)
778
* Creating the bgp config beforehand as a configmap is thus required.
779
780
The error one gets without the configmap present:
781
782
Pods are hanging with:
783
784
<pre>
785
cilium-bpqm6                       0/1     Init:0/4            0             9s
786
cilium-operator-5947d94f7f-5bmh2   0/1     ContainerCreating   0             9s
787
</pre>
788
789
The error message in the cilium-*perator is:
790
791
<pre>
792
Events:
793
  Type     Reason       Age                From               Message
794
  ----     ------       ----               ----               -------
795
  Normal   Scheduled    80s                default-scheduler  Successfully assigned kube-system/cilium-operator-5947d94f7f-lqcsp to server56
796
  Warning  FailedMount  16s (x8 over 80s)  kubelet            MountVolume.SetUp failed for volume "bgp-config-path" : configmap "bgp-config" not found
797
</pre>
798
799
A correct bgp config looks like this:
800
801
<pre>
802
apiVersion: v1
803
kind: ConfigMap
804
metadata:
805
  name: bgp-config
806
  namespace: kube-system
807
data:
808
  config.yaml: |
809
    peers:
810
      - peer-address: 2a0a:e5c0::46
811
        peer-asn: 209898
812
        my-asn: 65533
813
      - peer-address: 2a0a:e5c0::47
814
        peer-asn: 209898
815
        my-asn: 65533
816
    address-pools:
817
      - name: default
818
        protocol: bgp
819
        addresses:
820
          - 2a0a:e5c0:0:14::/64
821
</pre>
822 127 Nico Schottelius
823
h3. Installation
824 130 Nico Schottelius
825 127 Nico Schottelius
Adding the repo
826 1 Nico Schottelius
<pre>
827 127 Nico Schottelius
828 129 Nico Schottelius
helm repo add cilium https://helm.cilium.io/
829 130 Nico Schottelius
helm repo update
830
</pre>
831 129 Nico Schottelius
832 135 Nico Schottelius
Installing + configuring cilium
833 129 Nico Schottelius
<pre>
834 130 Nico Schottelius
ipv6pool=2a0a:e5c0:0:14::/112
835 1 Nico Schottelius
836 146 Nico Schottelius
version=1.12.2
837 129 Nico Schottelius
838
helm upgrade --install cilium cilium/cilium --version $version \
839 1 Nico Schottelius
  --namespace kube-system \
840
  --set ipv4.enabled=false \
841
  --set ipv6.enabled=true \
842 146 Nico Schottelius
  --set enableIPv6Masquerade=false \
843
  --set bgpControlPlane.enabled=true 
844 1 Nico Schottelius
845 146 Nico Schottelius
#  --set ipam.operator.clusterPoolIPv6PodCIDRList=$ipv6pool
846
847
# Old style bgp?
848 136 Nico Schottelius
#   --set bgp.enabled=true --set bgp.announce.podCIDR=true \
849 127 Nico Schottelius
850
# Show possible configuration options
851
helm show values cilium/cilium
852
853 1 Nico Schottelius
</pre>
854 132 Nico Schottelius
855
Using a /64 for ipam.operator.clusterPoolIPv6PodCIDRList fails with:
856
857
<pre>
858
level=fatal msg="Unable to init cluster-pool allocator" error="unable to initialize IPv6 allocator New CIDR set failed; the node CIDR size is too big" subsys=cilium-operator-generic
859
</pre>
860
861 126 Nico Schottelius
862 1 Nico Schottelius
See also https://github.com/cilium/cilium/issues/20756
863 135 Nico Schottelius
864
Seems a /112 is actually working.
865
866
h3. Kernel modules
867
868
Cilium requires the following modules to be loaded on the host (not loaded by default):
869
870
<pre>
871 1 Nico Schottelius
modprobe  ip6table_raw
872
modprobe  ip6table_filter
873
</pre>
874 146 Nico Schottelius
875
h3. Interesting helm flags
876
877
* autoDirectNodeRoutes
878
* bgpControlPlane.enabled = true
879
880
h3. SEE ALSO
881
882
* https://docs.cilium.io/en/v1.12/helm-reference/
883 133 Nico Schottelius
884 179 Nico Schottelius
h2. Multus
885 168 Nico Schottelius
886
* https://github.com/k8snetworkplumbingwg/multus-cni
887
* Installing a deployment w/ CRDs
888 150 Nico Schottelius
889 169 Nico Schottelius
<pre>
890 176 Nico Schottelius
VERSION=v4.0.1
891 169 Nico Schottelius
892 170 Nico Schottelius
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/${VERSION}/deployments/multus-daemonset-crio.yml
893
</pre>
894 169 Nico Schottelius
895 191 Nico Schottelius
h2. ArgoCD
896 56 Nico Schottelius
897 60 Nico Schottelius
h3. Argocd Installation
898 1 Nico Schottelius
899 116 Nico Schottelius
* See https://argo-cd.readthedocs.io/en/stable/
900
901 60 Nico Schottelius
As there is no configuration management present yet, argocd is installed using
902
903 1 Nico Schottelius
<pre>
904 60 Nico Schottelius
kubectl create namespace argocd
905 1 Nico Schottelius
906
# OR: latest stable
907
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
908
909 191 Nico Schottelius
# OR Specific Version
910
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/v2.3.2/manifests/install.yaml
911 56 Nico Schottelius
912 191 Nico Schottelius
913
</pre>
914 1 Nico Schottelius
915 60 Nico Schottelius
h3. Get the argocd credentials
916
917
<pre>
918
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d; echo ""
919
</pre>
920 52 Nico Schottelius
921 87 Nico Schottelius
h3. Accessing argocd
922
923
In regular IPv6 clusters:
924
925
* Navigate to https://argocd-server.argocd.CLUSTERDOMAIN
926
927
In legacy IPv4 clusters
928
929
<pre>
930
kubectl --namespace argocd port-forward svc/argocd-server 8080:80
931
</pre>
932
933 88 Nico Schottelius
* Navigate to https://localhost:8080
934
935 68 Nico Schottelius
h3. Using the argocd webhook to trigger changes
936 67 Nico Schottelius
937
* To trigger changes post json https://argocd.example.com/api/webhook
938
939 72 Nico Schottelius
h3. Deploying an application
940
941
* Applications are deployed via git towards gitea (code.ungleich.ch) and then pulled by argo
942 73 Nico Schottelius
* Always include the *redmine-url* pointing to the (customer) ticket
943
** Also add the support-url if it exists
944 72 Nico Schottelius
945
Application sample
946
947
<pre>
948
apiVersion: argoproj.io/v1alpha1
949
kind: Application
950
metadata:
951
  name: gitea-CUSTOMER
952
  namespace: argocd
953
spec:
954
  destination:
955
    namespace: default
956
    server: 'https://kubernetes.default.svc'
957
  source:
958
    path: apps/prod/gitea
959
    repoURL: 'https://code.ungleich.ch/ungleich-intern/k8s-config.git'
960
    targetRevision: HEAD
961
    helm:
962
      parameters:
963
        - name: storage.data.storageClass
964
          value: rook-ceph-block-hdd
965
        - name: storage.data.size
966
          value: 200Gi
967
        - name: storage.db.storageClass
968
          value: rook-ceph-block-ssd
969
        - name: storage.db.size
970
          value: 10Gi
971
        - name: storage.letsencrypt.storageClass
972
          value: rook-ceph-block-hdd
973
        - name: storage.letsencrypt.size
974
          value: 50Mi
975
        - name: letsencryptStaging
976
          value: 'no'
977
        - name: fqdn
978
          value: 'code.verua.online'
979
  project: default
980
  syncPolicy:
981
    automated:
982
      prune: true
983
      selfHeal: true
984
  info:
985
    - name: 'redmine-url'
986
      value: 'https://redmine.ungleich.ch/issues/ISSUEID'
987
    - name: 'support-url'
988
      value: 'https://support.ungleich.ch/Ticket/Display.html?id=TICKETID'
989
</pre>
990
991 80 Nico Schottelius
h2. Helm related operations and conventions
992 55 Nico Schottelius
993 61 Nico Schottelius
We use helm charts extensively.
994
995
* In production, they are managed via argocd
996
* In development, helm chart can de developed and deployed manually using the helm utility.
997
998 55 Nico Schottelius
h3. Installing a helm chart
999
1000
One can use the usual pattern of
1001
1002
<pre>
1003
helm install <releasename> <chartdirectory>
1004
</pre>
1005
1006
However often you want to reinstall/update when testing helm charts. The following pattern is "better", because it allows you to reinstall, if it is already installed:
1007
1008
<pre>
1009
helm upgrade --install <releasename> <chartdirectory>
1010 1 Nico Schottelius
</pre>
1011 80 Nico Schottelius
1012
h3. Naming services and deployments in helm charts [Application labels]
1013
1014
* We always have {{ .Release.Name }} to identify the current "instance"
1015
* Deployments:
1016
** use @app: <what it is>@, f.i. @app: nginx@, @app: postgres@, ...
1017 81 Nico Schottelius
* See more about standard labels on
1018
** https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
1019
** https://helm.sh/docs/chart_best_practices/labels/
1020 55 Nico Schottelius
1021 151 Nico Schottelius
h3. Show all versions of a helm chart
1022
1023
<pre>
1024
helm search repo -l repo/chart
1025
</pre>
1026
1027
For example:
1028
1029
<pre>
1030
% helm search repo -l projectcalico/tigera-operator 
1031
NAME                         	CHART VERSION	APP VERSION	DESCRIPTION                            
1032
projectcalico/tigera-operator	v3.23.3      	v3.23.3    	Installs the Tigera operator for Calico
1033
projectcalico/tigera-operator	v3.23.2      	v3.23.2    	Installs the Tigera operator for Calico
1034
....
1035
</pre>
1036
1037 152 Nico Schottelius
h3. Show possible values of a chart
1038
1039
<pre>
1040
helm show values <repo/chart>
1041
</pre>
1042
1043
Example:
1044
1045
<pre>
1046
helm show values ingress-nginx/ingress-nginx
1047
</pre>
1048
1049 207 Nico Schottelius
h3. Show all possible charts in a repo
1050
1051
<pre>
1052
helm search repo REPO
1053
</pre>
1054
1055 178 Nico Schottelius
h3. Download a chart
1056
1057
For instance for checking it out locally. Use:
1058
1059
<pre>
1060
helm pull <repo/chart>
1061
</pre>
1062 152 Nico Schottelius
1063 139 Nico Schottelius
h2. Rook + Ceph
1064
1065
h3. Installation
1066
1067
* Usually directly via argocd
1068
1069 71 Nico Schottelius
h3. Executing ceph commands
1070
1071
Using the ceph-tools pod as follows:
1072
1073
<pre>
1074
kubectl exec -n rook-ceph -ti $(kubectl -n rook-ceph get pods -l app=rook-ceph-tools -o jsonpath='{.items[*].metadata.name}') -- ceph -s
1075
</pre>
1076
1077 43 Nico Schottelius
h3. Inspecting the logs of a specific server
1078
1079
<pre>
1080
# Get the related pods
1081
kubectl -n rook-ceph get pods -l app=rook-ceph-osd-prepare 
1082
...
1083
1084
# Inspect the logs of a specific pod
1085
kubectl -n rook-ceph logs -f rook-ceph-osd-prepare-server23--1-444qx
1086
1087 71 Nico Schottelius
</pre>
1088
1089
h3. Inspecting the logs of the rook-ceph-operator
1090
1091
<pre>
1092
kubectl -n rook-ceph logs -f -l app=rook-ceph-operator
1093 43 Nico Schottelius
</pre>
1094
1095 200 Nico Schottelius
h3. (Temporarily) Disabling the rook-operation
1096
1097
* first disabling the sync in argocd
1098
* then scale it down
1099
1100
<pre>
1101
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=0
1102
</pre>
1103
1104
When done with the work/maintenance, re-enable sync in argocd.
1105
The following command is thus strictly speaking not required, as argocd will fix it on its own:
1106
1107
<pre>
1108
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=1
1109
</pre>
1110
1111 121 Nico Schottelius
h3. Restarting the rook operator
1112
1113
<pre>
1114
kubectl -n rook-ceph delete pods  -l app=rook-ceph-operator
1115
</pre>
1116
1117 43 Nico Schottelius
h3. Triggering server prepare / adding new osds
1118
1119
The rook-ceph-operator triggers/watches/creates pods to maintain hosts. To trigger a full "re scan", simply delete that pod:
1120
1121
<pre>
1122
kubectl -n rook-ceph delete pods -l app=rook-ceph-operator
1123
</pre>
1124
1125
This will cause all the @rook-ceph-osd-prepare-..@ jobs to be recreated and thus OSDs to be created, if new disks have been added.
1126
1127
h3. Removing an OSD
1128
1129
* See "Ceph OSD Management":https://rook.io/docs/rook/v1.7/ceph-osd-mgmt.html
1130 77 Nico Schottelius
* More specifically: https://github.com/rook/rook/blob/release-1.7/cluster/examples/kubernetes/ceph/osd-purge.yaml
1131 99 Nico Schottelius
* Then delete the related deployment
1132 41 Nico Schottelius
1133 98 Nico Schottelius
Set osd id in the osd-purge.yaml and apply it. OSD should be down before.
1134
1135
<pre>
1136
apiVersion: batch/v1
1137
kind: Job
1138
metadata:
1139
  name: rook-ceph-purge-osd
1140
  namespace: rook-ceph # namespace:cluster
1141
  labels:
1142
    app: rook-ceph-purge-osd
1143
spec:
1144
  template:
1145
    metadata:
1146
      labels:
1147
        app: rook-ceph-purge-osd
1148
    spec:
1149
      serviceAccountName: rook-ceph-purge-osd
1150
      containers:
1151
        - name: osd-removal
1152
          image: rook/ceph:master
1153
          # TODO: Insert the OSD ID in the last parameter that is to be removed
1154
          # The OSD IDs are a comma-separated list. For example: "0" or "0,2".
1155
          # If you want to preserve the OSD PVCs, set `--preserve-pvc true`.
1156
          #
1157
          # A --force-osd-removal option is available if the OSD should be destroyed even though the
1158
          # removal could lead to data loss.
1159
          args:
1160
            - "ceph"
1161
            - "osd"
1162
            - "remove"
1163
            - "--preserve-pvc"
1164
            - "false"
1165
            - "--force-osd-removal"
1166
            - "false"
1167
            - "--osd-ids"
1168
            - "SETTHEOSDIDHERE"
1169
          env:
1170
            - name: POD_NAMESPACE
1171
              valueFrom:
1172
                fieldRef:
1173
                  fieldPath: metadata.namespace
1174
            - name: ROOK_MON_ENDPOINTS
1175
              valueFrom:
1176
                configMapKeyRef:
1177
                  key: data
1178
                  name: rook-ceph-mon-endpoints
1179
            - name: ROOK_CEPH_USERNAME
1180
              valueFrom:
1181
                secretKeyRef:
1182
                  key: ceph-username
1183
                  name: rook-ceph-mon
1184
            - name: ROOK_CEPH_SECRET
1185
              valueFrom:
1186
                secretKeyRef:
1187
                  key: ceph-secret
1188
                  name: rook-ceph-mon
1189
            - name: ROOK_CONFIG_DIR
1190
              value: /var/lib/rook
1191
            - name: ROOK_CEPH_CONFIG_OVERRIDE
1192
              value: /etc/rook/config/override.conf
1193
            - name: ROOK_FSID
1194
              valueFrom:
1195
                secretKeyRef:
1196
                  key: fsid
1197
                  name: rook-ceph-mon
1198
            - name: ROOK_LOG_LEVEL
1199
              value: DEBUG
1200
          volumeMounts:
1201
            - mountPath: /etc/ceph
1202
              name: ceph-conf-emptydir
1203
            - mountPath: /var/lib/rook
1204
              name: rook-config
1205
      volumes:
1206
        - emptyDir: {}
1207
          name: ceph-conf-emptydir
1208
        - emptyDir: {}
1209
          name: rook-config
1210
      restartPolicy: Never
1211
1212
1213 99 Nico Schottelius
</pre>
1214
1215 1 Nico Schottelius
Deleting the deployment:
1216
1217
<pre>
1218
[18:05] bridge:~% kubectl -n rook-ceph delete deployment rook-ceph-osd-6
1219 99 Nico Schottelius
deployment.apps "rook-ceph-osd-6" deleted
1220
</pre>
1221 185 Nico Schottelius
1222
h3. Placement of mons/osds/etc.
1223
1224
See https://rook.io/docs/rook/v1.11/CRDs/Cluster/ceph-cluster-crd/#placement-configuration-settings
1225 98 Nico Schottelius
1226 215 Nico Schottelius
h3. Setting up and managing S3 object storage
1227
1228 217 Nico Schottelius
h4. Endpoints
1229
1230
| Location | Enpdoint |
1231
| p5 | https://s3.k8s.place5.ungleich.ch |
1232
1233
1234 215 Nico Schottelius
h4. Setting up a storage class
1235
1236
* This will store the buckets of a specific customer
1237
1238
Similar to this:
1239
1240
<pre>
1241
apiVersion: storage.k8s.io/v1
1242
kind: StorageClass
1243
metadata:
1244
  name: ungleich-archive-bucket-sc
1245
  namespace: rook-ceph
1246
provisioner: rook-ceph.ceph.rook.io/bucket
1247
reclaimPolicy: Delete
1248
parameters:
1249
  objectStoreName: place5
1250
  objectStoreNamespace: rook-ceph
1251
</pre>
1252
1253
h4. Setting up the Bucket
1254
1255
Similar to this:
1256
1257
<pre>
1258
apiVersion: objectbucket.io/v1alpha1
1259
kind: ObjectBucketClaim
1260
metadata:
1261
  name: ungleich-archive-bucket-claim
1262
  namespace: rook-ceph
1263
spec:
1264
  generateBucketName: ungleich-archive-ceph-bkt
1265
  storageClassName: ungleich-archive-bucket-sc
1266
  additionalConfig:
1267
    # To set for quota for OBC
1268
    #maxObjects: "1000"
1269
    maxSize: "100G"
1270
</pre>
1271
1272
* See also: https://rook.io/docs/rook/latest-release/Storage-Configuration/Object-Storage-RGW/ceph-object-bucket-claim/#obc-custom-resource
1273
1274
h4. Getting the credentials for the bucket
1275
1276
* Get "public" information from the configmap
1277
* Get secret from the secret
1278
1279 216 Nico Schottelius
<pre>
1280 1 Nico Schottelius
name=BUCKETNAME
1281 221 Nico Schottelius
s3host=s3.k8s.place5.ungleich.ch
1282
endpoint=https://${s3host}
1283 1 Nico Schottelius
1284
cm=$(kubectl -n rook-ceph get configmap -o yaml ${name}-bucket-claim)
1285 217 Nico Schottelius
1286 1 Nico Schottelius
sec=$(kubectl -n rook-ceph get secrets -o yaml ${name}-bucket-claim)
1287 222 Nico Schottelius
export AWS_ACCESS_KEY_ID=$(echo $sec | yq .data.AWS_ACCESS_KEY_ID | base64 -d ; echo "")
1288
export AWS_SECRET_ACCESS_KEY=$(echo $sec | yq .data.AWS_SECRET_ACCESS_KEY | base64 -d ; echo "")
1289 1 Nico Schottelius
1290 217 Nico Schottelius
1291 216 Nico Schottelius
bucket_name=$(echo $cm | yq .data.BUCKET_NAME)
1292 1 Nico Schottelius
</pre>
1293 217 Nico Schottelius
1294 220 Nico Schottelius
h5. Access via s3cmd
1295 1 Nico Schottelius
1296 221 Nico Schottelius
it is *NOT*:
1297
1298 220 Nico Schottelius
<pre>
1299 221 Nico Schottelius
s3cmd --host ${s3host}:443 --access_key=${AWS_ACCESS_KEY_ID} --secret_key=${AWS_SECRET_ACCESS_KEY} ls s3://${name}
1300 220 Nico Schottelius
</pre>
1301
1302 217 Nico Schottelius
h5. Access via s4cmd
1303
1304
<pre>
1305 1 Nico Schottelius
s4cmd --endpoint-url ${endpoint} --access-key=$(AWS_ACCESS_KEY_ID) --secret-key=$(AWS_SECRET_ACCESS_KEY) ls
1306
</pre>
1307 221 Nico Schottelius
1308
h5. Access via s5cmd
1309
1310
* Uses environment variables
1311
1312
<pre>
1313
s5cmd --endpoint-url ${endpoint} ls
1314
</pre>
1315 215 Nico Schottelius
1316 145 Nico Schottelius
h2. Ingress + Cert Manager
1317
1318
* We deploy "nginx-ingress":https://docs.nginx.com/nginx-ingress-controller/ to get an ingress
1319
* we deploy "cert-manager":https://cert-manager.io/ to handle certificates
1320
* We independently deploy @ClusterIssuer@ to allow the cert-manager app to deploy and the issuer to be created once the CRDs from cert manager are in place
1321
1322
h3. IPv4 reachability 
1323
1324
The ingress is by default IPv6 only. To make it reachable from the IPv4 world, get its IPv6 address and configure a NAT64 mapping in Jool.
1325
1326
Steps:
1327
1328
h4. Get the ingress IPv6 address
1329
1330
Use @kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.spec.clusterIP}'; echo ''@
1331
1332
Example:
1333
1334
<pre>
1335
kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.spec.clusterIP}'; echo ''
1336
2a0a:e5c0:10:1b::ce11
1337
</pre>
1338
1339
h4. Add NAT64 mapping
1340
1341
* Update the __dcl_jool_siit cdist type
1342
* Record the two IPs (IPv6 and IPv4)
1343
* Configure all routers
1344
1345
1346
h4. Add DNS record
1347
1348
To use the ingress capable as a CNAME destination, create an "ingress" DNS record, such as:
1349
1350
<pre>
1351
; k8s ingress for dev
1352
dev-ingress                 AAAA 2a0a:e5c0:10:1b::ce11
1353
dev-ingress                 A 147.78.194.23
1354
1355
</pre> 
1356
1357
h4. Add supporting wildcard DNS
1358
1359
If you plan to add various sites under a specific domain, we can add a wildcard DNS entry, such as *.k8s-dev.django-hosting.ch:
1360
1361
<pre>
1362
*.k8s-dev         CNAME dev-ingress.ungleich.ch.
1363
</pre>
1364
1365 76 Nico Schottelius
h2. Harbor
1366
1367 175 Nico Schottelius
* We user "Harbor":https://goharbor.io/ as an image registry for our own images. Internal app reference: apps/prod/harbor.
1368
* The admin password is in the password store, it is Harbor12345 by default
1369 76 Nico Schottelius
* At the moment harbor only authenticates against the internal ldap tree
1370
1371
h3. LDAP configuration
1372
1373
* The url needs to be ldaps://...
1374
* uid = uid
1375
* rest standard
1376 75 Nico Schottelius
1377 89 Nico Schottelius
h2. Monitoring / Prometheus
1378
1379 90 Nico Schottelius
* Via "kube-prometheus":https://github.com/prometheus-operator/kube-prometheus/
1380 89 Nico Schottelius
1381 91 Nico Schottelius
Access via ...
1382
1383
* http://prometheus-k8s.monitoring.svc:9090
1384
* http://grafana.monitoring.svc:3000
1385
* http://alertmanager.monitoring.svc:9093
1386
1387
1388 100 Nico Schottelius
h3. Prometheus Options
1389
1390
* "helm/kube-prometheus-stack":https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
1391
** Includes dashboards and co.
1392
* "manifest based kube-prometheus":https://github.com/prometheus-operator/kube-prometheus
1393
** Includes dashboards and co.
1394
* "Prometheus Operator (mainly CRD manifest":https://github.com/prometheus-operator/prometheus-operator
1395
1396 171 Nico Schottelius
h3. Grafana default password
1397
1398 218 Nico Schottelius
* If not changed: admin / @prom-operator@
1399
** Can be changed via:
1400
1401
<pre>
1402
    helm:
1403
      values: |-
1404
        configurations: |-
1405
          grafana:
1406
            adminPassword: "..."
1407
</pre>
1408 171 Nico Schottelius
1409 82 Nico Schottelius
h2. Nextcloud
1410
1411 85 Nico Schottelius
h3. How to get the nextcloud credentials 
1412 84 Nico Schottelius
1413
* The initial username is set to "nextcloud"
1414
* The password is autogenerated and saved in a kubernetes secret
1415
1416
<pre>
1417 85 Nico Schottelius
kubectl get secret RELEASENAME-nextcloud -o jsonpath="{.data.PASSWORD}" | base64 -d; echo "" 
1418 84 Nico Schottelius
</pre>
1419
1420 83 Nico Schottelius
h3. How to fix "Access through untrusted domain"
1421
1422 82 Nico Schottelius
* Nextcloud stores the initial domain configuration
1423 1 Nico Schottelius
* If the FQDN is changed, it will show the error message "Access through untrusted domain"
1424 82 Nico Schottelius
* To fix, edit /var/www/html/config/config.php and correct the domain
1425 1 Nico Schottelius
* Then delete the pods
1426 165 Nico Schottelius
1427
h3. Running occ commands inside the nextcloud container
1428
1429
* Find the pod in the right namespace
1430
1431
Exec:
1432
1433
<pre>
1434
su www-data -s /bin/sh -c ./occ
1435
</pre>
1436
1437
* -s /bin/sh is needed as the default shell is set to /bin/false
1438
1439 166 Nico Schottelius
h4. Rescanning files
1440 165 Nico Schottelius
1441 166 Nico Schottelius
* If files have been added without nextcloud's knowledge
1442
1443
<pre>
1444
su www-data -s /bin/sh -c "./occ files:scan --all"
1445
</pre>
1446 82 Nico Schottelius
1447 201 Nico Schottelius
h2. Sealed Secrets
1448
1449 202 Jin-Guk Kwon
* install kubeseal
1450 1 Nico Schottelius
1451 202 Jin-Guk Kwon
<pre>
1452
KUBESEAL_VERSION='0.23.0'
1453
wget "https://github.com/bitnami-labs/sealed-secrets/releases/download/v${KUBESEAL_VERSION:?}/kubeseal-${KUBESEAL_VERSION:?}-linux-amd64.tar.gz" 
1454
tar -xvzf kubeseal-${KUBESEAL_VERSION:?}-linux-amd64.tar.gz kubeseal
1455
sudo install -m 755 kubeseal /usr/local/bin/kubeseal
1456
</pre>
1457
1458
* create key for sealed-secret
1459
1460
<pre>
1461
kubeseal --fetch-cert > /tmp/public-key-cert.pem
1462
</pre>
1463
1464
* create the secret
1465
1466
<pre>
1467 203 Jin-Guk Kwon
ex)
1468 202 Jin-Guk Kwon
apiVersion: v1
1469
kind: Secret
1470
metadata:
1471
  name: Release.Name-postgres-config
1472
  annotations:
1473
    secret-generator.v1.mittwald.de/autogenerate: POSTGRES_PASSWORD
1474
    hosting: Release.Name
1475
  labels:
1476
    app.kubernetes.io/instance: Release.Name
1477
    app.kubernetes.io/component: postgres
1478
stringData:
1479
  POSTGRES_USER: postgresUser
1480
  POSTGRES_DB: postgresDBName
1481
  POSTGRES_INITDB_ARGS: "--no-locale --encoding=UTF8"
1482
</pre>
1483
1484
* convert secret.yaml to sealed-secret.yaml
1485
1486
<pre>
1487
kubeseal -n <namespace> --cert=/tmp/public-key-cert.pem --format=yaml < ./secret.yaml  > ./sealed-secret.yaml
1488
</pre>
1489
1490
* use sealed-secret.yaml on helm-chart directory
1491 201 Nico Schottelius
1492 205 Jin-Guk Kwon
* refer ticket : #11989 , #12120
1493 204 Jin-Guk Kwon
1494 1 Nico Schottelius
h2. Infrastructure versions
1495 35 Nico Schottelius
1496 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v5 (2021-10)
1497 1 Nico Schottelius
1498 57 Nico Schottelius
Clusters are configured / setup in this order:
1499
1500
* Bootstrap via kubeadm
1501 59 Nico Schottelius
* "Networking via calico + BGP (non ECMP) using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
1502
* "ArgoCD for CD":https://argo-cd.readthedocs.io/en/stable/
1503
** "rook for storage via argocd":https://rook.io/
1504 58 Nico Schottelius
** haproxy for in IPv6-cluster-IPv4-to-IPv6 proxy via argocd
1505
** "kubernetes-secret-generator for in cluster secrets":https://github.com/mittwald/kubernetes-secret-generator
1506
** "ungleich-certbot managing certs and nginx":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
1507
1508 57 Nico Schottelius
1509
h3. ungleich kubernetes infrastructure v4 (2021-09)
1510
1511 54 Nico Schottelius
* rook is configured via manifests instead of using the rook-ceph-cluster helm chart
1512 1 Nico Schottelius
* The rook operator is still being installed via helm
1513 35 Nico Schottelius
1514 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v3 (2021-07)
1515 1 Nico Schottelius
1516 10 Nico Schottelius
* rook is now installed via helm via argocd instead of directly via manifests
1517 28 Nico Schottelius
1518 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v2 (2021-05)
1519 28 Nico Schottelius
1520
* Replaced fluxv2 from ungleich k8s v1 with argocd
1521 1 Nico Schottelius
** argocd can apply helm templates directly without needing to go through Chart releases
1522 28 Nico Schottelius
* We are also using argoflow for build flows
1523
* Planned to add "kaniko":https://github.com/GoogleContainerTools/kaniko for image building
1524
1525 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v1 (2021-01)
1526 28 Nico Schottelius
1527
We are using the following components:
1528
1529
* "Calico as a CNI":https://www.projectcalico.org/ with BGP, IPv6 only, no encapsulation
1530
** Needed for basic networking
1531
* "kubernetes-secret-generator":https://github.com/mittwald/kubernetes-secret-generator for creating secrets
1532
** Needed so that secrets are not stored in the git repository, but only in the cluster
1533
* "ungleich-certbot":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
1534
** Needed to get letsencrypt certificates for services
1535
* "rook with ceph rbd + cephfs":https://rook.io/ for storage
1536
** rbd for almost everything, *ReadWriteOnce*
1537
** cephfs for smaller things, multi access *ReadWriteMany*
1538
** Needed for providing persistent storage
1539
* "flux v2":https://fluxcd.io/
1540
** Needed to manage resources automatically