Project

General

Profile

The ungleich kubernetes infrastructure » History » Version 231

Nico Schottelius, 06/01/2025 07:56 AM

1 22 Nico Schottelius
h1. The ungleich kubernetes infrastructure and ungleich kubernetes manual
2 1 Nico Schottelius
3 3 Nico Schottelius
{{toc}}
4
5 1 Nico Schottelius
h2. Status
6
7 211 Nico Schottelius
This document is **production**.
8
This document is the ungleich kubernetes infrastructure overview as well as the ungleich kubernetes manual.
9 1 Nico Schottelius
10 10 Nico Schottelius
h2. k8s clusters
11
12 123 Nico Schottelius
| Cluster            | Purpose/Setup     | Maintainer | Master(s)                     | argo                                                   | v4 http proxy | last verified |
13
| c0.k8s.ooo         | Dev               | -          | UNUSED                        |                                                        |               |    2021-10-05 |
14
| c1.k8s.ooo         | retired           |            | -                             |                                                        |               |    2022-03-15 |
15
| c2.k8s.ooo         | Dev p7 HW         | Nico       | server47 server53 server54    | "argo":https://argocd-server.argocd.svc.c2.k8s.ooo     |               |    2021-10-05 |
16
| c3.k8s.ooo         | retired           | -          | -                             |                                                        |               |    2021-10-05 |
17
| c4.k8s.ooo         | Dev2 p7 HW        | Jin-Guk    | server52 server53 server54    |                                                        |               |             - |
18
| c5.k8s.ooo         | retired           |            | -                             |                                                        |               |    2022-03-15 |
19
| c6.k8s.ooo         | Dev p6 VM Jin-Guk | Jin-Guk    |                               |                                                        |               |               |
20
| [[p5.k8s.ooo]]     | production        |            | server34 server36 server38    | "argo":https://argocd-server.argocd.svc.p5.k8s.ooo     | -             |               |
21
| [[p5-cow.k8s.ooo]] | production        | Nico       | server47 server51 server55    | "argo":https://argocd-server.argocd.svc.p5-cow.k8s.ooo |               |    2022-08-27 |
22
| [[p6.k8s.ooo]]     | production        |            | server67 server69 server71    | "argo":https://argocd-server.argocd.svc.p6.k8s.ooo     | 147.78.194.13 |    2021-10-05 |
23 184 Nico Schottelius
| [[p6-cow.k8s.ooo]] | production        |            | server134 server135 server136 | "argo":https://argocd-server.argocd.svc.p6in10.k8s.ooo | ?             |    2023-05-17 |
24 177 Nico Schottelius
| [[p10.k8s.ooo]]    | production        |            | server131 server132 server133 | "argo":https://argocd-server.argocd.svc.p10.k8s.ooo    | 147.78.194.12 |    2021-10-05 |
25 123 Nico Schottelius
| [[k8s.ge.nau.so]]  | development       |            | server107 server108 server109 | "argo":https://argocd-server.argocd.svc.k8s.ge.nau.so  |               |               |
26
| [[dev.k8s.ooo]]    | development       |            | server110 server111 server112 | "argo":https://argocd-server.argocd.svc.dev.k8s.ooo    | -             |    2022-07-08 |
27 164 Nico Schottelius
| [[r1r2p15k8sooo|r1.p15.k8s.ooo]] | production | Nico | server120 | | | 2022-10-30 |
28
| [[r1r2p15k8sooo|r2.p15.k8s.ooo]] | production | Nico | server121 | | | 2022-09-06 |
29 162 Nico Schottelius
| [[r1r2p10k8sooo|r1.p10.k8s.ooo]] | production | Nico | server122 | | | 2022-10-30 |
30
| [[r1r2p10k8sooo|r2.p10.k8s.ooo]] | production | Nico | server123 | | | 2022-10-15 |
31
| [[r1r2p5k8sooo|r1.p5.k8s.ooo]] | production | Nico | server137 | | | 2022-10-30 |
32
| [[r1r2p5k8sooo|r2.p5.k8s.ooo]] | production | Nico | server138 | | | 2022-10-30 |
33
| [[r1r2p6k8sooo|r1.p6.k8s.ooo]] | production | Nico | server139 | | | 2022-10-30 |
34
| [[r1r2p6k8sooo|r2.p6.k8s.ooo]] | production | Nico | server140 | | | 2022-10-30 |
35 21 Nico Schottelius
36 1 Nico Schottelius
h2. General architecture and components overview
37
38
* All k8s clusters are IPv6 only
39
* We use BGP peering to propagate podcidr and serviceCidr networks to our infrastructure
40
* The main public testing repository is "ungleich-k8s":https://code.ungleich.ch/ungleich-public/ungleich-k8s
41 18 Nico Schottelius
** Private configurations are found in the **k8s-config** repository
42 1 Nico Schottelius
43
h3. Cluster types
44
45 28 Nico Schottelius
| **Type/Feature**            | **Development**                | **Production**         |
46
| Min No. nodes               | 3 (1 master, 3 worker)         | 5 (3 master, 3 worker) |
47
| Recommended minimum         | 4 (dedicated master, 3 worker) | 8 (3 master, 5 worker) |
48
| Separation of control plane | optional                       | recommended            |
49
| Persistent storage          | required                       | required               |
50
| Number of storage monitors  | 3                              | 5                      |
51 1 Nico Schottelius
52 43 Nico Schottelius
h2. General k8s operations
53 1 Nico Schottelius
54 46 Nico Schottelius
h3. Cheat sheet / external great references
55
56
* "kubectl cheatsheet":https://kubernetes.io/docs/reference/kubectl/cheatsheet/
57
58 214 Nico Schottelius
Some examples:
59
60
h4. Use kubectl to print only the node names
61
62
<pre>
63
kubectl get nodes -o jsonpath='{.items[*].metadata.name}'
64
</pre>
65
66
Can easily be used in a shell loop like this:
67
68
<pre>
69
for host in $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}'); do echo $host; ssh root@${host} uptime; done
70
</pre>
71
72 117 Nico Schottelius
h3. Allowing to schedule work on the control plane / removing node taints
73 69 Nico Schottelius
74
* Mostly for single node / test / development clusters
75
* Just remove the master taint as follows
76
77
<pre>
78
kubectl taint nodes --all node-role.kubernetes.io/master-
79 118 Nico Schottelius
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
80 69 Nico Schottelius
</pre>
81 1 Nico Schottelius
82 117 Nico Schottelius
You can check the node taints using @kubectl describe node ...@
83 69 Nico Schottelius
84 208 Nico Schottelius
h3. Adding taints
85
86
* For instance to limit nodes to specific customers
87
88
<pre>
89
kubectl taint nodes serverXX customer=CUSTOMERNAME:NoSchedule
90
</pre>
91
92 44 Nico Schottelius
h3. Get the cluster admin.conf
93
94
* On the masters of each cluster you can find the file @/etc/kubernetes/admin.conf@
95
* To be able to administrate the cluster you can copy the admin.conf to your local machine
96
* Multi cluster debugging can very easy if you name the config ~/cX-admin.conf (see example below)
97
98
<pre>
99
% scp root@server47.place7.ungleich.ch:/etc/kubernetes/admin.conf ~/c2-admin.conf
100
% export KUBECONFIG=~/c2-admin.conf    
101
% kubectl get nodes
102
NAME       STATUS                     ROLES                  AGE   VERSION
103
server47   Ready                      control-plane,master   82d   v1.22.0
104
server48   Ready                      control-plane,master   82d   v1.22.0
105
server49   Ready                      <none>                 82d   v1.22.0
106
server50   Ready                      <none>                 82d   v1.22.0
107
server59   Ready                      control-plane,master   82d   v1.22.0
108
server60   Ready,SchedulingDisabled   <none>                 82d   v1.22.0
109
server61   Ready                      <none>                 82d   v1.22.0
110
server62   Ready                      <none>                 82d   v1.22.0               
111
</pre>
112
113 18 Nico Schottelius
h3. Installing a new k8s cluster
114 8 Nico Schottelius
115 9 Nico Schottelius
* Decide on the cluster name (usually *cX.k8s.ooo*), X counting upwards
116 28 Nico Schottelius
** Using pXX.k8s.ooo for production clusters of placeXX
117 9 Nico Schottelius
* Use cdist to configure the nodes with requirements like crio
118
* Decide between single or multi node control plane setups (see below)
119 28 Nico Schottelius
** Single control plane suitable for development clusters
120 9 Nico Schottelius
121 28 Nico Schottelius
Typical init procedure:
122 9 Nico Schottelius
123 206 Nico Schottelius
h4. Single control plane:
124
125
<pre>
126
kubeadm init --config bootstrap/XXX/kubeadm.yaml
127
</pre>
128
129
h4. Multi control plane (HA):
130
131
<pre>
132
kubeadm init --config bootstrap/XXX/kubeadm.yaml --upload-certs
133
</pre>
134
135 10 Nico Schottelius
136 29 Nico Schottelius
h3. Deleting a pod that is hanging in terminating state
137
138
<pre>
139
kubectl delete pod <PODNAME> --grace-period=0 --force --namespace <NAMESPACE>
140
</pre>
141
142
(from https://stackoverflow.com/questions/35453792/pods-stuck-in-terminating-status)
143
144 42 Nico Schottelius
h3. Listing nodes of a cluster
145
146
<pre>
147
[15:05] bridge:~% kubectl get nodes
148
NAME       STATUS   ROLES                  AGE   VERSION
149
server22   Ready    <none>                 52d   v1.22.0
150
server23   Ready    <none>                 52d   v1.22.2
151
server24   Ready    <none>                 52d   v1.22.0
152
server25   Ready    <none>                 52d   v1.22.0
153
server26   Ready    <none>                 52d   v1.22.0
154
server27   Ready    <none>                 52d   v1.22.0
155
server63   Ready    control-plane,master   52d   v1.22.0
156
server64   Ready    <none>                 52d   v1.22.0
157
server65   Ready    control-plane,master   52d   v1.22.0
158
server66   Ready    <none>                 52d   v1.22.0
159
server83   Ready    control-plane,master   52d   v1.22.0
160
server84   Ready    <none>                 52d   v1.22.0
161
server85   Ready    <none>                 52d   v1.22.0
162
server86   Ready    <none>                 52d   v1.22.0
163
</pre>
164
165 41 Nico Schottelius
h3. Removing / draining a node
166
167
Usually @kubectl drain server@ should do the job, but sometimes we need to be more aggressive:
168
169 1 Nico Schottelius
<pre>
170 103 Nico Schottelius
kubectl drain --delete-emptydir-data --ignore-daemonsets serverXX
171 42 Nico Schottelius
</pre>
172
173
h3. Readding a node after draining
174
175
<pre>
176
kubectl uncordon serverXX
177 1 Nico Schottelius
</pre>
178 43 Nico Schottelius
179 50 Nico Schottelius
h3. (Re-)joining worker nodes after creating the cluster
180 49 Nico Schottelius
181
* We need to have an up-to-date token
182
* We use different join commands for the workers and control plane nodes
183
184
Generating the join command on an existing control plane node:
185
186
<pre>
187
kubeadm token create --print-join-command
188
</pre>
189
190 50 Nico Schottelius
h3. (Re-)joining control plane nodes after creating the cluster
191 1 Nico Schottelius
192 50 Nico Schottelius
* We generate the token again
193
* We upload the certificates
194
* We need to combine/create the join command for the control plane node
195
196
Example session:
197
198
<pre>
199
% kubeadm token create --print-join-command
200
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash 
201
202
% kubeadm init phase upload-certs --upload-certs
203
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
204
[upload-certs] Using certificate key:
205
CERTKEY
206
207
# Then we use these two outputs on the joining node:
208
209
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash --control-plane --certificate-key CERTKEY
210
</pre>
211
212
Commands to be used on a control plane node:
213
214
<pre>
215
kubeadm token create --print-join-command
216
kubeadm init phase upload-certs --upload-certs
217
</pre>
218
219
Commands to be used on the joining node:
220
221
<pre>
222
JOINCOMMAND --control-plane --certificate-key CERTKEY
223
</pre>
224 49 Nico Schottelius
225 51 Nico Schottelius
SEE ALSO
226
227
* https://stackoverflow.com/questions/63936268/how-to-generate-kubeadm-token-for-secondary-control-plane-nodes
228
* https://blog.scottlowe.org/2019/08/15/reconstructing-the-join-command-for-kubeadm/
229
230 53 Nico Schottelius
h3. How to fix etcd does not start when rejoining a kubernetes cluster as a control plane
231 52 Nico Schottelius
232
If during the above step etcd does not come up, @kubeadm join@ can hang as follows:
233
234
<pre>
235
[control-plane] Creating static Pod manifest for "kube-apiserver"                                                              
236
[control-plane] Creating static Pod manifest for "kube-controller-manager"                                                     
237
[control-plane] Creating static Pod manifest for "kube-scheduler"                                                              
238
[check-etcd] Checking that the etcd cluster is healthy                                                                         
239
error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://[2a0a:e5c0:10:1:225:b3ff:fe20:37
240
8a]:2379 with maintenance client: context deadline exceeded                                                                    
241
To see the stack trace of this error execute with --v=5 or higher         
242
</pre>
243
244
Then the problem is likely that the etcd server is still a member of the cluster. We first need to remove it from the etcd cluster and then the join works.
245
246
To fix this we do:
247
248
* Find a working etcd pod
249
* Find the etcd members / member list
250
* Remove the etcd member that we want to re-join the cluster
251
252
253
<pre>
254
# Find the etcd pods
255
kubectl -n kube-system get pods -l component=etcd,tier=control-plane
256
257
# Get the list of etcd servers with the member id 
258
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
259
260
# Remove the member
261
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove MEMBERID
262
</pre>
263
264
Sample session:
265
266
<pre>
267
[10:48] line:~% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
268
NAME            READY   STATUS    RESTARTS     AGE
269
etcd-server63   1/1     Running   0            3m11s
270
etcd-server65   1/1     Running   3            7d2h
271
etcd-server83   1/1     Running   8 (6d ago)   7d2h
272
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
273
356891cd676df6e4, started, server65, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2379, false
274
371b8a07185dee7e, started, server63, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2379, false
275
5942bc58307f8af9, started, server83, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2380, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2379, false
276
277
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove 371b8a07185dee7e
278
Member 371b8a07185dee7e removed from cluster e3c0805f592a8f77
279 1 Nico Schottelius
280
</pre>
281
282
SEE ALSO
283
284
* We found the solution using https://stackoverflow.com/questions/67921552/re-installed-node-cannot-join-kubernetes-cluster
285 56 Nico Schottelius
286 213 Nico Schottelius
h4. Updating the members
287
288
1) get alive member
289
290
<pre>
291
% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
292
NAME            READY   STATUS    RESTARTS   AGE
293
etcd-server67   1/1     Running   1          185d
294
etcd-server69   1/1     Running   1          185d
295
etcd-server71   1/1     Running   2          185d
296
[20:57] sun:~% 
297
</pre>
298
299
2) get member list
300
301
* in this case via crictl, as the api does not work correctly anymore
302
303
<pre>
304
305
306
</pre>
307
308
309
3) update
310
311
<pre>
312
etcdctl member update MEMBERID  --peer-urls=https://[...]:2380
313
314
315
</pre>
316
317 147 Nico Schottelius
h3. Node labels (adding, showing, removing)
318
319
Listing the labels:
320
321
<pre>
322
kubectl get nodes --show-labels
323
</pre>
324
325
Adding labels:
326
327
<pre>
328
kubectl label nodes LIST-OF-NODES label1=value1 
329
330
</pre>
331
332
For instance:
333
334
<pre>
335
kubectl label nodes router2 router3 hosttype=router 
336
</pre>
337
338
Selecting nodes in pods:
339
340
<pre>
341
apiVersion: v1
342
kind: Pod
343
...
344
spec:
345
  nodeSelector:
346
    hosttype: router
347
</pre>
348
349 148 Nico Schottelius
Removing labels by adding a minus at the end of the label name:
350
351
<pre>
352
kubectl label node <nodename> <labelname>-
353
</pre>
354
355
For instance:
356
357
<pre>
358
kubectl label nodes router2 router3 hosttype- 
359
</pre>
360
361 147 Nico Schottelius
SEE ALSO
362 1 Nico Schottelius
363 148 Nico Schottelius
* https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes/
364
* https://stackoverflow.com/questions/34067979/how-to-delete-a-node-label-by-command-and-api
365 147 Nico Schottelius
366 199 Nico Schottelius
h3. Listing all pods on a node
367
368
<pre>
369
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=serverXX
370
</pre>
371
372
Found on https://stackoverflow.com/questions/62000559/how-to-list-all-the-pods-running-in-a-particular-worker-node-by-executing-a-comm
373
374 101 Nico Schottelius
h3. Hardware Maintenance using ungleich-hardware
375
376
Use the following manifest and replace the HOST with the actual host:
377
378
<pre>
379
apiVersion: v1
380
kind: Pod
381
metadata:
382
  name: ungleich-hardware-HOST
383
spec:
384
  containers:
385
  - name: ungleich-hardware
386
    image: ungleich/ungleich-hardware:0.0.5
387
    args:
388
    - sleep
389
    - "1000000"
390
    volumeMounts:
391
      - mountPath: /dev
392
        name: dev
393
    securityContext:
394
      privileged: true
395
  nodeSelector:
396
    kubernetes.io/hostname: "HOST"
397
398
  volumes:
399
    - name: dev
400
      hostPath:
401
        path: /dev
402
</pre>
403
404 102 Nico Schottelius
Also see: [[The_ungleich_hardware_maintenance_guide]]
405
406 105 Nico Schottelius
h3. Triggering a cronjob / creating a job from a cronjob
407 104 Nico Schottelius
408
To test a cronjob, we can create a job from a cronjob:
409
410
<pre>
411
kubectl create job --from=cronjob/volume2-daily-backup volume2-manual
412
</pre>
413
414
This creates a job volume2-manual based on the cronjob  volume2-daily
415
416 112 Nico Schottelius
h3. su-ing into a user that has nologin shell set
417
418
Many times users are having nologin as their shell inside the container. To be able to execute maintenance commands within the
419
container, we can use @su -s /bin/sh@ like this:
420
421
<pre>
422
su -s /bin/sh -c '/path/to/your/script' testuser
423
</pre>
424
425
Found on https://serverfault.com/questions/351046/how-to-run-command-as-user-who-has-usr-sbin-nologin-as-shell
426
427 113 Nico Schottelius
h3. How to print a secret value
428
429
Assuming you want the "password" item from a secret, use:
430
431
<pre>
432
kubectl get secret SECRETNAME -o jsonpath="{.data.password}" | base64 -d; echo "" 
433
</pre>
434
435 209 Nico Schottelius
h3. Fixing the "ImageInspectError"
436
437
If you see this problem:
438
439
<pre>
440
# kubectl get pods
441
NAME                                                       READY   STATUS                   RESTARTS   AGE
442
bird-router-server137-bird-767f65bb47-g4xsh                0/1     Init:ImageInspectError   0          77d
443
bird-router-server137-openvpn-server120-5c987b7ffb-cn9xf   0/1     ImageInspectError        1          159d
444
bird-router-server137-unbound-5c6f5d4bb6-cxbpr             0/1     ImageInspectError        1          159d
445
</pre>
446
447
Fixes so far:
448
449
* correct registries.conf
450
451 212 Nico Schottelius
h3. Automatic cleanup of images
452
453
* options to kubelet
454
455
<pre>
456
  --image-gc-high-threshold=90: The percent of disk usage after which image garbage collection is always run. Default: 90%
457
  --image-gc-low-threshold=80: The percent of disk usage before which image garbage collection is never run. Lowest disk usage to garbage collect to. Default: 80%
458
</pre>
459 209 Nico Schottelius
460 173 Nico Schottelius
h3. How to upgrade a kubernetes cluster
461 172 Nico Schottelius
462
h4. General
463
464
* Should be done every X months to stay up-to-date
465
** X probably something like 3-6
466
* kubeadm based clusters
467
* Needs specific kubeadm versions for upgrade
468
* Follow instructions on https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
469 190 Nico Schottelius
* Finding releases: https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG
470 172 Nico Schottelius
471
h4. Getting a specific kubeadm or kubelet version
472
473
<pre>
474 190 Nico Schottelius
RELEASE=v1.22.17
475
RELEASE=v1.23.17
476 181 Nico Schottelius
RELEASE=v1.24.9
477 1 Nico Schottelius
RELEASE=v1.25.9
478
RELEASE=v1.26.6
479 190 Nico Schottelius
RELEASE=v1.27.2
480
481 187 Nico Schottelius
ARCH=amd64
482 172 Nico Schottelius
483
curl -L --remote-name-all https://dl.k8s.io/release/${RELEASE}/bin/linux/${ARCH}/{kubeadm,kubelet}
484 182 Nico Schottelius
chmod u+x kubeadm kubelet
485 172 Nico Schottelius
</pre>
486
487
h4. Steps
488
489
* kubeadm upgrade plan
490
** On one control plane node
491
* kubeadm upgrade apply vXX.YY.ZZ
492
** On one control plane node
493 189 Nico Schottelius
* kubeadm upgrade node
494
** On all other control plane nodes
495
** On all worker nodes afterwards
496
497 172 Nico Schottelius
498 173 Nico Schottelius
Repeat for all control planes nodes. The upgrade kubelet on all other nodes via package manager.
499 172 Nico Schottelius
500 193 Nico Schottelius
h4. Upgrading to 1.22.17
501 1 Nico Schottelius
502 193 Nico Schottelius
* https://v1-22.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
503 194 Nico Schottelius
* Need to create a kubeadm config map
504 198 Nico Schottelius
** f.i. using the following
505
** @/usr/local/bin/kubeadm-v1.22.17   upgrade --config kubeadm.yaml --ignore-preflight-errors=CoreDNSUnsupportedPlugins,CoreDNSMigration apply -y v1.22.17@
506 193 Nico Schottelius
* Done for p6 on 2023-10-04
507
508
h4. Upgrading to 1.23.17
509
510
* https://v1-23.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
511
* No special notes
512
* Done for p6 on 2023-10-04
513
514
h4. Upgrading to 1.24.17
515
516
* https://v1-24.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
517
* No special notes
518
* Done for p6 on 2023-10-04
519
520
h4. Upgrading to 1.25.14
521
522
* https://v1-24.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
523
* No special notes
524
* Done for p6 on 2023-10-04
525
526
h4. Upgrading to 1.26.9
527
528 1 Nico Schottelius
* https://v1-26.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
529 193 Nico Schottelius
* No special notes
530
* Done for p6 on 2023-10-04
531 188 Nico Schottelius
532 196 Nico Schottelius
h4. Upgrading to 1.27
533 186 Nico Schottelius
534 192 Nico Schottelius
* https://v1-27.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
535 186 Nico Schottelius
* kubelet will not start anymore
536
* reason: @"command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"@
537
* /var/lib/kubelet/kubeadm-flags.env contains that parameter
538
* remove it, start kubelet
539 192 Nico Schottelius
540 197 Nico Schottelius
h4. Upgrading to 1.28
541 192 Nico Schottelius
542
* https://v1-28.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
543 186 Nico Schottelius
544 223 Nico Schottelius
h4. Upgrading to 1.29
545
546
* Done for many clusters around 2024-01-10
547
* Unsure if it was properly released
548
* https://v1-29.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
549
550 219 Nico Schottelius
h4. Upgrading to 1.31
551
552
* Cluster needs to updated FIRST before kubelet/the OS
553
554
Otherwise you run into errors in the pod like this:
555
556
<pre>
557
  Warning  Failed     11s (x3 over 12s)  kubelet            Error: services have not yet been read at least once, cannot construct envvars
558
</pre>
559
560 210 Nico Schottelius
And the resulting pod state is:
561
562
<pre>
563
Init:CreateContainerConfigError
564
</pre>
565
566 224 Nico Schottelius
Fix: 
567
568
* find an old 1.30 kubelet package, downgrade kubelet, upgrade the control plane, upgrade kubelet again
569
570 225 Nico Schottelius
<pre>
571
wget https://mirror.ungleich.ch/mirror/packages/alpine/v3.20/community/x86_64/kubelet-1.30.0-r3.apk
572
wget https://mirror.ungleich.ch/mirror/packages/alpine/v3.20/community/x86_64/kubelet-openrc-1.30.0-r3.apk
573
apk add ./kubelet-1.30.0-r3.apk ./kubelet-openrc-1.30.0-r3.apk
574 226 Nico Schottelius
/etc/init.d/kubelet restart
575 225 Nico Schottelius
</pre>
576 226 Nico Schottelius
577
Then upgrade:
578
579
<pre>
580
/usr/local/bin/kubeadm-v1.31.3   upgrade apply -y v1.31.3
581
</pre>
582
583
Then re-upgrade the kubelet:
584
585
<pre>
586
apk upgrade -a
587
</pre>
588
589 225 Nico Schottelius
590 186 Nico Schottelius
h4. Upgrade to crio 1.27: missing crun
591
592
Error message
593
594
<pre>
595
level=fatal msg="validating runtime config: runtime validation: \"crun\" not found in $PATH: exec: \"crun\": executable file not found in $PATH"
596
</pre>
597 1 Nico Schottelius
598 186 Nico Schottelius
Fix:
599
600
<pre>
601
apk add crun
602
</pre>
603 223 Nico Schottelius
604 186 Nico Schottelius
605 157 Nico Schottelius
h2. Reference CNI
606
607
* Mainly "stupid", but effective plugins
608
* Main documentation on https://www.cni.dev/plugins/current/
609 158 Nico Schottelius
* Plugins
610
** bridge
611
*** Can create the bridge on the host
612
*** But seems not to be able to add host interfaces to it as well
613
*** Has support for vlan tags
614
** vlan
615
*** creates vlan tagged sub interface on the host
616 160 Nico Schottelius
*** "It's a 1:1 mapping (i.e. no bridge in between)":https://github.com/k8snetworkplumbingwg/multus-cni/issues/569
617 158 Nico Schottelius
** host-device
618
*** moves the interface from the host into the container
619
*** very easy for physical connections to containers
620 159 Nico Schottelius
** ipvlan
621
*** "virtualisation" of a host device
622
*** routing based on IP
623
*** Same MAC for everyone
624
*** Cannot reach the master interface
625
** maclvan
626
*** With mac addresses
627
*** Supports various modes (to be checked)
628
** ptp ("point to point")
629
*** Creates a host device and connects it to the container
630
** win*
631 158 Nico Schottelius
*** Windows implementations
632 157 Nico Schottelius
633 62 Nico Schottelius
h2. Calico CNI
634
635
h3. Calico Installation
636
637
* We install "calico using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
638 228 Nico Schottelius
* Check the tags on https://github.com/projectcalico/calico/tags for the latest release
639 62 Nico Schottelius
* This has the following advantages:
640
** Easy to upgrade
641
** Does not require os to configure IPv6/dual stack settings as the tigera operator figures out things on its own
642 230 Nico Schottelius
* As of 2025-05-28, tigera should be installed in the *tigera-operator* namespace
643 62 Nico Schottelius
644
Usually plain calico can be installed directly using:
645
646 1 Nico Schottelius
<pre>
647 228 Nico Schottelius
VERSION=v3.30.0
648 149 Nico Schottelius
649 1 Nico Schottelius
helm repo add projectcalico https://docs.projectcalico.org/charts
650
helm repo update
651 230 Nico Schottelius
helm upgrade --install calico projectcalico/tigera-operator --version $VERSION --namespace tigera-operator --create-namespace
652 92 Nico Schottelius
</pre>
653 1 Nico Schottelius
654 229 Nico Schottelius
h3. Calico upgrade
655 92 Nico Schottelius
656 229 Nico Schottelius
* As of 3.30 or so, CRDs need to be applied manually beforehand
657
658
<pre>
659 231 Nico Schottelius
VERSION=v3.30.1
660 229 Nico Schottelius
661 1 Nico Schottelius
kubectl apply --server-side --force-conflicts -f https://raw.githubusercontent.com/projectcalico/calico/${VERSION}/manifests/operator-crds.yaml
662 231 Nico Schottelius
helm repo update
663
helm upgrade --install --namespace tigera-operator calico projectcalico/tigera-operator --version $VERSION --create-namespace
664 229 Nico Schottelius
</pre>
665 228 Nico Schottelius
666 62 Nico Schottelius
h3. Installing calicoctl
667
668 115 Nico Schottelius
* General installation instructions, including binary download: https://projectcalico.docs.tigera.io/maintenance/clis/calicoctl/install
669
670 62 Nico Schottelius
To be able to manage and configure calico, we need to 
671
"install calicoctl (we choose the version as a pod)":https://docs.projectcalico.org/getting-started/clis/calicoctl/install#install-calicoctl-as-a-kubernetes-pod
672
673
<pre>
674
kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml
675
</pre>
676
677 93 Nico Schottelius
Or version specific:
678
679
<pre>
680
kubectl apply -f https://github.com/projectcalico/calico/blob/v3.20.4/manifests/calicoctl.yaml
681 97 Nico Schottelius
682
# For 3.22
683
kubectl apply -f https://projectcalico.docs.tigera.io/archive/v3.22/manifests/calicoctl.yaml
684 93 Nico Schottelius
</pre>
685
686 70 Nico Schottelius
And making it easier accessible by alias:
687
688
<pre>
689
alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"
690
</pre>
691
692 62 Nico Schottelius
h3. Calico configuration
693
694 63 Nico Schottelius
By default our k8s clusters "BGP peer":https://docs.projectcalico.org/networking/bgp
695
with an upstream router to propagate podcidr and servicecidr.
696 62 Nico Schottelius
697
Default settings in our infrastructure:
698
699
* We use a full-mesh using the @nodeToNodeMeshEnabled: true@ option
700
* We keep the original next hop so that *only* the server with the pod is announcing it (instead of ecmp)
701 1 Nico Schottelius
* We use private ASNs for k8s clusters
702 63 Nico Schottelius
* We do *not* use any overlay
703 62 Nico Schottelius
704
After installing calico and calicoctl the last step of the installation is usually:
705
706 1 Nico Schottelius
<pre>
707 79 Nico Schottelius
calicoctl create -f - < calico-bgp.yaml
708 62 Nico Schottelius
</pre>
709
710
711
A sample BGP configuration:
712
713
<pre>
714
---
715
apiVersion: projectcalico.org/v3
716
kind: BGPConfiguration
717
metadata:
718
  name: default
719
spec:
720
  logSeverityScreen: Info
721
  nodeToNodeMeshEnabled: true
722
  asNumber: 65534
723
  serviceClusterIPs:
724
  - cidr: 2a0a:e5c0:10:3::/108
725
  serviceExternalIPs:
726
  - cidr: 2a0a:e5c0:10:3::/108
727
---
728
apiVersion: projectcalico.org/v3
729
kind: BGPPeer
730
metadata:
731
  name: router1-place10
732
spec:
733
  peerIP: 2a0a:e5c0:10:1::50
734
  asNumber: 213081
735
  keepOriginalNextHop: true
736
</pre>
737
738 227 Nico Schottelius
h3. Get installed calico version
739
740
* might be in calico or tigera namespace
741
742
<pre>
743
helm ls -A | grep calico
744
</pre>
745
746 126 Nico Schottelius
h2. Cilium CNI (experimental)
747
748 137 Nico Schottelius
h3. Status
749
750 138 Nico Schottelius
*NO WORKING CILIUM CONFIGURATION FOR IPV6 only modes*
751 137 Nico Schottelius
752 146 Nico Schottelius
h3. Latest error
753
754
It seems cilium does not run on IPv6 only hosts:
755
756
<pre>
757
level=info msg="Validating configured node address ranges" subsys=daemon
758
level=fatal msg="postinit failed" error="external IPv4 node address could not be derived, please configure via --ipv4-node" subsys=daemon
759
level=info msg="Starting IP identity watcher" subsys=ipcache
760
</pre>
761
762
It crashes after that log entry
763
764 128 Nico Schottelius
h3. BGP configuration
765
766
* The cilium-operator will not start without a correct configmap being present beforehand (see error message below)
767
* Creating the bgp config beforehand as a configmap is thus required.
768
769
The error one gets without the configmap present:
770
771
Pods are hanging with:
772
773
<pre>
774
cilium-bpqm6                       0/1     Init:0/4            0             9s
775
cilium-operator-5947d94f7f-5bmh2   0/1     ContainerCreating   0             9s
776
</pre>
777
778
The error message in the cilium-*perator is:
779
780
<pre>
781
Events:
782
  Type     Reason       Age                From               Message
783
  ----     ------       ----               ----               -------
784
  Normal   Scheduled    80s                default-scheduler  Successfully assigned kube-system/cilium-operator-5947d94f7f-lqcsp to server56
785
  Warning  FailedMount  16s (x8 over 80s)  kubelet            MountVolume.SetUp failed for volume "bgp-config-path" : configmap "bgp-config" not found
786
</pre>
787
788
A correct bgp config looks like this:
789
790
<pre>
791
apiVersion: v1
792
kind: ConfigMap
793
metadata:
794
  name: bgp-config
795
  namespace: kube-system
796
data:
797
  config.yaml: |
798
    peers:
799
      - peer-address: 2a0a:e5c0::46
800
        peer-asn: 209898
801
        my-asn: 65533
802
      - peer-address: 2a0a:e5c0::47
803
        peer-asn: 209898
804
        my-asn: 65533
805
    address-pools:
806
      - name: default
807
        protocol: bgp
808
        addresses:
809
          - 2a0a:e5c0:0:14::/64
810
</pre>
811 127 Nico Schottelius
812
h3. Installation
813 130 Nico Schottelius
814 127 Nico Schottelius
Adding the repo
815 1 Nico Schottelius
<pre>
816 127 Nico Schottelius
817 129 Nico Schottelius
helm repo add cilium https://helm.cilium.io/
818 130 Nico Schottelius
helm repo update
819
</pre>
820 129 Nico Schottelius
821 135 Nico Schottelius
Installing + configuring cilium
822 129 Nico Schottelius
<pre>
823 130 Nico Schottelius
ipv6pool=2a0a:e5c0:0:14::/112
824 1 Nico Schottelius
825 146 Nico Schottelius
version=1.12.2
826 129 Nico Schottelius
827
helm upgrade --install cilium cilium/cilium --version $version \
828 1 Nico Schottelius
  --namespace kube-system \
829
  --set ipv4.enabled=false \
830
  --set ipv6.enabled=true \
831 146 Nico Schottelius
  --set enableIPv6Masquerade=false \
832
  --set bgpControlPlane.enabled=true 
833 1 Nico Schottelius
834 146 Nico Schottelius
#  --set ipam.operator.clusterPoolIPv6PodCIDRList=$ipv6pool
835
836
# Old style bgp?
837 136 Nico Schottelius
#   --set bgp.enabled=true --set bgp.announce.podCIDR=true \
838 127 Nico Schottelius
839
# Show possible configuration options
840
helm show values cilium/cilium
841
842 1 Nico Schottelius
</pre>
843 132 Nico Schottelius
844
Using a /64 for ipam.operator.clusterPoolIPv6PodCIDRList fails with:
845
846
<pre>
847
level=fatal msg="Unable to init cluster-pool allocator" error="unable to initialize IPv6 allocator New CIDR set failed; the node CIDR size is too big" subsys=cilium-operator-generic
848
</pre>
849
850 126 Nico Schottelius
851 1 Nico Schottelius
See also https://github.com/cilium/cilium/issues/20756
852 135 Nico Schottelius
853
Seems a /112 is actually working.
854
855
h3. Kernel modules
856
857
Cilium requires the following modules to be loaded on the host (not loaded by default):
858
859
<pre>
860 1 Nico Schottelius
modprobe  ip6table_raw
861
modprobe  ip6table_filter
862
</pre>
863 146 Nico Schottelius
864
h3. Interesting helm flags
865
866
* autoDirectNodeRoutes
867
* bgpControlPlane.enabled = true
868
869
h3. SEE ALSO
870
871
* https://docs.cilium.io/en/v1.12/helm-reference/
872 133 Nico Schottelius
873 179 Nico Schottelius
h2. Multus
874 168 Nico Schottelius
875
* https://github.com/k8snetworkplumbingwg/multus-cni
876
* Installing a deployment w/ CRDs
877 150 Nico Schottelius
878 169 Nico Schottelius
<pre>
879 176 Nico Schottelius
VERSION=v4.0.1
880 169 Nico Schottelius
881 170 Nico Schottelius
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/${VERSION}/deployments/multus-daemonset-crio.yml
882
</pre>
883 169 Nico Schottelius
884 191 Nico Schottelius
h2. ArgoCD
885 56 Nico Schottelius
886 60 Nico Schottelius
h3. Argocd Installation
887 1 Nico Schottelius
888 116 Nico Schottelius
* See https://argo-cd.readthedocs.io/en/stable/
889
890 60 Nico Schottelius
As there is no configuration management present yet, argocd is installed using
891
892 1 Nico Schottelius
<pre>
893 60 Nico Schottelius
kubectl create namespace argocd
894 1 Nico Schottelius
895
# OR: latest stable
896
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
897
898 191 Nico Schottelius
# OR Specific Version
899
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/v2.3.2/manifests/install.yaml
900 56 Nico Schottelius
901 191 Nico Schottelius
902
</pre>
903 1 Nico Schottelius
904 60 Nico Schottelius
h3. Get the argocd credentials
905
906
<pre>
907
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d; echo ""
908
</pre>
909 52 Nico Schottelius
910 87 Nico Schottelius
h3. Accessing argocd
911
912
In regular IPv6 clusters:
913
914
* Navigate to https://argocd-server.argocd.CLUSTERDOMAIN
915
916
In legacy IPv4 clusters
917
918
<pre>
919
kubectl --namespace argocd port-forward svc/argocd-server 8080:80
920
</pre>
921
922 88 Nico Schottelius
* Navigate to https://localhost:8080
923
924 68 Nico Schottelius
h3. Using the argocd webhook to trigger changes
925 67 Nico Schottelius
926
* To trigger changes post json https://argocd.example.com/api/webhook
927
928 72 Nico Schottelius
h3. Deploying an application
929
930
* Applications are deployed via git towards gitea (code.ungleich.ch) and then pulled by argo
931 73 Nico Schottelius
* Always include the *redmine-url* pointing to the (customer) ticket
932
** Also add the support-url if it exists
933 72 Nico Schottelius
934
Application sample
935
936
<pre>
937
apiVersion: argoproj.io/v1alpha1
938
kind: Application
939
metadata:
940
  name: gitea-CUSTOMER
941
  namespace: argocd
942
spec:
943
  destination:
944
    namespace: default
945
    server: 'https://kubernetes.default.svc'
946
  source:
947
    path: apps/prod/gitea
948
    repoURL: 'https://code.ungleich.ch/ungleich-intern/k8s-config.git'
949
    targetRevision: HEAD
950
    helm:
951
      parameters:
952
        - name: storage.data.storageClass
953
          value: rook-ceph-block-hdd
954
        - name: storage.data.size
955
          value: 200Gi
956
        - name: storage.db.storageClass
957
          value: rook-ceph-block-ssd
958
        - name: storage.db.size
959
          value: 10Gi
960
        - name: storage.letsencrypt.storageClass
961
          value: rook-ceph-block-hdd
962
        - name: storage.letsencrypt.size
963
          value: 50Mi
964
        - name: letsencryptStaging
965
          value: 'no'
966
        - name: fqdn
967
          value: 'code.verua.online'
968
  project: default
969
  syncPolicy:
970
    automated:
971
      prune: true
972
      selfHeal: true
973
  info:
974
    - name: 'redmine-url'
975
      value: 'https://redmine.ungleich.ch/issues/ISSUEID'
976
    - name: 'support-url'
977
      value: 'https://support.ungleich.ch/Ticket/Display.html?id=TICKETID'
978
</pre>
979
980 80 Nico Schottelius
h2. Helm related operations and conventions
981 55 Nico Schottelius
982 61 Nico Schottelius
We use helm charts extensively.
983
984
* In production, they are managed via argocd
985
* In development, helm chart can de developed and deployed manually using the helm utility.
986
987 55 Nico Schottelius
h3. Installing a helm chart
988
989
One can use the usual pattern of
990
991
<pre>
992
helm install <releasename> <chartdirectory>
993
</pre>
994
995
However often you want to reinstall/update when testing helm charts. The following pattern is "better", because it allows you to reinstall, if it is already installed:
996
997
<pre>
998
helm upgrade --install <releasename> <chartdirectory>
999 1 Nico Schottelius
</pre>
1000 80 Nico Schottelius
1001
h3. Naming services and deployments in helm charts [Application labels]
1002
1003
* We always have {{ .Release.Name }} to identify the current "instance"
1004
* Deployments:
1005
** use @app: <what it is>@, f.i. @app: nginx@, @app: postgres@, ...
1006 81 Nico Schottelius
* See more about standard labels on
1007
** https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
1008
** https://helm.sh/docs/chart_best_practices/labels/
1009 55 Nico Schottelius
1010 151 Nico Schottelius
h3. Show all versions of a helm chart
1011
1012
<pre>
1013
helm search repo -l repo/chart
1014
</pre>
1015
1016
For example:
1017
1018
<pre>
1019
% helm search repo -l projectcalico/tigera-operator 
1020
NAME                         	CHART VERSION	APP VERSION	DESCRIPTION                            
1021
projectcalico/tigera-operator	v3.23.3      	v3.23.3    	Installs the Tigera operator for Calico
1022
projectcalico/tigera-operator	v3.23.2      	v3.23.2    	Installs the Tigera operator for Calico
1023
....
1024
</pre>
1025
1026 152 Nico Schottelius
h3. Show possible values of a chart
1027
1028
<pre>
1029
helm show values <repo/chart>
1030
</pre>
1031
1032
Example:
1033
1034
<pre>
1035
helm show values ingress-nginx/ingress-nginx
1036
</pre>
1037
1038 207 Nico Schottelius
h3. Show all possible charts in a repo
1039
1040
<pre>
1041
helm search repo REPO
1042
</pre>
1043
1044 178 Nico Schottelius
h3. Download a chart
1045
1046
For instance for checking it out locally. Use:
1047
1048
<pre>
1049
helm pull <repo/chart>
1050
</pre>
1051 152 Nico Schottelius
1052 139 Nico Schottelius
h2. Rook + Ceph
1053
1054
h3. Installation
1055
1056
* Usually directly via argocd
1057
1058 71 Nico Schottelius
h3. Executing ceph commands
1059
1060
Using the ceph-tools pod as follows:
1061
1062
<pre>
1063
kubectl exec -n rook-ceph -ti $(kubectl -n rook-ceph get pods -l app=rook-ceph-tools -o jsonpath='{.items[*].metadata.name}') -- ceph -s
1064
</pre>
1065
1066 43 Nico Schottelius
h3. Inspecting the logs of a specific server
1067
1068
<pre>
1069
# Get the related pods
1070
kubectl -n rook-ceph get pods -l app=rook-ceph-osd-prepare 
1071
...
1072
1073
# Inspect the logs of a specific pod
1074
kubectl -n rook-ceph logs -f rook-ceph-osd-prepare-server23--1-444qx
1075
1076 71 Nico Schottelius
</pre>
1077
1078
h3. Inspecting the logs of the rook-ceph-operator
1079
1080
<pre>
1081
kubectl -n rook-ceph logs -f -l app=rook-ceph-operator
1082 43 Nico Schottelius
</pre>
1083
1084 200 Nico Schottelius
h3. (Temporarily) Disabling the rook-operation
1085
1086
* first disabling the sync in argocd
1087
* then scale it down
1088
1089
<pre>
1090
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=0
1091
</pre>
1092
1093
When done with the work/maintenance, re-enable sync in argocd.
1094
The following command is thus strictly speaking not required, as argocd will fix it on its own:
1095
1096
<pre>
1097
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=1
1098
</pre>
1099
1100 121 Nico Schottelius
h3. Restarting the rook operator
1101
1102
<pre>
1103
kubectl -n rook-ceph delete pods  -l app=rook-ceph-operator
1104
</pre>
1105
1106 43 Nico Schottelius
h3. Triggering server prepare / adding new osds
1107
1108
The rook-ceph-operator triggers/watches/creates pods to maintain hosts. To trigger a full "re scan", simply delete that pod:
1109
1110
<pre>
1111
kubectl -n rook-ceph delete pods -l app=rook-ceph-operator
1112
</pre>
1113
1114
This will cause all the @rook-ceph-osd-prepare-..@ jobs to be recreated and thus OSDs to be created, if new disks have been added.
1115
1116
h3. Removing an OSD
1117
1118
* See "Ceph OSD Management":https://rook.io/docs/rook/v1.7/ceph-osd-mgmt.html
1119 77 Nico Schottelius
* More specifically: https://github.com/rook/rook/blob/release-1.7/cluster/examples/kubernetes/ceph/osd-purge.yaml
1120 99 Nico Schottelius
* Then delete the related deployment
1121 41 Nico Schottelius
1122 98 Nico Schottelius
Set osd id in the osd-purge.yaml and apply it. OSD should be down before.
1123
1124
<pre>
1125
apiVersion: batch/v1
1126
kind: Job
1127
metadata:
1128
  name: rook-ceph-purge-osd
1129
  namespace: rook-ceph # namespace:cluster
1130
  labels:
1131
    app: rook-ceph-purge-osd
1132
spec:
1133
  template:
1134
    metadata:
1135
      labels:
1136
        app: rook-ceph-purge-osd
1137
    spec:
1138
      serviceAccountName: rook-ceph-purge-osd
1139
      containers:
1140
        - name: osd-removal
1141
          image: rook/ceph:master
1142
          # TODO: Insert the OSD ID in the last parameter that is to be removed
1143
          # The OSD IDs are a comma-separated list. For example: "0" or "0,2".
1144
          # If you want to preserve the OSD PVCs, set `--preserve-pvc true`.
1145
          #
1146
          # A --force-osd-removal option is available if the OSD should be destroyed even though the
1147
          # removal could lead to data loss.
1148
          args:
1149
            - "ceph"
1150
            - "osd"
1151
            - "remove"
1152
            - "--preserve-pvc"
1153
            - "false"
1154
            - "--force-osd-removal"
1155
            - "false"
1156
            - "--osd-ids"
1157
            - "SETTHEOSDIDHERE"
1158
          env:
1159
            - name: POD_NAMESPACE
1160
              valueFrom:
1161
                fieldRef:
1162
                  fieldPath: metadata.namespace
1163
            - name: ROOK_MON_ENDPOINTS
1164
              valueFrom:
1165
                configMapKeyRef:
1166
                  key: data
1167
                  name: rook-ceph-mon-endpoints
1168
            - name: ROOK_CEPH_USERNAME
1169
              valueFrom:
1170
                secretKeyRef:
1171
                  key: ceph-username
1172
                  name: rook-ceph-mon
1173
            - name: ROOK_CEPH_SECRET
1174
              valueFrom:
1175
                secretKeyRef:
1176
                  key: ceph-secret
1177
                  name: rook-ceph-mon
1178
            - name: ROOK_CONFIG_DIR
1179
              value: /var/lib/rook
1180
            - name: ROOK_CEPH_CONFIG_OVERRIDE
1181
              value: /etc/rook/config/override.conf
1182
            - name: ROOK_FSID
1183
              valueFrom:
1184
                secretKeyRef:
1185
                  key: fsid
1186
                  name: rook-ceph-mon
1187
            - name: ROOK_LOG_LEVEL
1188
              value: DEBUG
1189
          volumeMounts:
1190
            - mountPath: /etc/ceph
1191
              name: ceph-conf-emptydir
1192
            - mountPath: /var/lib/rook
1193
              name: rook-config
1194
      volumes:
1195
        - emptyDir: {}
1196
          name: ceph-conf-emptydir
1197
        - emptyDir: {}
1198
          name: rook-config
1199
      restartPolicy: Never
1200
1201
1202 99 Nico Schottelius
</pre>
1203
1204 1 Nico Schottelius
Deleting the deployment:
1205
1206
<pre>
1207
[18:05] bridge:~% kubectl -n rook-ceph delete deployment rook-ceph-osd-6
1208 99 Nico Schottelius
deployment.apps "rook-ceph-osd-6" deleted
1209
</pre>
1210 185 Nico Schottelius
1211
h3. Placement of mons/osds/etc.
1212
1213
See https://rook.io/docs/rook/v1.11/CRDs/Cluster/ceph-cluster-crd/#placement-configuration-settings
1214 98 Nico Schottelius
1215 215 Nico Schottelius
h3. Setting up and managing S3 object storage
1216
1217 217 Nico Schottelius
h4. Endpoints
1218
1219
| Location | Enpdoint |
1220
| p5 | https://s3.k8s.place5.ungleich.ch |
1221
1222
1223 215 Nico Schottelius
h4. Setting up a storage class
1224
1225
* This will store the buckets of a specific customer
1226
1227
Similar to this:
1228
1229
<pre>
1230
apiVersion: storage.k8s.io/v1
1231
kind: StorageClass
1232
metadata:
1233
  name: ungleich-archive-bucket-sc
1234
  namespace: rook-ceph
1235
provisioner: rook-ceph.ceph.rook.io/bucket
1236
reclaimPolicy: Delete
1237
parameters:
1238
  objectStoreName: place5
1239
  objectStoreNamespace: rook-ceph
1240
</pre>
1241
1242
h4. Setting up the Bucket
1243
1244
Similar to this:
1245
1246
<pre>
1247
apiVersion: objectbucket.io/v1alpha1
1248
kind: ObjectBucketClaim
1249
metadata:
1250
  name: ungleich-archive-bucket-claim
1251
  namespace: rook-ceph
1252
spec:
1253
  generateBucketName: ungleich-archive-ceph-bkt
1254
  storageClassName: ungleich-archive-bucket-sc
1255
  additionalConfig:
1256
    # To set for quota for OBC
1257
    #maxObjects: "1000"
1258
    maxSize: "100G"
1259
</pre>
1260
1261
* See also: https://rook.io/docs/rook/latest-release/Storage-Configuration/Object-Storage-RGW/ceph-object-bucket-claim/#obc-custom-resource
1262
1263
h4. Getting the credentials for the bucket
1264
1265
* Get "public" information from the configmap
1266
* Get secret from the secret
1267
1268 216 Nico Schottelius
<pre>
1269 1 Nico Schottelius
name=BUCKETNAME
1270 221 Nico Schottelius
s3host=s3.k8s.place5.ungleich.ch
1271
endpoint=https://${s3host}
1272 1 Nico Schottelius
1273
cm=$(kubectl -n rook-ceph get configmap -o yaml ${name}-bucket-claim)
1274 217 Nico Schottelius
1275 1 Nico Schottelius
sec=$(kubectl -n rook-ceph get secrets -o yaml ${name}-bucket-claim)
1276 222 Nico Schottelius
export AWS_ACCESS_KEY_ID=$(echo $sec | yq .data.AWS_ACCESS_KEY_ID | base64 -d ; echo "")
1277
export AWS_SECRET_ACCESS_KEY=$(echo $sec | yq .data.AWS_SECRET_ACCESS_KEY | base64 -d ; echo "")
1278 1 Nico Schottelius
1279 217 Nico Schottelius
1280 216 Nico Schottelius
bucket_name=$(echo $cm | yq .data.BUCKET_NAME)
1281 1 Nico Schottelius
</pre>
1282 217 Nico Schottelius
1283 220 Nico Schottelius
h5. Access via s3cmd
1284 1 Nico Schottelius
1285 221 Nico Schottelius
it is *NOT*:
1286
1287 220 Nico Schottelius
<pre>
1288 221 Nico Schottelius
s3cmd --host ${s3host}:443 --access_key=${AWS_ACCESS_KEY_ID} --secret_key=${AWS_SECRET_ACCESS_KEY} ls s3://${name}
1289 220 Nico Schottelius
</pre>
1290
1291 217 Nico Schottelius
h5. Access via s4cmd
1292
1293
<pre>
1294 1 Nico Schottelius
s4cmd --endpoint-url ${endpoint} --access-key=$(AWS_ACCESS_KEY_ID) --secret-key=$(AWS_SECRET_ACCESS_KEY) ls
1295
</pre>
1296 221 Nico Schottelius
1297
h5. Access via s5cmd
1298
1299
* Uses environment variables
1300
1301
<pre>
1302
s5cmd --endpoint-url ${endpoint} ls
1303
</pre>
1304 215 Nico Schottelius
1305 145 Nico Schottelius
h2. Ingress + Cert Manager
1306
1307
* We deploy "nginx-ingress":https://docs.nginx.com/nginx-ingress-controller/ to get an ingress
1308
* we deploy "cert-manager":https://cert-manager.io/ to handle certificates
1309
* We independently deploy @ClusterIssuer@ to allow the cert-manager app to deploy and the issuer to be created once the CRDs from cert manager are in place
1310
1311
h3. IPv4 reachability 
1312
1313
The ingress is by default IPv6 only. To make it reachable from the IPv4 world, get its IPv6 address and configure a NAT64 mapping in Jool.
1314
1315
Steps:
1316
1317
h4. Get the ingress IPv6 address
1318
1319
Use @kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.spec.clusterIP}'; echo ''@
1320
1321
Example:
1322
1323
<pre>
1324
kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.spec.clusterIP}'; echo ''
1325
2a0a:e5c0:10:1b::ce11
1326
</pre>
1327
1328
h4. Add NAT64 mapping
1329
1330
* Update the __dcl_jool_siit cdist type
1331
* Record the two IPs (IPv6 and IPv4)
1332
* Configure all routers
1333
1334
1335
h4. Add DNS record
1336
1337
To use the ingress capable as a CNAME destination, create an "ingress" DNS record, such as:
1338
1339
<pre>
1340
; k8s ingress for dev
1341
dev-ingress                 AAAA 2a0a:e5c0:10:1b::ce11
1342
dev-ingress                 A 147.78.194.23
1343
1344
</pre> 
1345
1346
h4. Add supporting wildcard DNS
1347
1348
If you plan to add various sites under a specific domain, we can add a wildcard DNS entry, such as *.k8s-dev.django-hosting.ch:
1349
1350
<pre>
1351
*.k8s-dev         CNAME dev-ingress.ungleich.ch.
1352
</pre>
1353
1354 76 Nico Schottelius
h2. Harbor
1355
1356 175 Nico Schottelius
* We user "Harbor":https://goharbor.io/ as an image registry for our own images. Internal app reference: apps/prod/harbor.
1357
* The admin password is in the password store, it is Harbor12345 by default
1358 76 Nico Schottelius
* At the moment harbor only authenticates against the internal ldap tree
1359
1360
h3. LDAP configuration
1361
1362
* The url needs to be ldaps://...
1363
* uid = uid
1364
* rest standard
1365 75 Nico Schottelius
1366 89 Nico Schottelius
h2. Monitoring / Prometheus
1367
1368 90 Nico Schottelius
* Via "kube-prometheus":https://github.com/prometheus-operator/kube-prometheus/
1369 89 Nico Schottelius
1370 91 Nico Schottelius
Access via ...
1371
1372
* http://prometheus-k8s.monitoring.svc:9090
1373
* http://grafana.monitoring.svc:3000
1374
* http://alertmanager.monitoring.svc:9093
1375
1376
1377 100 Nico Schottelius
h3. Prometheus Options
1378
1379
* "helm/kube-prometheus-stack":https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
1380
** Includes dashboards and co.
1381
* "manifest based kube-prometheus":https://github.com/prometheus-operator/kube-prometheus
1382
** Includes dashboards and co.
1383
* "Prometheus Operator (mainly CRD manifest":https://github.com/prometheus-operator/prometheus-operator
1384
1385 171 Nico Schottelius
h3. Grafana default password
1386
1387 218 Nico Schottelius
* If not changed: admin / @prom-operator@
1388
** Can be changed via:
1389
1390
<pre>
1391
    helm:
1392
      values: |-
1393
        configurations: |-
1394
          grafana:
1395
            adminPassword: "..."
1396
</pre>
1397 171 Nico Schottelius
1398 82 Nico Schottelius
h2. Nextcloud
1399
1400 85 Nico Schottelius
h3. How to get the nextcloud credentials 
1401 84 Nico Schottelius
1402
* The initial username is set to "nextcloud"
1403
* The password is autogenerated and saved in a kubernetes secret
1404
1405
<pre>
1406 85 Nico Schottelius
kubectl get secret RELEASENAME-nextcloud -o jsonpath="{.data.PASSWORD}" | base64 -d; echo "" 
1407 84 Nico Schottelius
</pre>
1408
1409 83 Nico Schottelius
h3. How to fix "Access through untrusted domain"
1410
1411 82 Nico Schottelius
* Nextcloud stores the initial domain configuration
1412 1 Nico Schottelius
* If the FQDN is changed, it will show the error message "Access through untrusted domain"
1413 82 Nico Schottelius
* To fix, edit /var/www/html/config/config.php and correct the domain
1414 1 Nico Schottelius
* Then delete the pods
1415 165 Nico Schottelius
1416
h3. Running occ commands inside the nextcloud container
1417
1418
* Find the pod in the right namespace
1419
1420
Exec:
1421
1422
<pre>
1423
su www-data -s /bin/sh -c ./occ
1424
</pre>
1425
1426
* -s /bin/sh is needed as the default shell is set to /bin/false
1427
1428 166 Nico Schottelius
h4. Rescanning files
1429 165 Nico Schottelius
1430 166 Nico Schottelius
* If files have been added without nextcloud's knowledge
1431
1432
<pre>
1433
su www-data -s /bin/sh -c "./occ files:scan --all"
1434
</pre>
1435 82 Nico Schottelius
1436 201 Nico Schottelius
h2. Sealed Secrets
1437
1438 202 Jin-Guk Kwon
* install kubeseal
1439 1 Nico Schottelius
1440 202 Jin-Guk Kwon
<pre>
1441
KUBESEAL_VERSION='0.23.0'
1442
wget "https://github.com/bitnami-labs/sealed-secrets/releases/download/v${KUBESEAL_VERSION:?}/kubeseal-${KUBESEAL_VERSION:?}-linux-amd64.tar.gz" 
1443
tar -xvzf kubeseal-${KUBESEAL_VERSION:?}-linux-amd64.tar.gz kubeseal
1444
sudo install -m 755 kubeseal /usr/local/bin/kubeseal
1445
</pre>
1446
1447
* create key for sealed-secret
1448
1449
<pre>
1450
kubeseal --fetch-cert > /tmp/public-key-cert.pem
1451
</pre>
1452
1453
* create the secret
1454
1455
<pre>
1456 203 Jin-Guk Kwon
ex)
1457 202 Jin-Guk Kwon
apiVersion: v1
1458
kind: Secret
1459
metadata:
1460
  name: Release.Name-postgres-config
1461
  annotations:
1462
    secret-generator.v1.mittwald.de/autogenerate: POSTGRES_PASSWORD
1463
    hosting: Release.Name
1464
  labels:
1465
    app.kubernetes.io/instance: Release.Name
1466
    app.kubernetes.io/component: postgres
1467
stringData:
1468
  POSTGRES_USER: postgresUser
1469
  POSTGRES_DB: postgresDBName
1470
  POSTGRES_INITDB_ARGS: "--no-locale --encoding=UTF8"
1471
</pre>
1472
1473
* convert secret.yaml to sealed-secret.yaml
1474
1475
<pre>
1476
kubeseal -n <namespace> --cert=/tmp/public-key-cert.pem --format=yaml < ./secret.yaml  > ./sealed-secret.yaml
1477
</pre>
1478
1479
* use sealed-secret.yaml on helm-chart directory
1480 201 Nico Schottelius
1481 205 Jin-Guk Kwon
* refer ticket : #11989 , #12120
1482 204 Jin-Guk Kwon
1483 1 Nico Schottelius
h2. Infrastructure versions
1484 35 Nico Schottelius
1485 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v5 (2021-10)
1486 1 Nico Schottelius
1487 57 Nico Schottelius
Clusters are configured / setup in this order:
1488
1489
* Bootstrap via kubeadm
1490 59 Nico Schottelius
* "Networking via calico + BGP (non ECMP) using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
1491
* "ArgoCD for CD":https://argo-cd.readthedocs.io/en/stable/
1492
** "rook for storage via argocd":https://rook.io/
1493 58 Nico Schottelius
** haproxy for in IPv6-cluster-IPv4-to-IPv6 proxy via argocd
1494
** "kubernetes-secret-generator for in cluster secrets":https://github.com/mittwald/kubernetes-secret-generator
1495
** "ungleich-certbot managing certs and nginx":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
1496
1497 57 Nico Schottelius
1498
h3. ungleich kubernetes infrastructure v4 (2021-09)
1499
1500 54 Nico Schottelius
* rook is configured via manifests instead of using the rook-ceph-cluster helm chart
1501 1 Nico Schottelius
* The rook operator is still being installed via helm
1502 35 Nico Schottelius
1503 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v3 (2021-07)
1504 1 Nico Schottelius
1505 10 Nico Schottelius
* rook is now installed via helm via argocd instead of directly via manifests
1506 28 Nico Schottelius
1507 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v2 (2021-05)
1508 28 Nico Schottelius
1509
* Replaced fluxv2 from ungleich k8s v1 with argocd
1510 1 Nico Schottelius
** argocd can apply helm templates directly without needing to go through Chart releases
1511 28 Nico Schottelius
* We are also using argoflow for build flows
1512
* Planned to add "kaniko":https://github.com/GoogleContainerTools/kaniko for image building
1513
1514 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v1 (2021-01)
1515 28 Nico Schottelius
1516
We are using the following components:
1517
1518
* "Calico as a CNI":https://www.projectcalico.org/ with BGP, IPv6 only, no encapsulation
1519
** Needed for basic networking
1520
* "kubernetes-secret-generator":https://github.com/mittwald/kubernetes-secret-generator for creating secrets
1521
** Needed so that secrets are not stored in the git repository, but only in the cluster
1522
* "ungleich-certbot":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
1523
** Needed to get letsencrypt certificates for services
1524
* "rook with ceph rbd + cephfs":https://rook.io/ for storage
1525
** rbd for almost everything, *ReadWriteOnce*
1526
** cephfs for smaller things, multi access *ReadWriteMany*
1527
** Needed for providing persistent storage
1528
* "flux v2":https://fluxcd.io/
1529
** Needed to manage resources automatically