Project

General

Profile

The ungleich kubernetes infrastructure » History » Version 218

Nico Schottelius, 10/02/2024 06:00 AM

1 22 Nico Schottelius
h1. The ungleich kubernetes infrastructure and ungleich kubernetes manual
2 1 Nico Schottelius
3 3 Nico Schottelius
{{toc}}
4
5 1 Nico Schottelius
h2. Status
6
7 211 Nico Schottelius
This document is **production**.
8
This document is the ungleich kubernetes infrastructure overview as well as the ungleich kubernetes manual.
9 1 Nico Schottelius
10 10 Nico Schottelius
h2. k8s clusters
11
12 123 Nico Schottelius
| Cluster            | Purpose/Setup     | Maintainer | Master(s)                     | argo                                                   | v4 http proxy | last verified |
13
| c0.k8s.ooo         | Dev               | -          | UNUSED                        |                                                        |               |    2021-10-05 |
14
| c1.k8s.ooo         | retired           |            | -                             |                                                        |               |    2022-03-15 |
15
| c2.k8s.ooo         | Dev p7 HW         | Nico       | server47 server53 server54    | "argo":https://argocd-server.argocd.svc.c2.k8s.ooo     |               |    2021-10-05 |
16
| c3.k8s.ooo         | retired           | -          | -                             |                                                        |               |    2021-10-05 |
17
| c4.k8s.ooo         | Dev2 p7 HW        | Jin-Guk    | server52 server53 server54    |                                                        |               |             - |
18
| c5.k8s.ooo         | retired           |            | -                             |                                                        |               |    2022-03-15 |
19
| c6.k8s.ooo         | Dev p6 VM Jin-Guk | Jin-Guk    |                               |                                                        |               |               |
20
| [[p5.k8s.ooo]]     | production        |            | server34 server36 server38    | "argo":https://argocd-server.argocd.svc.p5.k8s.ooo     | -             |               |
21
| [[p5-cow.k8s.ooo]] | production        | Nico       | server47 server51 server55    | "argo":https://argocd-server.argocd.svc.p5-cow.k8s.ooo |               |    2022-08-27 |
22
| [[p6.k8s.ooo]]     | production        |            | server67 server69 server71    | "argo":https://argocd-server.argocd.svc.p6.k8s.ooo     | 147.78.194.13 |    2021-10-05 |
23 184 Nico Schottelius
| [[p6-cow.k8s.ooo]] | production        |            | server134 server135 server136 | "argo":https://argocd-server.argocd.svc.p6in10.k8s.ooo | ?             |    2023-05-17 |
24 177 Nico Schottelius
| [[p10.k8s.ooo]]    | production        |            | server131 server132 server133 | "argo":https://argocd-server.argocd.svc.p10.k8s.ooo    | 147.78.194.12 |    2021-10-05 |
25 123 Nico Schottelius
| [[k8s.ge.nau.so]]  | development       |            | server107 server108 server109 | "argo":https://argocd-server.argocd.svc.k8s.ge.nau.so  |               |               |
26
| [[dev.k8s.ooo]]    | development       |            | server110 server111 server112 | "argo":https://argocd-server.argocd.svc.dev.k8s.ooo    | -             |    2022-07-08 |
27 164 Nico Schottelius
| [[r1r2p15k8sooo|r1.p15.k8s.ooo]] | production | Nico | server120 | | | 2022-10-30 |
28
| [[r1r2p15k8sooo|r2.p15.k8s.ooo]] | production | Nico | server121 | | | 2022-09-06 |
29 162 Nico Schottelius
| [[r1r2p10k8sooo|r1.p10.k8s.ooo]] | production | Nico | server122 | | | 2022-10-30 |
30
| [[r1r2p10k8sooo|r2.p10.k8s.ooo]] | production | Nico | server123 | | | 2022-10-15 |
31
| [[r1r2p5k8sooo|r1.p5.k8s.ooo]] | production | Nico | server137 | | | 2022-10-30 |
32
| [[r1r2p5k8sooo|r2.p5.k8s.ooo]] | production | Nico | server138 | | | 2022-10-30 |
33
| [[r1r2p6k8sooo|r1.p6.k8s.ooo]] | production | Nico | server139 | | | 2022-10-30 |
34
| [[r1r2p6k8sooo|r2.p6.k8s.ooo]] | production | Nico | server140 | | | 2022-10-30 |
35 21 Nico Schottelius
36 1 Nico Schottelius
h2. General architecture and components overview
37
38
* All k8s clusters are IPv6 only
39
* We use BGP peering to propagate podcidr and serviceCidr networks to our infrastructure
40
* The main public testing repository is "ungleich-k8s":https://code.ungleich.ch/ungleich-public/ungleich-k8s
41 18 Nico Schottelius
** Private configurations are found in the **k8s-config** repository
42 1 Nico Schottelius
43
h3. Cluster types
44
45 28 Nico Schottelius
| **Type/Feature**            | **Development**                | **Production**         |
46
| Min No. nodes               | 3 (1 master, 3 worker)         | 5 (3 master, 3 worker) |
47
| Recommended minimum         | 4 (dedicated master, 3 worker) | 8 (3 master, 5 worker) |
48
| Separation of control plane | optional                       | recommended            |
49
| Persistent storage          | required                       | required               |
50
| Number of storage monitors  | 3                              | 5                      |
51 1 Nico Schottelius
52 43 Nico Schottelius
h2. General k8s operations
53 1 Nico Schottelius
54 46 Nico Schottelius
h3. Cheat sheet / external great references
55
56
* "kubectl cheatsheet":https://kubernetes.io/docs/reference/kubectl/cheatsheet/
57
58 214 Nico Schottelius
Some examples:
59
60
h4. Use kubectl to print only the node names
61
62
<pre>
63
kubectl get nodes -o jsonpath='{.items[*].metadata.name}'
64
</pre>
65
66
Can easily be used in a shell loop like this:
67
68
<pre>
69
for host in $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}'); do echo $host; ssh root@${host} uptime; done
70
</pre>
71
72 117 Nico Schottelius
h3. Allowing to schedule work on the control plane / removing node taints
73 69 Nico Schottelius
74
* Mostly for single node / test / development clusters
75
* Just remove the master taint as follows
76
77
<pre>
78
kubectl taint nodes --all node-role.kubernetes.io/master-
79 118 Nico Schottelius
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
80 69 Nico Schottelius
</pre>
81 1 Nico Schottelius
82 117 Nico Schottelius
You can check the node taints using @kubectl describe node ...@
83 69 Nico Schottelius
84 208 Nico Schottelius
h3. Adding taints
85
86
* For instance to limit nodes to specific customers
87
88
<pre>
89
kubectl taint nodes serverXX customer=CUSTOMERNAME:NoSchedule
90
</pre>
91
92 44 Nico Schottelius
h3. Get the cluster admin.conf
93
94
* On the masters of each cluster you can find the file @/etc/kubernetes/admin.conf@
95
* To be able to administrate the cluster you can copy the admin.conf to your local machine
96
* Multi cluster debugging can very easy if you name the config ~/cX-admin.conf (see example below)
97
98
<pre>
99
% scp root@server47.place7.ungleich.ch:/etc/kubernetes/admin.conf ~/c2-admin.conf
100
% export KUBECONFIG=~/c2-admin.conf    
101
% kubectl get nodes
102
NAME       STATUS                     ROLES                  AGE   VERSION
103
server47   Ready                      control-plane,master   82d   v1.22.0
104
server48   Ready                      control-plane,master   82d   v1.22.0
105
server49   Ready                      <none>                 82d   v1.22.0
106
server50   Ready                      <none>                 82d   v1.22.0
107
server59   Ready                      control-plane,master   82d   v1.22.0
108
server60   Ready,SchedulingDisabled   <none>                 82d   v1.22.0
109
server61   Ready                      <none>                 82d   v1.22.0
110
server62   Ready                      <none>                 82d   v1.22.0               
111
</pre>
112
113 18 Nico Schottelius
h3. Installing a new k8s cluster
114 8 Nico Schottelius
115 9 Nico Schottelius
* Decide on the cluster name (usually *cX.k8s.ooo*), X counting upwards
116 28 Nico Schottelius
** Using pXX.k8s.ooo for production clusters of placeXX
117 9 Nico Schottelius
* Use cdist to configure the nodes with requirements like crio
118
* Decide between single or multi node control plane setups (see below)
119 28 Nico Schottelius
** Single control plane suitable for development clusters
120 9 Nico Schottelius
121 28 Nico Schottelius
Typical init procedure:
122 9 Nico Schottelius
123 206 Nico Schottelius
h4. Single control plane:
124
125
<pre>
126
kubeadm init --config bootstrap/XXX/kubeadm.yaml
127
</pre>
128
129
h4. Multi control plane (HA):
130
131
<pre>
132
kubeadm init --config bootstrap/XXX/kubeadm.yaml --upload-certs
133
</pre>
134
135 10 Nico Schottelius
136 29 Nico Schottelius
h3. Deleting a pod that is hanging in terminating state
137
138
<pre>
139
kubectl delete pod <PODNAME> --grace-period=0 --force --namespace <NAMESPACE>
140
</pre>
141
142
(from https://stackoverflow.com/questions/35453792/pods-stuck-in-terminating-status)
143
144 42 Nico Schottelius
h3. Listing nodes of a cluster
145
146
<pre>
147
[15:05] bridge:~% kubectl get nodes
148
NAME       STATUS   ROLES                  AGE   VERSION
149
server22   Ready    <none>                 52d   v1.22.0
150
server23   Ready    <none>                 52d   v1.22.2
151
server24   Ready    <none>                 52d   v1.22.0
152
server25   Ready    <none>                 52d   v1.22.0
153
server26   Ready    <none>                 52d   v1.22.0
154
server27   Ready    <none>                 52d   v1.22.0
155
server63   Ready    control-plane,master   52d   v1.22.0
156
server64   Ready    <none>                 52d   v1.22.0
157
server65   Ready    control-plane,master   52d   v1.22.0
158
server66   Ready    <none>                 52d   v1.22.0
159
server83   Ready    control-plane,master   52d   v1.22.0
160
server84   Ready    <none>                 52d   v1.22.0
161
server85   Ready    <none>                 52d   v1.22.0
162
server86   Ready    <none>                 52d   v1.22.0
163
</pre>
164
165 41 Nico Schottelius
h3. Removing / draining a node
166
167
Usually @kubectl drain server@ should do the job, but sometimes we need to be more aggressive:
168
169 1 Nico Schottelius
<pre>
170 103 Nico Schottelius
kubectl drain --delete-emptydir-data --ignore-daemonsets serverXX
171 42 Nico Schottelius
</pre>
172
173
h3. Readding a node after draining
174
175
<pre>
176
kubectl uncordon serverXX
177 1 Nico Schottelius
</pre>
178 43 Nico Schottelius
179 50 Nico Schottelius
h3. (Re-)joining worker nodes after creating the cluster
180 49 Nico Schottelius
181
* We need to have an up-to-date token
182
* We use different join commands for the workers and control plane nodes
183
184
Generating the join command on an existing control plane node:
185
186
<pre>
187
kubeadm token create --print-join-command
188
</pre>
189
190 50 Nico Schottelius
h3. (Re-)joining control plane nodes after creating the cluster
191 1 Nico Schottelius
192 50 Nico Schottelius
* We generate the token again
193
* We upload the certificates
194
* We need to combine/create the join command for the control plane node
195
196
Example session:
197
198
<pre>
199
% kubeadm token create --print-join-command
200
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash 
201
202
% kubeadm init phase upload-certs --upload-certs
203
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
204
[upload-certs] Using certificate key:
205
CERTKEY
206
207
# Then we use these two outputs on the joining node:
208
209
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash --control-plane --certificate-key CERTKEY
210
</pre>
211
212
Commands to be used on a control plane node:
213
214
<pre>
215
kubeadm token create --print-join-command
216
kubeadm init phase upload-certs --upload-certs
217
</pre>
218
219
Commands to be used on the joining node:
220
221
<pre>
222
JOINCOMMAND --control-plane --certificate-key CERTKEY
223
</pre>
224 49 Nico Schottelius
225 51 Nico Schottelius
SEE ALSO
226
227
* https://stackoverflow.com/questions/63936268/how-to-generate-kubeadm-token-for-secondary-control-plane-nodes
228
* https://blog.scottlowe.org/2019/08/15/reconstructing-the-join-command-for-kubeadm/
229
230 53 Nico Schottelius
h3. How to fix etcd does not start when rejoining a kubernetes cluster as a control plane
231 52 Nico Schottelius
232
If during the above step etcd does not come up, @kubeadm join@ can hang as follows:
233
234
<pre>
235
[control-plane] Creating static Pod manifest for "kube-apiserver"                                                              
236
[control-plane] Creating static Pod manifest for "kube-controller-manager"                                                     
237
[control-plane] Creating static Pod manifest for "kube-scheduler"                                                              
238
[check-etcd] Checking that the etcd cluster is healthy                                                                         
239
error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://[2a0a:e5c0:10:1:225:b3ff:fe20:37
240
8a]:2379 with maintenance client: context deadline exceeded                                                                    
241
To see the stack trace of this error execute with --v=5 or higher         
242
</pre>
243
244
Then the problem is likely that the etcd server is still a member of the cluster. We first need to remove it from the etcd cluster and then the join works.
245
246
To fix this we do:
247
248
* Find a working etcd pod
249
* Find the etcd members / member list
250
* Remove the etcd member that we want to re-join the cluster
251
252
253
<pre>
254
# Find the etcd pods
255
kubectl -n kube-system get pods -l component=etcd,tier=control-plane
256
257
# Get the list of etcd servers with the member id 
258
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
259
260
# Remove the member
261
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove MEMBERID
262
</pre>
263
264
Sample session:
265
266
<pre>
267
[10:48] line:~% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
268
NAME            READY   STATUS    RESTARTS     AGE
269
etcd-server63   1/1     Running   0            3m11s
270
etcd-server65   1/1     Running   3            7d2h
271
etcd-server83   1/1     Running   8 (6d ago)   7d2h
272
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
273
356891cd676df6e4, started, server65, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2379, false
274
371b8a07185dee7e, started, server63, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2379, false
275
5942bc58307f8af9, started, server83, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2380, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2379, false
276
277
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove 371b8a07185dee7e
278
Member 371b8a07185dee7e removed from cluster e3c0805f592a8f77
279 1 Nico Schottelius
280
</pre>
281
282
SEE ALSO
283
284
* We found the solution using https://stackoverflow.com/questions/67921552/re-installed-node-cannot-join-kubernetes-cluster
285 56 Nico Schottelius
286 213 Nico Schottelius
h4. Updating the members
287
288
1) get alive member
289
290
<pre>
291
% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
292
NAME            READY   STATUS    RESTARTS   AGE
293
etcd-server67   1/1     Running   1          185d
294
etcd-server69   1/1     Running   1          185d
295
etcd-server71   1/1     Running   2          185d
296
[20:57] sun:~% 
297
</pre>
298
299
2) get member list
300
301
* in this case via crictl, as the api does not work correctly anymore
302
303
<pre>
304
305
306
</pre>
307
308
309
3) update
310
311
<pre>
312
etcdctl member update MEMBERID  --peer-urls=https://[...]:2380
313
314
315
</pre>
316
317 147 Nico Schottelius
h3. Node labels (adding, showing, removing)
318
319
Listing the labels:
320
321
<pre>
322
kubectl get nodes --show-labels
323
</pre>
324
325
Adding labels:
326
327
<pre>
328
kubectl label nodes LIST-OF-NODES label1=value1 
329
330
</pre>
331
332
For instance:
333
334
<pre>
335
kubectl label nodes router2 router3 hosttype=router 
336
</pre>
337
338
Selecting nodes in pods:
339
340
<pre>
341
apiVersion: v1
342
kind: Pod
343
...
344
spec:
345
  nodeSelector:
346
    hosttype: router
347
</pre>
348
349 148 Nico Schottelius
Removing labels by adding a minus at the end of the label name:
350
351
<pre>
352
kubectl label node <nodename> <labelname>-
353
</pre>
354
355
For instance:
356
357
<pre>
358
kubectl label nodes router2 router3 hosttype- 
359
</pre>
360
361 147 Nico Schottelius
SEE ALSO
362 1 Nico Schottelius
363 148 Nico Schottelius
* https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes/
364
* https://stackoverflow.com/questions/34067979/how-to-delete-a-node-label-by-command-and-api
365 147 Nico Schottelius
366 199 Nico Schottelius
h3. Listing all pods on a node
367
368
<pre>
369
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=serverXX
370
</pre>
371
372
Found on https://stackoverflow.com/questions/62000559/how-to-list-all-the-pods-running-in-a-particular-worker-node-by-executing-a-comm
373
374 101 Nico Schottelius
h3. Hardware Maintenance using ungleich-hardware
375
376
Use the following manifest and replace the HOST with the actual host:
377
378
<pre>
379
apiVersion: v1
380
kind: Pod
381
metadata:
382
  name: ungleich-hardware-HOST
383
spec:
384
  containers:
385
  - name: ungleich-hardware
386
    image: ungleich/ungleich-hardware:0.0.5
387
    args:
388
    - sleep
389
    - "1000000"
390
    volumeMounts:
391
      - mountPath: /dev
392
        name: dev
393
    securityContext:
394
      privileged: true
395
  nodeSelector:
396
    kubernetes.io/hostname: "HOST"
397
398
  volumes:
399
    - name: dev
400
      hostPath:
401
        path: /dev
402
</pre>
403
404 102 Nico Schottelius
Also see: [[The_ungleich_hardware_maintenance_guide]]
405
406 105 Nico Schottelius
h3. Triggering a cronjob / creating a job from a cronjob
407 104 Nico Schottelius
408
To test a cronjob, we can create a job from a cronjob:
409
410
<pre>
411
kubectl create job --from=cronjob/volume2-daily-backup volume2-manual
412
</pre>
413
414
This creates a job volume2-manual based on the cronjob  volume2-daily
415
416 112 Nico Schottelius
h3. su-ing into a user that has nologin shell set
417
418
Many times users are having nologin as their shell inside the container. To be able to execute maintenance commands within the
419
container, we can use @su -s /bin/sh@ like this:
420
421
<pre>
422
su -s /bin/sh -c '/path/to/your/script' testuser
423
</pre>
424
425
Found on https://serverfault.com/questions/351046/how-to-run-command-as-user-who-has-usr-sbin-nologin-as-shell
426
427 113 Nico Schottelius
h3. How to print a secret value
428
429
Assuming you want the "password" item from a secret, use:
430
431
<pre>
432
kubectl get secret SECRETNAME -o jsonpath="{.data.password}" | base64 -d; echo "" 
433
</pre>
434
435 209 Nico Schottelius
h3. Fixing the "ImageInspectError"
436
437
If you see this problem:
438
439
<pre>
440
# kubectl get pods
441
NAME                                                       READY   STATUS                   RESTARTS   AGE
442
bird-router-server137-bird-767f65bb47-g4xsh                0/1     Init:ImageInspectError   0          77d
443
bird-router-server137-openvpn-server120-5c987b7ffb-cn9xf   0/1     ImageInspectError        1          159d
444
bird-router-server137-unbound-5c6f5d4bb6-cxbpr             0/1     ImageInspectError        1          159d
445
</pre>
446
447
Fixes so far:
448
449
* correct registries.conf
450
451 212 Nico Schottelius
h3. Automatic cleanup of images
452
453
* options to kubelet
454
455
<pre>
456
  --image-gc-high-threshold=90: The percent of disk usage after which image garbage collection is always run. Default: 90%
457
  --image-gc-low-threshold=80: The percent of disk usage before which image garbage collection is never run. Lowest disk usage to garbage collect to. Default: 80%
458
</pre>
459 209 Nico Schottelius
460 173 Nico Schottelius
h3. How to upgrade a kubernetes cluster
461 172 Nico Schottelius
462
h4. General
463
464
* Should be done every X months to stay up-to-date
465
** X probably something like 3-6
466
* kubeadm based clusters
467
* Needs specific kubeadm versions for upgrade
468
* Follow instructions on https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
469 190 Nico Schottelius
* Finding releases: https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG
470 172 Nico Schottelius
471
h4. Getting a specific kubeadm or kubelet version
472
473
<pre>
474 190 Nico Schottelius
RELEASE=v1.22.17
475
RELEASE=v1.23.17
476 181 Nico Schottelius
RELEASE=v1.24.9
477 1 Nico Schottelius
RELEASE=v1.25.9
478
RELEASE=v1.26.6
479 190 Nico Schottelius
RELEASE=v1.27.2
480
481 187 Nico Schottelius
ARCH=amd64
482 172 Nico Schottelius
483
curl -L --remote-name-all https://dl.k8s.io/release/${RELEASE}/bin/linux/${ARCH}/{kubeadm,kubelet}
484 182 Nico Schottelius
chmod u+x kubeadm kubelet
485 172 Nico Schottelius
</pre>
486
487
h4. Steps
488
489
* kubeadm upgrade plan
490
** On one control plane node
491
* kubeadm upgrade apply vXX.YY.ZZ
492
** On one control plane node
493 189 Nico Schottelius
* kubeadm upgrade node
494
** On all other control plane nodes
495
** On all worker nodes afterwards
496
497 172 Nico Schottelius
498 173 Nico Schottelius
Repeat for all control planes nodes. The upgrade kubelet on all other nodes via package manager.
499 172 Nico Schottelius
500 193 Nico Schottelius
h4. Upgrading to 1.22.17
501 1 Nico Schottelius
502 193 Nico Schottelius
* https://v1-22.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
503 194 Nico Schottelius
* Need to create a kubeadm config map
504 198 Nico Schottelius
** f.i. using the following
505
** @/usr/local/bin/kubeadm-v1.22.17   upgrade --config kubeadm.yaml --ignore-preflight-errors=CoreDNSUnsupportedPlugins,CoreDNSMigration apply -y v1.22.17@
506 193 Nico Schottelius
* Done for p6 on 2023-10-04
507
508
h4. Upgrading to 1.23.17
509
510
* https://v1-23.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
511
* No special notes
512
* Done for p6 on 2023-10-04
513
514
h4. Upgrading to 1.24.17
515
516
* https://v1-24.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
517
* No special notes
518
* Done for p6 on 2023-10-04
519
520
h4. Upgrading to 1.25.14
521
522
* https://v1-24.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
523
* No special notes
524
* Done for p6 on 2023-10-04
525
526
h4. Upgrading to 1.26.9
527
528 1 Nico Schottelius
* https://v1-26.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
529 193 Nico Schottelius
* No special notes
530
* Done for p6 on 2023-10-04
531 188 Nico Schottelius
532 196 Nico Schottelius
h4. Upgrading to 1.27
533 186 Nico Schottelius
534 192 Nico Schottelius
* https://v1-27.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
535 186 Nico Schottelius
* kubelet will not start anymore
536
* reason: @"command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"@
537
* /var/lib/kubelet/kubeadm-flags.env contains that parameter
538
* remove it, start kubelet
539 192 Nico Schottelius
540 197 Nico Schottelius
h4. Upgrading to 1.28
541 192 Nico Schottelius
542
* https://v1-28.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
543 186 Nico Schottelius
544 210 Nico Schottelius
h4. Upgrading to 1.29
545
546
* Done for many clusters around 2024-01-10
547
* Unsure if it was properly released
548
* https://v1-29.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
549
550 186 Nico Schottelius
h4. Upgrade to crio 1.27: missing crun
551
552
Error message
553
554
<pre>
555
level=fatal msg="validating runtime config: runtime validation: \"crun\" not found in $PATH: exec: \"crun\": executable file not found in $PATH"
556
</pre>
557
558
Fix:
559
560
<pre>
561
apk add crun
562
</pre>
563
564 157 Nico Schottelius
h2. Reference CNI
565
566
* Mainly "stupid", but effective plugins
567
* Main documentation on https://www.cni.dev/plugins/current/
568 158 Nico Schottelius
* Plugins
569
** bridge
570
*** Can create the bridge on the host
571
*** But seems not to be able to add host interfaces to it as well
572
*** Has support for vlan tags
573
** vlan
574
*** creates vlan tagged sub interface on the host
575 160 Nico Schottelius
*** "It's a 1:1 mapping (i.e. no bridge in between)":https://github.com/k8snetworkplumbingwg/multus-cni/issues/569
576 158 Nico Schottelius
** host-device
577
*** moves the interface from the host into the container
578
*** very easy for physical connections to containers
579 159 Nico Schottelius
** ipvlan
580
*** "virtualisation" of a host device
581
*** routing based on IP
582
*** Same MAC for everyone
583
*** Cannot reach the master interface
584
** maclvan
585
*** With mac addresses
586
*** Supports various modes (to be checked)
587
** ptp ("point to point")
588
*** Creates a host device and connects it to the container
589
** win*
590 158 Nico Schottelius
*** Windows implementations
591 157 Nico Schottelius
592 62 Nico Schottelius
h2. Calico CNI
593
594
h3. Calico Installation
595
596
* We install "calico using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
597
* This has the following advantages:
598
** Easy to upgrade
599
** Does not require os to configure IPv6/dual stack settings as the tigera operator figures out things on its own
600
601
Usually plain calico can be installed directly using:
602
603
<pre>
604 174 Nico Schottelius
VERSION=v3.25.0
605 149 Nico Schottelius
606 1 Nico Schottelius
helm repo add projectcalico https://docs.projectcalico.org/charts
607 167 Nico Schottelius
helm repo update
608 124 Nico Schottelius
helm upgrade --install --namespace tigera calico projectcalico/tigera-operator --version $VERSION --create-namespace
609 1 Nico Schottelius
</pre>
610 92 Nico Schottelius
611
* Check the tags on https://github.com/projectcalico/calico/tags for the latest release
612 62 Nico Schottelius
613
h3. Installing calicoctl
614
615 115 Nico Schottelius
* General installation instructions, including binary download: https://projectcalico.docs.tigera.io/maintenance/clis/calicoctl/install
616
617 62 Nico Schottelius
To be able to manage and configure calico, we need to 
618
"install calicoctl (we choose the version as a pod)":https://docs.projectcalico.org/getting-started/clis/calicoctl/install#install-calicoctl-as-a-kubernetes-pod
619
620
<pre>
621
kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml
622
</pre>
623
624 93 Nico Schottelius
Or version specific:
625
626
<pre>
627
kubectl apply -f https://github.com/projectcalico/calico/blob/v3.20.4/manifests/calicoctl.yaml
628 97 Nico Schottelius
629
# For 3.22
630
kubectl apply -f https://projectcalico.docs.tigera.io/archive/v3.22/manifests/calicoctl.yaml
631 93 Nico Schottelius
</pre>
632
633 70 Nico Schottelius
And making it easier accessible by alias:
634
635
<pre>
636
alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"
637
</pre>
638
639 62 Nico Schottelius
h3. Calico configuration
640
641 63 Nico Schottelius
By default our k8s clusters "BGP peer":https://docs.projectcalico.org/networking/bgp
642
with an upstream router to propagate podcidr and servicecidr.
643 62 Nico Schottelius
644
Default settings in our infrastructure:
645
646
* We use a full-mesh using the @nodeToNodeMeshEnabled: true@ option
647
* We keep the original next hop so that *only* the server with the pod is announcing it (instead of ecmp)
648 1 Nico Schottelius
* We use private ASNs for k8s clusters
649 63 Nico Schottelius
* We do *not* use any overlay
650 62 Nico Schottelius
651
After installing calico and calicoctl the last step of the installation is usually:
652
653 1 Nico Schottelius
<pre>
654 79 Nico Schottelius
calicoctl create -f - < calico-bgp.yaml
655 62 Nico Schottelius
</pre>
656
657
658
A sample BGP configuration:
659
660
<pre>
661
---
662
apiVersion: projectcalico.org/v3
663
kind: BGPConfiguration
664
metadata:
665
  name: default
666
spec:
667
  logSeverityScreen: Info
668
  nodeToNodeMeshEnabled: true
669
  asNumber: 65534
670
  serviceClusterIPs:
671
  - cidr: 2a0a:e5c0:10:3::/108
672
  serviceExternalIPs:
673
  - cidr: 2a0a:e5c0:10:3::/108
674
---
675
apiVersion: projectcalico.org/v3
676
kind: BGPPeer
677
metadata:
678
  name: router1-place10
679
spec:
680
  peerIP: 2a0a:e5c0:10:1::50
681
  asNumber: 213081
682
  keepOriginalNextHop: true
683
</pre>
684
685 126 Nico Schottelius
h2. Cilium CNI (experimental)
686
687 137 Nico Schottelius
h3. Status
688
689 138 Nico Schottelius
*NO WORKING CILIUM CONFIGURATION FOR IPV6 only modes*
690 137 Nico Schottelius
691 146 Nico Schottelius
h3. Latest error
692
693
It seems cilium does not run on IPv6 only hosts:
694
695
<pre>
696
level=info msg="Validating configured node address ranges" subsys=daemon
697
level=fatal msg="postinit failed" error="external IPv4 node address could not be derived, please configure via --ipv4-node" subsys=daemon
698
level=info msg="Starting IP identity watcher" subsys=ipcache
699
</pre>
700
701
It crashes after that log entry
702
703 128 Nico Schottelius
h3. BGP configuration
704
705
* The cilium-operator will not start without a correct configmap being present beforehand (see error message below)
706
* Creating the bgp config beforehand as a configmap is thus required.
707
708
The error one gets without the configmap present:
709
710
Pods are hanging with:
711
712
<pre>
713
cilium-bpqm6                       0/1     Init:0/4            0             9s
714
cilium-operator-5947d94f7f-5bmh2   0/1     ContainerCreating   0             9s
715
</pre>
716
717
The error message in the cilium-*perator is:
718
719
<pre>
720
Events:
721
  Type     Reason       Age                From               Message
722
  ----     ------       ----               ----               -------
723
  Normal   Scheduled    80s                default-scheduler  Successfully assigned kube-system/cilium-operator-5947d94f7f-lqcsp to server56
724
  Warning  FailedMount  16s (x8 over 80s)  kubelet            MountVolume.SetUp failed for volume "bgp-config-path" : configmap "bgp-config" not found
725
</pre>
726
727
A correct bgp config looks like this:
728
729
<pre>
730
apiVersion: v1
731
kind: ConfigMap
732
metadata:
733
  name: bgp-config
734
  namespace: kube-system
735
data:
736
  config.yaml: |
737
    peers:
738
      - peer-address: 2a0a:e5c0::46
739
        peer-asn: 209898
740
        my-asn: 65533
741
      - peer-address: 2a0a:e5c0::47
742
        peer-asn: 209898
743
        my-asn: 65533
744
    address-pools:
745
      - name: default
746
        protocol: bgp
747
        addresses:
748
          - 2a0a:e5c0:0:14::/64
749
</pre>
750 127 Nico Schottelius
751
h3. Installation
752 130 Nico Schottelius
753 127 Nico Schottelius
Adding the repo
754 1 Nico Schottelius
<pre>
755 127 Nico Schottelius
756 129 Nico Schottelius
helm repo add cilium https://helm.cilium.io/
757 130 Nico Schottelius
helm repo update
758
</pre>
759 129 Nico Schottelius
760 135 Nico Schottelius
Installing + configuring cilium
761 129 Nico Schottelius
<pre>
762 130 Nico Schottelius
ipv6pool=2a0a:e5c0:0:14::/112
763 1 Nico Schottelius
764 146 Nico Schottelius
version=1.12.2
765 129 Nico Schottelius
766
helm upgrade --install cilium cilium/cilium --version $version \
767 1 Nico Schottelius
  --namespace kube-system \
768
  --set ipv4.enabled=false \
769
  --set ipv6.enabled=true \
770 146 Nico Schottelius
  --set enableIPv6Masquerade=false \
771
  --set bgpControlPlane.enabled=true 
772 1 Nico Schottelius
773 146 Nico Schottelius
#  --set ipam.operator.clusterPoolIPv6PodCIDRList=$ipv6pool
774
775
# Old style bgp?
776 136 Nico Schottelius
#   --set bgp.enabled=true --set bgp.announce.podCIDR=true \
777 127 Nico Schottelius
778
# Show possible configuration options
779
helm show values cilium/cilium
780
781 1 Nico Schottelius
</pre>
782 132 Nico Schottelius
783
Using a /64 for ipam.operator.clusterPoolIPv6PodCIDRList fails with:
784
785
<pre>
786
level=fatal msg="Unable to init cluster-pool allocator" error="unable to initialize IPv6 allocator New CIDR set failed; the node CIDR size is too big" subsys=cilium-operator-generic
787
</pre>
788
789 126 Nico Schottelius
790 1 Nico Schottelius
See also https://github.com/cilium/cilium/issues/20756
791 135 Nico Schottelius
792
Seems a /112 is actually working.
793
794
h3. Kernel modules
795
796
Cilium requires the following modules to be loaded on the host (not loaded by default):
797
798
<pre>
799 1 Nico Schottelius
modprobe  ip6table_raw
800
modprobe  ip6table_filter
801
</pre>
802 146 Nico Schottelius
803
h3. Interesting helm flags
804
805
* autoDirectNodeRoutes
806
* bgpControlPlane.enabled = true
807
808
h3. SEE ALSO
809
810
* https://docs.cilium.io/en/v1.12/helm-reference/
811 133 Nico Schottelius
812 179 Nico Schottelius
h2. Multus
813 168 Nico Schottelius
814
* https://github.com/k8snetworkplumbingwg/multus-cni
815
* Installing a deployment w/ CRDs
816 150 Nico Schottelius
817 169 Nico Schottelius
<pre>
818 176 Nico Schottelius
VERSION=v4.0.1
819 169 Nico Schottelius
820 170 Nico Schottelius
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/${VERSION}/deployments/multus-daemonset-crio.yml
821
</pre>
822 169 Nico Schottelius
823 191 Nico Schottelius
h2. ArgoCD
824 56 Nico Schottelius
825 60 Nico Schottelius
h3. Argocd Installation
826 1 Nico Schottelius
827 116 Nico Schottelius
* See https://argo-cd.readthedocs.io/en/stable/
828
829 60 Nico Schottelius
As there is no configuration management present yet, argocd is installed using
830
831 1 Nico Schottelius
<pre>
832 60 Nico Schottelius
kubectl create namespace argocd
833 1 Nico Schottelius
834
# OR: latest stable
835
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
836
837 191 Nico Schottelius
# OR Specific Version
838
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/v2.3.2/manifests/install.yaml
839 56 Nico Schottelius
840 191 Nico Schottelius
841
</pre>
842 1 Nico Schottelius
843 60 Nico Schottelius
h3. Get the argocd credentials
844
845
<pre>
846
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d; echo ""
847
</pre>
848 52 Nico Schottelius
849 87 Nico Schottelius
h3. Accessing argocd
850
851
In regular IPv6 clusters:
852
853
* Navigate to https://argocd-server.argocd.CLUSTERDOMAIN
854
855
In legacy IPv4 clusters
856
857
<pre>
858
kubectl --namespace argocd port-forward svc/argocd-server 8080:80
859
</pre>
860
861 88 Nico Schottelius
* Navigate to https://localhost:8080
862
863 68 Nico Schottelius
h3. Using the argocd webhook to trigger changes
864 67 Nico Schottelius
865
* To trigger changes post json https://argocd.example.com/api/webhook
866
867 72 Nico Schottelius
h3. Deploying an application
868
869
* Applications are deployed via git towards gitea (code.ungleich.ch) and then pulled by argo
870 73 Nico Schottelius
* Always include the *redmine-url* pointing to the (customer) ticket
871
** Also add the support-url if it exists
872 72 Nico Schottelius
873
Application sample
874
875
<pre>
876
apiVersion: argoproj.io/v1alpha1
877
kind: Application
878
metadata:
879
  name: gitea-CUSTOMER
880
  namespace: argocd
881
spec:
882
  destination:
883
    namespace: default
884
    server: 'https://kubernetes.default.svc'
885
  source:
886
    path: apps/prod/gitea
887
    repoURL: 'https://code.ungleich.ch/ungleich-intern/k8s-config.git'
888
    targetRevision: HEAD
889
    helm:
890
      parameters:
891
        - name: storage.data.storageClass
892
          value: rook-ceph-block-hdd
893
        - name: storage.data.size
894
          value: 200Gi
895
        - name: storage.db.storageClass
896
          value: rook-ceph-block-ssd
897
        - name: storage.db.size
898
          value: 10Gi
899
        - name: storage.letsencrypt.storageClass
900
          value: rook-ceph-block-hdd
901
        - name: storage.letsencrypt.size
902
          value: 50Mi
903
        - name: letsencryptStaging
904
          value: 'no'
905
        - name: fqdn
906
          value: 'code.verua.online'
907
  project: default
908
  syncPolicy:
909
    automated:
910
      prune: true
911
      selfHeal: true
912
  info:
913
    - name: 'redmine-url'
914
      value: 'https://redmine.ungleich.ch/issues/ISSUEID'
915
    - name: 'support-url'
916
      value: 'https://support.ungleich.ch/Ticket/Display.html?id=TICKETID'
917
</pre>
918
919 80 Nico Schottelius
h2. Helm related operations and conventions
920 55 Nico Schottelius
921 61 Nico Schottelius
We use helm charts extensively.
922
923
* In production, they are managed via argocd
924
* In development, helm chart can de developed and deployed manually using the helm utility.
925
926 55 Nico Schottelius
h3. Installing a helm chart
927
928
One can use the usual pattern of
929
930
<pre>
931
helm install <releasename> <chartdirectory>
932
</pre>
933
934
However often you want to reinstall/update when testing helm charts. The following pattern is "better", because it allows you to reinstall, if it is already installed:
935
936
<pre>
937
helm upgrade --install <releasename> <chartdirectory>
938 1 Nico Schottelius
</pre>
939 80 Nico Schottelius
940
h3. Naming services and deployments in helm charts [Application labels]
941
942
* We always have {{ .Release.Name }} to identify the current "instance"
943
* Deployments:
944
** use @app: <what it is>@, f.i. @app: nginx@, @app: postgres@, ...
945 81 Nico Schottelius
* See more about standard labels on
946
** https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
947
** https://helm.sh/docs/chart_best_practices/labels/
948 55 Nico Schottelius
949 151 Nico Schottelius
h3. Show all versions of a helm chart
950
951
<pre>
952
helm search repo -l repo/chart
953
</pre>
954
955
For example:
956
957
<pre>
958
% helm search repo -l projectcalico/tigera-operator 
959
NAME                         	CHART VERSION	APP VERSION	DESCRIPTION                            
960
projectcalico/tigera-operator	v3.23.3      	v3.23.3    	Installs the Tigera operator for Calico
961
projectcalico/tigera-operator	v3.23.2      	v3.23.2    	Installs the Tigera operator for Calico
962
....
963
</pre>
964
965 152 Nico Schottelius
h3. Show possible values of a chart
966
967
<pre>
968
helm show values <repo/chart>
969
</pre>
970
971
Example:
972
973
<pre>
974
helm show values ingress-nginx/ingress-nginx
975
</pre>
976
977 207 Nico Schottelius
h3. Show all possible charts in a repo
978
979
<pre>
980
helm search repo REPO
981
</pre>
982
983 178 Nico Schottelius
h3. Download a chart
984
985
For instance for checking it out locally. Use:
986
987
<pre>
988
helm pull <repo/chart>
989
</pre>
990 152 Nico Schottelius
991 139 Nico Schottelius
h2. Rook + Ceph
992
993
h3. Installation
994
995
* Usually directly via argocd
996
997 71 Nico Schottelius
h3. Executing ceph commands
998
999
Using the ceph-tools pod as follows:
1000
1001
<pre>
1002
kubectl exec -n rook-ceph -ti $(kubectl -n rook-ceph get pods -l app=rook-ceph-tools -o jsonpath='{.items[*].metadata.name}') -- ceph -s
1003
</pre>
1004
1005 43 Nico Schottelius
h3. Inspecting the logs of a specific server
1006
1007
<pre>
1008
# Get the related pods
1009
kubectl -n rook-ceph get pods -l app=rook-ceph-osd-prepare 
1010
...
1011
1012
# Inspect the logs of a specific pod
1013
kubectl -n rook-ceph logs -f rook-ceph-osd-prepare-server23--1-444qx
1014
1015 71 Nico Schottelius
</pre>
1016
1017
h3. Inspecting the logs of the rook-ceph-operator
1018
1019
<pre>
1020
kubectl -n rook-ceph logs -f -l app=rook-ceph-operator
1021 43 Nico Schottelius
</pre>
1022
1023 200 Nico Schottelius
h3. (Temporarily) Disabling the rook-operation
1024
1025
* first disabling the sync in argocd
1026
* then scale it down
1027
1028
<pre>
1029
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=0
1030
</pre>
1031
1032
When done with the work/maintenance, re-enable sync in argocd.
1033
The following command is thus strictly speaking not required, as argocd will fix it on its own:
1034
1035
<pre>
1036
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=1
1037
</pre>
1038
1039 121 Nico Schottelius
h3. Restarting the rook operator
1040
1041
<pre>
1042
kubectl -n rook-ceph delete pods  -l app=rook-ceph-operator
1043
</pre>
1044
1045 43 Nico Schottelius
h3. Triggering server prepare / adding new osds
1046
1047
The rook-ceph-operator triggers/watches/creates pods to maintain hosts. To trigger a full "re scan", simply delete that pod:
1048
1049
<pre>
1050
kubectl -n rook-ceph delete pods -l app=rook-ceph-operator
1051
</pre>
1052
1053
This will cause all the @rook-ceph-osd-prepare-..@ jobs to be recreated and thus OSDs to be created, if new disks have been added.
1054
1055
h3. Removing an OSD
1056
1057
* See "Ceph OSD Management":https://rook.io/docs/rook/v1.7/ceph-osd-mgmt.html
1058 77 Nico Schottelius
* More specifically: https://github.com/rook/rook/blob/release-1.7/cluster/examples/kubernetes/ceph/osd-purge.yaml
1059 99 Nico Schottelius
* Then delete the related deployment
1060 41 Nico Schottelius
1061 98 Nico Schottelius
Set osd id in the osd-purge.yaml and apply it. OSD should be down before.
1062
1063
<pre>
1064
apiVersion: batch/v1
1065
kind: Job
1066
metadata:
1067
  name: rook-ceph-purge-osd
1068
  namespace: rook-ceph # namespace:cluster
1069
  labels:
1070
    app: rook-ceph-purge-osd
1071
spec:
1072
  template:
1073
    metadata:
1074
      labels:
1075
        app: rook-ceph-purge-osd
1076
    spec:
1077
      serviceAccountName: rook-ceph-purge-osd
1078
      containers:
1079
        - name: osd-removal
1080
          image: rook/ceph:master
1081
          # TODO: Insert the OSD ID in the last parameter that is to be removed
1082
          # The OSD IDs are a comma-separated list. For example: "0" or "0,2".
1083
          # If you want to preserve the OSD PVCs, set `--preserve-pvc true`.
1084
          #
1085
          # A --force-osd-removal option is available if the OSD should be destroyed even though the
1086
          # removal could lead to data loss.
1087
          args:
1088
            - "ceph"
1089
            - "osd"
1090
            - "remove"
1091
            - "--preserve-pvc"
1092
            - "false"
1093
            - "--force-osd-removal"
1094
            - "false"
1095
            - "--osd-ids"
1096
            - "SETTHEOSDIDHERE"
1097
          env:
1098
            - name: POD_NAMESPACE
1099
              valueFrom:
1100
                fieldRef:
1101
                  fieldPath: metadata.namespace
1102
            - name: ROOK_MON_ENDPOINTS
1103
              valueFrom:
1104
                configMapKeyRef:
1105
                  key: data
1106
                  name: rook-ceph-mon-endpoints
1107
            - name: ROOK_CEPH_USERNAME
1108
              valueFrom:
1109
                secretKeyRef:
1110
                  key: ceph-username
1111
                  name: rook-ceph-mon
1112
            - name: ROOK_CEPH_SECRET
1113
              valueFrom:
1114
                secretKeyRef:
1115
                  key: ceph-secret
1116
                  name: rook-ceph-mon
1117
            - name: ROOK_CONFIG_DIR
1118
              value: /var/lib/rook
1119
            - name: ROOK_CEPH_CONFIG_OVERRIDE
1120
              value: /etc/rook/config/override.conf
1121
            - name: ROOK_FSID
1122
              valueFrom:
1123
                secretKeyRef:
1124
                  key: fsid
1125
                  name: rook-ceph-mon
1126
            - name: ROOK_LOG_LEVEL
1127
              value: DEBUG
1128
          volumeMounts:
1129
            - mountPath: /etc/ceph
1130
              name: ceph-conf-emptydir
1131
            - mountPath: /var/lib/rook
1132
              name: rook-config
1133
      volumes:
1134
        - emptyDir: {}
1135
          name: ceph-conf-emptydir
1136
        - emptyDir: {}
1137
          name: rook-config
1138
      restartPolicy: Never
1139
1140
1141 99 Nico Schottelius
</pre>
1142
1143 1 Nico Schottelius
Deleting the deployment:
1144
1145
<pre>
1146
[18:05] bridge:~% kubectl -n rook-ceph delete deployment rook-ceph-osd-6
1147 99 Nico Schottelius
deployment.apps "rook-ceph-osd-6" deleted
1148
</pre>
1149 185 Nico Schottelius
1150
h3. Placement of mons/osds/etc.
1151
1152
See https://rook.io/docs/rook/v1.11/CRDs/Cluster/ceph-cluster-crd/#placement-configuration-settings
1153 98 Nico Schottelius
1154 215 Nico Schottelius
h3. Setting up and managing S3 object storage
1155
1156 217 Nico Schottelius
h4. Endpoints
1157
1158
| Location | Enpdoint |
1159
| p5 | https://s3.k8s.place5.ungleich.ch |
1160
1161
1162 215 Nico Schottelius
h4. Setting up a storage class
1163
1164
* This will store the buckets of a specific customer
1165
1166
Similar to this:
1167
1168
<pre>
1169
apiVersion: storage.k8s.io/v1
1170
kind: StorageClass
1171
metadata:
1172
  name: ungleich-archive-bucket-sc
1173
  namespace: rook-ceph
1174
provisioner: rook-ceph.ceph.rook.io/bucket
1175
reclaimPolicy: Delete
1176
parameters:
1177
  objectStoreName: place5
1178
  objectStoreNamespace: rook-ceph
1179
</pre>
1180
1181
h4. Setting up the Bucket
1182
1183
Similar to this:
1184
1185
<pre>
1186
apiVersion: objectbucket.io/v1alpha1
1187
kind: ObjectBucketClaim
1188
metadata:
1189
  name: ungleich-archive-bucket-claim
1190
  namespace: rook-ceph
1191
spec:
1192
  generateBucketName: ungleich-archive-ceph-bkt
1193
  storageClassName: ungleich-archive-bucket-sc
1194
  additionalConfig:
1195
    # To set for quota for OBC
1196
    #maxObjects: "1000"
1197
    maxSize: "100G"
1198
</pre>
1199
1200
* See also: https://rook.io/docs/rook/latest-release/Storage-Configuration/Object-Storage-RGW/ceph-object-bucket-claim/#obc-custom-resource
1201
1202
h4. Getting the credentials for the bucket
1203
1204
* Get "public" information from the configmap
1205
* Get secret from the secret
1206
1207 216 Nico Schottelius
<pre>
1208 1 Nico Schottelius
name=BUCKETNAME
1209 217 Nico Schottelius
endpoint=https://s3.k8s.place5.ungleich.ch
1210 1 Nico Schottelius
1211
cm=$(kubectl -n rook-ceph get configmap -o yaml ${name}-bucket-claim)
1212 217 Nico Schottelius
1213 1 Nico Schottelius
sec=$(kubectl -n rook-ceph get secrets -o yaml ${name}-bucket-claim)
1214 217 Nico Schottelius
AWS_ACCESS_KEY_ID=$(echo $sec | yq .data.AWS_ACCESS_KEY_ID | base64 -d ; echo "")
1215
AWS_SECRET_ACCESS_KEY=$(echo $sec | yq .data.AWS_SECRET_ACCESS_KEY | base64 -d ; echo "")
1216 1 Nico Schottelius
1217 217 Nico Schottelius
1218 216 Nico Schottelius
bucket_name=$(echo $cm | yq .data.BUCKET_NAME)
1219 1 Nico Schottelius
</pre>
1220 217 Nico Schottelius
1221
h5. Access via s4cmd
1222
1223
<pre>
1224
s4cmd --endpoint-url ${endpoint} --access-key=$(AWS_ACCESS_KEY_ID) --secret-key=$(AWS_SECRET_ACCESS_KEY) ls
1225
</pre>
1226
1227 215 Nico Schottelius
1228 145 Nico Schottelius
h2. Ingress + Cert Manager
1229
1230
* We deploy "nginx-ingress":https://docs.nginx.com/nginx-ingress-controller/ to get an ingress
1231
* we deploy "cert-manager":https://cert-manager.io/ to handle certificates
1232
* We independently deploy @ClusterIssuer@ to allow the cert-manager app to deploy and the issuer to be created once the CRDs from cert manager are in place
1233
1234
h3. IPv4 reachability 
1235
1236
The ingress is by default IPv6 only. To make it reachable from the IPv4 world, get its IPv6 address and configure a NAT64 mapping in Jool.
1237
1238
Steps:
1239
1240
h4. Get the ingress IPv6 address
1241
1242
Use @kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.spec.clusterIP}'; echo ''@
1243
1244
Example:
1245
1246
<pre>
1247
kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.spec.clusterIP}'; echo ''
1248
2a0a:e5c0:10:1b::ce11
1249
</pre>
1250
1251
h4. Add NAT64 mapping
1252
1253
* Update the __dcl_jool_siit cdist type
1254
* Record the two IPs (IPv6 and IPv4)
1255
* Configure all routers
1256
1257
1258
h4. Add DNS record
1259
1260
To use the ingress capable as a CNAME destination, create an "ingress" DNS record, such as:
1261
1262
<pre>
1263
; k8s ingress for dev
1264
dev-ingress                 AAAA 2a0a:e5c0:10:1b::ce11
1265
dev-ingress                 A 147.78.194.23
1266
1267
</pre> 
1268
1269
h4. Add supporting wildcard DNS
1270
1271
If you plan to add various sites under a specific domain, we can add a wildcard DNS entry, such as *.k8s-dev.django-hosting.ch:
1272
1273
<pre>
1274
*.k8s-dev         CNAME dev-ingress.ungleich.ch.
1275
</pre>
1276
1277 76 Nico Schottelius
h2. Harbor
1278
1279 175 Nico Schottelius
* We user "Harbor":https://goharbor.io/ as an image registry for our own images. Internal app reference: apps/prod/harbor.
1280
* The admin password is in the password store, it is Harbor12345 by default
1281 76 Nico Schottelius
* At the moment harbor only authenticates against the internal ldap tree
1282
1283
h3. LDAP configuration
1284
1285
* The url needs to be ldaps://...
1286
* uid = uid
1287
* rest standard
1288 75 Nico Schottelius
1289 89 Nico Schottelius
h2. Monitoring / Prometheus
1290
1291 90 Nico Schottelius
* Via "kube-prometheus":https://github.com/prometheus-operator/kube-prometheus/
1292 89 Nico Schottelius
1293 91 Nico Schottelius
Access via ...
1294
1295
* http://prometheus-k8s.monitoring.svc:9090
1296
* http://grafana.monitoring.svc:3000
1297
* http://alertmanager.monitoring.svc:9093
1298
1299
1300 100 Nico Schottelius
h3. Prometheus Options
1301
1302
* "helm/kube-prometheus-stack":https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
1303
** Includes dashboards and co.
1304
* "manifest based kube-prometheus":https://github.com/prometheus-operator/kube-prometheus
1305
** Includes dashboards and co.
1306
* "Prometheus Operator (mainly CRD manifest":https://github.com/prometheus-operator/prometheus-operator
1307
1308 171 Nico Schottelius
h3. Grafana default password
1309
1310 218 Nico Schottelius
* If not changed: admin / @prom-operator@
1311
** Can be changed via:
1312
1313
<pre>
1314
    helm:
1315
      values: |-
1316
        configurations: |-
1317
          grafana:
1318
            adminPassword: "..."
1319
</pre>
1320 171 Nico Schottelius
1321 82 Nico Schottelius
h2. Nextcloud
1322
1323 85 Nico Schottelius
h3. How to get the nextcloud credentials 
1324 84 Nico Schottelius
1325
* The initial username is set to "nextcloud"
1326
* The password is autogenerated and saved in a kubernetes secret
1327
1328
<pre>
1329 85 Nico Schottelius
kubectl get secret RELEASENAME-nextcloud -o jsonpath="{.data.PASSWORD}" | base64 -d; echo "" 
1330 84 Nico Schottelius
</pre>
1331
1332 83 Nico Schottelius
h3. How to fix "Access through untrusted domain"
1333
1334 82 Nico Schottelius
* Nextcloud stores the initial domain configuration
1335 1 Nico Schottelius
* If the FQDN is changed, it will show the error message "Access through untrusted domain"
1336 82 Nico Schottelius
* To fix, edit /var/www/html/config/config.php and correct the domain
1337 1 Nico Schottelius
* Then delete the pods
1338 165 Nico Schottelius
1339
h3. Running occ commands inside the nextcloud container
1340
1341
* Find the pod in the right namespace
1342
1343
Exec:
1344
1345
<pre>
1346
su www-data -s /bin/sh -c ./occ
1347
</pre>
1348
1349
* -s /bin/sh is needed as the default shell is set to /bin/false
1350
1351 166 Nico Schottelius
h4. Rescanning files
1352 165 Nico Schottelius
1353 166 Nico Schottelius
* If files have been added without nextcloud's knowledge
1354
1355
<pre>
1356
su www-data -s /bin/sh -c "./occ files:scan --all"
1357
</pre>
1358 82 Nico Schottelius
1359 201 Nico Schottelius
h2. Sealed Secrets
1360
1361 202 Jin-Guk Kwon
* install kubeseal
1362 1 Nico Schottelius
1363 202 Jin-Guk Kwon
<pre>
1364
KUBESEAL_VERSION='0.23.0'
1365
wget "https://github.com/bitnami-labs/sealed-secrets/releases/download/v${KUBESEAL_VERSION:?}/kubeseal-${KUBESEAL_VERSION:?}-linux-amd64.tar.gz" 
1366
tar -xvzf kubeseal-${KUBESEAL_VERSION:?}-linux-amd64.tar.gz kubeseal
1367
sudo install -m 755 kubeseal /usr/local/bin/kubeseal
1368
</pre>
1369
1370
* create key for sealed-secret
1371
1372
<pre>
1373
kubeseal --fetch-cert > /tmp/public-key-cert.pem
1374
</pre>
1375
1376
* create the secret
1377
1378
<pre>
1379 203 Jin-Guk Kwon
ex)
1380 202 Jin-Guk Kwon
apiVersion: v1
1381
kind: Secret
1382
metadata:
1383
  name: Release.Name-postgres-config
1384
  annotations:
1385
    secret-generator.v1.mittwald.de/autogenerate: POSTGRES_PASSWORD
1386
    hosting: Release.Name
1387
  labels:
1388
    app.kubernetes.io/instance: Release.Name
1389
    app.kubernetes.io/component: postgres
1390
stringData:
1391
  POSTGRES_USER: postgresUser
1392
  POSTGRES_DB: postgresDBName
1393
  POSTGRES_INITDB_ARGS: "--no-locale --encoding=UTF8"
1394
</pre>
1395
1396
* convert secret.yaml to sealed-secret.yaml
1397
1398
<pre>
1399
kubeseal -n <namespace> --cert=/tmp/public-key-cert.pem --format=yaml < ./secret.yaml  > ./sealed-secret.yaml
1400
</pre>
1401
1402
* use sealed-secret.yaml on helm-chart directory
1403 201 Nico Schottelius
1404 205 Jin-Guk Kwon
* refer ticket : #11989 , #12120
1405 204 Jin-Guk Kwon
1406 1 Nico Schottelius
h2. Infrastructure versions
1407 35 Nico Schottelius
1408 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v5 (2021-10)
1409 1 Nico Schottelius
1410 57 Nico Schottelius
Clusters are configured / setup in this order:
1411
1412
* Bootstrap via kubeadm
1413 59 Nico Schottelius
* "Networking via calico + BGP (non ECMP) using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
1414
* "ArgoCD for CD":https://argo-cd.readthedocs.io/en/stable/
1415
** "rook for storage via argocd":https://rook.io/
1416 58 Nico Schottelius
** haproxy for in IPv6-cluster-IPv4-to-IPv6 proxy via argocd
1417
** "kubernetes-secret-generator for in cluster secrets":https://github.com/mittwald/kubernetes-secret-generator
1418
** "ungleich-certbot managing certs and nginx":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
1419
1420 57 Nico Schottelius
1421
h3. ungleich kubernetes infrastructure v4 (2021-09)
1422
1423 54 Nico Schottelius
* rook is configured via manifests instead of using the rook-ceph-cluster helm chart
1424 1 Nico Schottelius
* The rook operator is still being installed via helm
1425 35 Nico Schottelius
1426 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v3 (2021-07)
1427 1 Nico Schottelius
1428 10 Nico Schottelius
* rook is now installed via helm via argocd instead of directly via manifests
1429 28 Nico Schottelius
1430 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v2 (2021-05)
1431 28 Nico Schottelius
1432
* Replaced fluxv2 from ungleich k8s v1 with argocd
1433 1 Nico Schottelius
** argocd can apply helm templates directly without needing to go through Chart releases
1434 28 Nico Schottelius
* We are also using argoflow for build flows
1435
* Planned to add "kaniko":https://github.com/GoogleContainerTools/kaniko for image building
1436
1437 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v1 (2021-01)
1438 28 Nico Schottelius
1439
We are using the following components:
1440
1441
* "Calico as a CNI":https://www.projectcalico.org/ with BGP, IPv6 only, no encapsulation
1442
** Needed for basic networking
1443
* "kubernetes-secret-generator":https://github.com/mittwald/kubernetes-secret-generator for creating secrets
1444
** Needed so that secrets are not stored in the git repository, but only in the cluster
1445
* "ungleich-certbot":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
1446
** Needed to get letsencrypt certificates for services
1447
* "rook with ceph rbd + cephfs":https://rook.io/ for storage
1448
** rbd for almost everything, *ReadWriteOnce*
1449
** cephfs for smaller things, multi access *ReadWriteMany*
1450
** Needed for providing persistent storage
1451
* "flux v2":https://fluxcd.io/
1452
** Needed to manage resources automatically