Project

General

Profile

The ungleich kubernetes infrastructure » History » Version 92

Nico Schottelius, 02/20/2022 07:03 PM

1 22 Nico Schottelius
h1. The ungleich kubernetes infrastructure and ungleich kubernetes manual
2 1 Nico Schottelius
3 3 Nico Schottelius
{{toc}}
4
5 1 Nico Schottelius
h2. Status
6
7 28 Nico Schottelius
This document is **pre-production**.
8
This document is to become the ungleich kubernetes infrastructure overview as well as the ungleich kubernetes manual.
9 1 Nico Schottelius
10 10 Nico Schottelius
h2. k8s clusters
11
12 78 Nico Schottelius
| Cluster         | Purpose/Setup     | Maintainer   | Master(s)                                                | argo                                                | rook | v4 http proxy | last verified |
13
| c0.k8s.ooo      | Dev               | -            | UNUSED                                                   |                                                     |      |               |    2021-10-05 |
14
| c1.k8s.ooo      | Dev p6 VM         | Nico         | 2a0a-e5c0-2-11-0-62ff-fe0b-1a3d.k8s-1.place6.ungleich.ch |                                                     |      |               |    2021-10-05 |
15
| c2.k8s.ooo      | Dev p7 HW         | Nico         | server47 server53 server54                               | "argo":https://argocd-server.argocd.svc.c2.k8s.ooo  | x    |               |    2021-10-05 |
16
| c3.k8s.ooo      | Test p7 PI        | -            | UNUSED                                                   |                                                     |      |               |    2021-10-05 |
17
| c4.k8s.ooo      | Dev2 p7 HW        | Fran/Jin-Guk | server52 server53 server54                               |                                                     |      |               |             - |
18
| c5.k8s.ooo      | Dev p6 VM Amal    | Nico/Amal    | 2a0a-e5c0-2-11-0-62ff-fe0b-1a46.k8s-1.place6.ungleich.ch |                                                     |      |               |               |
19
| c6.k8s.ooo      | Dev p6 VM Jin-Guk | Jin-Guk      |                                                          |                                                     |      |               |               |
20
| [[p5.k8s.ooo]]  | production        |              | server34 server36 server38                               | "argo":https://argocd-server.argocd.svc.p5.k8s.ooo  |      |             - |               |
21
| [[p6.k8s.ooo]]  | production        |              | server67 server69 server71                               | "argo":https://argocd-server.argocd.svc.p6.k8s.ooo  | x    | 147.78.194.13 |    2021-10-05 |
22
| [[p10.k8s.ooo]] | production        |              | server63 server65 server83                               | "argo":https://argocd-server.argocd.svc.p10.k8s.ooo | x    | 147.78.194.12 |    2021-10-05 |
23
24 21 Nico Schottelius
25 1 Nico Schottelius
h2. General architecture and components overview
26
27
* All k8s clusters are IPv6 only
28
* We use BGP peering to propagate podcidr and serviceCidr networks to our infrastructure
29
* The main public testing repository is "ungleich-k8s":https://code.ungleich.ch/ungleich-public/ungleich-k8s
30 18 Nico Schottelius
** Private configurations are found in the **k8s-config** repository
31 1 Nico Schottelius
32
h3. Cluster types
33
34 28 Nico Schottelius
| **Type/Feature**            | **Development**                | **Production**         |
35
| Min No. nodes               | 3 (1 master, 3 worker)         | 5 (3 master, 3 worker) |
36
| Recommended minimum         | 4 (dedicated master, 3 worker) | 8 (3 master, 5 worker) |
37
| Separation of control plane | optional                       | recommended            |
38
| Persistent storage          | required                       | required               |
39
| Number of storage monitors  | 3                              | 5                      |
40 1 Nico Schottelius
41 43 Nico Schottelius
h2. General k8s operations
42 1 Nico Schottelius
43 46 Nico Schottelius
h3. Cheat sheet / external great references
44
45
* "kubectl cheatsheet":https://kubernetes.io/docs/reference/kubectl/cheatsheet/
46
47 69 Nico Schottelius
h3. Allowing to schedule work on the control plane
48
49
* Mostly for single node / test / development clusters
50
* Just remove the master taint as follows
51
52
<pre>
53
kubectl taint nodes --all node-role.kubernetes.io/master-
54
</pre>
55
56
57 44 Nico Schottelius
h3. Get the cluster admin.conf
58
59
* On the masters of each cluster you can find the file @/etc/kubernetes/admin.conf@
60
* To be able to administrate the cluster you can copy the admin.conf to your local machine
61
* Multi cluster debugging can very easy if you name the config ~/cX-admin.conf (see example below)
62
63
<pre>
64
% scp root@server47.place7.ungleich.ch:/etc/kubernetes/admin.conf ~/c2-admin.conf
65
% export KUBECONFIG=~/c2-admin.conf    
66
% kubectl get nodes
67
NAME       STATUS                     ROLES                  AGE   VERSION
68
server47   Ready                      control-plane,master   82d   v1.22.0
69
server48   Ready                      control-plane,master   82d   v1.22.0
70
server49   Ready                      <none>                 82d   v1.22.0
71
server50   Ready                      <none>                 82d   v1.22.0
72
server59   Ready                      control-plane,master   82d   v1.22.0
73
server60   Ready,SchedulingDisabled   <none>                 82d   v1.22.0
74
server61   Ready                      <none>                 82d   v1.22.0
75
server62   Ready                      <none>                 82d   v1.22.0               
76
</pre>
77
78 18 Nico Schottelius
h3. Installing a new k8s cluster
79 8 Nico Schottelius
80 9 Nico Schottelius
* Decide on the cluster name (usually *cX.k8s.ooo*), X counting upwards
81 28 Nico Schottelius
** Using pXX.k8s.ooo for production clusters of placeXX
82 9 Nico Schottelius
* Use cdist to configure the nodes with requirements like crio
83
* Decide between single or multi node control plane setups (see below)
84 28 Nico Schottelius
** Single control plane suitable for development clusters
85 9 Nico Schottelius
86 28 Nico Schottelius
Typical init procedure:
87 9 Nico Schottelius
88 28 Nico Schottelius
* Single control plane: @kubeadm init --config bootstrap/XXX/kubeadm.yaml@
89
* Multi control plane (HA): @kubeadm init --config bootstrap/XXX/kubeadm.yaml --upload-certs@
90 10 Nico Schottelius
91 29 Nico Schottelius
h3. Deleting a pod that is hanging in terminating state
92
93
<pre>
94
kubectl delete pod <PODNAME> --grace-period=0 --force --namespace <NAMESPACE>
95
</pre>
96
97
(from https://stackoverflow.com/questions/35453792/pods-stuck-in-terminating-status)
98
99 42 Nico Schottelius
h3. Listing nodes of a cluster
100
101
<pre>
102
[15:05] bridge:~% kubectl get nodes
103
NAME       STATUS   ROLES                  AGE   VERSION
104
server22   Ready    <none>                 52d   v1.22.0
105
server23   Ready    <none>                 52d   v1.22.2
106
server24   Ready    <none>                 52d   v1.22.0
107
server25   Ready    <none>                 52d   v1.22.0
108
server26   Ready    <none>                 52d   v1.22.0
109
server27   Ready    <none>                 52d   v1.22.0
110
server63   Ready    control-plane,master   52d   v1.22.0
111
server64   Ready    <none>                 52d   v1.22.0
112
server65   Ready    control-plane,master   52d   v1.22.0
113
server66   Ready    <none>                 52d   v1.22.0
114
server83   Ready    control-plane,master   52d   v1.22.0
115
server84   Ready    <none>                 52d   v1.22.0
116
server85   Ready    <none>                 52d   v1.22.0
117
server86   Ready    <none>                 52d   v1.22.0
118
</pre>
119
120
121 41 Nico Schottelius
h3. Removing / draining a node
122
123
Usually @kubectl drain server@ should do the job, but sometimes we need to be more aggressive:
124
125
<pre>
126
kubectl drain --delete-emptydir-data --ignore-daemonsets server23
127 42 Nico Schottelius
</pre>
128
129
h3. Readding a node after draining
130
131
<pre>
132
kubectl uncordon serverXX
133 1 Nico Schottelius
</pre>
134 43 Nico Schottelius
135 50 Nico Schottelius
h3. (Re-)joining worker nodes after creating the cluster
136 49 Nico Schottelius
137
* We need to have an up-to-date token
138
* We use different join commands for the workers and control plane nodes
139
140
Generating the join command on an existing control plane node:
141
142
<pre>
143
kubeadm token create --print-join-command
144
</pre>
145
146 50 Nico Schottelius
h3. (Re-)joining control plane nodes after creating the cluster
147 1 Nico Schottelius
148 50 Nico Schottelius
* We generate the token again
149
* We upload the certificates
150
* We need to combine/create the join command for the control plane node
151
152
Example session:
153
154
<pre>
155
% kubeadm token create --print-join-command
156
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash 
157
158
% kubeadm init phase upload-certs --upload-certs
159
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
160
[upload-certs] Using certificate key:
161
CERTKEY
162
163
# Then we use these two outputs on the joining node:
164
165
kubeadm join p10-api.k8s.ooo:6443 --token xmff4i.ABC --discovery-token-ca-cert-hash sha256:longhash --control-plane --certificate-key CERTKEY
166
</pre>
167
168
Commands to be used on a control plane node:
169
170
<pre>
171
kubeadm token create --print-join-command
172
kubeadm init phase upload-certs --upload-certs
173
</pre>
174
175
Commands to be used on the joining node:
176
177
<pre>
178
JOINCOMMAND --control-plane --certificate-key CERTKEY
179
</pre>
180 49 Nico Schottelius
181 51 Nico Schottelius
SEE ALSO
182
183
* https://stackoverflow.com/questions/63936268/how-to-generate-kubeadm-token-for-secondary-control-plane-nodes
184
* https://blog.scottlowe.org/2019/08/15/reconstructing-the-join-command-for-kubeadm/
185
186 53 Nico Schottelius
h3. How to fix etcd does not start when rejoining a kubernetes cluster as a control plane
187 52 Nico Schottelius
188
If during the above step etcd does not come up, @kubeadm join@ can hang as follows:
189
190
<pre>
191
[control-plane] Creating static Pod manifest for "kube-apiserver"                                                              
192
[control-plane] Creating static Pod manifest for "kube-controller-manager"                                                     
193
[control-plane] Creating static Pod manifest for "kube-scheduler"                                                              
194
[check-etcd] Checking that the etcd cluster is healthy                                                                         
195
error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://[2a0a:e5c0:10:1:225:b3ff:fe20:37
196
8a]:2379 with maintenance client: context deadline exceeded                                                                    
197
To see the stack trace of this error execute with --v=5 or higher         
198
</pre>
199
200
Then the problem is likely that the etcd server is still a member of the cluster. We first need to remove it from the etcd cluster and then the join works.
201
202
To fix this we do:
203
204
* Find a working etcd pod
205
* Find the etcd members / member list
206
* Remove the etcd member that we want to re-join the cluster
207
208
209
<pre>
210
# Find the etcd pods
211
kubectl -n kube-system get pods -l component=etcd,tier=control-plane
212
213
# Get the list of etcd servers with the member id 
214
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
215
216
# Remove the member
217
kubectl exec -n kube-system -ti ETCDPODNAME -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove MEMBERID
218
</pre>
219
220
Sample session:
221
222
<pre>
223
[10:48] line:~% kubectl -n kube-system get pods -l component=etcd,tier=control-plane
224
NAME            READY   STATUS    RESTARTS     AGE
225
etcd-server63   1/1     Running   0            3m11s
226
etcd-server65   1/1     Running   3            7d2h
227
etcd-server83   1/1     Running   8 (6d ago)   7d2h
228
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
229
356891cd676df6e4, started, server65, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:375c]:2379, false
230
371b8a07185dee7e, started, server63, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2380, https://[2a0a:e5c0:10:1:225:b3ff:fe20:378a]:2379, false
231
5942bc58307f8af9, started, server83, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2380, https://[2a0a:e5c0:10:1:3e4a:92ff:fe79:bb98]:2379, false
232
233
[10:48] line:~% kubectl exec -n kube-system -ti etcd-server65 -- etcdctl --endpoints '[::1]:2379' --cacert /etc/kubernetes/pki/etcd/ca.crt --cert  /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member remove 371b8a07185dee7e
234
Member 371b8a07185dee7e removed from cluster e3c0805f592a8f77
235 1 Nico Schottelius
236
</pre>
237
238
SEE ALSO
239
240
* We found the solution using https://stackoverflow.com/questions/67921552/re-installed-node-cannot-join-kubernetes-cluster
241 56 Nico Schottelius
242 62 Nico Schottelius
h2. Calico CNI
243
244
h3. Calico Installation
245
246
* We install "calico using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
247
* This has the following advantages:
248
** Easy to upgrade
249
** Does not require os to configure IPv6/dual stack settings as the tigera operator figures out things on its own
250
251
Usually plain calico can be installed directly using:
252
253
<pre>
254
helm repo add projectcalico https://docs.projectcalico.org/charts
255
helm install calico projectcalico/tigera-operator --version v3.20.2
256 1 Nico Schottelius
</pre>
257 92 Nico Schottelius
258
* Check the tags on https://github.com/projectcalico/calico/tags for the latest release
259 62 Nico Schottelius
260
h3. Installing calicoctl
261
262
To be able to manage and configure calico, we need to 
263
"install calicoctl (we choose the version as a pod)":https://docs.projectcalico.org/getting-started/clis/calicoctl/install#install-calicoctl-as-a-kubernetes-pod
264
265
<pre>
266
kubectl apply -f https://docs.projectcalico.org/manifests/calicoctl.yaml
267
</pre>
268
269 70 Nico Schottelius
And making it easier accessible by alias:
270
271
<pre>
272
alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"
273
</pre>
274
275 62 Nico Schottelius
h3. Calico configuration
276
277 63 Nico Schottelius
By default our k8s clusters "BGP peer":https://docs.projectcalico.org/networking/bgp
278
with an upstream router to propagate podcidr and servicecidr.
279 62 Nico Schottelius
280
Default settings in our infrastructure:
281
282
* We use a full-mesh using the @nodeToNodeMeshEnabled: true@ option
283
* We keep the original next hop so that *only* the server with the pod is announcing it (instead of ecmp)
284 1 Nico Schottelius
* We use private ASNs for k8s clusters
285 63 Nico Schottelius
* We do *not* use any overlay
286 62 Nico Schottelius
287
After installing calico and calicoctl the last step of the installation is usually:
288
289 1 Nico Schottelius
<pre>
290 79 Nico Schottelius
calicoctl create -f - < calico-bgp.yaml
291 62 Nico Schottelius
</pre>
292
293
294
A sample BGP configuration:
295
296
<pre>
297
---
298
apiVersion: projectcalico.org/v3
299
kind: BGPConfiguration
300
metadata:
301
  name: default
302
spec:
303
  logSeverityScreen: Info
304
  nodeToNodeMeshEnabled: true
305
  asNumber: 65534
306
  serviceClusterIPs:
307
  - cidr: 2a0a:e5c0:10:3::/108
308
  serviceExternalIPs:
309
  - cidr: 2a0a:e5c0:10:3::/108
310
---
311
apiVersion: projectcalico.org/v3
312
kind: BGPPeer
313
metadata:
314
  name: router1-place10
315
spec:
316
  peerIP: 2a0a:e5c0:10:1::50
317
  asNumber: 213081
318
  keepOriginalNextHop: true
319
</pre>
320
321 64 Nico Schottelius
h2. ArgoCD / ArgoWorkFlow
322 56 Nico Schottelius
323 60 Nico Schottelius
h3. Argocd Installation
324 1 Nico Schottelius
325 60 Nico Schottelius
As there is no configuration management present yet, argocd is installed using
326
327 1 Nico Schottelius
<pre>
328 60 Nico Schottelius
kubectl create namespace argocd
329 86 Nico Schottelius
330
# Version 2.2.3
331
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/v2.2.3/manifests/install.yaml
332
333
# OR: latest stable
334 60 Nico Schottelius
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
335 1 Nico Schottelius
</pre>
336 56 Nico Schottelius
337 60 Nico Schottelius
* See https://argo-cd.readthedocs.io/en/stable/
338 1 Nico Schottelius
339 60 Nico Schottelius
h3. Get the argocd credentials
340
341
<pre>
342
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d; echo ""
343
</pre>
344 52 Nico Schottelius
345 87 Nico Schottelius
h3. Accessing argocd
346
347
In regular IPv6 clusters:
348
349
* Navigate to https://argocd-server.argocd.CLUSTERDOMAIN
350
351
In legacy IPv4 clusters
352
353
<pre>
354
kubectl --namespace argocd port-forward svc/argocd-server 8080:80
355
</pre>
356
357 88 Nico Schottelius
* Navigate to https://localhost:8080
358
359 68 Nico Schottelius
h3. Using the argocd webhook to trigger changes
360 67 Nico Schottelius
361
* To trigger changes post json https://argocd.example.com/api/webhook
362
363 72 Nico Schottelius
h3. Deploying an application
364
365
* Applications are deployed via git towards gitea (code.ungleich.ch) and then pulled by argo
366 73 Nico Schottelius
* Always include the *redmine-url* pointing to the (customer) ticket
367
** Also add the support-url if it exists
368 72 Nico Schottelius
369
Application sample
370
371
<pre>
372
apiVersion: argoproj.io/v1alpha1
373
kind: Application
374
metadata:
375
  name: gitea-CUSTOMER
376
  namespace: argocd
377
spec:
378
  destination:
379
    namespace: default
380
    server: 'https://kubernetes.default.svc'
381
  source:
382
    path: apps/prod/gitea
383
    repoURL: 'https://code.ungleich.ch/ungleich-intern/k8s-config.git'
384
    targetRevision: HEAD
385
    helm:
386
      parameters:
387
        - name: storage.data.storageClass
388
          value: rook-ceph-block-hdd
389
        - name: storage.data.size
390
          value: 200Gi
391
        - name: storage.db.storageClass
392
          value: rook-ceph-block-ssd
393
        - name: storage.db.size
394
          value: 10Gi
395
        - name: storage.letsencrypt.storageClass
396
          value: rook-ceph-block-hdd
397
        - name: storage.letsencrypt.size
398
          value: 50Mi
399
        - name: letsencryptStaging
400
          value: 'no'
401
        - name: fqdn
402
          value: 'code.verua.online'
403
  project: default
404
  syncPolicy:
405
    automated:
406
      prune: true
407
      selfHeal: true
408
  info:
409
    - name: 'redmine-url'
410
      value: 'https://redmine.ungleich.ch/issues/ISSUEID'
411
    - name: 'support-url'
412
      value: 'https://support.ungleich.ch/Ticket/Display.html?id=TICKETID'
413
</pre>
414
415 80 Nico Schottelius
h2. Helm related operations and conventions
416 55 Nico Schottelius
417 61 Nico Schottelius
We use helm charts extensively.
418
419
* In production, they are managed via argocd
420
* In development, helm chart can de developed and deployed manually using the helm utility.
421
422 55 Nico Schottelius
h3. Installing a helm chart
423
424
One can use the usual pattern of
425
426
<pre>
427
helm install <releasename> <chartdirectory>
428
</pre>
429
430
However often you want to reinstall/update when testing helm charts. The following pattern is "better", because it allows you to reinstall, if it is already installed:
431
432
<pre>
433
helm upgrade --install <releasename> <chartdirectory>
434 1 Nico Schottelius
</pre>
435 80 Nico Schottelius
436
h3. Naming services and deployments in helm charts [Application labels]
437
438
* We always have {{ .Release.Name }} to identify the current "instance"
439
* Deployments:
440
** use @app: <what it is>@, f.i. @app: nginx@, @app: postgres@, ...
441 81 Nico Schottelius
* See more about standard labels on
442
** https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
443
** https://helm.sh/docs/chart_best_practices/labels/
444 55 Nico Schottelius
445 43 Nico Schottelius
h2. Rook / Ceph Related Operations
446
447 71 Nico Schottelius
h3. Executing ceph commands
448
449
Using the ceph-tools pod as follows:
450
451
<pre>
452
kubectl exec -n rook-ceph -ti $(kubectl -n rook-ceph get pods -l app=rook-ceph-tools -o jsonpath='{.items[*].metadata.name}') -- ceph -s
453
</pre>
454
455 43 Nico Schottelius
h3. Inspecting the logs of a specific server
456
457
<pre>
458
# Get the related pods
459
kubectl -n rook-ceph get pods -l app=rook-ceph-osd-prepare 
460
...
461
462
# Inspect the logs of a specific pod
463
kubectl -n rook-ceph logs -f rook-ceph-osd-prepare-server23--1-444qx
464
465 71 Nico Schottelius
</pre>
466
467
h3. Inspecting the logs of the rook-ceph-operator
468
469
<pre>
470
kubectl -n rook-ceph logs -f -l app=rook-ceph-operator
471 43 Nico Schottelius
</pre>
472
473
h3. Triggering server prepare / adding new osds
474
475
The rook-ceph-operator triggers/watches/creates pods to maintain hosts. To trigger a full "re scan", simply delete that pod:
476
477
<pre>
478
kubectl -n rook-ceph delete pods -l app=rook-ceph-operator
479
</pre>
480
481
This will cause all the @rook-ceph-osd-prepare-..@ jobs to be recreated and thus OSDs to be created, if new disks have been added.
482
483
h3. Removing an OSD
484
485
* See "Ceph OSD Management":https://rook.io/docs/rook/v1.7/ceph-osd-mgmt.html
486 77 Nico Schottelius
* More specifically: https://github.com/rook/rook/blob/release-1.7/cluster/examples/kubernetes/ceph/osd-purge.yaml
487 41 Nico Schottelius
488 76 Nico Schottelius
h2. Harbor
489
490
* We user "Harbor":https://goharbor.io/ for caching and as an image registry. Internal app reference: apps/prod/harbor.
491
* The admin password is in the password store, auto generated per cluster
492
* At the moment harbor only authenticates against the internal ldap tree
493
494
h3. LDAP configuration
495
496
* The url needs to be ldaps://...
497
* uid = uid
498
* rest standard
499 75 Nico Schottelius
500 89 Nico Schottelius
h2. Monitoring / Prometheus
501
502 90 Nico Schottelius
* Via "kube-prometheus":https://github.com/prometheus-operator/kube-prometheus/
503 89 Nico Schottelius
504 91 Nico Schottelius
Access via ...
505
506
* http://prometheus-k8s.monitoring.svc:9090
507
* http://grafana.monitoring.svc:3000
508
* http://alertmanager.monitoring.svc:9093
509
510
511
512 82 Nico Schottelius
h2. Nextcloud
513
514 85 Nico Schottelius
h3. How to get the nextcloud credentials 
515 84 Nico Schottelius
516
* The initial username is set to "nextcloud"
517
* The password is autogenerated and saved in a kubernetes secret
518
519
<pre>
520 85 Nico Schottelius
kubectl get secret RELEASENAME-nextcloud -o jsonpath="{.data.PASSWORD}" | base64 -d; echo "" 
521 84 Nico Schottelius
</pre>
522
523 83 Nico Schottelius
h3. How to fix "Access through untrusted domain"
524
525 82 Nico Schottelius
* Nextcloud stores the initial domain configuration
526 1 Nico Schottelius
* If the FQDN is changed, it will show the error message "Access through untrusted domain"
527 82 Nico Schottelius
* To fix, edit /var/www/html/config/config.php and correct the domain
528 83 Nico Schottelius
* Then delete the pods
529 82 Nico Schottelius
530 1 Nico Schottelius
h2. Infrastructure versions
531 35 Nico Schottelius
532 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v5 (2021-10)
533 1 Nico Schottelius
534 57 Nico Schottelius
Clusters are configured / setup in this order:
535
536
* Bootstrap via kubeadm
537 59 Nico Schottelius
* "Networking via calico + BGP (non ECMP) using helm":https://docs.projectcalico.org/getting-started/kubernetes/helm
538
* "ArgoCD for CD":https://argo-cd.readthedocs.io/en/stable/
539
** "rook for storage via argocd":https://rook.io/
540 58 Nico Schottelius
** haproxy for in IPv6-cluster-IPv4-to-IPv6 proxy via argocd
541
** "kubernetes-secret-generator for in cluster secrets":https://github.com/mittwald/kubernetes-secret-generator
542
** "ungleich-certbot managing certs and nginx":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
543
544 57 Nico Schottelius
545
h3. ungleich kubernetes infrastructure v4 (2021-09)
546
547 54 Nico Schottelius
* rook is configured via manifests instead of using the rook-ceph-cluster helm chart
548 1 Nico Schottelius
* The rook operator is still being installed via helm
549 35 Nico Schottelius
550 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v3 (2021-07)
551 1 Nico Schottelius
552 10 Nico Schottelius
* rook is now installed via helm via argocd instead of directly via manifests
553 28 Nico Schottelius
554 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v2 (2021-05)
555 28 Nico Schottelius
556
* Replaced fluxv2 from ungleich k8s v1 with argocd
557 1 Nico Schottelius
** argocd can apply helm templates directly without needing to go through Chart releases
558 28 Nico Schottelius
* We are also using argoflow for build flows
559
* Planned to add "kaniko":https://github.com/GoogleContainerTools/kaniko for image building
560
561 57 Nico Schottelius
h3. ungleich kubernetes infrastructure v1 (2021-01)
562 28 Nico Schottelius
563
We are using the following components:
564
565
* "Calico as a CNI":https://www.projectcalico.org/ with BGP, IPv6 only, no encapsulation
566
** Needed for basic networking
567
* "kubernetes-secret-generator":https://github.com/mittwald/kubernetes-secret-generator for creating secrets
568
** Needed so that secrets are not stored in the git repository, but only in the cluster
569
* "ungleich-certbot":https://hub.docker.com/repository/docker/ungleich/ungleich-certbot
570
** Needed to get letsencrypt certificates for services
571
* "rook with ceph rbd + cephfs":https://rook.io/ for storage
572
** rbd for almost everything, *ReadWriteOnce*
573
** cephfs for smaller things, multi access *ReadWriteMany*
574
** Needed for providing persistent storage
575
* "flux v2":https://fluxcd.io/
576
** Needed to manage resources automatically