ungleich redmine: Issueshttp://localhost:3000/http://localhost:3000/favicon.ico?16699092332021-02-11T15:50:32Zungleich redmine
Redmine Open Infrastructure - Task #8888 (Closed): Meta-Issue for Matrix late-winter 2021 cleanuphttp://localhost:3000/issues/88882021-02-11T15:50:32ZTimothée Floure
<p>Hello there,</p>
<p>We have some performance issues and infrastructure rot on our matrix deployments: I'll work on it here and there over the next few weeks. This meta-issue will make following easier to follow what's going on. I might add things on the fly as I encounter them / link to other issues later on.</p>
<ul>
<li>[x] Cleaning up and upstreaming the __matrix-synapse cdist type. <a class="issue tracker-5 status-5 priority-1 priority-lowest closed" title="Task: Cleanup & upstream matrix-related types (Closed)" href="http://localhost:3000/issues/7345">#7345</a><br /> - [x] Clean-up.<br /> - [x] Bring configuration template up-to-date.<br /> - [x] Add more performance-related flags.<br /> - [x] Add support for multi-workers (a new __matrix_synapse_worker type might be needed)<br /> - [x] Upstream to cdist-contrib See <a class="external" href="https://code.ungleich.ch/ungleich-public/cdist-contrib/-/merge_requests/9">https://code.ungleich.ch/ungleich-public/cdist-contrib/-/merge_requests/9</a></li>
<li>[ ] Cleanup and simplify the __ungleich_matrix type<br /> - [ ] Allow PGSQL tuning / auto-tune from explorer if not provided.<br /> - [x] Adapt to updated __matrix_synapse type</li>
<li>[ ] Revamp matrix monitoring: we need something simpler and more robust.<br /> - [x] Get back missing instances in monitoring.<br /> - [ ] Add alerts.<br /> - [x] Add PGSQL performance monitoring.</li>
<li>[x] Update admin UI</li>
<li>[ ] Investigate performance issues.<br /> - [~] Checking out database bottlenecks.<br /> - [~] Checking out synapse bottlenecks.<br /> - [ ] Possibily add periodic database cleanup.</li>
<li>[ ] Check out the state of the Jitsi integration.<br /> - [x] Rebuilt with CDIST (small issue with watermark - see <a class="external" href="https://code.ungleich.ch/ungleich-public/cdist-contrib/-/issues/4">https://code.ungleich.ch/ungleich-public/cdist-contrib/-/issues/4</a>)<br /> - [x] Wire Prometheus to the new Jitsi Exporter<br /> - [ ] Add simple blackbox monitoring</li>
<li>[x] Check state of ext.ungleich.ch homeserver</li>
<li>[ ] LOW_PRIO check out if it is useful to deploy our own integration server</li>
<li>[ ] Don't forget to document!</li>
</ul> Open Infrastructure - Task #8887 (Closed): Update synapse-adminhttp://localhost:3000/issues/88872021-02-11T15:14:53ZTimothée Floure
<p>The synapse-admin instance running at admin.matrix.ungleich.cloud is outdated - we need to updated it.</p> Open Infrastructure - Task #8877 (Rejected): Checkout ext.ungleich.ch matrix instance on server1....http://localhost:3000/issues/88772021-02-10T08:38:57ZTimothée Floure
<blockquote>
<p>we used to have an external/smaller/simpler matrix server.<br />I have asked Timothee to have a look at this to revive it on</p>
</blockquote>
<p>server1.place4 as ext.ungleich.ch again.</p> Open Infrastructure - Task #8852 (Closed): Investigate matrix.ungleich.ch slownesshttp://localhost:3000/issues/88522021-02-05T08:32:39ZTimothée Floure
<p>matrix.ungleich.ch is so slow it becomes unusable, I'm currently investigating and:</p>
<ul>
<li>Disabling Presence</li>
<li>Increase cache sizes</li>
<li>Configure Synapse with multiple workers</li>
</ul> Open Infrastructure - Task #8201 (Rejected): Setup our own NTP poolhttp://localhost:3000/issues/82012020-06-23T17:47:53ZTimothée Floure
<p>Likely on black1..3.</p> Open Infrastructure - Task #8111 (Rejected): Monitor unbound nodeshttp://localhost:3000/issues/81112020-06-03T08:06:48ZTimothée Floure
<p>There's a prometheus exporter for unbound: <a class="external" href="https://github.com/wish/unbound_exporter">https://github.com/wish/unbound_exporter</a></p>
<p>TODO: deploy it against service-monitoring, sexy grafana graph + alerts.</p> Open Infrastructure - Task #8110 (Closed): Investigate unbound{1,2}.place6.ungleich.ch crasheshttp://localhost:3000/issues/81102020-06-03T08:02:11ZTimothée Floure
<p>I increased log verbosity on unbound1.p6, and will try to see if there's anything amiss.</p> Open Infrastructure - Task #8091 (Closed): Alpine-based Opennebula workershttp://localhost:3000/issues/80912020-05-30T08:57:26ZTimothée Floure
<p>Plan: move our ONE workers from devuan to alpine.</p>
<ul>
<li>I managed to get an alpine node to join my test ONE cluster.
<ul>
<li>Now waiting for llnu to set me up a pet CEPH cluster.</li>
</ul>
</li>
<li>TODO: package/cdistify/upstream alpine node configuration.
<ul>
<li>Related (where are ONE package definitions tracked?): <a class="external" href="https://github.com/OpenNebula/one/issues/4844">https://github.com/OpenNebula/one/issues/4844</a></li>
</ul></li>
</ul> Open Infrastructure - Task #8069 (Closed): Investigate potential bottleneck on storage/CEPH at DCLhttp://localhost:3000/issues/80692020-05-27T10:54:55ZTimothée FloureOpen Infrastructure - Task #7953 (Rejected): Normalize user managementhttp://localhost:3000/issues/79532020-05-02T11:23:46ZTimothée Floure
<blockquote>
<p>Timothée: Can you create a ticket in open infra for above suggestions? (i.e. single point of registration, allow user to choose username, add filter to disallow common usernames (info, admin, ...) and sharing the design with other web projects</p>
</blockquote>
<p>Customers can register on DCL/IPv6OnlyHosting OR on account.ungleich.ch: it's confusing AND both services have their own implementation of the registration flow. We should centralize everything against a single service / codebase and implement more password and username checks.</p> Open Infrastructure - Task #7930 (Rejected): Monitoring LAN in place6http://localhost:3000/issues/79302020-04-21T14:00:52ZTimothée Floure
<a name="Fnux"></a>
<h2 >Fnux<a href="#Fnux" class="wiki-anchor">¶</a></h2>
<blockquote>
<p>Monitoring services is a pain at the moment: either I have to configure prometheus by hand to monitor a service, or I have to make a hole for the node's specific IP so that it joins the consul cluster.<br />Could we have some kind of internal "monitoring LAN" that we attach to the VM in ONE? This subnet could be wired to be able to access the consul cluster.</p>
</blockquote>
<a name="Nico"></a>
<h2 >Nico<a href="#Nico" class="wiki-anchor">¶</a></h2>
<blockquote>
<p>ok. Proceed as follows:<br />delegate a new /64 from 2a0a:e5c0:2::/48 in netbox<br />Create an opennebula network for it, cluster = place6, ciara (all clusters that are in place6)<br />Don't configure a gateway - we keep this as an add-on network<br />Reconfigure the firewall to allow accessing consul from this network<br />(all in a redmine ticket, cc llnu kjg)</p>
</blockquote>
<a name="2020-05-29-vxlan"></a>
<h2 >2020-05-29, vxlan<a href="#2020-05-29-vxlan" class="wiki-anchor">¶</a></h2>
<ul>
<li>We create a vxlan device</li>
<li>We create a bridge containing the vxlan device</li>
<li>And we are happy</li>
</ul> Open Infrastructure - Task #7768 (Closed): Add monitoring to Matrix-as-a-Servicehttp://localhost:3000/issues/77682020-02-24T08:27:57ZTimothée FloureOpen Infrastructure - Task #7580 (Closed): Preparing for matrix-as-a-servicehttp://localhost:3000/issues/75802020-01-07T12:53:44ZTimothée Floure
<p>Once matrix is deployed at ungleich:</p>
<ul>
<li>Build & document MaaS deployment and maintenance pipeline.<br /> - Wiki page.<br /> - A staging environment will be required to test upgrades.</li>
<li>1 or 2 blog entries about it? First one maybe a bit more as introduction, why we want to support matrix and second one more about the technical details? (quoting Nico here)</li>
<li>Be mentionned in "This Week In Matrix" (Weekly matrix news) and on <a class="external" href="https://matrix.org/docs/projects/hosting/">https://matrix.org/docs/projects/hosting/</a><br /> - We should emphasize on the decent/green (hydro/old building/second-hand servers/...) factor, as I expect it will interest some (sub-)communities.</li>
<li>Upstream `__matrix_*` cdist types.</li>
<li>Investigate the application services we could offer.</li>
</ul>
<p>Feel free to put this task in another project if it doesn't fit here.</p> Open Infrastructure - Task #7545 (Closed): Switch production LDAPs to cdist-managed alpinehttp://localhost:3000/issues/75452019-12-31T15:20:43ZTimothée Floure
<p>Our production LDAP nodes do not seem to be managed by cdist (anymore?):
* No relevant mention in `grep -R __ungleich_ldap dot-cdist/` or `grep -R ldap1 dot-cdist/`
* Deployed configuration do not exactly match `__ungleich_ldap` type.</p>
<p>=> Investigate and update dot-cdist to handle production ldap{1,2}.ungleich.ch</p> Open Infrastructure - Task #7544 (Rejected): Write "beginner's guide" for datacenterlight customershttp://localhost:3000/issues/75442019-12-31T07:36:38ZTimothée Floure
<p>Such a guide should cover:
* What is a VM? How do I choose CPU/Memory/Storage?
* How do I choose a GNU/Linux or *BSD distribution?
* How do I connect to my VM?<br /> - GNU/Linux, *BSD<br /> - MacOS<br /> - Windows
* Managing my VM:<br /> - What can I do with my shell?<br /> - Graphical management helpers (= cockpit project, ...).<br /> - Good practices: automatic updates, firewall, permissions, versioning configuration, ...<br /> - Example deploying simple web server?</p>