Task #8888
Updated by Timothée Floure almost 4 years ago
Hello there, We have some performance issues and infrastructure rot on our matrix deployments: I'll work on it here and there over the next few weeks. This meta-issue will make following easier to follow what's going on. I might add things on the fly as I encounter them / link to other issues later on. * [ ] Cleaning up and upstreaming the __matrix-synapse cdist type. #7345 - [x] Clean-up. - [x] Bring configuration template up-to-date. - [x] Add more performance-related flags. - [x] Add support for multi-workers (a new __matrix_synapse_worker type might be needed) - [ ] Upstream to cdist-contrib See https://code.ungleich.ch/ungleich-public/cdist-contrib/-/merge_requests/9 * [ ] Cleanup and simplify the __ungleich_matrix type - [ ] Allow PGSQL tuning / auto-tune from explorer if not provided. - [x] Adapt to updated __matrix_synapse type * [ ] Revamp matrix monitoring: we need something simpler and more robust. - [x] [ ] Get back missing instances in monitoring. - [ ] Add alerts. Send alerts on high message latency. - [x] [ ] Add PGSQL performance monitoring. * [x] Update admin UI * [ ] Investigate performance issues. - [ ] Checking out database bottlenecks. - [ ] Checking out synapse bottlenecks. - [ ] Possibily add periodic database cleanup. * [ ] Check out the state of the Jitsi integration. - [x] Rebuilt with CDIST (small issue with watermark - see https://code.ungleich.ch/ungleich-public/cdist-contrib/-/issues/4) - [x] [ ] Wire Prometheus to the new Jitsi Exporter - [ ] Add simple blackbox monitoring * [ ] Check state of ext.ungleich.ch homeserver * [ ] LOW_PRIO check out if it is useful to deploy our own integration server * [ ] Don't forget to document!