Project

General

Profile

Task #8888

Updated by Timothée Floure almost 4 years ago

Hello there, 

 We have some performance issues and infrastructure rot on our matrix deployments: I'll work on it here and there over the next few weeks. This meta-issue will make following easier to follow what's going on. I might add things on the fly as I encounter them / link to other issues later on. 

 * [ ] Cleaning up and upstreaming the __matrix-synapse cdist type. #7345 
   - [x] Clean-up. 
   - [x] Bring configuration template up-to-date. 
   - [x] Add more performance-related flags. 
   - [x] Add support for multi-workers (a new __matrix_synapse_worker type might be needed) 
   - [ ] Upstream to cdist-contrib See https://code.ungleich.ch/ungleich-public/cdist-contrib/-/merge_requests/9 
 * [ ] Cleanup and simplify the __ungleich_matrix type 
   - [ ] Allow PGSQL tuning / auto-tune from explorer if not provided. 
   - [x] Adapt to updated __matrix_synapse type 
 * [ ] Revamp matrix monitoring: we need something simpler and more robust. 
   - [x] [ ] Get back missing instances in monitoring. 
   - [ ] Add alerts. Send alerts on high message latency. 
   - [x] [ ] Add PGSQL performance monitoring. 
 * [x] Update admin UI 
 * [ ] Investigate performance issues. 
   - [ ] Checking out database bottlenecks. 
   - [ ] Checking out synapse bottlenecks. 
   - [ ] Possibily add periodic database cleanup. 
 * [ ] Check out the state of the Jitsi integration. 
   - [x] Rebuilt with CDIST (small issue with watermark - see https://code.ungleich.ch/ungleich-public/cdist-contrib/-/issues/4) 
   - [x] [ ] Wire Prometheus to the new Jitsi Exporter 
   - [ ] Add simple blackbox monitoring 
 * [ ] Check state of ext.ungleich.ch homeserver 
 * [ ] LOW_PRIO check out if it is useful to deploy our own integration server 
 * [ ] Don't forget to document!

Back