I spend most of my morning playing around with this synapse issues. I moved some processing out of the main process and increased a bit the cache but there's still high latency on registering new events: in-depth investigation including a deeper look at what's going with synapse's internals and potential slowness with PGSQL are required.
I'd also like to:
- Revamp the whole monitoring pipeline which is utterly broken right now: autodiscovery with consul and a monitoring LAN sounds nice in theory, but implies too many flying components in practice. I want to drop it in favor of good old cdist-managed static configuration files. We also need some alerts (= instead of ranting customers..).
- Clean up the synapse type (written in my early cdist days ~ 1 year ago...) and merge it into cdist-contrib.
- Add the multi-worker setup to cdist. It doesn't matter if we add more core with a single synapse process...
- Make our internal cdist manifest more concise, it's somewhat messy right now.
This means a few work-days. Your thoughts @Nico Schottelius?