We identified the cause of the failure and isolated it; we did not have any additional freeze in the last 4 days. Additional mitigations will take place in the following days to make sure we become more resilient to this kind of issues.
Posted over 1 year ago. Jan 11, 2018 - 06:31 EST
The issue just showed up again. We are going to recreate one of the servers. You might experience trouble connecting to your projects for a few minutes.
Sorry again for the inconvenience!
Posted over 1 year ago. Dec 28, 2017 - 03:47 EST
We added various monitoring metrics that will help us identify this issue quicker in the future, to fix it more easily.
Posted over 1 year ago. Dec 25, 2017 - 18:06 EST
Some component of our infrastructure is causing very high load on servers, leading one of them to eventually get stuck. At that point we have to terminate it, but then after a few hours the problem presents itself on a different server. The origin is probably a rogue project which triggers a side condition in our services.