A note on recent outages and instability on CommCare HQ
Incident Report for CommCare HQ
Resolved
Dear users, as you may have noticed on our status updates over the last one month, there have been recurring occurrences of outages and performance degradation on CommCare HQ. These outages are related to a steep increase in load on our infrastructure in recent months. We have seen the total number of overall requests to CommCare HQ triple, compared to the period since July. Additionally, requests to our Web Apps services specifically have increased by nearly 5 times in the same period.

We regularly handle changes in request volume without incident. However, in the past month, the quick scale up has highlighted some bottlenecks in our infrastructure that have resulted in some instability and outages while we work through them. To mitigate these bottlenecks, we have taken several steps to enable the core services to perform effectively during peak usage. These steps include tuning our web servers and optimizing memory usage on our databases.

We are also continuing to tune our infrastructure as the load on the system continues to rise. As a result, we have seen the frequency of the outages decline in the last week. We are continuing to closely monitor the platform, and the changes we’re making will enable users to experience better performance on CommCare HQ going forward.

We sincerely appreciate your patience and understanding during these outages. We remain committed to making the platform experience better for all our users as we continue to keep up with the demand and prepare ourselves for the future.

Sincerely,
– The CommCare HQ Support Team
Posted Jan 21, 2022 - 18:00 UTC