Multi Select on Case Search is causing Errors for users
Incident Report for CommCare HQ
Postmortem

Overview

On 11/17/22, a standard deploy was released at 02:00 am ET. Included as part of this deploy were code changes related to a new validation to check for maximum cases users can select on a multi-select case list screen . A subset of BHA users first experienced an issue at 8:42 am ET and were presented with an error message while attempting to choose a case from a multi select case list. By 11:46 am ET, our incident response team had resolved the issue, and Web Apps was fully functional.

Summary of the incident

On 11/17/2022, a standard code deploy went out to CommCare’s production environment, which contained code changes that prevented some users from choosing a case within a multi select case list.

At 10:02 am ET, a ticket was opened by the Dimagi BHA project team alerting support to several errors in the error log.

At 10:42 am ET, a Priority 1 incident was declared and an emergency response team was assembled to address the issue. Based on the timing of the deploy and the fact that deploy included a code change related to multi select, the team had a strong hypothesis that the deploy had introduced the issue.

At 10:52 am ET, the developer who had created the change confirmed the origin of the issue and that the code needed to be removed.  A response team then began work to reverse the change that introduced the issue and re-deploy the system.

At 11:34 am ET, the reversion of the problematic change was deployed. Our error monitoring tool confirmed that the errors had stopped, and we asked users to confirm functionality.

By 11:46 am ET, the BHA delivery team had heard back from the users and were able to confirm that they were no longer experiencing the issue. 

Our Next Steps

The errors the users saw were due to functionality related to constraining the total number of cases a user can select from a multi select case list. Thursday morning's deploy introduced changes which modified the code in a way that broke the selection for some users who had previously used multi-select case lists.

We have protection against these types of issues (changes to the serialization schema) when updating model definitions, but in this case it was a single attribute's value that changed, which was not caught.

Going forward, the engineering team will incorporate awareness of this type of risk into code review. The engineering team is also exploring a method to reduce the number of errors reported  so that the issue will no longer be blocking 

We understand that our users expect a positive user experience, and we are sorry for the inconvenience this has caused. Thank you for your patience and support. Please reach out to support@dimagi.com if you have further questions about the incident.

Posted Nov 21, 2022 - 19:07 UTC

Resolved
Dear Users, the response team has confirmed that we are no longer experiencing the errors with Multi Select and the fix has resolved the issue. We thank you for your patience during this event which affected a subset of our users. We again apologize for any inconvenience.
Posted Nov 17, 2022 - 16:51 UTC
Monitoring
Dear users, Multi Select on Case Search is causing errors for users in CommCare Web Apps for a subset of users. The response team has applied a fix to resolve the issue and users should no longer be experiencing errors. We are monitoring for errors. We apologize for the inconvenience.
Posted Nov 17, 2022 - 16:37 UTC
Update
Dear users, Multi Select on Case Search is causing errors for users in CommCare Web Apps for a subset of users. The response team is applying a fix to resolve the issue which may cause ~1 minute of disruption. We apologize for the inconvenience.
Posted Nov 17, 2022 - 16:28 UTC
Identified
Dear Users, this issues is only affecting Multi Select for Case Search for users in CommCare Webapps for a subset of users. the response team has determined a fix and are working on releasing the fix. We will update the status after the fix has been applied. We apologize for the inconvenience.
Posted Nov 17, 2022 - 16:19 UTC
Investigating
Dear Users, Multi Select on Case Search is causing Errors for users in CommCare Webapps for a subset of users. We have a response team investigating the issue with the highest priority. We apologize for the inconvenience.
Posted Nov 17, 2022 - 16:04 UTC
This incident affected: www.commcarehq.org (Web Apps).