Errors for the Twilio Rest API Impacting Multiple Twilio Services

Incident
February 17, 10:01pm EST

Errors for the Twilio Rest API Impacting Multiple Twilio Services

Status: closed
Start: February 17, 8:57am EST
End: February 17, 10:01pm EST
Duration: 13 hours 3 minutes
Affected Components:
Cloud Providers Twilio
Investigating

February 17, 8:57am EST

February 17, 8:57am EST

We are currently investigating errors for the Twilio Rest API, impacting multiple Twilio services. Our engineering team has been alerted and is actively investigating. We will update as soon as we have more information.

Investigating

February 17, 9:29am EST

February 17, 9:29am EST

We are experiencing issues with the Twilio Rest API, impacting multiple Twilio services. Customer are experiencing voice insights call events for client/SDK being delayed, and errors retrieving 408 errors. Customers may also experiencing issues with Flex edge services, SMS conversion rate and Verify. We’re consolidating the following StatusPage posts: https://status.twilio.com/incidents/pzhb9l80kkg7, https://status.twilio.com/incidents/vy2l3p2pp7xm, https://status.twilio.com/incidents/6qbh9vr54f8p, https://status.twilio.com/incidents/xtn4n1vwyrjf. Our engineering team is actively working to resolve the issues. We expect to provide another update in 30 mins or as soon as more information becomes available.

Investigating

February 17, 9:57am EST

February 17, 9:57am EST

We have declared a sev0 incident related to ongoing issues affecting multiple products, including Flex, Programmable Voice, ElasticSearch, Insights, Voice insights, Programmable SMS, Automated SMS routing, Task Router Statistics, Agent Login, and Fraud Guard. All necessary teams have been engaged to remediate the issue and restore services as soon as possible. We will. provide an update in 30 minutes or as soon as we have further information.

Investigating

February 17, 10:16am EST

February 17, 10:16am EST

We have declared a sev0 incident related to ongoing issues affecting multiple products, including Flex, Programmable Voice, ElasticSearch, Insights, Voice insights, Programmable SMS, Automated SMS routing, Task Router Statistics, Agent Login, support live chat, and Fraud Guard. All necessary teams have been engaged to remediate the issue and restore services as soon as possible. We will provide an update in 1 hour or as soon as we have further information.

Identified

February 17, 10:38am EST

February 17, 10:38am EST

Our engineering team has identified that a degraded underlying service is causing failures on upstream services affecting multiple products. We are redeploying the underlying service across all other services. Our initial estimate was that partial mitigation would start at 8:00am PST, but our engineers have now updated that to start at 9:00am PST. As we redeploy the service, this estimate may change. We will continue to provide updates every 30 minutes or sooner if we have further information.

Identified

February 17, 11:08am EST

February 17, 11:08am EST

Our engineering teams are continuing to work on the redeployment of the underlying service. No additional information is available at this time. We will continue to provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 11:38am EST

February 17, 11:38am EST

Our engineering teams continue to work on the redeployment of the underlying service. No additional information is available at this time. We will continue to provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 12:04pm EST

February 17, 12:04pm EST

Our engineering teams are continuing to implement the redeployment of the underlying service, but the initial estimated time to partial mitigation is being extended until 9:30 am PST. No additional information is available at this time. We will continue to provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 12:33pm EST

February 17, 12:33pm EST

We are seeing intermittent recovery of Flex Console login, but the underlying services within Flex are still facing issues. Our engineering teams are continuing to work to remediate these issues, and we will provide updates every 30 minutes or sooner if we have further pertinent information. Please note that automated SMS routing and SMS conversion rates are not impacted at this time.

Identified

February 17, 1:03pm EST

February 17, 1:03pm EST

We are continuing to see intermittent recovery of the Flex Console login, but the underlying services within Flex are still facing issues. Our engineering teams are continuing to work to remediate these issues, and we will provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 1:33pm EST

February 17, 1:33pm EST

We are continuing to see intermittent recovery of the Flex Console login, but the underlying services within Flex are still facing issues. Our engineering teams are continuing to work to remediate these issues, and we will provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 2:11pm EST

February 17, 2:11pm EST

We are observing intermittent recovery of Flex console login and other underlying services within Flex. Our engineering teams are continuing to work towards full restoration of services and we will continue to provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 2:41pm EST

February 17, 2:41pm EST

We are observing intermittent recovery of TaskRouter, Flex console login, and other underlying services within Flex. Our engineering teams are continuing to work towards full restoration of services and we will continue to provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 3:15pm EST

February 17, 3:15pm EST

We are continuing to observe intermittent recovery of TaskRouter, Flex console login and other underlying services within Flex. Separately, our teams have observed a degradation in Twilio Notify services which is being triaged and assessed for remediation. Our engineering teams are continuing to work towards full restoration of services and we will continue to provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 3:45pm EST

February 17, 3:45pm EST

We are continuing to observe intermittent recovery of TaskRouter, Flex console login and other underlying services within Flex. Separately, our teams continue to observe a degradation in Twilio Notify and IoT - Programmable Wireless services which are being triaged and assessed for remediation. Our engineering teams are continuing to work towards full restoration of services and we will continue to provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 4:15pm EST

February 17, 4:15pm EST

We are continuing to observe intermittent recovery of TaskRouter, Flex console login and other underlying services within Flex. Separately, our teams have observed a degradation in Twilio Notify services which is being triaged and assessed for remediation. Our engineering teams are continuing to work towards full restoration of services and we will continue to provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 4:50pm EST

February 17, 4:50pm EST

We are continuing to observe intermittent recovery of TaskRouter, Flex console login and other underlying services within Flex. Separately, our teams continue to observe a degradation in Twilio Notify, IoT - Programmable Wireless, and Serverless Functions logs services which are being triaged and assessed for remediation. Our engineering teams are continuing to work towards full restoration of services and we will continue to provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 5:29pm EST

February 17, 5:29pm EST

Our engineering team is observing further indications of recovery on our TaskRouter and Flex services, but with continued intermittent issues, which are being addressed. In parallel, our teams are working on the remediation of the following services: Voice Insights, Verify, Sync, Debugger & Alerts, Event Streams, Rest API, Twilio Notify, IoT - Programmable Wireless, and Serverless Functions logs services. Our engineering teams will continue to work towards full restoration of all affected services and will provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 6:07pm EST

February 17, 6:07pm EST

Our engineering team continues to observe indications of recovery on our TaskRouter and Flex services, but with continued intermittent issues, which are being identified and addressed. In parallel, our teams are continuing to work on the remediation of the following services: Voice Insights, Verify, Sync, Debugger & Alerts, Event Streams, Rest API, Twilio Notify, IoT - Programmable Wireless, and Serverless Functions logs services. Our engineering teams will continue to work towards full restoration of all affected services and will provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 6:35pm EST

February 17, 6:35pm EST

Our engineering team continues to observe indications of recovery on our TaskRouter and Flex services, but with continued intermittent issues, which are being identified and addressed. In parallel, our teams are continuing to work on the remediation of the following services: Voice Insights, Verify, Sync, Debugger & Alerts, Event Streams, Rest API, Twilio Notify, IoT - Programmable Wireless, and Serverless Functions logs services. Our engineering teams will continue to work towards full restoration of all affected services and will provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 7:04pm EST

February 17, 7:04pm EST

Our engineering team continues to observe indications of recovery on our TaskRouter and Flex services, but with continued intermittent issues, which are being identified and addressed. In parallel, our teams are continuing to work on the remediation of the following services: Voice Insights, Verify, Sync, Debugger & Alerts, Event Streams, Rest API, Twilio Notify, IoT - Programmable Wireless, and Serverless Functions logs services. Our engineering teams will continue to work towards full restoration of all affected services and will provide updates every 30 minutes or sooner if we have further pertinent information.

Identified

February 17, 7:59pm EST

February 17, 7:59pm EST

Our engineering team continues to observe indications of recovery on our TaskRouter and Flex services, but with continued intermittent issues, which are being identified and addressed. In parallel, our teams are continuing to work on the remediation of the following services: Voice Insights, Verify, Sync, Debugger & Alerts, Rest API, Twilio Notify, IoT - Programmable Wireless, and Serverless Functions logs services. In clarification to a prior update, there was no impact to Event Stream services. Our engineering teams will continue to work towards full restoration of all affected services and will provide an update in 1 hour or sooner if we have further information.

Monitoring

February 17, 9:20pm EST

February 17, 9:20pm EST

All affected services are now operating normally. We will continue to monitor for system stability. We’ll provide another update in 1 hour or as soon as more information becomes available.

Resolved

February 17, 10:01pm EST

February 17, 10:01pm EST

The issue impacting multiple products has been resolved, and Twilio services are operating normally at this time.