StatusCast engineers identified a backup in its background processing that is causing delays in some actions from being completed in a timely fashion.
Engineers have determined the root cause of the backup in processing. During this time we have scaled out the service that is responsible to help clear out the remaining items faster.
Services should be operating as expected.
StatusCast engineers have been alerted to a possible performance impacting event affecting status pages and the admin application response times. This event is not impacting notification processing. We apologize for this inconvenience and will provide an update shortly.
All services should be fully functional and performing regularly.
StatusCast's engineers determined that at approximately 8:00PM EST on November 15th 2024, several of StatusCast's application servers experienced an issue that caused the response time to spike. StatusCast's infrastructure in Azure is designed to perform scaling procedures for services under duress, and for all but one of the application servers in question this was done successfully restoring the service to an acceptable level of performance.
The remaining application service did not correctly scale and stayed in a degraded state for multiple days. On November 18th engineers were alerted that some customers were still experiencing load time delays and at that point the last server was corrected.
After this event occurred StatusCast's Devops team began an audit of all scaling procedures to ensure that all application services across our Azure operational regions adhere to a consistent scaling process and more importantly to ensure that monitoring is properly deployed to all services.
StatusCast engineers were alerted earlier that some users were experiencing sporadic issues attempting to connect to the status page and admin portal. Our hosting provider, Microsoft Azure, has alerted us via their status page that they are experiencing some network issues globally. We will provide an update as soon as more information is available.
Access to status pages has remained stable and Azure has updated their status indicating failover processes have been engaged to improve their service availability. StatusCast's engineers will continue to watch this closely and will post additional updates as necessary.
StatusCast's application has continued to remain stable. Our engineers will continue to watch the system closely as Microsoft has not fully closed out the event on their side. For more specific details on Azure's issue please refer to their status page. We will provide additional updates as necessary.
Microsoft has closed the issue on their side and StatusCast's platform continues to operate as expected. Once Microsoft has published more details on this we will provide here in the form of an RCA.
Mitigation Statement - Azure Front Door Issues accessing a subset of Microsoft services
At approximately 8:19PM EDT, StatusCast’s engineers were alerted that some status page and admin applications were inaccessible. The team identified that its hosting partner, Microsoft, was experiencing some issues in its US East region related to app services and SQL databases connections. As of 9:03PM EDT services have been restored and StatusCast’s team is currently working with Microsoft to fully investigate the incident. Once the team has completed it’s investigation we will follow up with an RCA.
At this time StatusCast should be operating fully as expected, if you continue to have any further issues please contact us at support@statuscast.com
As of 9:03PM EDT services have been restored and StatusCast’s team is currently working with Microsoft to fully investigate the incident. Once the team has completed it’s investigation we will follow up with an RCA.
In working with Microsoft, StatusCast’s team confirmed that the disruption was due to an outage with SQL Databases located in Azure’s US East region which is where StatusCast is primarily hosted:
StatusCast itself was impacted by this outage from approximately 8:19 PM EDT and had fully recovered by 9:03 PM EDT. StatusCast’s team will continue to work closely with Microsoft to further optimize its offering to help ensure that impact of service provider outages is as minimal as possible.