EU DC Partial Outage Resolved: A Detailed RCA

EU DC Partial Outage Resolved: A Detailed RCA

Incident Summary

Due to an overload on one of the nodes in the EU DC for Zoho Desk, the system was unable to handle the heavy load, causing a slowdown in requests and resulting in a partial outage for customers with data residing in that node.

On May 2, 8:26 AM, CEST , the incident was identified by our team. Our engineers determined the root cause and initiated efforts to address it by adjusting our system configurations. During this time, we began transferring high-traffic organizations to other nodes to minimize impact on the affected node. Although our secondary adjustments mitigated the issue and enabled the portals to load and work by May 2, 03:19 PM, CEST, it was not completely resolved.

After completing the scheduled movement of top-traffic generating organizations, we deployed a bug fix to prevent the system from holding connections for extended periods. Stability was fully restored on May 3, 12:40 PM, CEST. The incident and its history were also captured on the service availability page (status.zoho.eu).
 
Technical Breakdown

We identified that our system was experiencing slowness, leading to a significant number of users experiencing difficulty in connecting to Desk. This issue directly impacted the availability and performance of our service.
 
Further analysis using our monitoring system and log data indicated that the high traffic was legitimate and not due to a DDoS attack. Also, this incident was solely related to a load-surge issue, and there was no data loss or data impact caused by the partial outage. Our engineering team worked to optimize our system configuration to handle the traffic, but it did not completely resolve the issue. In addition, we moved top traffic generating organizations from the impacted node to minimize traffic, resulting in some improvement. 

Furthermore, we identified a code bug that held connections for an extended period of time and quickly deployed a live build to rectify the issue.
 
Timeline (in CEST)
 
May 2, 09:26 AM
Incident identified
May 2, 11:40 AM
Root cause identified - high number of connections
May 2, 12:15 PM
System configurations tuned
May 2, 01:51 PM
Started moving top-traffic generating orgs
May 2, 03:54 PM
Second-level system configuration tuning
May 3, 12:12 PM
Preparation of build to fix code bug began
May 3, 12:29 PM
Movement of top-traffic generating orgs completed
May 3, 02:40 PM
Bug fix build went live, stabilizing the system.
 
Future Preventive Measures to Avoid Recurrence of the Issue
  • Relocating selected organizations to other nodes to keep the connection count to the affected node at a minimum.
  • Monitoring the system connections proactively and re-balancing them as necessary.
  • Setting a lower connection threshold to receive early notifications when breached and take prompt action to avoid customer impact.
  • Incorporate a code-check configuration rule to prevent code from holding connections for extended periods of time before being shipped to production.

Regards,
Zoho Desk Team
    Zoho Desk Resources

    • Desk Community Learning Series


    • Digest


    • Functions


    • Meetups


    • Kbase


    • Resources


    • Glossary


    • Desk Marketplace


    • MVP Corner


    • Word of the Day


      • Sticky Posts

      • Live Webinar - Work smarter with Zoho Desk and Zoho Workplace integration

        Hello customers! Zoho Desk and Zoho Workplace are coming together for a webinar on 14th May, 2024. Zoho Workplace is a suite of productivity apps for email, chat, docs, calls, and more at one single place. Zoho Desk is closely integrated with a few tools
      • Apple iOS 17 and iPadOS 17 updates for Zoho Desk users

        Hello Zoho Desk users! Apple recently announced the release of iOS 17 and iPad OS 17. These latest OS updates will help you stay productive and efficient, through interactive and seamless user experiences. Zoho Desk has incorporated the updates to help
      • Zoho Desk Partners with Microsoft's M365 Copilot for seamless customer service experiences

        Hello Zoho Desk users, We are happy to announce that Zoho Desk has partnered with Microsoft's M365 to empower customer service teams with enhanced capabilities and seamless experiences for agents. Microsoft announced their partnership during their keynote
      • Zoho Desk Cheat Sheet For The Year-End

        Check out these Zoho Desk best practices to end this year on a high and have a great one ahead! #1 Set Business (Holiday) Hours - If you have limited working hours, please make sure you restrict your business hours or set them as holidays for the coming days. Let your customers know when you will, and won't, be available. #2 Update the Annual Holiday List - Check the holidays for the new year and update the holiday schedule. Usually, holidays from the current year will be carried over for the next
      • Deprecation of older versions of ASAP Mobile SDK | Zoho Desk

        Hello, everyone.    Greetings from Zoho Desk ASAP!   In order to continue to deliver the best and most secure experience to our mobile SDK users. On account of the recent enhancements and updates to the mobile SDKs, we have planned to mark the older versions

      Zoho CRM Plus Resources

        Zoho Books Resources


          Zoho Subscriptions Resources

            Zoho Projects Resources


              Zoho Sprints Resources


                Zoho Orchestly Resources


                  Zoho Creator Resources


                    Zoho WorkDrive Resources



                      Zoho Campaigns Resources

                        Zoho CRM Resources

                        • CRM Community Learning Series

                          CRM Community Learning Series


                        • Tips

                          Tips

                        • Functions

                          Functions

                        • Meetups

                          Meetups

                        • Kbase

                          Kbase

                        • Resources

                          Resources

                        • Digest

                          Digest

                        • CRM Marketplace

                          CRM Marketplace

                        • MVP Corner

                          MVP Corner




                          Zoho Writer Writer

                          Get Started. Write Away!

                          Writer is a powerful online word processor, designed for collaborative work.

                            Zoho CRM コンテンツ




                              ご検討中の方