Case Study
Cracking the Code to Excellence
How Shiprocket, a Logistics tech unicorn achieved a 22% reduction in Production Bugs?
The rise to unicorn status for Shiprocket as a logistics tech company means disrupting the warehousing, fulfillment, and shipping industry. Shiprocket's keen focus on tech-led growth sets it apart in the logistics tech space. This means when your e-commerce orders are being shipped lightning-fast, the behind-the-scenes technology that delivers operational excellence is what makes it happen. To bring in customer delight, this unicorn strived for engineering excellence.
Their engineering teams had already achieved excellent velocity. But they were required insights into how to improve the engineering delivery that enabled them to ship high-quality features while validating a slowdown of the release cycles.
Reduced production issues by 22% while balancing speed
The quality of features released by software engineers determines the on-ground efficiency of fulfillment and success of delivered features. This is even more critical for an eCommerce logistics and shipping software solution that’s running millions of deliveries every week. This ties back to the process efficiency and the speed at which developers release features. Everything seems fine until here. But the question we want to highlight is
“How to maintain a high quality of feature releases while maintaining a good feature release velocity?”
Before hivel.ai
Faster release cycles with compromised review time.
This led to production issues arising out of unreviewed PRs
The issues forced the engineering teams to resort to hotfixes during every sprint that went unnoticed.
More rework resulted in engineer burnout from context switching between new work & rework.
Engineering Teams were troubled by resolving Production Isses
After hivel.ai
Increased time spent on Reviews, Slashed unreviewed PRs to almost zero.
With increased PR reviews, bugs went down every sprint.
Because of fewer bugs, the team focused on fewer feature releases but with higher quality
With fewer bugs to resolve, there was less context switching, Developers released more high-quality features every sprint.
Delivered high-quality software to end-users.
Data from hivel.ai allowed people to meet quality standards & empowered engineering leaders to communicate the reason for slower delivery using data.
Why did Shiprocket need data from hivel.ai to solve this problem?
Problem
Their engineering team was caught up in a vicious cycle of teams delivering feature releases at a high velocity. The engineers were busy but were experiencing quality issues. Poor quality led to more production issues. Fixing them over and over led to burnout of engineers. But the reason for high production issues (Change Failure Rate), could not be pinned down.
They had to find answers to these questions.
- Is high deployment frequency always good? What’s the optimal pace for our team?
- Why has the deployment frequency dropped?
- Why has there been an increase in the change failure rate?
How did Shiprocket reduce Production Bugs with hivel.ai?
Shiprocket could reduce production bugs only with better visibility into reasons that hurt the quality of features, for the Engineering Leadership.
Defined Objectives:
- Find an optimal pace of deployment frequency to deliver good code quality.
- Find the culprit behind the high change failure rate.
Resolution:
Improve the Speed and Quality of Features released at the same time. Reduce the Change Failure Rate and find an optimal deployment frequency without hurting the code quality.
Some Investigations
Improve the Speed and Quality of Features released at the same time. Reduce the Change Failure Rate and find an optimal deployment frequency without hurting the code quality.
Deep dive into the PR review process, coding review, and process dashboards to identify reasons for the high change failure rate.
What dashboards revealed data to support their gut feeling?
Solution :
Shiprocket's engineering leadership identified the reasons for increased production bugs using data from hivel.ai. The Process breakdown and Cockpit dashboards revealed the reasons for the high change failure rate.
Their deep dive into the Process breakdown and Pull request dashboards revealed that the PR review process time was unrealistic due to quick 1-minute reviews that led to bugs. These unreviewed PRs with bugs were sent to the production pipeline, which led to many hotfixes.
Resolution:
The engineering team integrated their version control (Git) and Jira boards to understand a correlation between the work estimates and actual work done.
They used Process, Pull Request, and Cockpit screens to measure metrics such as:
Deployment frequency
Change failure rate
No. of unreviewed PRs
Time taken to review PRs
This chart displays the change failure rate increased in September 2022 with an increase in deployment frequency at the same time. When further investigated, the engineering leaders found out that there were PRs being merged without being reviewed, and the time spent on PR review did not see any change. This was a red flag.
LOOK DEEPER : Is the speed of your feature releases affecting customers' satisfaction or their underlying experience?
Are unreviewed PRs a good or bad sign of software feature release efficiency?
When PRs are merged without any review, it leads to a lot of buggy code being pushed into the applications. The user would have a poor experience with your software, resulting in customer attrition, an increase in incidents, and bugs in the code during quality control and testing.
How to identify a poor PR review process & arrest flashy reviews?
When a PR is reviewed in a few seconds or minutes, then it is flagged as a ‘flashy review.’ It simply means the person responsible for reviewing has not paid enough attention to review the code. There could be many reasons, such as long lines of code, unfavorable bias within the team, negligence, etc.
Another reason that goes unnoticed in the absence of data is whether the reviewer has reviewed the code before publishing. These two flaws in the PR review process would reflect the reduced time in PR review.
Leadership Buy-in
Using data from Hivel, the engineering leadership was able to represent the problem and validate their decision to slow down release cycles. They wereable to provide direction to the engineering managers on how to managesprints better. This helped them reduce production bugs and increase thequality of the features shipped.
Productivity Insights and Learnings:
- Don’t overvalue speed at the cost of neglected quality.
- If your engineering teams see a high change failure rate, they may need to reevaluate their PR review procedures.
- Watch out for flashy reviews, unreviewed PRs merged, and PR review time.
Shiprocket's engineering leadership proactively used data from hivel.ai to encourage engineers to implement a meaningful review cycle that supports the quality of the review process. To get the buy-in from the stakeholders and leadership on slowing down the release cycle, they used this data as a point of proof to help them validate their proposal.
We are on a mission to empower engineering leaders with data to build High Velocity teams.
hivel.ai is a Productivity Insights platform designed by engineers for engineers.
We have built-in Software Engineering AnalyticsDashboards made for engineering teams which delivers Productivity Insights. It aids in creating a data-driven culture for people and process efficiency. We aim tomake engineering teams consistently release high-quality features in less time, prevent and minimizeproduction bugs, and also avoid team burnout.