In the previous article, Observability: What Matters, we explored how observability helps engineers understand what's happening inside a system.
But here's the uncomfortable truth:
A healthy system doesn't always mean a healthy business.
When I was growing as a software engineer, it was hard to connect what I was building with what actually mattered for the business. We were efficient at shipping code but were we solving the right problems?
It took years (and plenty of painful launches, rewrites, and misaligned requirements) to understand this shift.
Here's a real case that changed how I think about engineering.
Case Search Product: PM Wants LTR, but the Real Fix Was Simpler
1. Problem Statement
"Search results feel irrelevant. We need to implement LTR so we can reorder results based on user behavior and improve conversion."
LTR sounds exciting, but it's heavy:
- ML pipeline
- Training & evaluation
- Feedback loop
- Feature store
- A/B testing
- β¦β¦
A 3β6 month initiative.
2. Step Back β Understand the Business Context
Before jumping into a multi-month LTR project, we took a step back to understand how the system actually behaved. Once we looked closely, the patterns were obvious:
- Users weren't complaining about ranking accuracy.
- They were complaining about clutter, duplicates, and confusing ordering.
In grocery search, a single item (Indomie Goreng, Ultra Milk, etc.) often explodes into multiple SKUs:
- different pack sizes
- bundles
- promo SKUs
- warehouse-specific SKUs
This noise looked like poor relevance.
Deep investigation showed the root cause wasn't the ranking logic at all.
3. Deep Dive Into the Catalog Structure
The catalog already had a solid data model:
- SKU β granular item (Coca-Cola 330ml, Coca-Cola 1.5L)
- Variant β product type (Beverage, Dairy, Snack)
- Product Group β shared identity (brand + flavor)
The structure wasn't the issue. How we used it was.
We tested two high-leverage improvements:
Group variants into a single product card
- Reduced noise
- Results looked cleaner and "more relevant"
Apply contextual boosting (Brand Positioning)
- Search "milk" β boost dairy, not unrelated categories
- Search "apple" β fruit first, not the electronics brand
Result: 1β2 Weeks, Not 3β6 Months
We delivered the improvement in weeks, not quarters.
- Search CTR improved from X% to Y%
- Zero-result rate dropped from X% to Y%
- Perceived relevance (via user feedback/NPS) improved significantly
High impact, minimal complexity.
Product teams often ask for large features because they don't see the underlying system behavior. But once you understand the constraints deeply, the opportunity becomes clear: a small, high-leverage change can unlock the outcome they were aiming for. This is why deep technical insight isn't just engineering β it's product strategy.
Learn to Practice Tai-Chi
This experience taught me something important: the best engineering wins often come not from building what's requested, but from redirecting the conversation toward high-leverage solutions.
As engineers, we can use our technical insight to guide the product conversation, not just toward the requested solution, but toward the right solution (high leverage impact). And once we develop a deeper understanding of business context and product language, we become far better negotiators.
Instead of responding with:
- "That's hard to build."
- "We don't have time."
- "There are too many bugs."
We can redirect the conversation toward high-impact, low-effort alternatives.
This is where engineering tai-chi comes in. In martial arts, tai chi is about redirecting an opponent's energy rather than blocking it head-on. The same principle applies here: use your understanding to gently redirect the energy of the request. Not by blocking, but by reframing.
By connects the why and how::
- why the system causes that pain
- how existing infrastructure can fix it faster
- what small change actually moves the needle
The goal isn't to push back for the sake of it. The goal is to guide the team toward a solution that is faster, simpler, aligned with real user behavior, and significantly more impactful.
Appendix: Understand The Product Language
| Metric Type | Purpose | Example Signals | Risk When Ignored | Opportunity When Used Well |
|---|---|---|---|---|
| Lagging Metrics | Final business outcomes. Slow to detect issues. | - Conversion Rate (CVR) - Revenue - Retention - Orders per user - Avg. basket size |
High risk β Problems detected too late; revenue may drop significantly before action is taken. | Align engineering with business goals. Validate long-term feature impact. |
| Leading Metrics | Early indicators that predict movement in lagging metrics. | - Search CTR - ATC per session - PDP views - Scroll depth - Zero-result rate |
MediumβHigh risk β Teams may build the wrong solution (e.g., LTR) by misreading user behavior. | Detect issues early. Find small, high-leverage improvements that move conversion quickly. |
| Watch Metrics | Guardrail metrics that maintain reliability and UX quality. | - API latency - Error rate - Crash rate - Index freshness - SKU duplication count |
LowβMedium risk β System appears healthy but UX silently degrades; hidden issues accumulate over time. | Prevent fires. Distinguish between product issues vs. technical issues. Ensure a stable experience. |