MontyCloud · Senior PM Interview
Real-Time Order Tracking
via Webhook Infrastructure

How a logistics OMS went from manual polling and missed deliveries to a real-time event infrastructure — and what I learned building it.

Presenter
Pratyush Ranjan
Company
Blowhorn · 2022–23
Scale
10,000+ orders/day · 70+ cities
↗ Full customer breakdown
01 — Customer
Not every partner needs the same thing
Tier 1 — SMB
Small & Mid Merchants
D2C brands, marketplace sellers. No dedicated tech team.
Built-in OMS UI — zero integration
Templated SMS via built-in gateway, toggle on
~60%
of partners · <200 orders/day
Tier 2 — Growth
Mid-to-Large Merchants
Mid-market e-commerce sellers with in-house tech.
Webhook consumers, own comms stack
Self-serve dashboard
~30%
of partners · 200–800 orders/day
Tier 3 — Enterprise
Decathlon · Croma · Apollo
Dedicated teams, existing CRM stacks.
Full webhook + CRM/marketing automation
Versioned schema, 12-month compat
~50%
of GMV · 1,000–3,000+ orders/day
↗ Tier model detail
01b — The Core Question
How do you give 100% of partners
real-time tracking —
not just the 10% with
engineering teams?
The Wrong Answer
Build a webhook API. Require partners to integrate. Ship it as "real-time tracking."

Result: 60% of your partner base — the SMBs with no engineering teams — get nothing. You've solved the problem for enterprise clients and left everyone else behind.
The Right Answer
Build two layers on top of one shared event infrastructure. Layer 1 gives SMBs real-time tracking with zero integration — toggle on. Layer 2 gives enterprise clients full programmatic control via webhooks.

Same underlying architecture. Full coverage on day one.
↗ Full problem analysis
02 — Problem
Three compounding pains
WISMO overload
30–90 min polling lag → support teams drowning in "Where Is My Order?" calls
22%
Delivery failures from delayed OTP
Rider details reached customers only when the rider called — minutes before arrival or not at all
17%
Missed marketing & CX triggers
Review campaigns fired 6–12 hrs post-delivery. Recovery flows ran next business day.
₹97K
↗ Cost breakdown
02b — Cost of Inaction
1,500
re-deliveries every single day

At 10k daily orders with a 15–17% first-attempt failure rate. At ₹45–65 per failed attempt — that's ₹67,500–97,500/day in direct margin erosion. Before accounting for CX damage, customer churn, or partner relationship cost.

22%
of all support calls
were WISMO — "Where Is My Order?" — every single one avoidable with real-time status
6–12 hrs
CRM trigger delay
post-delivery review campaigns and recovery flows firing hours late — missing the conversion window
7 days
to onboard one partner
median integration time before the self-serve dashboard and single-schema design existed
↗ Solution architecture detail
03 — Solution
Two layers. One architecture.
Layer 1 — SMB partners
Built-in OMS Tracking
Toggle-on dashboard. Automated templated SMS via Blowhorn's built-in messaging service. Zero API keys, zero engineering overhead. Every SMB partner gets real-time tracking on day one.
→ 100% partner coverage, zero integration cost
Layer 2 — Growth + Enterprise
Webhook Notification Infrastructure
Event-driven push to registered partner endpoints. Configurable per event type. Retry guarantees. Partners build their own downstream workflows.
→ Programmatic control for those who need it
Both layers share the same underlying event infrastructure — exposed differently based on partner capability.
↗ All five events
03b — Five Events Shipped
Every delivery moment covered
order_confirmed
Order accepted, SLA set
🛵
out_for_delivery
Rider + phone + OTP injected
📦
delivered
Confirmed timestamp
delivery_failed
Reason + reschedule
return_initiated
Condition + pickup ETA
Key payload: out_for_delivery
Rider name, masked phone, and OTP injected the moment dispatch is confirmed — before the rider arrives. Customers are ready. This one event drove the +9.6% improvement in first-attempt delivery success.
Key payload: delivery_failed
Failure reason code + reschedule window fires within 90 seconds. Enterprise clients trigger CRM recovery flows immediately — not the next business day. Reduced failed-delivery support contacts by 31%.
One schema. All five events.
Every event carries the same JSON shape. Irrelevant fields return null — never omitted. Partners integrate once and receive every event type automatically, forever. No re-integration.
↗ Schema explorer
03c — Schema Design
One schema. Every event. Null — never missing.
out_for_delivery · v2consistent shape
"event": "out_for_delivery", "rider": { "name": "Rajesh Kumar", "phone_masked": "+91 98xxx xx214" }, "delivery": { "otp": "8821", "delivered_at": null // not yet, but always present }, "failure": { "reason_code": null, // not applicable here "rescheduled_for": null }, "returns": { "initiated": null, // not applicable here "reason_code": null }
Integrate once. Never re-integrate.
Schema shape never changes. New event types or fields get added — existing integrations never break. This directly enabled our 1.5-day median onboarding, down from 7 business days.
12-month backward compatibility
Schema versioned v1 to v2. Enterprise clients build with confidence — no breaking changes without advance notice.
↗ Full user story 1
04 — User Story 1 / 3
Merchant Partner · 500–2,000 daily shipments
Automated Delivery Communication
"As a merchant partner, I want to receive a real-time push the moment an order moves to Out for Delivery — with rider details and OTP — so that I can immediately relay this to my end-customer via SMS, without any manual intervention."
Acceptance Criteria
out_for_delivery webhook fires within 60 seconds of dispatch update — not on a polling schedule
Payload includes rider_name, rider_phone_masked, delivery_otp, eta_window, order_id
3-attempt exponential backoff retry (30s → 2min → 8min) on non-200 partner response
Merchant configures which endpoint receives each event type independently
Measured Outcome
83% → 91%
first-attempt delivery success (+9.6%)
~120 fewer re-deliveries/day at pilot scale
↗ Full user story 2
04 — User Story 2 / 3
Technical Integration Lead · Merchant / Enterprise
Self-Serve Integration and Observability
"As a technical integration lead, I want to register, test, and monitor my webhook endpoints from a self-serve dashboard — so that I can go live in under 2 hours, without raising a support ticket to Blowhorn's integration team."
Acceptance Criteria
Register up to 5 endpoint URLs per account, mapped to specific event types independently
'Send Test Event' fires a synthetic payload and shows HTTP response code in real time
Event Log shows last 500 events, filterable by type, status, and timestamp
Failed events can be manually retried from dashboard with full retry history visible
Measured Outcome
7d → 1.5d
partner integration time (−79%)
Setup support tickets: −68% in 6 weeks
↗ Full user story 3
04 — User Story 3 / 3
CX / Marketing Manager · Enterprise Client
CRM and Marketing Trigger Automation
"As a CX manager, I want structured webhook events the moment an order is Delivered or Delivery Failed — so that I can trigger post-purchase CRM workflows within minutes of the outcome, not hours later."
Acceptance Criteria
delivered and delivery_failed events fire within 90 seconds of rider app update — no batching
Payload includes order_id, delivery_timestamp, failure_reason_code, customer_id for CRM routing
Separate endpoint URLs configurable per event type for independent CRM workflow routing
Versioned schema (v1 → v2) with 12-month backward compatibility — no re-integration required
Measured Outcome
+22% reviews
post-delivery · within 8 weeks
Failed-delivery support contacts: −31%
↗ Full metrics table
05 — Business Impact
Numbers that moved the business
MetricBeforeAfterChange
First-attempt delivery success~83%~91%+9.6%
WISMO calls (% of support volume)~22%~9%-59%
Partner integration lead time7 days1.5 days-79%
Status propagation latency30-90 min<90 sec60x faster
Webhook event delivery reliabilityN/A>99.5%New capability
Daily re-delivery cost savings₹60-80K/dayNetwork-wide
↗ Technical architecture
06 — Technical Architecture
Built for scale. Decoupled by design.
🔄
CDC + Event Streaming
Order status changes captured at the database level — completely decoupled from the OMS write path.
🚀
Event Dispatcher Service
Consumes the event stream, enriches payloads with rider and OTP data via internal service calls, then delivers over HTTPS.
In-Memory Retry Queue
3 retries: 30s, 2min, 8min. Failures surfaced for manual retry in dashboard.
📊
Log Search + Dashboarding
Every dispatch attempt logged with status, latency, and retry count — powers the Event Log UI and internal SLA dashboards.
Managed Container Infrastructure
Auto-scaling to 3x volume. 99.9% uptime through sale event spikes.
💻
Self-Serve Partner Dashboard
Register, test, monitor, retry — no Blowhorn engineer required.
Thank You
Questions?

Happy to go deeper on any part — the tiered partner model, the schema design, the reliability architecture, or the numbers behind any outcome.

Pratyush Ranjan
[email protected]
linkedin.com/in/ranpra
Supporting materials on next slide if needed →
Reference Materials
Supporting documents

Open in a new tab if a question warrants more detail.

1 / 15
N notes   F fullscreen