Blog
The Road to Extracting Information from Chats in Real-Time
June 18, 2025 10 min read
By Haroon Hassan, CEO

In this article
- Real‑Time OTC Chat Intelligence: Move Fast, But Measure Each Step
- Map Value and Cost on the Same Axis
- Start with a T+1 Foundation with Humans in the Loop (HITL)
- Let the HITL - Non‑HITL Spread Collapse Before Chasing Speed
- Pinpoint Which Use‑Cases Benefit from Lower Latency
- Know when to Stop Optimising Latency
Share
Real‑Time OTC Chat Intelligence: Move Fast, But Measure Each Step
Real‑time chat data APIs are coming to over‑the‑counter (OTC) FICC. Done right, they can surface fleeting quotes, enrich your situation awareness while a deal is still alive, and in some cases dramatically improve efficiency. Done badly, they burn GPU hours and engineer morale without adding a single basis‑point of edge. The trick is to earn your way down the latency curve – capturing ROI at higher latency first, then committing to speed only when every step still pays.
Some pragmatic considerations (when I use the term trader below, I mean trader or salesperson):
Map Value and Cost on the Same Axis
- Information value tends to rise as latency falls; fresher info is more valuable than stale info. But the curve diminishes in various ways depending on the application. It might diminish gently, or it might have a hard limit e.g. there is a human-in-the-loop that needs to do something.
- Compute & ops cost are modest to start – basic plumbing (pipeline to set up stream etc) – but given the need to inject machine inference into language processing, cost and complexity jumps when you demand lower latency inference.
- The wedge between those curves is your net value zone: real‑time is profitable only while you sit inside it. The goal is to capture as much as possible.

Most of the time understanding the cost and value curves wrt to latency is crucial. Additional investment in real time may be needlessly painful.

Good candidate to max out real time development: ROI grows for arbitrary reductions in latency up top some limit (e.g. HFT).
Start with a T+1 Foundation with Humans in the Loop (HITL)
- Ingest yesterday’s chat logs; and overtime understand minimal normalization required by your application.
- Create a “gold” corpus by manually labelling important information in the chats.
- Train baseline models offline and carefully monitor them against human truth. This is more subtle than you think. What needs monitoring depends on your annotation schema. Good luck getting that right!
- Ship next‑day dashboards that traders use in on a T+1 basis.
This is not a sandbox. It generates business value while you generate the high‑fidelity dataset and the processes you need for reliable automation at lower latency with minimal to no HITL.
Your T+1 pipeline likely uses validators to correct model errors. Before you shorten latency:
- Clone the pipeline without the humans (same models, same data).
- Track precision, recall and value uplift for both versions. (How do you measure value – you need to get feedback from traders).
- Retrain until the non-HITL metrics are within ~2-3 pp of the HITL baseline.
When that performance spread is stably narrow, you’ve de‑risked automation and can justify the move to lower latency, ultimately real-time.
Pinpoint Which Use‑Cases Benefit from Lower Latency
Not every application benefits equally from shaving the milliseconds. Think in terms of “latency leverage.”

Latency Leverage
Know when to Stop Optimising Latency
If sales-traders capture > 90 % of the value with 5 second latency, chasing 1 second may cost more than it returns. The art in all this of course, is estimating value – ASK THE SALES-TRADERS! At that point, shift to improvements better coverage (more products) and depth (deeper information extraction) i.e. work on increasing value at the achieved level of latency.
Key take‑aways
- Collapse the HITL vs. non-HITL performance gap first; that is your safety net.
- Match latency to task type: automate live manual steps quickly; accept that behaviour‑mining tasks harvest most value over hours or days.
- Stop chasing latency when marginal cost exceeds marginal value – instead build value at a level of latency that doesn’t break the bank and engineer morale.
- But most importantly – Judge each latency milestone by realised ROI to find the right level of latency for the application – keeping both engineers and traders firmly in the green!