Blog

The Road to Extracting Information from Chats in Real-Time

June 18, 2025   10 min read

By Haroon Hassan, CEO

Real‑Time OTC Chat Intelligence: Move Fast, But Measure Each Step

Real‑time chat data APIs are coming to over‑the‑counter (OTC) FICC. Done right, they can surface fleeting quotes, enrich your situation awareness while a deal is still alive, and in some cases dramatically improve efficiency. Done badly, they burn GPU hours and engineer morale without adding a single basis‑point of edge. The trick is to earn your way down the latency curve – capturing ROI at higher latency first, then committing to speed only when every step still pays.

Some pragmatic considerations (when I use the term trader below, I mean trader or salesperson):

Map Value and Cost on the Same Axis

  • Information value tends to rise as latency falls; fresher info is more valuable than stale info. But the curve diminishes in various ways depending on the application. It might diminish gently, or it might have a hard limit e.g. there is a human-in-the-loop that needs to do something.

 

  • Compute & ops cost are modest to start – basic plumbing (pipeline to set up stream etc) – but given the need to inject machine inference into language processing, cost and complexity jumps when you demand lower latency inference.

 

  • The wedge between those curves is your net value zone: real‑time is profitable only while you sit inside it. The goal is to capture as much as possible.

Most of the time understanding the cost and value curves wrt to latency is crucial. Additional investment in real time may be needlessly painful.

Good candidate to max out real time development: ROI grows for arbitrary reductions in latency up top some limit (e.g. HFT).

Start with a T+1 Foundation with Humans in the Loop (HITL)

  1. Ingest yesterday’s chat logs; and overtime understand minimal normalization required by your application.
  2. Create a “gold” corpus by manually labelling important information in the chats.
  3. Train baseline models offline and carefully monitor them against human truth. This is more subtle than you think. What needs monitoring depends on your annotation schema. Good luck getting that right!
  4. Ship next‑day dashboards that traders use in on a T+1 basis.

 

This is not a sandbox.  It generates business value while you generate the high‑fidelity dataset and the processes you need for reliable automation at lower latency with minimal to no HITL.

Your T+1 pipeline likely uses validators to correct model errors. Before you shorten latency:

  1. Clone the pipeline without the humans (same models, same data).
  2. Track precision, recall and value uplift for both versions. (How do you measure value – you need to get feedback from traders).
  3. Retrain until the non-HITL metrics are within ~2-3 pp of the HITL baseline.

 

When that performance spread is stably narrow, you’ve de‑risked automation and can justify the move to lower latency, ultimately real-time.

Pinpoint Which Use‑Cases Benefit from Lower Latency

Not every application benefits equally from shaving the milliseconds. Think in terms of “latency leverage.”

Latency Leverage

Know when to Stop Optimising Latency

If sales-traders capture > 90 % of the value with 5 second latency, chasing 1 second may cost more than it returns. The art in all this of course, is estimating value – ASK THE SALES-TRADERS! At that point, shift to improvements better coverage (more products) and depth (deeper information extraction) i.e. work on increasing value at the achieved level of latency.

 

Key take‑aways

  1. Collapse the HITL vs. non-HITL performance gap first; that is your safety net.
  2. Match latency to task type: automate live manual steps quickly; accept that behaviour‑mining tasks harvest most value over hours or days.
  3. Stop chasing latency when marginal cost exceeds marginal value – instead build value at a level of latency that doesn’t break the bank and engineer morale.
  4. But most importantly – Judge each latency milestone by realised ROI to find the right level of latency for the application – keeping both engineers and traders firmly in the green!

You May Like

Brief

At Sense Street we’re building AI that reconstructs counterparty intentions in all the messiness of financial chat data.

But what does this endeavour really involve? 

On-Demand
Watch our on-demand webinar to see how Sense Street streamlines your workflows, automates tasks, and enhances visibility into trading activities to meet regulatory requirements.
 
Case Study

One-Click Extraction: Precision and Efficiency in Every Trade Request.

Learn How ING Automates Sales Trader Workflow with Sense Street.