Beyond Dashboards: An HFT Engineer on Real-Time Operational Intelligence

3forge sits at the center of some of the most demanding trading environments in the world. To understand what that looks like from the inside, we spoke with Amit Kumar Mishra, an engineer who has spent years building real-time operational systems in high-frequency trading. He walked us through the journey from 150 disconnected screens to a single operational seat, the architecture that makes it possible, and why, in fast-moving markets, visibility alone is no longer enough.

How did you first encounter 3forge, and what problem brought it into view?

My first exposure to 3forge was around 2023, while I was working in a high-frequency trading environment. One of the biggest operational challenges at the time was fragmentation. Operational monitoring depended on more than 150 active RDP sessions across the trading infrastructure, with each server maintaining its own reporting system, scripts, monitoring utilities, and isolated workflows.

A large part of our visibility depended on separate tools and custom scripts stitched together over time. Many summaries and exchange-level reports were only available post-market, which meant operational decisions were often reactive instead of real-time.

The requirement from management was very clear: bring everything into a unified operational view. They wanted visibility into trades, algorithm-level PnLs, net positions, losses, rejections, and alerts, with real-time monitoring from a single seat, whether in the office or remote. That was my first practical introduction to 3forge.

What were you trying to solve, and what stood out first?

Initially, the focus was operational consolidation and continuous monitoring. We were trying to eliminate the dependency on disconnected reporting systems and create a unified layer where critical trading and infrastructure information could be tracked continuously.

What stood out immediately was the ability to aggregate live data from multiple systems into a structured, interactive operational view. Instead of waiting for end-of-day summaries, losses, rejections, trade activity, and strategy-level PnLs became visible in real-time.

Another important shift was that the platform was not limited to visualization alone. The same data could also drive analytics, monitoring, and workflow-driven actions. That changed the way we thought about dashboards entirely: it moved from passive reporting toward operational intelligence.

How were real-time trading systems typically assembled before 3forge?

Most real-time trading environments I encountered were assembled as collections of disconnected systems rather than a unified operational platform. Different teams would build their own reporting utilities, dashboards, scripts, and monitoring layers around individual applications or servers.

Operationally, this created duplication and fragmentation. Multiple RDP sessions, isolated reporting tools, separate databases, spreadsheets, custom scripts, and standalone monitoring utilities were common. Real-time systems often became tightly coupled to specific workflows or individuals, so troubleshooting required jumping across multiple systems and depended heavily on manual coordination.

Another major friction point was delayed visibility. Many important summaries and exchange-level reports were only available post-market or end-of-day. By the time anomalies, losses, rejections, or execution issues became visible, the chance to react proactively had already passed.

There was also significant complexity in integrating live market feeds, historical datasets, analytics engines, visualization layers, and operational workflows. In many environments these evolved independently, leading to duplicated logic, repeated data movement, and infrastructure overhead. What stood out with 3forge was the ability to move toward a centralized, event-driven model where monitoring, analytics, workflows, and live operational state could coexist in a far more integrated way.

How did your perception of the platform change over time?

Initially, I viewed 3forge primarily as a reporting and dashboarding layer, essentially a way to visualize real-time PnL summaries and operational metrics more cleanly. As we scaled usage in production, the more important architectural aspects became visible.

One of the first realizations was that in HFT environments the dashboard itself is actually the easy part. The real challenge is the movement and synchronization of data across a large distributed infrastructure. Pulling operational data continuously from hundreds of servers through polling creates network overhead, scaling challenges, latency, and operational complexity.

What became increasingly important was the way 3forge handled real-time state propagation and delta-based processing. Instead of repeatedly transferring complete datasets, only incremental changes moved across the network. In large environments that difference matters: it reduces unnecessary traffic while maintaining continuous visibility.

The modularity of the platform also stood out. As we moved beyond initial dashboards, it became clear the system could support broader workflows where different components evolved independently while staying operationally connected. And the ease of use mattered: teams with basic technical understanding could build useful operational views without long development cycles, which lowered the barrier between infrastructure teams, operational teams, and business users. Over time, my understanding shifted from seeing 3forge as a visualization platform to seeing it as a real-time operational framework.

What does "going beyond the defaults" mean in HFT environments?

In HFT, reaction time is critical, so the architecture cannot be built around traditional polling-based models. Polling introduces delay, unnecessary network traffic, and scaling overhead, and by the time data is fetched and processed the opportunity to react may already be lost.

One limitation of many default deployments is the assumption that systems scale comfortably in monolithic environments. In practice, monolithic architectures quickly become bottlenecks, because every additional dependency introduces contention, delay, and recovery complexity. For us, modular deployment became extremely important: decoupling ingestion, workflows, visualization, and analytics reduced both infrastructure stress and operational fragility while keeping state synchronized.

Another challenge is data-flow reliability. Even when latency is low, the receiving layer must ingest bursts of updates continuously without dropping information.

And visibility alone is not enough. Even if humans can see real-time changes instantly, human reaction time becomes the bottleneck. The next evolution is event-driven workflows, where systems subscribe directly to trade data, operational events, and market-state changes and trigger actions automatically. That is where the architecture shifts from dashboards and monitoring into real-time operational orchestration.

What role does modularity play, especially around Center, Web, and Relay?

In trading infrastructure, modularity is not just a design preference; it is an operational necessity. Even when components run on the same physical system, keeping responsibilities separated makes the environment easier to scale, troubleshoot, and evolve.

The way I think about it is simple: Relay should handle ingestion, Center instances should process and maintain data, and Web should focus on presentation and user interaction. We used relays to fetch data from Kafka and conditionally forward it to the relevant consuming instances, which avoided unnecessary data movement and let downstream modules subscribe only to what they needed.

When responsibilities are merged into one monolithic setup, even small changes such as schema modifications can require restarts or cause wider disruption. With modularity, each component does its part independently: one module processes trade data, another calculates PnL, another handles alerts or subscriptions, while the Web layer collates and displays the outputs. That lets calculations and workflows run in parallel instead of forcing every stage to wait for the previous one.

Processed data from one module can also become the source for another, creating a chain of real-time workflows where different instances operate simultaneously, each adding value without blocking the system. That is where decoupling creates real operational value: better resilience, less unnecessary dependency, parallel processing, and the ability to scale or modify specific parts independently.

What makes real-time data architecture hard in low-latency environments?

It is difficult because the system has to balance speed, reliability, and continuous state accuracy at the same time. In HFT, data does not arrive evenly. There are ingestion spikes, bursts of events, and periods where the receiving layer must keep queues clear continuously. If queues start building up, the system may still be running, but operationally it has already started falling behind the market.

The biggest challenges are ingestion spikes, latency versus reliability, and the convergence of live and historical data. Live systems cannot wait: order rejections, for example, need to be visible and acted on immediately, or the business impact has already occurred. Historical systems have a different nature; they can batch process, replay, aggregate, and analyze patterns over longer periods. In our case, viewing market standing and operational behavior over a rolling six-month period became valuable for analysis and strategy improvement.

The challenge grows when teams combine live feeds, historical data, analytics, visualization, and workflows in the same architecture. Live systems need real-time resources, fast processing, and immediate response, while historical systems prioritize storage, aggregation, and replayability. In live environments, throughput, queue clearance, and workflow subscription response time all matter. It is not enough to display data; systems must subscribe to relevant events, process them quickly, and trigger responses without creating bottlenecks. That is why real-time trading architecture must be designed around event flow, buffering, state management, and response time from the beginning.

What trends are you seeing in India's trading technology ecosystem?

India's trading technology ecosystem is evolving very quickly, and expectations around real-time systems have changed significantly in the last few years. Earlier, many environments were assembled with scripts, isolated reporting tools, manual monitoring, and end-of-day summaries. Today, firms increasingly expect live visibility into orders, trades, rejections, PnL, risk, infrastructure health, and operational exceptions.

There is growing emphasis on automation, real-time monitoring, market-data handling, operational dashboards, and faster reaction to events. Algorithmic trading and infrastructure modernization are no longer niche; they are mainstream operational priorities. Another major trend is the move from dashboards as passive displays toward dashboards as operational control layers: teams want systems that subscribe to live data, detect anomalies, trigger actions, feed downstream systems, and reduce human reaction time.

At the same time, many firms still face friction from fragmented tools, people-dependent workflows, legacy systems, and delayed visibility. The opportunity now is to move toward modular, event-driven architectures where analytics, monitoring, workflows, and live data processing are connected without unnecessary duplication. The Indian market is entering a phase where firms no longer ask only, "Can we see the data?" They are asking, "Can the system understand what changed and react fast enough to matter?"

How are expectations shifting from dashboards to event-driven workflows?

The expectation is clearly moving beyond dashboards as passive visualization tools. A dashboard that shows PnL, trades, positions, orders, and exceptions in one place is still important, but in high-frequency environments visibility alone is no longer enough.

The real question is: what happens after the system detects a change? If an order is rejected, a strategy moves outside expected behavior, PnL deteriorates suddenly, or infrastructure health changes, the system should not simply display it. It should subscribe to the event, evaluate the condition, trigger workflows, notify the right people, and where appropriate feed downstream systems for further action. That is where trading infrastructure is evolving from dashboards into event-driven operational systems.

In our environment, real-time subscriptions and operational workflows became critical. We implemented Slack-based alerts and kill-switch mechanisms triggered by real-time monitoring conditions, to reduce dependency on manual observation and shorten reaction time. Human reaction time is a major factor: in fast markets, by the time someone notices an issue on a dashboard and reacts, the impact may already have occurred. Firms increasingly want platforms that do more than visualize, offering subscriptions, alerts, automated actions, integrations with external systems, and the ability to convert operational intelligence into immediate response.

How has 3forge itself evolved as you have spent more time with it?

Initially I viewed 3forge primarily as a reporting and visualization layer, and our first use case was operational consolidation: bringing trades, PnLs, losses, rejections, and monitoring into a unified real-time view. Over time my understanding changed significantly.

One of the first major realizations was that 3forge, combined with custom workflows and integrations, could evolve into a much broader operational architecture rather than just a dashboarding system. As we explored deeper, capabilities like in-memory tabular data handling, historical storage, real-time subscriptions, triggers, and AMIScript workflows became increasingly important. We started building operational logic around the platform instead of just displaying information on top of it.

The evolution was gradual. Initially we used external Python processes to poll data, run calculations, and push processed outputs back in to support workflows beyond the default capabilities at the time, such as strategy-specific PnL and custom operational analytics. Later we developed custom AMI modules in Java for tighter, more native integrations. That changed our perception of the platform from a visualization tool into a flexible real-time operational framework.

Another major shift was the subscription-driven model: instead of continuously polling for state changes, systems could subscribe to relevant updates and react in real-time. That was especially important in workflows such as monitoring and squaring off open positions, where reaction time directly matters. What became strategically valuable was the openness of the architecture: integrating custom processing layers, combining historical and real-time data, building triggers and workflows, and evolving the system progressively without redesigning everything. Over time, my perception shifted from a reporting platform to a real-time operational ecosystem that supports analytics, workflows, monitoring, subscriptions, automation, and custom integrations together.

What advice would you give teams implementing 3forge for demanding environments?

Start by mapping the complete lifecycle of an event through the trading system: how data moves from ingestion to workflows, analytics, storage, and finally to dashboards or automated actions. In trading, a single ingested order can trigger multiple independent workflows at once. The same packet may need to update pending order books, order history, trade books, reject books, cancel books, net positions, and order-analysis modules simultaneously. Later, netbook calculations may depend on tradebook data, while real-time prices may need to be fetched separately from market-data systems.

The important architectural decision is identifying which workflows are sequential and which can run in parallel. If processes can run simultaneously, forward packets to multiple instances immediately rather than forcing one long sequential chain. One of the biggest mistakes teams make is placing the complete flow inside the same processing instance, where the next order cannot move efficiently until the previous one finishes every stage, creating unnecessary latency and bottlenecks. Design instead around modular, parallel processing, where each instance has a specific responsibility and downstream modules subscribe only to the data they need.

Data design matters too. Teams often store data exactly as received without evaluating field cardinality. Because 3forge uses a columnar storage model, high-cardinality fields can create unnecessary memory pressure in live environments. Historical systems are easier to optimize because data can be compressed or reorganized over time, but live systems need careful memory planning from the start.

I would also strongly recommend thinking beyond dashboards early in the design process. The real advantage comes from identifying workflows that can be automated with subscriptions, triggers, and custom logic. The biggest mindset shift is understanding that dashboards are only one layer of the architecture; the actual competitive advantage comes from reducing operational reaction time, distributing workflows intelligently, and converting real-time visibility into real-time action. A phrase I keep coming back to is: "What next... why not better?"

Beyond Dashboards:
An HFT Engineer on Real-Time Operational Intelligence