Most AI security tools inspect messages. Arc Gate inspects sessions.

June 22, 2026

1:11

Most AI security tools inspect messages. Arc Gate inspects sessions.

Imagine you're monitoring an AI in real time — most tools only scan one message at a time, like checking each text in isolation. But as /u/Turbulent-Tap6723 points out on Reddit, that’s not how malicious prompts actually work. Attackers weave subtle nudges across multiple messages, emails, and web pages — nothing looks suspicious alone, but together they lead the AI astray. That’s why Arc Gate is different: it tracks the entire session, understanding the context shifts and the different sources, like prompts versus tool outputs, which shouldn’t all carry the same weight. By doing this, Arc Gate aims to stop agents from acting on hidden instructions buried in untrusted data. It’s a smarter, more nuanced approach. And honestly, this could be a game-changer for security — if others see the value in it. As /u/Turbulent-Tap6723 wonders, is this the way forward, or are existing methods already covering this ground? Either way, the shift toward session-aware inspection feels like a crucial step in defending AI systems from evolving threats.

One thing that’s always felt weird to me about prompt injection defenses is that they usually evaluate one message at a time.

But a lot of the attacks I’m seeing don’t really work that way.

A webpage says something subtle. A tool result reinforces it. An email adds another nudge. Nothing looks obviously malicious on its own, but a few turns later the agent is heading somewhere it definitely shouldn’t.

That was the motivation behind Arc Gate.

Instead of looking at each message in isolation, it keeps track of what’s happening across the entire session. It also treats different sources differently. A system prompt, a user message, a webpage, and a tool output shouldn’t all have the same authority just because they ended up in the same context window.

The goal isn’t just to catch bad prompts. It’s to stop agents from taking actions based on instructions hidden inside untrusted data.

I’m curious whether other people building agents think this is the right direction, or if I’m overthinking a problem that existing approaches already solve.

Repo: https://github.com/9hannahnine-jpg/arc-gate

submitted by /u/Turbulent-Tap6723
[link] [comments]

Audio Transcript