MosaicLeaks: Detecting Privacy Leaks in Research Agents

Agents & MCP

June 19, 2026 6 min read

Curated by Oleksandr Kuzmenko, AI Product EngineerUpdated June 19, 2026Sources cited on every story

AI-assisted · editor-reviewedHow we use AI

MosaicLeaks: Detecting Privacy Leaks in Research Agents

Researchers found that agents often leak internal data by interleaving private context into public web search queries. The new PA-DR training method reduces this leakage by penalizing queries that reveal proprietary fragments.

Impact: Medium

Why it matters

If your agent architecture involves local document retrieval combined with web search, it is likely leaking snippets of your private data to search providers.

TL;DR

01Informing agents about specific local facts during search increases the likelihood of data leakage.
02Simple system prompts to 'be private' are ineffective and reduce task accuracy.
03PA-DR training balances task performance with privacy protection by evaluating queries individually.

Key facts

Original strict chain success: 48.7%
PA-DR strict chain success: 58.7%
Original leakage rate: 34.0%
PA-DR leakage rate: 9.9%

The Mechanism of Mosaic Leakage

The core issue is the interleaving of local knowledge with external search. When an agent queries a search engine for 'MediConn 70% migration', it might appear innocuous, but when aggregated with previous queries about security disclosures, an adversary can deduce internal business facts. Researchers identified three levels of leakage: Intent (inferring research goals), Answer (answering private questions), and Full-Information (stating true private facts without prior context).

Challenges in Mitigation

Simply prompting the agent to 'not leak data' is insufficient. For models like Qwen3-4B, this intervention often decreases task performance (strict chain success dropping from 48.7% to 44.5%) without providing consistent protection. Training for higher task performance actually worsens the problem; models learn to include more specific context in search queries to improve retrieval results, creating richer 'leakage' paths.

The PA-DR Solution

Privacy-Aware Deep Research (PA-DR) uses a granular reward system. It assigns a privacy cost at the exact moment of planning. If a query is estimated to leak information directly or contribute to a mosaic leak, the model is penalized. This allows the model to find a balance between retrieval quality and data safety.

✓ When to use

For agents performing deep research on proprietary internal documents.
In enterprise systems requiring strict data isolation.

What to do today

Audit the outbound search queries of your research agents for sensitive entity references.
Limit the amount of context passed to web search functions in agent tools.

#Qwen3-4B

Sources

MosaicLeaks: Can your research agent keep a secret?

ShareShare on X Share on LinkedIn

Agents & MCP

June 19, 2026 6 min read

Curated by Oleksandr Kuzmenko, AI Product EngineerUpdated June 19, 2026Sources cited on every story

AI-assisted · editor-reviewedHow we use AI

Impact: Medium

Why it matters

If your agent architecture involves local document retrieval combined with web search, it is likely leaking snippets of your private data to search providers.

TL;DR

01Informing agents about specific local facts during search increases the likelihood of data leakage.
02Simple system prompts to 'be private' are ineffective and reduce task accuracy.
03PA-DR training balances task performance with privacy protection by evaluating queries individually.

Key facts

Original strict chain success: 48.7%
PA-DR strict chain success: 58.7%
Original leakage rate: 34.0%
PA-DR leakage rate: 9.9%

The Mechanism of Mosaic Leakage

Challenges in Mitigation

The PA-DR Solution

✓ When to use

For agents performing deep research on proprietary internal documents.
In enterprise systems requiring strict data isolation.

What to do today

Audit the outbound search queries of your research agents for sensitive entity references.
Limit the amount of context passed to web search functions in agent tools.

#Qwen3-4B

Sources

MosaicLeaks: Can your research agent keep a secret?

ShareShare on X Share on LinkedIn

MosaicLeaks: Detecting Privacy Leaks in Research Agents

The Mechanism of Mosaic Leakage

Challenges in Mitigation

The PA-DR Solution

Related stories

Get the morning AI brief

MosaicLeaks: Detecting Privacy Leaks in Research Agents

The Mechanism of Mosaic Leakage

Challenges in Mitigation

The PA-DR Solution

Related stories

Get the morning AI brief