Loading Insight...
Loading Insight...
Loading Insight...
Insights
Insights

Large language models (LLMs) are powerful, but their outputs can be unpredictable. Even small inconsistencies or ambiguous values can silently corrupt downstream systems, creating hidden errors that are difficult to detect.
The Adaptive Anti-Corruption Layer (AACL) addresses this challenge with a two-layer boundary that normalises LLM outputs and enforces type-safe processing. By providing structured correction signals, AACL enables real-time self-correction, allowing probabilistic LLMs to integrate reliably with deterministic business logic.
Problem solved: LLM-produced format hallucinations
The Synpulse8 team formalised the AACL design pattern to integrate probabilistic LLM agents with deterministic systems via a self-healing mechanism. A version of this pattern is also available in the README of github.
The AACL provides a normalisation boundary that converts ambiguous LLM outputs into strictly typed inputs, returning structured correction signals that allow the model to self-correct at runtime. This eliminates silent format corruption and enhances reliable agentic behaviour without model retraining.
| Design pattern: Adaptive Anti-Corruption Layer (AACL) |
| Context: Integrating probabilistic LLM agents with deterministic systems |
| Problem: LLMs produce chaotic, non-type-safe outputs; direct integration causes silent format corruption |
| Solution: Two-layer architecture with normalisation boundary that provides structured feedback |
| Result: Self-correcting system where structured feedback enables runtime error correction |
Reference implementation and adversarial testing can be found in github
The self-healing mechanism requires an agentic loop architecture where the LLM can receive tool execution feedback and retry with corrected inputs. Specifically, the system must support:
Framework example:
If your system lacks an agentic loop (i.e., one-shot tool calls with no retry), the AACL pattern still provides value by preventing silent format corruption, but self-healing requires the retry mechanism.
Why it’s important
LLMs are semantic sequence models. They are not type-safe, schema-stable, and reliable data serialisers. Therefore, LLMs must provide values, while code must provide structure.
The correct architecture is a two-layer boundary separating free-form model output from deterministic business logic.
LLM (semantic planner)
↓
Interface Layer (normalisation + validation + structured errors)
↓
Implementation Layer (strict types, pure logic)
This boundary is where the system becomes self-correcting. The interface boundary is the only location where ambiguity is allowed to exist. Once execution passes into the implementation layer, ambiguity must be ZERO.
Structured output belongs in function results, not token streams. Use function calling to receive structured data from code, not to parse it from LLM-generated text. Need the JSON visible to users? Put the function result in your output queue - same place the response stream goes.
Common LLM output failures (non-exhaustive)
LLMs freely interchange:
• "true", "True", "yes", "1", True
• 5, "05", "five", "5"
• "null", "none", None, "n/a", ""
• "a.com b.com", "a.com,b.com", ["a.com", "b.com"]
• Dates in any human-readable permutation
Passing these directly to an API layer introduces silent format corruption, which is the worst class of system failure because it has a probability of “working”, then it just breaks for no apparent reason.
Architecture
This two-layer boundary is the core of the AACL. The LLM operates in a semantic space; the implementation layer operates in a typed deterministic space. The AACL is the boundary that translates between them through normalisation + structured failure signals.
1. Interface layer (LLM-facing)
Function: Convert arbitrary inputs into typed inputs.
Requirements:
• Accept union and ambiguous input types
• Normalise to canonical representations
• Validate according to strict schema expectations
• Return structured error messages when normalisation fails
This layer must be total:
Every input either normalises or fails with an LLM-usable correction signal.
2. Implementation layer (logic-facing)
Function: Perform business operations with strict typing.
• No normalisation
• No LLM-awareness
• No ambiguity handling
• Pure deterministic execution
If incorrect values reach this layer, the architecture is wrong.
Minimal example (Python)
The pattern uses a two-file structure. See full reference implementation in github.
Interface layer (LLM-facing)
from fastmcp import FastMCP
from .implementations.tavily_impl import tavily_search_impl
mcp = FastMCP("My MCP Server")
@mcp.tool()
def search_web(
query,
search_depth="basic",
max_results=5,
include_domains=None,
time_range=None
) -> dict:
"""
Search the web using Tavily's search API.
Args:
query: Search query (required)
search_depth: "basic" or "advanced" (Optional, defaults to "basic")
max_results: Number of results, 1-10 (Optional, defaults to 5)
include_domains: Domain filter (comma/space-separated or list) (Optional)
time_range: Time filter ("day", "week", "month", "year") (Optional)
Returns:
Dictionary containing search results
Raises:
ValueError: With structured correction signals for invalid inputs
"""
# Normalize ambiguous inputs to canonical forms
search_depth = _normalize_search_depth(search_depth)
include_domains = _normalize_optional_string(include_domains)
time_range = _normalize_optional_string(time_range)
# Validate with structured error messages
if time_range and time_range not in ("day", "week", "month", "year"):
raise ValueError(
"INVALID_TIME_RANGE: expected one of ['day', 'week', 'month', 'year']; "
f"received '{time_range}'. Retry with a valid value."
)
# Pass typed, normalized inputs to implementation
return tavily_search_impl(
query=query,
search_depth=search_depth,
max_results=int(max_results),
include_domains=include_domains,
time_range=time_range
)
def _normalize_optional_string(value):
"""Normalize null-like values to None."""
if value is None:
return None
if isinstance(value, str):
s = value.strip().lower()
if s in ("", "null", "none", "n/a", "na"):
return None
return value
def _normalize_search_depth(depth):
"""Normalize search depth to 'basic' or 'advanced'."""
if not depth:
return "basic"
d = str(depth).strip().lower()
if d in ("advanced", "deep", "thorough"):
return "advanced"
return "basic"Implementation layer (logic-facing)
def tavily_search_impl(
query: str,
search_depth: str,
max_results: int,
include_domains: list[str] | None,
time_range: str | None
) -> dict:
"""
Pure implementation - expects strictly typed, normalized inputs.
No validation or normalization should happen here.
"""
client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))
params = {
"query": query,
"search_depth": search_depth,
"max_results": max_results
}
if include_domains:
params["include_domains"] = include_domains
if time_range:
params["time_range"] = time_range
return client.search(**params)
Why this is self-healing
The loop:
1. LLM emits a parameter in an arbitrary format.
2. Interface layer attempts normalisation.
3. If normalisation succeeds → call implementation logic.
4. If normalisation fails → return a structured correction signal.
5. LLM re-plans and retries (ReAct pattern, no human involvement).
This produces adaptive convergence: The system self-heals at runtime by guiding the LLM to correct inputs without human supervision.
Why you never let the LLM produce JSON
This is not about syntax errors. It is about responsibility boundaries.
JSON is a deterministic serialisation format. LLMs are probabilistic sequence models.
If the LLM is responsible for producing formatted JSON, you guarantee:
• Silent type drift (“5” instead of 5)
• Mixed boolean encodings (“true” vs true vs “yes”)
• Key-order instability (breaks hashing, caching, diff-based tests)
• Schema drift over iterative refinement
• Random breakage triggered by prompt context state
These are not mistakes. They are the statistical nature of token generation. Once the model is permitted to define structure, the structure becomes non-deterministic.
Same two-layer architecture application
LLM → untyped values
Interface Layer → normalisation + schema enforcement
Implementation Layer → constructs JSON deterministically
The model never formats JSON.
Need the structured data visible in the user interface? That's fine - your function returns it, and your application layer displays it. The point is the LLM doesn't generate the structure, your code does.
Example
See full reference implementation in github.
Interface layer (LLM-facing)
from fastmcp import FastMCP
from .implementations.json_writer_impl import create_structured_data_impl
mcp = FastMCP("My MCP Server")
@mcp.tool()
def create_json(x, y, flag) -> dict:
"""
Create and return structured JSON from LLM-provided values.
Args:
x: Integer value (accepts "5", "05", 5)
y: String value
flag: Boolean flag (accepts "true", "yes", "1", True, etc.)
Returns:
Dictionary with deterministic structure (ready for JSON serialization)
Raises:
ValueError: With structured correction signals for invalid inputs
"""
# Normalize integer with structured error
try:
x = int(x)
except (ValueError, TypeError):
raise ValueError(
f"TYPE_ERROR: field 'x' must be an integer; "
f"received {repr(x)} (type: {type(x).__name__}). "
"Retry with a valid integer value."
)
# Normalize string
y = str(y)
# Normalize boolean from various representations
if isinstance(flag, str):
flag = flag.strip().lower() in ("true", "1", "yes", "on")
else:
flag = bool(flag)
# Pass typed inputs to implementation
return create_structured_data_impl(x=x, y=y, flag=flag)Implementation layer (logic-facing | strict, deterministic JSON)
def create_structured_data_impl(x: int, y: str, flag: bool) -> dict:
"""
Construct JSON structure deterministically from typed values.
The LLM never generates JSON - it only provides values.
Code defines keys, order, and types.
"""
# Structure is defined by code, not LLM
return {
"x": x,
"y": y,
"flag": flag
}Why This Works
| Responsibility | LLM | Interface Layer | Implementation Layer |
| Interpret Intent | Yes | No | No |
| Normalise Values | No | Yes | No |
| Enforce Schema | No | Yes | No |
| Construct Data Structures | No | No | Yes |
| Serialise Data | No | No | Yes |
Core principle
LLMs plan. Code types. Never let the model define structure. Always enforce structure at the boundary.
Summary
| Layer | Handles | Must Be | Failure Mode | Output |
|---|---|---|---|---|
| LLM | Semantics | Flexible | Format Hallucination | Unstructured Values |
| Interface Layer | Normalisation + Validation | Total / Deterministic | Structured Correction (Intentional Exception Raised) | Typed Inputs |
| Implementation Layer | Business Logic | Pure / Strict | Hard Failure (if reached incorrectly) | Stable Data / JSON / YAML |
The invariant: If the implementation layer sees garbage, the interface layer is incorrect.
This pattern is general and applies to every LLM-tooling integration, including MCP, ReAct, function-calling APIs, and agentic planning systems. This architecture is not a workaround for LLM weaknesses. The AACL is the correct separation of concerns for any system in which a probabilistic language generator interacts with deterministic software components.
Related Patterns
| Pattern | Relationship |
|---|---|
| DDD Anti-Corruption Layer | Conceptual ancestor — but assumes deterministic upstream domain |
| Adapter Pattern | Handles interface mismatch, but not semantic ambiguity |
| Retry with Backoff | Handles failure, but not interpretation |
| ReAct | Handles iterative convergence, but focuses on LLM output instead of relying on the known good system known as code |
For pattern catalogue inclusion.
Applicability
Use the AACL pattern when:
• Integrating LLMs with deterministic APIs, databases, or business logic
• Building function-calling or tool-use systems
• Creating MCP servers or LLM integration points
• Generating structured data (JSON, YAML) from LLM outputs
Do not use when:
• Input is already strictly typed (traditional API)
• Format variation is acceptable downstream
• Only semantic correctness matters (content, not format)
Consequences
Benefits:
• Eliminates silent format corruption
• Enables self-healing via structured errors
• Clear separation of concerns
• Works with any LLM (no retraining) that supports function calling
• Composable with existing frameworks
Trade-offs:
• Requires two-layer architecture
• Normalisation adds (minimal) latency
• Interface must evolve with edge cases
• Does not solve content hallucinations
Known uses
• OpenAI/Anthropic function calling APIs
• MCP server implementations
• LangChain custom tools
• Agentic systems
• Any LLM-to-database/API boundary
Forces resolved
The pattern balances:
• Flexibility vs. Correctness: LLM freedom + type safety
• Fail-Fast vs. Teach: Structured errors guide correction
• When to Normalise vs. Validate: Intentional design choice per parameter
• Boundary Location: Interface handles ambiguity, implementation stays pure
The resolution: The interface layer is total (handles all inputs), while the implementation is pure (assumes correctness).
Author: Morgan Lee
Organisation: Synpulse8
First published: 12 November 2025
Large language models (LLMs) are powerful, but their outputs can be unpredictable. Even small inconsistencies or ambiguous values can silently corrupt downstream systems, creating hidden errors that are difficult to detect.
The Adaptive Anti-Corruption Layer (AACL) addresses this challenge with a two-layer boundary that normalises LLM outputs and enforces type-safe processing. By providing structured correction signals, AACL enables real-time self-correction, allowing probabilistic LLMs to integrate reliably with deterministic business logic.
Problem solved: LLM-produced format hallucinations
The Synpulse8 team formalised the AACL design pattern to integrate probabilistic LLM agents with deterministic systems via a self-healing mechanism. A version of this pattern is also available in the README of github.
The AACL provides a normalisation boundary that converts ambiguous LLM outputs into strictly typed inputs, returning structured correction signals that allow the model to self-correct at runtime. This eliminates silent format corruption and enhances reliable agentic behaviour without model retraining.
| Design pattern: Adaptive Anti-Corruption Layer (AACL) |
| Context: Integrating probabilistic LLM agents with deterministic systems |
| Problem: LLMs produce chaotic, non-type-safe outputs; direct integration causes silent format corruption |
| Solution: Two-layer architecture with normalisation boundary that provides structured feedback |
| Result: Self-correcting system where structured feedback enables runtime error correction |
Reference implementation and adversarial testing can be found in github
The self-healing mechanism requires an agentic loop architecture where the LLM can receive tool execution feedback and retry with corrected inputs. Specifically, the system must support:
Framework example:
If your system lacks an agentic loop (i.e., one-shot tool calls with no retry), the AACL pattern still provides value by preventing silent format corruption, but self-healing requires the retry mechanism.
Why it’s important
LLMs are semantic sequence models. They are not type-safe, schema-stable, and reliable data serialisers. Therefore, LLMs must provide values, while code must provide structure.
The correct architecture is a two-layer boundary separating free-form model output from deterministic business logic.
LLM (semantic planner)
↓
Interface Layer (normalisation + validation + structured errors)
↓
Implementation Layer (strict types, pure logic)
This boundary is where the system becomes self-correcting. The interface boundary is the only location where ambiguity is allowed to exist. Once execution passes into the implementation layer, ambiguity must be ZERO.
Structured output belongs in function results, not token streams. Use function calling to receive structured data from code, not to parse it from LLM-generated text. Need the JSON visible to users? Put the function result in your output queue - same place the response stream goes.
Common LLM output failures (non-exhaustive)
LLMs freely interchange:
• "true", "True", "yes", "1", True
• 5, "05", "five", "5"
• "null", "none", None, "n/a", ""
• "a.com b.com", "a.com,b.com", ["a.com", "b.com"]
• Dates in any human-readable permutation
Passing these directly to an API layer introduces silent format corruption, which is the worst class of system failure because it has a probability of “working”, then it just breaks for no apparent reason.
Architecture
This two-layer boundary is the core of the AACL. The LLM operates in a semantic space; the implementation layer operates in a typed deterministic space. The AACL is the boundary that translates between them through normalisation + structured failure signals.
1. Interface layer (LLM-facing)
Function: Convert arbitrary inputs into typed inputs.
Requirements:
• Accept union and ambiguous input types
• Normalise to canonical representations
• Validate according to strict schema expectations
• Return structured error messages when normalisation fails
This layer must be total:
Every input either normalises or fails with an LLM-usable correction signal.
2. Implementation layer (logic-facing)
Function: Perform business operations with strict typing.
• No normalisation
• No LLM-awareness
• No ambiguity handling
• Pure deterministic execution
If incorrect values reach this layer, the architecture is wrong.
Minimal example (Python)
The pattern uses a two-file structure. See full reference implementation in github.
Interface layer (LLM-facing)
from fastmcp import FastMCP
from .implementations.tavily_impl import tavily_search_impl
mcp = FastMCP("My MCP Server")
@mcp.tool()
def search_web(
query,
search_depth="basic",
max_results=5,
include_domains=None,
time_range=None
) -> dict:
"""
Search the web using Tavily's search API.
Args:
query: Search query (required)
search_depth: "basic" or "advanced" (Optional, defaults to "basic")
max_results: Number of results, 1-10 (Optional, defaults to 5)
include_domains: Domain filter (comma/space-separated or list) (Optional)
time_range: Time filter ("day", "week", "month", "year") (Optional)
Returns:
Dictionary containing search results
Raises:
ValueError: With structured correction signals for invalid inputs
"""
# Normalize ambiguous inputs to canonical forms
search_depth = _normalize_search_depth(search_depth)
include_domains = _normalize_optional_string(include_domains)
time_range = _normalize_optional_string(time_range)
# Validate with structured error messages
if time_range and time_range not in ("day", "week", "month", "year"):
raise ValueError(
"INVALID_TIME_RANGE: expected one of ['day', 'week', 'month', 'year']; "
f"received '{time_range}'. Retry with a valid value."
)
# Pass typed, normalized inputs to implementation
return tavily_search_impl(
query=query,
search_depth=search_depth,
max_results=int(max_results),
include_domains=include_domains,
time_range=time_range
)
def _normalize_optional_string(value):
"""Normalize null-like values to None."""
if value is None:
return None
if isinstance(value, str):
s = value.strip().lower()
if s in ("", "null", "none", "n/a", "na"):
return None
return value
def _normalize_search_depth(depth):
"""Normalize search depth to 'basic' or 'advanced'."""
if not depth:
return "basic"
d = str(depth).strip().lower()
if d in ("advanced", "deep", "thorough"):
return "advanced"
return "basic"Implementation layer (logic-facing)
def tavily_search_impl(
query: str,
search_depth: str,
max_results: int,
include_domains: list[str] | None,
time_range: str | None
) -> dict:
"""
Pure implementation - expects strictly typed, normalized inputs.
No validation or normalization should happen here.
"""
client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))
params = {
"query": query,
"search_depth": search_depth,
"max_results": max_results
}
if include_domains:
params["include_domains"] = include_domains
if time_range:
params["time_range"] = time_range
return client.search(**params)
Why this is self-healing
The loop:
1. LLM emits a parameter in an arbitrary format.
2. Interface layer attempts normalisation.
3. If normalisation succeeds → call implementation logic.
4. If normalisation fails → return a structured correction signal.
5. LLM re-plans and retries (ReAct pattern, no human involvement).
This produces adaptive convergence: The system self-heals at runtime by guiding the LLM to correct inputs without human supervision.
Why you never let the LLM produce JSON
This is not about syntax errors. It is about responsibility boundaries.
JSON is a deterministic serialisation format. LLMs are probabilistic sequence models.
If the LLM is responsible for producing formatted JSON, you guarantee:
• Silent type drift (“5” instead of 5)
• Mixed boolean encodings (“true” vs true vs “yes”)
• Key-order instability (breaks hashing, caching, diff-based tests)
• Schema drift over iterative refinement
• Random breakage triggered by prompt context state
These are not mistakes. They are the statistical nature of token generation. Once the model is permitted to define structure, the structure becomes non-deterministic.
Same two-layer architecture application
LLM → untyped values
Interface Layer → normalisation + schema enforcement
Implementation Layer → constructs JSON deterministically
The model never formats JSON.
Need the structured data visible in the user interface? That's fine - your function returns it, and your application layer displays it. The point is the LLM doesn't generate the structure, your code does.
Example
See full reference implementation in github.
Interface layer (LLM-facing)
from fastmcp import FastMCP
from .implementations.json_writer_impl import create_structured_data_impl
mcp = FastMCP("My MCP Server")
@mcp.tool()
def create_json(x, y, flag) -> dict:
"""
Create and return structured JSON from LLM-provided values.
Args:
x: Integer value (accepts "5", "05", 5)
y: String value
flag: Boolean flag (accepts "true", "yes", "1", True, etc.)
Returns:
Dictionary with deterministic structure (ready for JSON serialization)
Raises:
ValueError: With structured correction signals for invalid inputs
"""
# Normalize integer with structured error
try:
x = int(x)
except (ValueError, TypeError):
raise ValueError(
f"TYPE_ERROR: field 'x' must be an integer; "
f"received {repr(x)} (type: {type(x).__name__}). "
"Retry with a valid integer value."
)
# Normalize string
y = str(y)
# Normalize boolean from various representations
if isinstance(flag, str):
flag = flag.strip().lower() in ("true", "1", "yes", "on")
else:
flag = bool(flag)
# Pass typed inputs to implementation
return create_structured_data_impl(x=x, y=y, flag=flag)Implementation layer (logic-facing | strict, deterministic JSON)
def create_structured_data_impl(x: int, y: str, flag: bool) -> dict:
"""
Construct JSON structure deterministically from typed values.
The LLM never generates JSON - it only provides values.
Code defines keys, order, and types.
"""
# Structure is defined by code, not LLM
return {
"x": x,
"y": y,
"flag": flag
}Why This Works
| Responsibility | LLM | Interface Layer | Implementation Layer |
| Interpret Intent | Yes | No | No |
| Normalise Values | No | Yes | No |
| Enforce Schema | No | Yes | No |
| Construct Data Structures | No | No | Yes |
| Serialise Data | No | No | Yes |
Core principle
LLMs plan. Code types. Never let the model define structure. Always enforce structure at the boundary.
Summary
| Layer | Handles | Must Be | Failure Mode | Output |
|---|---|---|---|---|
| LLM | Semantics | Flexible | Format Hallucination | Unstructured Values |
| Interface Layer | Normalisation + Validation | Total / Deterministic | Structured Correction (Intentional Exception Raised) | Typed Inputs |
| Implementation Layer | Business Logic | Pure / Strict | Hard Failure (if reached incorrectly) | Stable Data / JSON / YAML |
The invariant: If the implementation layer sees garbage, the interface layer is incorrect.
This pattern is general and applies to every LLM-tooling integration, including MCP, ReAct, function-calling APIs, and agentic planning systems. This architecture is not a workaround for LLM weaknesses. The AACL is the correct separation of concerns for any system in which a probabilistic language generator interacts with deterministic software components.
Related Patterns
| Pattern | Relationship |
|---|---|
| DDD Anti-Corruption Layer | Conceptual ancestor — but assumes deterministic upstream domain |
| Adapter Pattern | Handles interface mismatch, but not semantic ambiguity |
| Retry with Backoff | Handles failure, but not interpretation |
| ReAct | Handles iterative convergence, but focuses on LLM output instead of relying on the known good system known as code |
For pattern catalogue inclusion.
Applicability
Use the AACL pattern when:
• Integrating LLMs with deterministic APIs, databases, or business logic
• Building function-calling or tool-use systems
• Creating MCP servers or LLM integration points
• Generating structured data (JSON, YAML) from LLM outputs
Do not use when:
• Input is already strictly typed (traditional API)
• Format variation is acceptable downstream
• Only semantic correctness matters (content, not format)
Consequences
Benefits:
• Eliminates silent format corruption
• Enables self-healing via structured errors
• Clear separation of concerns
• Works with any LLM (no retraining) that supports function calling
• Composable with existing frameworks
Trade-offs:
• Requires two-layer architecture
• Normalisation adds (minimal) latency
• Interface must evolve with edge cases
• Does not solve content hallucinations
Known uses
• OpenAI/Anthropic function calling APIs
• MCP server implementations
• LangChain custom tools
• Agentic systems
• Any LLM-to-database/API boundary
Forces resolved
The pattern balances:
• Flexibility vs. Correctness: LLM freedom + type safety
• Fail-Fast vs. Teach: Structured errors guide correction
• When to Normalise vs. Validate: Intentional design choice per parameter
• Boundary Location: Interface handles ambiguity, implementation stays pure
The resolution: The interface layer is total (handles all inputs), while the implementation is pure (assumes correctness).
Author: Morgan Lee
Organisation: Synpulse8
First published: 12 November 2025
Insights
Insights

Large language models (LLMs) are powerful, but their outputs can be unpredictable. Even small inconsistencies or ambiguous values can silently corrupt downstream systems, creating hidden errors that are difficult to detect.
The Adaptive Anti-Corruption Layer (AACL) addresses this challenge with a two-layer boundary that normalises LLM outputs and enforces type-safe processing. By providing structured correction signals, AACL enables real-time self-correction, allowing probabilistic LLMs to integrate reliably with deterministic business logic.
Problem solved: LLM-produced format hallucinations
The Synpulse8 team formalised the AACL design pattern to integrate probabilistic LLM agents with deterministic systems via a self-healing mechanism. A version of this pattern is also available in the README of github.
The AACL provides a normalisation boundary that converts ambiguous LLM outputs into strictly typed inputs, returning structured correction signals that allow the model to self-correct at runtime. This eliminates silent format corruption and enhances reliable agentic behaviour without model retraining.
| Design pattern: Adaptive Anti-Corruption Layer (AACL) |
| Context: Integrating probabilistic LLM agents with deterministic systems |
| Problem: LLMs produce chaotic, non-type-safe outputs; direct integration causes silent format corruption |
| Solution: Two-layer architecture with normalisation boundary that provides structured feedback |
| Result: Self-correcting system where structured feedback enables runtime error correction |
Reference implementation and adversarial testing can be found in github
The self-healing mechanism requires an agentic loop architecture where the LLM can receive tool execution feedback and retry with corrected inputs. Specifically, the system must support:
Framework example:
If your system lacks an agentic loop (i.e., one-shot tool calls with no retry), the AACL pattern still provides value by preventing silent format corruption, but self-healing requires the retry mechanism.
Why it’s important
LLMs are semantic sequence models. They are not type-safe, schema-stable, and reliable data serialisers. Therefore, LLMs must provide values, while code must provide structure.
The correct architecture is a two-layer boundary separating free-form model output from deterministic business logic.
LLM (semantic planner)
↓
Interface Layer (normalisation + validation + structured errors)
↓
Implementation Layer (strict types, pure logic)
This boundary is where the system becomes self-correcting. The interface boundary is the only location where ambiguity is allowed to exist. Once execution passes into the implementation layer, ambiguity must be ZERO.
Structured output belongs in function results, not token streams. Use function calling to receive structured data from code, not to parse it from LLM-generated text. Need the JSON visible to users? Put the function result in your output queue - same place the response stream goes.
Common LLM output failures (non-exhaustive)
LLMs freely interchange:
• "true", "True", "yes", "1", True
• 5, "05", "five", "5"
• "null", "none", None, "n/a", ""
• "a.com b.com", "a.com,b.com", ["a.com", "b.com"]
• Dates in any human-readable permutation
Passing these directly to an API layer introduces silent format corruption, which is the worst class of system failure because it has a probability of “working”, then it just breaks for no apparent reason.
Architecture
This two-layer boundary is the core of the AACL. The LLM operates in a semantic space; the implementation layer operates in a typed deterministic space. The AACL is the boundary that translates between them through normalisation + structured failure signals.
1. Interface layer (LLM-facing)
Function: Convert arbitrary inputs into typed inputs.
Requirements:
• Accept union and ambiguous input types
• Normalise to canonical representations
• Validate according to strict schema expectations
• Return structured error messages when normalisation fails
This layer must be total:
Every input either normalises or fails with an LLM-usable correction signal.
2. Implementation layer (logic-facing)
Function: Perform business operations with strict typing.
• No normalisation
• No LLM-awareness
• No ambiguity handling
• Pure deterministic execution
If incorrect values reach this layer, the architecture is wrong.
Minimal example (Python)
The pattern uses a two-file structure. See full reference implementation in github.
Interface layer (LLM-facing)
from fastmcp import FastMCP
from .implementations.tavily_impl import tavily_search_impl
mcp = FastMCP("My MCP Server")
@mcp.tool()
def search_web(
query,
search_depth="basic",
max_results=5,
include_domains=None,
time_range=None
) -> dict:
"""
Search the web using Tavily's search API.
Args:
query: Search query (required)
search_depth: "basic" or "advanced" (Optional, defaults to "basic")
max_results: Number of results, 1-10 (Optional, defaults to 5)
include_domains: Domain filter (comma/space-separated or list) (Optional)
time_range: Time filter ("day", "week", "month", "year") (Optional)
Returns:
Dictionary containing search results
Raises:
ValueError: With structured correction signals for invalid inputs
"""
# Normalize ambiguous inputs to canonical forms
search_depth = _normalize_search_depth(search_depth)
include_domains = _normalize_optional_string(include_domains)
time_range = _normalize_optional_string(time_range)
# Validate with structured error messages
if time_range and time_range not in ("day", "week", "month", "year"):
raise ValueError(
"INVALID_TIME_RANGE: expected one of ['day', 'week', 'month', 'year']; "
f"received '{time_range}'. Retry with a valid value."
)
# Pass typed, normalized inputs to implementation
return tavily_search_impl(
query=query,
search_depth=search_depth,
max_results=int(max_results),
include_domains=include_domains,
time_range=time_range
)
def _normalize_optional_string(value):
"""Normalize null-like values to None."""
if value is None:
return None
if isinstance(value, str):
s = value.strip().lower()
if s in ("", "null", "none", "n/a", "na"):
return None
return value
def _normalize_search_depth(depth):
"""Normalize search depth to 'basic' or 'advanced'."""
if not depth:
return "basic"
d = str(depth).strip().lower()
if d in ("advanced", "deep", "thorough"):
return "advanced"
return "basic"Implementation layer (logic-facing)
def tavily_search_impl(
query: str,
search_depth: str,
max_results: int,
include_domains: list[str] | None,
time_range: str | None
) -> dict:
"""
Pure implementation - expects strictly typed, normalized inputs.
No validation or normalization should happen here.
"""
client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))
params = {
"query": query,
"search_depth": search_depth,
"max_results": max_results
}
if include_domains:
params["include_domains"] = include_domains
if time_range:
params["time_range"] = time_range
return client.search(**params)
Why this is self-healing
The loop:
1. LLM emits a parameter in an arbitrary format.
2. Interface layer attempts normalisation.
3. If normalisation succeeds → call implementation logic.
4. If normalisation fails → return a structured correction signal.
5. LLM re-plans and retries (ReAct pattern, no human involvement).
This produces adaptive convergence: The system self-heals at runtime by guiding the LLM to correct inputs without human supervision.
Why you never let the LLM produce JSON
This is not about syntax errors. It is about responsibility boundaries.
JSON is a deterministic serialisation format. LLMs are probabilistic sequence models.
If the LLM is responsible for producing formatted JSON, you guarantee:
• Silent type drift (“5” instead of 5)
• Mixed boolean encodings (“true” vs true vs “yes”)
• Key-order instability (breaks hashing, caching, diff-based tests)
• Schema drift over iterative refinement
• Random breakage triggered by prompt context state
These are not mistakes. They are the statistical nature of token generation. Once the model is permitted to define structure, the structure becomes non-deterministic.
Same two-layer architecture application
LLM → untyped values
Interface Layer → normalisation + schema enforcement
Implementation Layer → constructs JSON deterministically
The model never formats JSON.
Need the structured data visible in the user interface? That's fine - your function returns it, and your application layer displays it. The point is the LLM doesn't generate the structure, your code does.
Example
See full reference implementation in github.
Interface layer (LLM-facing)
from fastmcp import FastMCP
from .implementations.json_writer_impl import create_structured_data_impl
mcp = FastMCP("My MCP Server")
@mcp.tool()
def create_json(x, y, flag) -> dict:
"""
Create and return structured JSON from LLM-provided values.
Args:
x: Integer value (accepts "5", "05", 5)
y: String value
flag: Boolean flag (accepts "true", "yes", "1", True, etc.)
Returns:
Dictionary with deterministic structure (ready for JSON serialization)
Raises:
ValueError: With structured correction signals for invalid inputs
"""
# Normalize integer with structured error
try:
x = int(x)
except (ValueError, TypeError):
raise ValueError(
f"TYPE_ERROR: field 'x' must be an integer; "
f"received {repr(x)} (type: {type(x).__name__}). "
"Retry with a valid integer value."
)
# Normalize string
y = str(y)
# Normalize boolean from various representations
if isinstance(flag, str):
flag = flag.strip().lower() in ("true", "1", "yes", "on")
else:
flag = bool(flag)
# Pass typed inputs to implementation
return create_structured_data_impl(x=x, y=y, flag=flag)Implementation layer (logic-facing | strict, deterministic JSON)
def create_structured_data_impl(x: int, y: str, flag: bool) -> dict:
"""
Construct JSON structure deterministically from typed values.
The LLM never generates JSON - it only provides values.
Code defines keys, order, and types.
"""
# Structure is defined by code, not LLM
return {
"x": x,
"y": y,
"flag": flag
}Why This Works
| Responsibility | LLM | Interface Layer | Implementation Layer |
| Interpret Intent | Yes | No | No |
| Normalise Values | No | Yes | No |
| Enforce Schema | No | Yes | No |
| Construct Data Structures | No | No | Yes |
| Serialise Data | No | No | Yes |
Core principle
LLMs plan. Code types. Never let the model define structure. Always enforce structure at the boundary.
Summary
| Layer | Handles | Must Be | Failure Mode | Output |
|---|---|---|---|---|
| LLM | Semantics | Flexible | Format Hallucination | Unstructured Values |
| Interface Layer | Normalisation + Validation | Total / Deterministic | Structured Correction (Intentional Exception Raised) | Typed Inputs |
| Implementation Layer | Business Logic | Pure / Strict | Hard Failure (if reached incorrectly) | Stable Data / JSON / YAML |
The invariant: If the implementation layer sees garbage, the interface layer is incorrect.
This pattern is general and applies to every LLM-tooling integration, including MCP, ReAct, function-calling APIs, and agentic planning systems. This architecture is not a workaround for LLM weaknesses. The AACL is the correct separation of concerns for any system in which a probabilistic language generator interacts with deterministic software components.
Related Patterns
| Pattern | Relationship |
|---|---|
| DDD Anti-Corruption Layer | Conceptual ancestor — but assumes deterministic upstream domain |
| Adapter Pattern | Handles interface mismatch, but not semantic ambiguity |
| Retry with Backoff | Handles failure, but not interpretation |
| ReAct | Handles iterative convergence, but focuses on LLM output instead of relying on the known good system known as code |
For pattern catalogue inclusion.
Applicability
Use the AACL pattern when:
• Integrating LLMs with deterministic APIs, databases, or business logic
• Building function-calling or tool-use systems
• Creating MCP servers or LLM integration points
• Generating structured data (JSON, YAML) from LLM outputs
Do not use when:
• Input is already strictly typed (traditional API)
• Format variation is acceptable downstream
• Only semantic correctness matters (content, not format)
Consequences
Benefits:
• Eliminates silent format corruption
• Enables self-healing via structured errors
• Clear separation of concerns
• Works with any LLM (no retraining) that supports function calling
• Composable with existing frameworks
Trade-offs:
• Requires two-layer architecture
• Normalisation adds (minimal) latency
• Interface must evolve with edge cases
• Does not solve content hallucinations
Known uses
• OpenAI/Anthropic function calling APIs
• MCP server implementations
• LangChain custom tools
• Agentic systems
• Any LLM-to-database/API boundary
Forces resolved
The pattern balances:
• Flexibility vs. Correctness: LLM freedom + type safety
• Fail-Fast vs. Teach: Structured errors guide correction
• When to Normalise vs. Validate: Intentional design choice per parameter
• Boundary Location: Interface handles ambiguity, implementation stays pure
The resolution: The interface layer is total (handles all inputs), while the implementation is pure (assumes correctness).
Author: Morgan Lee
Organisation: Synpulse8
First published: 12 November 2025
Large language models (LLMs) are powerful, but their outputs can be unpredictable. Even small inconsistencies or ambiguous values can silently corrupt downstream systems, creating hidden errors that are difficult to detect.
The Adaptive Anti-Corruption Layer (AACL) addresses this challenge with a two-layer boundary that normalises LLM outputs and enforces type-safe processing. By providing structured correction signals, AACL enables real-time self-correction, allowing probabilistic LLMs to integrate reliably with deterministic business logic.
Problem solved: LLM-produced format hallucinations
The Synpulse8 team formalised the AACL design pattern to integrate probabilistic LLM agents with deterministic systems via a self-healing mechanism. A version of this pattern is also available in the README of github.
The AACL provides a normalisation boundary that converts ambiguous LLM outputs into strictly typed inputs, returning structured correction signals that allow the model to self-correct at runtime. This eliminates silent format corruption and enhances reliable agentic behaviour without model retraining.
| Design pattern: Adaptive Anti-Corruption Layer (AACL) |
| Context: Integrating probabilistic LLM agents with deterministic systems |
| Problem: LLMs produce chaotic, non-type-safe outputs; direct integration causes silent format corruption |
| Solution: Two-layer architecture with normalisation boundary that provides structured feedback |
| Result: Self-correcting system where structured feedback enables runtime error correction |
Reference implementation and adversarial testing can be found in github
The self-healing mechanism requires an agentic loop architecture where the LLM can receive tool execution feedback and retry with corrected inputs. Specifically, the system must support:
Framework example:
If your system lacks an agentic loop (i.e., one-shot tool calls with no retry), the AACL pattern still provides value by preventing silent format corruption, but self-healing requires the retry mechanism.
Why it’s important
LLMs are semantic sequence models. They are not type-safe, schema-stable, and reliable data serialisers. Therefore, LLMs must provide values, while code must provide structure.
The correct architecture is a two-layer boundary separating free-form model output from deterministic business logic.
LLM (semantic planner)
↓
Interface Layer (normalisation + validation + structured errors)
↓
Implementation Layer (strict types, pure logic)
This boundary is where the system becomes self-correcting. The interface boundary is the only location where ambiguity is allowed to exist. Once execution passes into the implementation layer, ambiguity must be ZERO.
Structured output belongs in function results, not token streams. Use function calling to receive structured data from code, not to parse it from LLM-generated text. Need the JSON visible to users? Put the function result in your output queue - same place the response stream goes.
Common LLM output failures (non-exhaustive)
LLMs freely interchange:
• "true", "True", "yes", "1", True
• 5, "05", "five", "5"
• "null", "none", None, "n/a", ""
• "a.com b.com", "a.com,b.com", ["a.com", "b.com"]
• Dates in any human-readable permutation
Passing these directly to an API layer introduces silent format corruption, which is the worst class of system failure because it has a probability of “working”, then it just breaks for no apparent reason.
Architecture
This two-layer boundary is the core of the AACL. The LLM operates in a semantic space; the implementation layer operates in a typed deterministic space. The AACL is the boundary that translates between them through normalisation + structured failure signals.
1. Interface layer (LLM-facing)
Function: Convert arbitrary inputs into typed inputs.
Requirements:
• Accept union and ambiguous input types
• Normalise to canonical representations
• Validate according to strict schema expectations
• Return structured error messages when normalisation fails
This layer must be total:
Every input either normalises or fails with an LLM-usable correction signal.
2. Implementation layer (logic-facing)
Function: Perform business operations with strict typing.
• No normalisation
• No LLM-awareness
• No ambiguity handling
• Pure deterministic execution
If incorrect values reach this layer, the architecture is wrong.
Minimal example (Python)
The pattern uses a two-file structure. See full reference implementation in github.
Interface layer (LLM-facing)
from fastmcp import FastMCP
from .implementations.tavily_impl import tavily_search_impl
mcp = FastMCP("My MCP Server")
@mcp.tool()
def search_web(
query,
search_depth="basic",
max_results=5,
include_domains=None,
time_range=None
) -> dict:
"""
Search the web using Tavily's search API.
Args:
query: Search query (required)
search_depth: "basic" or "advanced" (Optional, defaults to "basic")
max_results: Number of results, 1-10 (Optional, defaults to 5)
include_domains: Domain filter (comma/space-separated or list) (Optional)
time_range: Time filter ("day", "week", "month", "year") (Optional)
Returns:
Dictionary containing search results
Raises:
ValueError: With structured correction signals for invalid inputs
"""
# Normalize ambiguous inputs to canonical forms
search_depth = _normalize_search_depth(search_depth)
include_domains = _normalize_optional_string(include_domains)
time_range = _normalize_optional_string(time_range)
# Validate with structured error messages
if time_range and time_range not in ("day", "week", "month", "year"):
raise ValueError(
"INVALID_TIME_RANGE: expected one of ['day', 'week', 'month', 'year']; "
f"received '{time_range}'. Retry with a valid value."
)
# Pass typed, normalized inputs to implementation
return tavily_search_impl(
query=query,
search_depth=search_depth,
max_results=int(max_results),
include_domains=include_domains,
time_range=time_range
)
def _normalize_optional_string(value):
"""Normalize null-like values to None."""
if value is None:
return None
if isinstance(value, str):
s = value.strip().lower()
if s in ("", "null", "none", "n/a", "na"):
return None
return value
def _normalize_search_depth(depth):
"""Normalize search depth to 'basic' or 'advanced'."""
if not depth:
return "basic"
d = str(depth).strip().lower()
if d in ("advanced", "deep", "thorough"):
return "advanced"
return "basic"Implementation layer (logic-facing)
def tavily_search_impl(
query: str,
search_depth: str,
max_results: int,
include_domains: list[str] | None,
time_range: str | None
) -> dict:
"""
Pure implementation - expects strictly typed, normalized inputs.
No validation or normalization should happen here.
"""
client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))
params = {
"query": query,
"search_depth": search_depth,
"max_results": max_results
}
if include_domains:
params["include_domains"] = include_domains
if time_range:
params["time_range"] = time_range
return client.search(**params)
Why this is self-healing
The loop:
1. LLM emits a parameter in an arbitrary format.
2. Interface layer attempts normalisation.
3. If normalisation succeeds → call implementation logic.
4. If normalisation fails → return a structured correction signal.
5. LLM re-plans and retries (ReAct pattern, no human involvement).
This produces adaptive convergence: The system self-heals at runtime by guiding the LLM to correct inputs without human supervision.
Why you never let the LLM produce JSON
This is not about syntax errors. It is about responsibility boundaries.
JSON is a deterministic serialisation format. LLMs are probabilistic sequence models.
If the LLM is responsible for producing formatted JSON, you guarantee:
• Silent type drift (“5” instead of 5)
• Mixed boolean encodings (“true” vs true vs “yes”)
• Key-order instability (breaks hashing, caching, diff-based tests)
• Schema drift over iterative refinement
• Random breakage triggered by prompt context state
These are not mistakes. They are the statistical nature of token generation. Once the model is permitted to define structure, the structure becomes non-deterministic.
Same two-layer architecture application
LLM → untyped values
Interface Layer → normalisation + schema enforcement
Implementation Layer → constructs JSON deterministically
The model never formats JSON.
Need the structured data visible in the user interface? That's fine - your function returns it, and your application layer displays it. The point is the LLM doesn't generate the structure, your code does.
Example
See full reference implementation in github.
Interface layer (LLM-facing)
from fastmcp import FastMCP
from .implementations.json_writer_impl import create_structured_data_impl
mcp = FastMCP("My MCP Server")
@mcp.tool()
def create_json(x, y, flag) -> dict:
"""
Create and return structured JSON from LLM-provided values.
Args:
x: Integer value (accepts "5", "05", 5)
y: String value
flag: Boolean flag (accepts "true", "yes", "1", True, etc.)
Returns:
Dictionary with deterministic structure (ready for JSON serialization)
Raises:
ValueError: With structured correction signals for invalid inputs
"""
# Normalize integer with structured error
try:
x = int(x)
except (ValueError, TypeError):
raise ValueError(
f"TYPE_ERROR: field 'x' must be an integer; "
f"received {repr(x)} (type: {type(x).__name__}). "
"Retry with a valid integer value."
)
# Normalize string
y = str(y)
# Normalize boolean from various representations
if isinstance(flag, str):
flag = flag.strip().lower() in ("true", "1", "yes", "on")
else:
flag = bool(flag)
# Pass typed inputs to implementation
return create_structured_data_impl(x=x, y=y, flag=flag)Implementation layer (logic-facing | strict, deterministic JSON)
def create_structured_data_impl(x: int, y: str, flag: bool) -> dict:
"""
Construct JSON structure deterministically from typed values.
The LLM never generates JSON - it only provides values.
Code defines keys, order, and types.
"""
# Structure is defined by code, not LLM
return {
"x": x,
"y": y,
"flag": flag
}Why This Works
| Responsibility | LLM | Interface Layer | Implementation Layer |
| Interpret Intent | Yes | No | No |
| Normalise Values | No | Yes | No |
| Enforce Schema | No | Yes | No |
| Construct Data Structures | No | No | Yes |
| Serialise Data | No | No | Yes |
Core principle
LLMs plan. Code types. Never let the model define structure. Always enforce structure at the boundary.
Summary
| Layer | Handles | Must Be | Failure Mode | Output |
|---|---|---|---|---|
| LLM | Semantics | Flexible | Format Hallucination | Unstructured Values |
| Interface Layer | Normalisation + Validation | Total / Deterministic | Structured Correction (Intentional Exception Raised) | Typed Inputs |
| Implementation Layer | Business Logic | Pure / Strict | Hard Failure (if reached incorrectly) | Stable Data / JSON / YAML |
The invariant: If the implementation layer sees garbage, the interface layer is incorrect.
This pattern is general and applies to every LLM-tooling integration, including MCP, ReAct, function-calling APIs, and agentic planning systems. This architecture is not a workaround for LLM weaknesses. The AACL is the correct separation of concerns for any system in which a probabilistic language generator interacts with deterministic software components.
Related Patterns
| Pattern | Relationship |
|---|---|
| DDD Anti-Corruption Layer | Conceptual ancestor — but assumes deterministic upstream domain |
| Adapter Pattern | Handles interface mismatch, but not semantic ambiguity |
| Retry with Backoff | Handles failure, but not interpretation |
| ReAct | Handles iterative convergence, but focuses on LLM output instead of relying on the known good system known as code |
For pattern catalogue inclusion.
Applicability
Use the AACL pattern when:
• Integrating LLMs with deterministic APIs, databases, or business logic
• Building function-calling or tool-use systems
• Creating MCP servers or LLM integration points
• Generating structured data (JSON, YAML) from LLM outputs
Do not use when:
• Input is already strictly typed (traditional API)
• Format variation is acceptable downstream
• Only semantic correctness matters (content, not format)
Consequences
Benefits:
• Eliminates silent format corruption
• Enables self-healing via structured errors
• Clear separation of concerns
• Works with any LLM (no retraining) that supports function calling
• Composable with existing frameworks
Trade-offs:
• Requires two-layer architecture
• Normalisation adds (minimal) latency
• Interface must evolve with edge cases
• Does not solve content hallucinations
Known uses
• OpenAI/Anthropic function calling APIs
• MCP server implementations
• LangChain custom tools
• Agentic systems
• Any LLM-to-database/API boundary
Forces resolved
The pattern balances:
• Flexibility vs. Correctness: LLM freedom + type safety
• Fail-Fast vs. Teach: Structured errors guide correction
• When to Normalise vs. Validate: Intentional design choice per parameter
• Boundary Location: Interface handles ambiguity, implementation stays pure
The resolution: The interface layer is total (handles all inputs), while the implementation is pure (assumes correctness).
Author: Morgan Lee
Organisation: Synpulse8
First published: 12 November 2025