Source of Truth: docs/ARCHITECTURE.md
Status: Canonical repo cleanup aligned to the current architecture as of 2026-05-16. This post continues the core series from tool execution security seams into durable traces, reconstructable evidence, operational validation, and AI-assisted troubleshooting workflows.

VS MCP Bridge Blog Series: Part 7

Durable Evidence, Trace Workflows, and AI-Assisted Troubleshooting

Part 6 explained why BridgeToolExecutor is the consistent execution boundary for policy, approval, secret-reference handling, redaction, audit, classification, correlation, and tool invocation.

Part 7 moves from the boundary to the evidence around it. The bridge is not just trying to execute tools. It is trying to make tool execution reconstructable later, by a developer or by an AI session that was not present when the behavior happened.

That is the shift that changed this project: diagnostics stopped being an afterthought and became part of the architecture.

Why Durable Evidence Matters

AI-assisted development can move quickly, but fast progress is fragile if the only record of a decision lives in chat history or a local debugging session.

VS MCP Bridge now treats important validation runs as durable evidence. A useful run should leave behind enough material to answer:

what code version was observed
what workflow was exercised
which request id and operation id were used
which boundary handled the request
where the request succeeded, failed, or stopped
which logs support that conclusion
which Mermaid diagram matches the observed flow
what the next session should trust or revalidate

That evidence turns a one-time manual observation into something another person can replay, inspect, and challenge.

The Artifact Triad

The most useful pattern has become a small triad:

a log file under artifacts/logs/
a metadata file beside it, usually .metadata.json
a Mermaid sequence diagram under docs/diagrams/

The log captures what happened. The metadata captures the run context: branch, commit, host, request id, operation id, input summary, observed result, and scope exclusions. The Mermaid diagram explains the sequence in a form that can be reviewed without rereading every log line.

None of those artifacts replaces the others. The log is the observed evidence. The metadata is the index card. The Mermaid diagram is the map.

Session Handoffs Complete The Record

For larger slices, the repo also keeps session handoffs under docs/session-handoffs/. These are not essays. They are resume points.

A good handoff records what was validated, what commit or branch was involved, what artifacts were produced, what constraints still apply, and what the next session should do first. That matters because future AI sessions should not reconstruct project state from memory or from a conversation transcript that may be incomplete.

The architecture document remains the source of truth for current behavior. The handoffs explain how the project arrived there and what evidence supports particular claims.

Trace Workflows Are Reproducible Procedures

The repo now has documented workflows for important validation paths:

tool-execution-trace-workflow.md explains how to validate the shared bridge tool catalog and executor path with correlated logs, audit metadata, redaction, policy, approval, and Mermaid output.
vsix-host-selected-text-trace-workflow.md explains how to validate the Visual Studio selected-text prompt path against a real editor selection.
LOGGING_DIAGNOSTIC_RUNBOOK.md explains how to localize hangs, keep MCP stdout clean, and collect the right UI, stderr, file, and correlation evidence.

The important detail is that these workflows are not just documentation after the fact. They are part of the development method. When the system changes, the workflow can be rerun, the artifacts can be regenerated, and the diagram can be compared against the current code path.

Correlation Makes Replay Possible

Request and operation identifiers are what make trace replay practical.

Without correlation, logs become a loose pile of events. With correlation, a run can be reconstructed across layers: MCP request, pipe attempt, catalog lookup, policy decision, approval decision, tool execution, audit envelope, result, and visible host behavior.

The point is not to add identifiers for decoration. The point is to let a future reader find the first missing or failing boundary. If a request id appears at the MCP layer but never reaches the pipe client, the failure is different from one that reaches the VSIX host and fails during service execution.

Diagnostics Must Stay Transport-Safe

Durable evidence only helps if it does not corrupt the transport it is trying to explain.

For MCP stdio, stdout must remain clean for JSON protocol traffic. Diagnostics belong in approved channels such as stderr, UI logs, file logs, Debug output, audit envelopes, and durable artifacts. The bridge uses this rule because a single stray log line on stdout can make a valid MCP server look broken.

This is why the activation diagnostics and pipe-failure diagnostics are trace-only or structured tool failures. They should help the operator understand what to do without polluting the MCP protocol stream.

Durable Evidence Improved The Architecture

The traces did more than prove that code worked. They changed the design.

Sequence diagrams and logs made it easier to see where responsibilities were blurred. That led to clearer boundaries around proposal management, host correctness, tool descriptors, catalog registration, executor-owned policy, approval-aware execution, redaction, audit metadata, MEF discovery, and VSIX activation diagnostics.

In other words, observability did not just describe the architecture. It shaped the architecture.

AI-Assisted Troubleshooting Uses Evidence First

This repo is being developed with AI assistance, so the evidence standard is practical: a future assistant should be able to inspect files in the repo and understand the current system without trusting prior chat history.

That means a good troubleshooting loop starts with durable artifacts:

read AI_START.md for the current resume map
read docs/ARCHITECTURE.md for current behavior
read the relevant workflow document for the validation path
inspect the matching log, metadata, and Mermaid files
compare the observed diagram against current code
treat the first missing or failed boundary as the next actionable problem

That process keeps the assistant grounded in repository evidence instead of inventing a story that sounds plausible.

Related Mermaid Trace Sources

The most useful diagram sources for this topic are:

tool-regex-search-trace-20260509.mmd for the compiled tool execution baseline.
tool-security-trace-20260509.mmd for policy, redaction, audit, and correlation around tool execution.
tool-approval-trace-20260516.mmd for approved and denied approval-required execution outcomes.
mef-discovery-trace-20260516.mmd for discovery-only MEF behavior feeding executor-routed tools.
vsix-activation-diagnostic-trace-20260516.mmd for inactive VSIX/named-pipe diagnostics and the operator activation path.
vsix-host-selected-text-trace-20260509.mmd for the Visual Studio selected-text prompt workflow.

Those .mmd files remain the diagram source of truth. Generated images can be useful later, but the source diagram should stay reviewable in the repo.

What This Does Not Mean

Durable traces are not a full observability platform. They are not telemetry ingestion, distributed tracing infrastructure, SIEM export, compliance storage, or production monitoring.

They are smaller and more immediate: checked-in evidence that the architecture can be understood and validated. That is enough for the current stage of the bridge.

Takeaway

VS MCP Bridge became easier to evolve when the team stopped treating diagnostics as cleanup work and started treating them as architecture.

The durable evidence pattern is simple: capture logs, preserve metadata, draw the observed sequence, and write a handoff when the result changes what future sessions should know. That pattern makes failures localizable, decisions reviewable, and AI-assisted troubleshooting much less dependent on memory.

Next In The Series

The next useful topic is how these evidence and architecture practices should shape the public BlogAI narrative: which posts should teach the transport boundary, which should teach tool execution, and which should teach the operational discipline that keeps AI-assisted systems explainable.

Adventures On The Edge

Keeping up with technologies