RAW Test Framework¶
The RAW test framework is the deterministic test harness for Repair Action Workflows. It exercises RAW decision branches without a live device, without RADKit, without an LLM in the headless path, and optionally without Webex.
A RAW is a decision tree. Every terminal leaf - resolution, escalation, failure, and denial paths where applicable - should have a test with known CLI or log output. That makes RAW changes reviewable in pull requests and practical to run in CI.
What the Framework Provides¶
Two runners share one bundle format:
| Runner | Purpose |
|---|---|
scripts/run_raw_tests.py |
Pure-Python headless runner for CI and local validation. It uses scripts/lib/raw_interpreter.py and does not use an LLM, Webex, or RADKit. |
network-troubleshooter test mode |
Agent runner for the full demo path, including Webex messages tagged [TEST], scripted approvals, troubleshooting bundle output, and parent-agent orchestration. |
Bundles are co-located with the artifacts they validate so RAW logic, fixtures, and expected outcomes can move through review together.
Bundle Layout¶
Use one test bundle per RAW:
intelligence-artifacts/
AD000004-example-fault/
FS000004-EXAMPLE_FAULT.yml
RAW000004-EXAMPLE_FAULT_REPAIR.yml
tests/
RAW000004-EXAMPLE_FAULT_REPAIR.tests.yml
The minimum coverage target is one test for each terminal leaf of the RAW's decision tree.
Bundle Schema¶
The current schema version is "1.0.0". The authoritative JSON Schema is
packaged with the raw-test-author skill at
.opencode/skills/raw-test-author/assets/test-bundle.schema.json.
Validate a bundle through the skill or directly with the packaged validator:
python .opencode/skills/raw-test-author/scripts/validate_test_bundle.py <bundle-path>
python .opencode/skills/raw-test-author/scripts/validate_test_bundle.py <bundle-path> --strict
Other skills and agents should refer to the validator by skill ownership
(raw-test-author) rather than treating the script path as a separate public
API.
schema_version: "1.0.0"
raw_id: RAW000004
raw_path: intelligence-artifacts/AD000004-example-fault/RAW000004-EXAMPLE_FAULT_REPAIR.yml
fs_path: intelligence-artifacts/AD000004-example-fault/FS000004-EXAMPLE_FAULT.yml
default_approvals:
default: APPROVED
overrides: []
tests:
- name: resolve_via_rollback
description: Happy path with rollback and verification.
source: synthetic
mode: strict
agent_only: false
webex_notify: true
alert_payload:
incident_id: INC-20260529T120102Z
alert_def_id: AD000004
device_hostname: r1
alert_vars:
peer_ip: 192.0.2.10
prefix_count: 142000
kb_context_override: |
sev_level: P2
change_window_active: true
known_issue_match: KB002
ia_artifacts_override: null
responses:
- step_id: "1"
command: "show bgp ipv4 unicast neighbors 192.0.2.10 received-routes | count"
output: |
Number of routes: 142000
approvals:
default: APPROVED
overrides:
- step_id: "4"
command: "rollback configuration to checkpoint pre-anomaly"
decision: APPROVED
expected:
outcome: resolution
step_path: ["1", "2", "3", "4", "5"]
must_visit: []
must_not_visit: []
variables:
rollback_applied: true
webex_events:
- fault-received
- step-progress
- approval-card
- resolution
Field Reference¶
| Field | Required | Notes |
|---|---|---|
schema_version |
yes | Current value is "1.0.0". |
raw_id |
yes | Matches the RAW metadata ID. |
raw_path / fs_path |
yes | Repo-relative paths resolved by both runners. |
default_approvals |
no | Applied to every test that does not set approvals. |
tests[].name |
yes | Snake case and unique within the bundle. |
tests[].source |
yes if synthetic | Use synthetic or captured; generated fixtures must be marked synthetic. |
tests[].mode |
no | Defaults to strict. hybrid-reasoning tests are skipped by the headless runner. |
tests[].agent_only |
no | Defaults to false. true means the headless runner skips the test. |
tests[].webex_notify |
no | Defaults to true. false suppresses Webex notification in the agent runner. |
tests[].alert_payload |
yes | Same shape as the webhook relay payload. |
tests[].alert_payload.incident_id |
no | Display and correlation metadata for logs and notifications. The agent can mint a local fallback when omitted. |
tests[].responses |
yes | Canned CLI responses. Match by (step_id, command), then by command only. |
tests[].approvals |
no | Same shape as default_approvals; consulted for every approval-needed event. |
tests[].expected |
yes | Must include at least outcome. step_path is strongly recommended for strict tests. |
Headless Runner¶
Use the headless runner for CI and deterministic local checks:
python scripts/run_raw_tests.py --bundle intelligence-artifacts/AD000002-bgp-neighbor-admin-shutdown-xr/tests/RAW000002-BGP_NEIGHBOR_ADMIN_SHUTDOWN_REPAIR.tests.yml
python scripts/run_raw_tests.py --all
python scripts/run_raw_tests.py --bundle <path> --test resolve_via_rollback
python scripts/run_raw_tests.py --all --junit out/raw-test-results.xml --summary out/raw-test-summary.md --json out/raw-test-results.json
python scripts/run_raw_tests.py --all --no-webex
The runner skips tests with agent_only: true or mode: hybrid-reasoning.
It exits with 0 when all selected tests pass or skip, and non-zero on any
failure or error. Per-test artifacts are written under
logs/test-runs/<UTC>-<raw-id>-<test-name>/.
TestResult.status is one of pass, fail, skipped, or error.
Agent Runner¶
Invoke the network-troubleshooter agent with a payload containing
test_bundle_path and, optionally, test_name.
Run the RAW test bundle for AD000002 / RAW000002 in test mode.
{
"alert_def_id": "AD000002",
"device_hostname": "xr-43",
"mode": "strict",
"test_bundle_path": "intelligence-artifacts/AD000002-bgp-neighbor-admin-shutdown-xr/tests/RAW000002-BGP_NEIGHBOR_ADMIN_SHUTDOWN_REPAIR.tests.yml",
"test_name": "resolve_via_no_shutdown",
"webex_notify": false
}
When test_name is omitted, the agent runs every test in the bundle
sequentially and emits one result block per test.
The agent runner:
- Reads the bundle and loads the FS and RAW declared by
fs_pathandraw_path. - Uses
kb_context_overrideoria_artifacts_overridewhen provided; otherwise it uses the normal reader path. - Injects each test's
responsesandapprovalsintofault-remediationastest_bundle. - Tags Webex messages with
[TEST]andtest_run_idwhen Webex is enabled. - Writes
session.md,result.json, and bundle artifacts underlogs/test-runs/<UTC>-<raw-id>-<test-name>/.
Hard Rules¶
- Do not call
radkit_*while atest_bundleis in play. Any such call is aTEST_MODE_VIOLATIONand the agent must abort the test. - Do not execute real waits.
waitactions are no-ops in test mode. - Do not write test sessions under
logs/troubleshooting/; uselogs/test-runs/. - Missing canned CLI output is a test failure with
reason: missing-canned-responseand must name the(step_id, command)pair. - Approval cards can still be posted when Webex is enabled, but the agent must
consume the scripted decision from
approvals.overridesorapprovals.defaultwithout waiting for a human click. - Tests that depend on parent-agent behavior the headless RAW interpreter
cannot see, such as Webex click verification or LLM judgement, must set
agent_only: true.
Authoring Bundles¶
Use the raw-test-author skill for new RAWs. It reads the RAW and paired FS,
enumerates terminal leaves, creates deterministic CLI or log fixtures, marks
generated fixtures as source: synthetic, and writes the bundle at the
canonical tests/<RAW-id>-<slug>.tests.yml path.
Do not hand-author bundles for new RAWs when the skill is available. Let the skill enumerate paths so coverage gaps are visible in review.
Published Examples¶
intelligence-artifacts/AD000002-bgp-neighbor-admin-shutdown-xr/tests/RAW000002-BGP_NEIGHBOR_ADMIN_SHUTDOWN_REPAIR.tests.ymlintelligence-artifacts/AD000003-bgp-max-prefix-adjchange-xr/tests/RAW000003-BGP_NEIGHBOR_MAX_PREFIX_LIMIT_EXCEEDED_REPAIR.tests.yml
File Map¶
| Path | Role |
|---|---|
scripts/lib/raw_interpreter.py |
Pure-Python RAW interpreter. |
scripts/run_raw_tests.py |
Headless test runner CLI. |
.opencode/skills/fault-remediation/SKILL.md |
Runtime RAW interpreter skill with test_bundle handling. |
.opencode/agents/network-troubleshooter.md |
Parent agent test-mode orchestration. |
.opencode/skills/webex-notify/SKILL.md |
Webex templates; accepts test_title_prefix and test_run_id. |
.opencode/skills/raw-test-author/SKILL.md |
Test bundle authoring and validation skill. |
intelligence-artifacts/<alert-def-id>/tests/<raw-id>.tests.yml |
Co-located RAW test bundle. |
logs/test-runs/<UTC>-<raw-id>-<test-name>/ |
Per-test session log and result.json. |