RAW Test Framework¶

The RAW test framework is the deterministic test harness for Repair Action Workflows. It exercises RAW decision branches without a live device, without RADKit, without an LLM in the headless path, and optionally without Webex.

A RAW is a decision tree. Every terminal leaf - resolution, escalation, failure, and denial paths where applicable - should have a test with known CLI or log output. That makes RAW changes reviewable in pull requests and practical to run in CI.

What the Framework Provides¶

Two runners share one bundle format:

Runner	Purpose
`scripts/run_raw_tests.py`	Pure-Python headless runner for CI and local validation. It uses `scripts/lib/raw_interpreter.py` and does not use an LLM, Webex, or RADKit.
`network-troubleshooter` test mode	Agent runner for the full demo path, including Webex messages tagged `[TEST]`, scripted approvals, troubleshooting bundle output, and parent-agent orchestration.

Bundles are co-located with the artifacts they validate so RAW logic, fixtures, and expected outcomes can move through review together.

Bundle Layout¶

Use one test bundle per RAW:

intelligence-artifacts/
  AD000004-example-fault/
    FS000004-EXAMPLE_FAULT.yml
    RAW000004-EXAMPLE_FAULT_REPAIR.yml
    tests/
      RAW000004-EXAMPLE_FAULT_REPAIR.tests.yml

The minimum coverage target is one test for each terminal leaf of the RAW's decision tree.

Bundle Schema¶

The current schema version is "1.0.0". The authoritative JSON Schema is packaged with the raw-test-author skill at .opencode/skills/raw-test-author/assets/test-bundle.schema.json.

Validate a bundle through the skill or directly with the packaged validator:

python .opencode/skills/raw-test-author/scripts/validate_test_bundle.py <bundle-path>
python .opencode/skills/raw-test-author/scripts/validate_test_bundle.py <bundle-path> --strict

Other skills and agents should refer to the validator by skill ownership (raw-test-author) rather than treating the script path as a separate public API.

schema_version: "1.0.0"
raw_id: RAW000004
raw_path: intelligence-artifacts/AD000004-example-fault/RAW000004-EXAMPLE_FAULT_REPAIR.yml
fs_path: intelligence-artifacts/AD000004-example-fault/FS000004-EXAMPLE_FAULT.yml

default_approvals:
  default: APPROVED
  overrides: []

tests:
  - name: resolve_via_rollback
    description: Happy path with rollback and verification.
    source: synthetic
    mode: strict
    agent_only: false
    webex_notify: true

    alert_payload:
      incident_id: INC-20260529T120102Z
      alert_def_id: AD000004
      device_hostname: r1
      alert_vars:
        peer_ip: 192.0.2.10
        prefix_count: 142000

    kb_context_override: |
      sev_level: P2
      change_window_active: true
      known_issue_match: KB002
    ia_artifacts_override: null

    responses:
      - step_id: "1"
        command: "show bgp ipv4 unicast neighbors 192.0.2.10 received-routes | count"
        output: |
          Number of routes: 142000

    approvals:
      default: APPROVED
      overrides:
        - step_id: "4"
          command: "rollback configuration to checkpoint pre-anomaly"
          decision: APPROVED

    expected:
      outcome: resolution
      step_path: ["1", "2", "3", "4", "5"]
      must_visit: []
      must_not_visit: []
      variables:
        rollback_applied: true
      webex_events:
        - fault-received
        - step-progress
        - approval-card
        - resolution

Field Reference¶

Field	Required	Notes
`schema_version`	yes	Current value is `"1.0.0"`.
`raw_id`	yes	Matches the RAW metadata ID.
`raw_path` / `fs_path`	yes	Repo-relative paths resolved by both runners.
`default_approvals`	no	Applied to every test that does not set `approvals`.
`tests[].name`	yes	Snake case and unique within the bundle.
`tests[].source`	yes if synthetic	Use `synthetic` or `captured`; generated fixtures must be marked `synthetic`.
`tests[].mode`	no	Defaults to `strict`. `hybrid-reasoning` tests are skipped by the headless runner.
`tests[].agent_only`	no	Defaults to `false`. `true` means the headless runner skips the test.
`tests[].webex_notify`	no	Defaults to `true`. `false` suppresses Webex notification in the agent runner.
`tests[].alert_payload`	yes	Same shape as the webhook relay payload.
`tests[].alert_payload.incident_id`	no	Display and correlation metadata for logs and notifications. The agent can mint a local fallback when omitted.
`tests[].responses`	yes	Canned CLI responses. Match by `(step_id, command)`, then by `command` only.
`tests[].approvals`	no	Same shape as `default_approvals`; consulted for every approval-needed event.
`tests[].expected`	yes	Must include at least `outcome`. `step_path` is strongly recommended for `strict` tests.

Headless Runner¶

Use the headless runner for CI and deterministic local checks:

python scripts/run_raw_tests.py --bundle intelligence-artifacts/AD000002-bgp-neighbor-admin-shutdown-xr/tests/RAW000002-BGP_NEIGHBOR_ADMIN_SHUTDOWN_REPAIR.tests.yml
python scripts/run_raw_tests.py --all
python scripts/run_raw_tests.py --bundle <path> --test resolve_via_rollback
python scripts/run_raw_tests.py --all --junit out/raw-test-results.xml --summary out/raw-test-summary.md --json out/raw-test-results.json
python scripts/run_raw_tests.py --all --no-webex

The runner skips tests with agent_only: true or mode: hybrid-reasoning. It exits with 0 when all selected tests pass or skip, and non-zero on any failure or error. Per-test artifacts are written under logs/test-runs/<UTC>-<raw-id>-<test-name>/.

TestResult.status is one of pass, fail, skipped, or error.

Agent Runner¶

Invoke the network-troubleshooter agent with a payload containing test_bundle_path and, optionally, test_name.

Run the RAW test bundle for AD000002 / RAW000002 in test mode.

{
  "alert_def_id": "AD000002",
  "device_hostname": "xr-43",
  "mode": "strict",
  "test_bundle_path": "intelligence-artifacts/AD000002-bgp-neighbor-admin-shutdown-xr/tests/RAW000002-BGP_NEIGHBOR_ADMIN_SHUTDOWN_REPAIR.tests.yml",
  "test_name": "resolve_via_no_shutdown",
  "webex_notify": false
}

When test_name is omitted, the agent runs every test in the bundle sequentially and emits one result block per test.

The agent runner:

Reads the bundle and loads the FS and RAW declared by fs_path and raw_path.
Uses kb_context_override or ia_artifacts_override when provided; otherwise it uses the normal reader path.
Injects each test's responses and approvals into fault-remediation as test_bundle.
Tags Webex messages with [TEST] and test_run_id when Webex is enabled.
Writes session.md, result.json, and bundle artifacts under logs/test-runs/<UTC>-<raw-id>-<test-name>/.

Hard Rules¶

Do not call radkit_* while a test_bundle is in play. Any such call is a TEST_MODE_VIOLATION and the agent must abort the test.
Do not execute real waits. wait actions are no-ops in test mode.
Do not write test sessions under logs/troubleshooting/; use logs/test-runs/.
Missing canned CLI output is a test failure with reason: missing-canned-response and must name the (step_id, command) pair.
Approval cards can still be posted when Webex is enabled, but the agent must consume the scripted decision from approvals.overrides or approvals.default without waiting for a human click.
Tests that depend on parent-agent behavior the headless RAW interpreter cannot see, such as Webex click verification or LLM judgement, must set agent_only: true.

Authoring Bundles¶

Use the raw-test-author skill for new RAWs. It reads the RAW and paired FS, enumerates terminal leaves, creates deterministic CLI or log fixtures, marks generated fixtures as source: synthetic, and writes the bundle at the canonical tests/<RAW-id>-<slug>.tests.yml path.

Do not hand-author bundles for new RAWs when the skill is available. Let the skill enumerate paths so coverage gaps are visible in review.

Published Examples¶

intelligence-artifacts/AD000002-bgp-neighbor-admin-shutdown-xr/tests/RAW000002-BGP_NEIGHBOR_ADMIN_SHUTDOWN_REPAIR.tests.yml
intelligence-artifacts/AD000003-bgp-max-prefix-adjchange-xr/tests/RAW000003-BGP_NEIGHBOR_MAX_PREFIX_LIMIT_EXCEEDED_REPAIR.tests.yml

File Map¶

Path	Role
`scripts/lib/raw_interpreter.py`	Pure-Python RAW interpreter.
`scripts/run_raw_tests.py`	Headless test runner CLI.
`.opencode/skills/fault-remediation/SKILL.md`	Runtime RAW interpreter skill with `test_bundle` handling.
`.opencode/agents/network-troubleshooter.md`	Parent agent test-mode orchestration.
`.opencode/skills/webex-notify/SKILL.md`	Webex templates; accepts `test_title_prefix` and `test_run_id`.
`.opencode/skills/raw-test-author/SKILL.md`	Test bundle authoring and validation skill.
`intelligence-artifacts/<alert-def-id>/tests/<raw-id>.tests.yml`	Co-located RAW test bundle.
`logs/test-runs/<UTC>-<raw-id>-<test-name>/`	Per-test session log and `result.json`.