Number of items completed.
Tracks how many items have been successfully processed so far in the current operation. This value increments as each item is completed, providing real-time progress indication.
The ratio of completed to total gives the completion percentage:
progress = (completed / total) * 100
Timestamp when the event was created.
ISO 8601 formatted date-time string indicating when this event was emitted by the system. This timestamp is crucial for event ordering, performance analysis, and debugging the agent workflow execution timeline.
Format: "YYYY-MM-DDTHH:mm:ss.sssZ" (e.g., "2024-01-15T14:30:45.123Z")
The API endpoint being tested.
Identifies the specific operation (method + path) that this test scenario targets. Used to correlate the review result with the corresponding API operation.
A unique identifier for the event.
The review result: improved scenario, deletion flag, or null.
Three possible outcomes reflecting the review priority system:
1. "erase" - Scenario must be deleted (PRIORITY 1 violation):
2. AutoBeTestScenario - Scenario has been improved (PRIORITY 2 fixes):
endpoint and functionName as the original3. null - No improvements needed (PRIORITY 3 - perfect scenario):
The review agent follows this decision tree:
Function calling trial statistics for the operation.
Records the complete trial history of function calling attempts, tracking total executions, successful completions, consent requests, validation failures, and invalid JSON responses. These metrics reveal the reliability and quality of AI agent autonomous operation with tool usage.
Trial statistics are critical for identifying operations where agents struggle with tool interfaces, generate invalid outputs, or require multiple correction attempts through self-healing spiral loops. High failure rates indicate opportunities for system prompt optimization or tool interface improvements.
The original test scenario before review.
Contains the initial scenario as generated by the Test Scenario Agent, preserving the original for comparison and audit purposes.
Iteration number of the interface specification this review was performed for.
Indicates which version of the API specification this review reflects. This step number ensures that the scenario review is aligned with the current interface structure.
The step value enables proper synchronization between scenario review and the underlying interface definitions, ensuring that test scenarios remain valid as the API evolves through iterations.
Detailed token usage metrics for the operation.
Contains comprehensive token consumption data including total usage, input token breakdown with cache hit rates, and output token categorization by generation type (reasoning, predictions). This component-level tracking enables precise cost analysis and identification of operations that benefit most from prompt caching or require optimization.
Token usage directly translates to operational costs, making this metric essential for understanding the financial implications of different operation types and guiding resource allocation decisions.
Total number of items to process.
Represents the complete count of operations, files, endpoints, or other entities that need to be processed in the current workflow step. This value is typically determined at the beginning of an operation and remains constant throughout the process.
Used together with the completed field to calculate progress percentage
and estimate time to completion.
Unique identifier for the event type.
A literal string that discriminates between different event types in the AutoBE system. This field enables TypeScript's discriminated union feature, allowing type-safe event handling through switch statements or conditional checks.
Examples: "analyzeWrite", "databaseSchema", "interfaceOperation", "testScenario"
Event emitted when a single test scenario's review is completed.
This event is triggered when the Test Scenario Review Agent completes analyzing one test scenario to validate authentication correctness, dependency completeness, execution order, and business logic coverage.
The review process follows a three-tier priority system:
PRIORITY 1: Validation Error Testing Detection (Can result in "erase"):
PRIORITY 2: Technical Correctness (Can result in improved scenario):
PRIORITY 3: Quality Assessment (Can result in null):
By extending multiple base interfaces, this event provides comprehensive tracking capabilities including progress monitoring for one-by-one scenario processing and token usage analytics for cost optimization.
Author
Michael
Author
Samchon