2026-05-04 12:01:57 - evalscope - INFO: Benchmarking summary: +---------------------------------------------+----------+ | Key | Value | +=============================================+==========+ | Time taken for tests (s) | 134.487 | +---------------------------------------------+----------+ | Number of concurrency | 2 | +---------------------------------------------+----------+ | Request rate (req/s) | -1 | +---------------------------------------------+----------+ | Total requests | 30 | +---------------------------------------------+----------+ | Succeed requests | 30 | +---------------------------------------------+----------+ | Failed requests | 0 | +---------------------------------------------+----------+ | Request throughput (req/s) | 0.2231 | +---------------------------------------------+----------+ | Average latency (s) | 8.8185 | +---------------------------------------------+----------+ | Average input tokens per request | 29.4333 | +---------------------------------------------+----------+ | Output token throughput (tok/s) | 216.921 | +---------------------------------------------+----------+ | Total token throughput (tok/s) | 223.487 | +---------------------------------------------+----------+ | Average time to first token (s) | 0.0467 | +---------------------------------------------+----------+ | Average time per output token (s) | 0.009 | +---------------------------------------------+----------+ | Average inter-token latency (s) | 0.009 | +---------------------------------------------+----------+ | Average output tokens per request | 972.433 | +---------------------------------------------+----------+ | Average decoded tokens per iter (tok/iter) | 1.002 | +---------------------------------------------+----------+ | Approx speculative decoding acceptance rate | 0.002 | +---------------------------------------------+----------+ Running[perf]: 0%| | 0/1 [02:24