Evaluation API Reference

Benchmark suite and efficiency profiling.

BenchmarkSuite

Run standard LLM benchmarks.

BenchmarkResult

Result container for benchmarks.

EfficiencyProfiler

Profile inference efficiency.

MemoryProfiler

Profile memory usage.

Metric Functions

compute_perplexity

compute_accuracy