API Overview
This section provides comprehensive API documentation for Relax components.
Core Modules
Controller
The Controller module manages the overall experiment lifecycle and service coordination.
from relax.core.controller import Controller
controller = Controller(config)
controller.register_all_serve()
controller.training_loop()Services
The Service class wraps Ray Serve deployments for each RL component.
from relax.core.service import Service
service = Service(cls=Actor, role="actor", healthy=health_handle, config=config, num_gpus=2)Service HTTP APIs
The Implementation module deploys concrete RL components as Ray Serve services with FastAPI HTTP endpoints for lifecycle management, recovery, and coordination.
| Service | Description | API Docs |
|---|---|---|
| Actor | Policy model training | Actor API |
| Rollout | Sample generation via SGLang | Rollout API |
| GenRM | Generative reward model (LLM-as-judge) | GenRM API |
| ActorFwd | Forward-only log-prob computation | ActorFwd API |
OpenAPI Specification
Each service page includes an interactive Swagger UI generated from the OpenAPI specification. The specs are generated offline via python scripts/tools/generate_openapi.py.
Utility Modules
Checkpoint Engine
Distributed checkpoint management.
from relax.distributed.checkpoint_service.client.engine import CheckpointEngineClient
engine = CheckpointEngineClient(config)Metrics Service
Metrics collection and reporting.
from relax.utils.metrics.client import MetricsClient
metrics = MetricsClient(service_url="http://localhost:8000/metrics")
metrics.log_metric(step=100, metric_name="reward", metric_value=0.75)
metrics.log_metrics_batch(step=100, metrics={"reward": 0.75, "loss": 0.3})Health System
Service health monitoring.
from relax.utils.health_system import HealthManager
health_manager = HealthManager(check_interval=1.0)
health_manager.start(on_unhealthy=callback_fn)Data Structures
Sample
from relax.utils.types import Sample
sample = Sample(
prompt="What is the capital of France?",
response="Paris",
reward=1.0,
metadata={"source": "dataset"}
)Episode
from relax.utils.types import Episode
episode = Episode(
samples=[sample1, sample2, sample3],
total_reward=2.5,
length=3
)Configuration
Configuration
Relax uses command-line arguments for configuration. See the Configuration Guide for details.
# Configuration is done via argparse
from relax.utils.arguments import parse_args
args = parse_args()
# args contains all configuration parametersQuick Reference
| Module | Description | Documentation |
|---|---|---|
relax.core | Experiment coordination | Controller API |
relax.components.actor | Policy training | Actor API |
relax.components.rollout | Sample generation | Rollout API |
relax.components.genrm | Generative reward model | GenRM API |
relax.components.actor_fwd | Log-prob computation | ActorFwd API |
relax.distributed.checkpoint_service | Checkpoint management | Checkpoint API |
relax.utils.metrics | Metrics collection | Metrics API |
relax.utils | Utilities | Utils API |
Type Hints
Relax uses type hints throughout the codebase:
from typing import Dict, List, Optional
from relax.components import Actor
def train_actor(
actor: Actor,
episodes: List[Episode],
learning_rate: float = 1e-5
) -> Dict[str, float]:
"""Train actor on episodes."""
...Error Handling
Common exceptions:
from relax.exceptions import (
ConfigurationError,
ServiceDeploymentError,
CheckpointError
)
try:
controller.deploy_services()
except ServiceDeploymentError as e:
logger.error(f"Failed to deploy services: {e}")Next Steps
- Actor API - Actor (policy training) HTTP API
- Rollout API - Rollout (sample generation) HTTP API
- GenRM API - GenRM (generative reward model) HTTP API
- ActorFwd API - ActorFwd (log-prob computation) HTTP API
