Skip to content

API Overview

This section provides comprehensive API documentation for Relax components.

Core Modules

Controller

The Controller module manages the overall experiment lifecycle and service coordination.

python
from relax.core.controller import Controller

controller = Controller(config)
controller.register_all_serve()
controller.training_loop()

Services

The Service class wraps Ray Serve deployments for each RL component.

python
from relax.core.service import Service

service = Service(cls=Actor, role="actor", healthy=health_handle, config=config, num_gpus=2)

Service HTTP APIs

The Implementation module deploys concrete RL components as Ray Serve services with FastAPI HTTP endpoints for lifecycle management, recovery, and coordination.

ServiceDescriptionAPI Docs
ActorPolicy model trainingActor API
RolloutSample generation via SGLangRollout API
GenRMGenerative reward model (LLM-as-judge)GenRM API
ActorFwdForward-only log-prob computationActorFwd API

OpenAPI Specification

Each service page includes an interactive Swagger UI generated from the OpenAPI specification. The specs are generated offline via python scripts/tools/generate_openapi.py.

Utility Modules

Checkpoint Engine

Distributed checkpoint management.

python
from relax.distributed.checkpoint_service.client.engine import CheckpointEngineClient

engine = CheckpointEngineClient(config)

Metrics Service

Metrics collection and reporting.

python
from relax.utils.metrics.client import MetricsClient

metrics = MetricsClient(service_url="http://localhost:8000/metrics")
metrics.log_metric(step=100, metric_name="reward", metric_value=0.75)
metrics.log_metrics_batch(step=100, metrics={"reward": 0.75, "loss": 0.3})

Health System

Service health monitoring.

python
from relax.utils.health_system import HealthManager

health_manager = HealthManager(check_interval=1.0)
health_manager.start(on_unhealthy=callback_fn)

Data Structures

Sample

python
from relax.utils.types import Sample

sample = Sample(
    prompt="What is the capital of France?",
    response="Paris",
    reward=1.0,
    metadata={"source": "dataset"}
)

Episode

python
from relax.utils.types import Episode

episode = Episode(
    samples=[sample1, sample2, sample3],
    total_reward=2.5,
    length=3
)

Configuration

Configuration

Relax uses command-line arguments for configuration. See the Configuration Guide for details.

python
# Configuration is done via argparse
from relax.utils.arguments import parse_args

args = parse_args()
# args contains all configuration parameters

Quick Reference

ModuleDescriptionDocumentation
relax.coreExperiment coordinationController API
relax.components.actorPolicy trainingActor API
relax.components.rolloutSample generationRollout API
relax.components.genrmGenerative reward modelGenRM API
relax.components.actor_fwdLog-prob computationActorFwd API
relax.distributed.checkpoint_serviceCheckpoint managementCheckpoint API
relax.utils.metricsMetrics collectionMetrics API
relax.utilsUtilitiesUtils API

Type Hints

Relax uses type hints throughout the codebase:

python
from typing import Dict, List, Optional
from relax.components import Actor

def train_actor(
    actor: Actor,
    episodes: List[Episode],
    learning_rate: float = 1e-5
) -> Dict[str, float]:
    """Train actor on episodes."""
    ...

Error Handling

Common exceptions:

python
from relax.exceptions import (
    ConfigurationError,
    ServiceDeploymentError,
    CheckpointError
)

try:
    controller.deploy_services()
except ServiceDeploymentError as e:
    logger.error(f"Failed to deploy services: {e}")

Next Steps

Released under the Apache 2.0 License.