xdrl.trainer_hooks.logging#
Classes#
Log collection metrics in the |
|
Log reduced optimization metrics under the |
|
Track frame and iteration counters in the |
|
Emit a compact progress-bar view for reward and frame counters. |
|
Run periodic evaluation and log metrics under |
|
Compose deterministic and non-deterministic evaluation hooks. |
|
Compose the default |
|
Optional shutdown hook for config-driven Weights & Biases cleanup. |
|
Flush pending W&B scalar rows emitted through TorchRL's scalar logger. |
Functions#
|
|
|
|
|
|
|
|
|
|
|
Module Contents#
- xdrl.trainer_hooks.logging._as_float(value)[source]#
- Parameters:
value (torch.Tensor)
- Return type:
float
- xdrl.trainer_hooks.logging._min_mean_max(prefix, value)[source]#
- Parameters:
prefix (str)
value (torch.Tensor)
- Return type:
dict[str, float]
- xdrl.trainer_hooks.logging._summarize_metric(prefix, value, *, reduce_stats=None)[source]#
- Parameters:
prefix (str)
value (torch.Tensor)
reduce_stats (bool | None)
- Return type:
dict[str, float]
- xdrl.trainer_hooks.logging._collector_mask(batch)[source]#
- Parameters:
batch (tensordict.TensorDictBase)
- Return type:
torch.Tensor | None
- xdrl.trainer_hooks.logging._scalarize_last_dim(value, weights)[source]#
- Parameters:
value (torch.Tensor)
weights (tuple[float, Ellipsis] | None)
- Return type:
torch.Tensor
- xdrl.trainer_hooks.logging._prepare_episode_reward(value, done, *, weights)[source]#
- Parameters:
value (torch.Tensor)
done (torch.Tensor)
weights (tuple[float, Ellipsis] | None)
- Return type:
torch.Tensor
- class xdrl.trainer_hooks.logging.LoggingCollectionMetricsHook(group='agents', reward_key=None, done_key=('next', 'done'), episode_reward_key=None, episode_reward_weights=None, reduce_stats=None)[source]#
Bases:
torchrl.trainers.trainers.TrainerHookBaseLog collection metrics in the
collection/namespace.The hook reads reward, done, and optional episode-reward tensors from a collected TensorDict. Vector-valued rewards can be scalarized with explicit weights, which is useful for MO-Gymnasium and other multi-objective runs.
- Parameters:
group (str)
reward_key (tuple[str, Ellipsis] | None)
done_key (tuple[str, Ellipsis])
episode_reward_key (tuple[str, Ellipsis] | None)
episode_reward_weights (collections.abc.Sequence[float] | None)
reduce_stats (bool | None)
- __call__(batch)[source]#
- Parameters:
batch (tensordict.TensorDictBase)
- Return type:
dict[str, float]
- register(trainer, name='logging_collection_metrics')[source]#
Registers the hook in the trainer at a default location.
- Parameters:
trainer (Trainer) – the trainer where the hook must be registered.
name (str) – the name of the hook.
- Return type:
None
Note
To register the hook at another location than the default, use
register_op().
- class xdrl.trainer_hooks.logging.LoggingTrainingMetricsHook(group='agents')[source]#
Bases:
torchrl.trainers.trainers.TrainerHookBaseLog reduced optimization metrics under the
train/namespace.TorchRL loss modules may emit TensorDict metrics with vector values. This hook reduces tensors to scalars and mirrors them under
train/<group>/so logger dashboards and progress outputs use stable names.- Parameters:
group (str)
- __call__(_optim_steps, average_losses)[source]#
- Parameters:
_optim_steps (int)
average_losses (tensordict.TensorDictBase | None)
- Return type:
dict[str, float]
- register(trainer, name='logging_training_metrics')[source]#
Registers the hook in the trainer at a default location.
- Parameters:
trainer (Trainer) – the trainer where the hook must be registered.
name (str) – the name of the hook.
- Return type:
None
Note
To register the hook at another location than the default, use
register_op().
- class xdrl.trainer_hooks.logging.LoggingCountersHook(frame_skip=1)[source]#
Bases:
torchrl.trainers.trainers.TrainerHookBaseTrack frame and iteration counters in the
counters/namespace.- Parameters:
frame_skip (int)
- register(trainer, name='logging_counters')[source]#
Registers the hook in the trainer at a default location.
- Parameters:
trainer (Trainer) – the trainer where the hook must be registered.
name (str) – the name of the hook.
- Return type:
None
Note
To register the hook at another location than the default, use
register_op().
- class xdrl.trainer_hooks.logging.LoggingProgressMetricsHook(*, group, counters_hook, reward_key=None)[source]#
Bases:
torchrl.trainers.trainers.TrainerHookBaseEmit a compact progress-bar view for reward and frame counters.
- Parameters:
group (str)
counters_hook (LoggingCountersHook)
reward_key (tuple[str, Ellipsis] | None)
- __call__(batch)[source]#
- Parameters:
batch (tensordict.TensorDictBase)
- Return type:
dict[str, float | bool]
- register(trainer, name='logging_progress_metrics')[source]#
Registers the hook in the trainer at a default location.
- Parameters:
trainer (Trainer) – the trainer where the hook must be registered.
name (str) – the name of the hook.
- Return type:
None
Note
To register the hook at another location than the default, use
register_op().
- class xdrl.trainer_hooks.logging.LoggingEvaluationMetricsHook(*, policy, environment, group, metric_subgroup, interval_frames, max_steps, deterministic, render, video_fps, render_kwargs=None, reward_key=None, reduce_stats=None, logger=None)[source]#
Bases:
torchrl.trainers.trainers.TrainerHookBaseRun periodic evaluation and log metrics under
eval/<subgroup>/.- Parameters:
policy (torch.nn.Module) – Policy module used during rollout.
environment – TorchRL environment with a
rolloutmethod.group (str) – Agent/group namespace used for reward keys.
metric_subgroup (str) – Evaluation label such as
"deterministic".interval_frames (int) – Collected-frame interval between evaluations.
max_steps (int) – Maximum rollout length.
deterministic (bool) – Whether to force deterministic exploration.
render (bool) – Whether to capture rendered frames and log a video.
video_fps (int) – Video frame rate passed to the logger.
render_kwargs (dict[str, Any] | None) – Optional keyword arguments passed to environment
render.reward_key (tuple[str, Ellipsis] | None) – TensorDict key for rollout rewards.
reduce_stats (bool | None) – Whether vector metrics are reduced to min/mean/max.
logger (Any | None) – Optional logger used before the hook is registered on a trainer.
- _log_direct(metrics, step)[source]#
- Parameters:
metrics (dict[str, float])
step (int)
- Return type:
None
- __call__(_batch)[source]#
- Parameters:
_batch (tensordict.TensorDictBase)
- Return type:
dict[str, float]
- register(trainer, name='logging_evaluation_metrics')[source]#
Registers the hook in the trainer at a default location.
- Parameters:
trainer (Trainer) – the trainer where the hook must be registered.
name (str) – the name of the hook.
- Return type:
None
Note
To register the hook at another location than the default, use
register_op().
- class xdrl.trainer_hooks.logging.LoggingEvaluationHookSet(*, policy, environment, group, interval_frames, max_steps, deterministic, non_deterministic, render, video_fps, render_kwargs=None, reward_key=None, reduce_stats=None, logger=None)[source]#
Compose deterministic and non-deterministic evaluation hooks.
This wrapper is useful when the same environment and policy should be evaluated with both exploration settings on the same schedule.
- Parameters:
policy (torch.nn.Module)
group (str)
interval_frames (int)
max_steps (int)
deterministic (bool)
non_deterministic (bool)
render (bool)
video_fps (int)
render_kwargs (dict[str, Any] | None)
reward_key (tuple[str, Ellipsis] | None)
reduce_stats (bool | None)
logger (Any | None)
- hooks: list[LoggingEvaluationMetricsHook] = [][source]#
- class xdrl.trainer_hooks.logging.LoggingHookSet(*, group, frame_skip, reward_key=None, done_key=('next', 'done'), episode_reward_key=None, episode_reward_weights=None, reduce_stats=None, eval_hook_set=None)[source]#
Bases:
torchrl.trainers.trainers.TrainerHookBaseCompose the default
xdrllogging hooks.The hook set registers collection metrics, training metrics, counters, progress metrics, timers, and optional evaluation hooks as a single object. This keeps Hydra configs concise while preserving independently testable hook components.
- Parameters:
group (str)
frame_skip (int)
reward_key (tuple[str, Ellipsis] | None)
done_key (tuple[str, Ellipsis])
episode_reward_key (tuple[str, Ellipsis] | None)
episode_reward_weights (collections.abc.Sequence[float] | None)
reduce_stats (bool | None)
eval_hook_set (LoggingEvaluationHookSet | None)
- _timers_start(batch)[source]#
- Parameters:
batch (tensordict.TensorDictBase)
- Return type:
tensordict.TensorDictBase
- _timers_end(_batch)[source]#
- Parameters:
_batch (tensordict.TensorDictBase)
- Return type:
dict[str, float]
- register(trainer, name='logging_hooks')[source]#
Registers the hook in the trainer at a default location.
- Parameters:
trainer (Trainer) – the trainer where the hook must be registered.
name (str) – the name of the hook.
- Return type:
None
Note
To register the hook at another location than the default, use
register_op().
- class xdrl.trainer_hooks.logging.WandbFinishHook(enabled=True)[source]#
Bases:
torchrl.trainers.trainers.TrainerHookBaseOptional shutdown hook for config-driven Weights & Biases cleanup.
The hook intentionally swallows import/runtime errors so offline or disabled W&B runs do not fail trainer shutdown.
- Parameters:
enabled (bool)
- register(trainer, name='wandb_finish')[source]#
Registers the hook in the trainer at a default location.
- Parameters:
trainer (Trainer) – the trainer where the hook must be registered.
name (str) – the name of the hook.
- Return type:
None
Note
To register the hook at another location than the default, use
register_op().
- class xdrl.trainer_hooks.logging.WandbFlushHook(enabled=True)[source]#
Bases:
torchrl.trainers.trainers.TrainerHookBaseFlush pending W&B scalar rows emitted through TorchRL’s scalar logger.
TorchRL logs scalar metrics one by one, while its W&B logger defaults those calls to
commit=Falseso metrics for the same step can be grouped. This hook commits the pending row after each trainer iteration and before W&B is finished, which makes metrics appear during long-running jobs.- Parameters:
enabled (bool)
- static _wandb_step_registry(logger)[source]#
- Parameters:
logger (Any)
- Return type:
tuple[tuple[str, int], Ellipsis]
- register(trainer, name='wandb_flush')[source]#
Registers the hook in the trainer at a default location.
- Parameters:
trainer (Trainer) – the trainer where the hook must be registered.
name (str) – the name of the hook.
- Return type:
None
Note
To register the hook at another location than the default, use
register_op().