rewards
¶
Reusable reward models for training methods.
Classes¶
fastvideo.train.methods.rl.rewards.ClipScoreScorer
¶
ClipScoreScorer(*, device: device | str = 'cuda')
Bases: Module
CLIPScore reward, matching DiffusionNFT normalization.
Ported from DiffusionNFT's flow_grpo/clip_scorer.py.
Source code in fastvideo/train/methods/rl/rewards/frame_rewards.py
fastvideo.train.methods.rl.rewards.MultiRewardScorer
¶
Weighted sum of reusable media reward scorers.
Mirrors DiffusionNFT's flow_grpo/rewards.py::multi_score behavior,
while leaving frame selection to each concrete reward.
Source code in fastvideo/train/methods/rl/rewards/media.py
fastvideo.train.methods.rl.rewards.PickScoreScorer
¶
PickScoreScorer(*, device: device | str = 'cuda', dtype: dtype = float32)
Bases: Module
PickScore reward, matching DiffusionNFT normalization.
Ported from DiffusionNFT's flow_grpo/pickscore_scorer.py.
Source code in fastvideo/train/methods/rl/rewards/frame_rewards.py
Functions:¶
fastvideo.train.methods.rl.rewards.select_first_frame
¶
Return first-frame media as [B, C, H, W].
This is a helper for reward models that are intrinsically frame-based
(for example PickScore and CLIPScore). Video-aware rewards should inspect
the full [B, C, T, H, W] tensor themselves.
Source code in fastvideo/train/methods/rl/rewards/media.py
Modules¶
fastvideo.train.methods.rl.rewards.frame_rewards
¶
Frame-based reward scorers used by RL training methods.
Classes¶
fastvideo.train.methods.rl.rewards.frame_rewards.ClipScoreScorer
¶
ClipScoreScorer(*, device: device | str = 'cuda')
Bases: Module
CLIPScore reward, matching DiffusionNFT normalization.
Ported from DiffusionNFT's flow_grpo/clip_scorer.py.
Source code in fastvideo/train/methods/rl/rewards/frame_rewards.py
fastvideo.train.methods.rl.rewards.frame_rewards.PickScoreScorer
¶
PickScoreScorer(*, device: device | str = 'cuda', dtype: dtype = float32)
Bases: Module
PickScore reward, matching DiffusionNFT normalization.
Ported from DiffusionNFT's flow_grpo/pickscore_scorer.py.
Source code in fastvideo/train/methods/rl/rewards/frame_rewards.py
Functions:¶
fastvideo.train.methods.rl.rewards.media
¶
Generic media reward composition utilities.
Classes¶
fastvideo.train.methods.rl.rewards.media.MultiRewardScorer
¶
Weighted sum of reusable media reward scorers.
Mirrors DiffusionNFT's flow_grpo/rewards.py::multi_score behavior,
while leaving frame selection to each concrete reward.
Source code in fastvideo/train/methods/rl/rewards/media.py
Functions:¶
fastvideo.train.methods.rl.rewards.media.select_first_frame
¶
Return first-frame media as [B, C, H, W].
This is a helper for reward models that are intrinsically frame-based
(for example PickScore and CLIPScore). Video-aware rewards should inspect
the full [B, C, T, H, W] tensor themselves.