Skip to content
@PRIME-RL

PRIME-RL

Researching scalable (RL) methods on language models.

Pinned Loading

  1. Entropy-Mechanism-of-RL Entropy-Mechanism-of-RL Public

    The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.

    Python 226 8

  2. SimpleVLA-RL SimpleVLA-RL Public

    Online RL with Simple Reward Enables Training VLA Models with Only One Trajectory

    Python 259 9

  3. PRIME PRIME Public

    Scalable RL solution for advanced reasoning of language models

    Python 1.6k 95

  4. TTRL TTRL Public

    TTRL: Test-Time Reinforcement Learning

    Python 683 53

  5. ImplicitPRM ImplicitPRM Public

    Repo of paper "Free Process Rewards without Process Labels"

    Python 154 10

Repositories

Showing 5 of 5 repositories

Top languages

Loading…

Most used topics