Skip to content

junwuxgi/Open-Scalable-RL-Agentic-Alignment-Algorithm-and-Framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

Open Scalable RL/Agentic Alignment Framework

inclusionAI/AReaL: Distributed RL System for LLM Reasoning, an open-source and efficient reinforcement learning system developed at the RL Lab, Ant Research, git

veRL: Volcano Engine Reinforcement Learning for LLM, a flexible, efficient and production-ready RL training library for large language models (LLMs), git; Awesome work using verl:

OpenRLHF: a high-performance RLHF framework built on Ray, DeepSpeed and HF Transformers, git

DeepSpeed: empowers ChatGPT-like model training with a single click, offering 15x speedup over SOTA RLHF systems with unprecedented cost reduction at all scales,blog, git

TRL - Transformer Reinforcement Learning, git

Open Scalable RL/Agentic Alignment Algorithms

Kimi k1.5: Scaling Reinforcement Learning with LLMs, paper, git, 20250120

NVIDIA NeMo-Aligner: Scalable toolkit for efficient model alignment, git

RLHFlow: Open-Source Code for RLHF Workflow, git, paper; Code for Reward Modeling: git; Code for Online RLHF: git; Online-DPO-R1: Unlocking Effective Reasoning Without the PPO Overhead, git

Open Scalable RL/Agentic Alignment Related Modules

SGLang: Efficient Execution of Structured Language Model Programs: blog, paper, git.

vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs, blog, git

Efficient Memory Management for Large Language Model Serving with PagedAttention, paper

Megatron-LM & Megatron-Core: Ongoing research training transformer models at scale, paper, git

Ray: Ray v2 Architecture doc, latest doc, git

DeepSeekMoE

  • DeepSeek-V3 Technical Report, paper, 202502v2
  • DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model, paper, 202406v5
  • DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models, paper, 202406v1

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published