Marroh

Follow

🎓

Open to work

Hao Ma Marroh

🎓

Open to work

Follow

PhD student at Chinese Academic of Science, Intitute of Automation (CASIA). Researcher, Builder.

5 followers · 1 following

UCAS CASIA
Beijing, China
04:45 (UTC +08:00)
https://orcid.org/0000-0001-9563-2518
in/hao-ma99

Achievements

Achievements

Highlights

Pro

Marroh/README.md

Ph.D. candidate at CASIA with research interests spanning multiple subfields of machine learning, including reinforcement learning, multi-agent learning, and reinforcement fine-tuning for large language models. Expected to graduate in summer 2026 and currently seeking job opportunities.

Pinned Loading

binary-husky/gpt_academic binary-husky/gpt_academic Public

为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 69.1k 8.4k
V-GEPF-official-code V-GEPF-official-code Public

The official implementation of AAAI 2025 paper "Vision-Based Generic Potential Function for Policy Alignment in Multi-Agent Reinforcement Learning".

Python
Harry67Hu/CORY Harry67Hu/CORY Public

Official implementation of the NeurIPS 2024 paper CORY

Python 19 2
Hand-coded-ML-algorithms Hand-coded-ML-algorithms Public

Hand-coded basic machine learning algorithms: logistic regression, softmax regression, perceptron algorithms, MLP, linear regression

Python
Medical-image-segmentation Medical-image-segmentation Public

A medical image segmentation tool with GUI.

Python 9 1
ppo_rnd ppo_rnd Public

An improved PPO algorithm using RND (Random Network Distillation). Tested in Google Research Football.

Python 4