Skip to content
View LunjunZhang's full-sized avatar

Highlights

  • Pro

Block or report LunjunZhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. ema-pg ema-pg Public

    Code for "EMA Policy Gradient: Taming Reinforcement Learning for LLMs with EMA Anchor and Top-k KL" (arxiv.org/abs/2602.04417)

    Python 9 2

  2. E-SPL E-SPL Public

    Code for "Evolutionary System Prompt Learning for Reinforcement Learning in LLMs" (arxiv.org/abs/2602.14697)

    Python 9

  3. world-model-as-a-graph world-model-as-a-graph Public

    Code for "World Model as a Graph: Learning Latent Landmarks for Planning" (ICML 2021 Long Presentation)

    Python 70 4

  4. d2ac-actor-critic/d2ac-public d2ac-actor-critic/d2ac-public Public

    Official code for D2AC: Diffusion Actor Meets Distributional Critic (TMLR 2025)

    Python 2

  5. genrm-star/genrm-critiques genrm-star/genrm-critiques Public

    GenRM-CoT: Data release for verification rationales

    68 6