Picture for Xingyu Lu

Xingyu Lu

Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning

Add code
May 27, 2025
Viaarxiv icon

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Add code
May 05, 2025
Viaarxiv icon

VLM as Policy: Common-Law Content Moderation Framework for Short Video Platform

Add code
Apr 21, 2025
Viaarxiv icon

InstructEngine: Instruction-driven Text-to-Image Alignment

Add code
Apr 14, 2025
Viaarxiv icon

RLCAD: Reinforcement Learning Training Gym for Revolution Involved CAD Command Sequence Generation

Add code
Mar 24, 2025
Viaarxiv icon

Aligning Multimodal LLM with Human Preference: A Survey

Add code
Mar 18, 2025
Viaarxiv icon

Kwai-STaR: Transform LLMs into State-Transition Reasoners

Add code
Nov 07, 2024
Viaarxiv icon

LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch

Add code
Oct 17, 2024
Figure 1 for LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch
Figure 2 for LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch
Figure 3 for LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch
Figure 4 for LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch
Viaarxiv icon

Multiscale Representation Enhanced Temporal Flow Fusion Model for Long-Term Workload Forecasting

Add code
Jul 29, 2024
Figure 1 for Multiscale Representation Enhanced Temporal Flow Fusion Model for Long-Term Workload Forecasting
Figure 2 for Multiscale Representation Enhanced Temporal Flow Fusion Model for Long-Term Workload Forecasting
Figure 3 for Multiscale Representation Enhanced Temporal Flow Fusion Model for Long-Term Workload Forecasting
Figure 4 for Multiscale Representation Enhanced Temporal Flow Fusion Model for Long-Term Workload Forecasting
Viaarxiv icon

Scaling Laws for Fact Memorization of Large Language Models

Add code
Jun 22, 2024
Viaarxiv icon