Picture for Qiaosheng Zhang

Qiaosheng Zhang

The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants

Add code
May 26, 2025
Viaarxiv icon

MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision

Add code
May 19, 2025
Viaarxiv icon

CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models

Add code
May 18, 2025
Viaarxiv icon

Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute

Add code
Apr 02, 2025
Viaarxiv icon

MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Add code
Mar 10, 2025
Viaarxiv icon

If Multi-Agent Debate is the Answer, What is the Question?

Add code
Feb 12, 2025
Viaarxiv icon

Graph Feedback Bandits on Similar Arms: With and Without Graph Structures

Add code
Jan 24, 2025
Viaarxiv icon

Community detection for Contexual-LSBM: Theoretical limitation on misclassfication ratio and effecient algorithm

Add code
Jan 19, 2025
Viaarxiv icon

Understanding When and Why Graph Attention Mechanisms Work via Node Classification

Add code
Dec 20, 2024
Viaarxiv icon

A Theory of Learnability for Offline Decision Making

Add code
Jun 03, 2024
Viaarxiv icon