Picture for Xuweiyi Chen

Xuweiyi Chen

Point-MoE: Towards Cross-Domain Generalization in 3D Semantic Segmentation via Mixture-of-Experts

Add code
May 29, 2025
Viaarxiv icon

Frame In-N-Out: Unbounded Controllable Image-to-Video Generation

Add code
May 27, 2025
Viaarxiv icon

Open Vocabulary Monocular 3D Object Detection

Add code
Nov 25, 2024
Viaarxiv icon

Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Add code
Nov 25, 2024
Figure 1 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Figure 2 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Figure 3 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Figure 4 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Viaarxiv icon

Learning 3D Representations from Procedural 3D Programs

Add code
Nov 25, 2024
Viaarxiv icon

Multi-Object Hallucination in Vision-Language Models

Add code
Jul 08, 2024
Figure 1 for Multi-Object Hallucination in Vision-Language Models
Figure 2 for Multi-Object Hallucination in Vision-Language Models
Figure 3 for Multi-Object Hallucination in Vision-Language Models
Figure 4 for Multi-Object Hallucination in Vision-Language Models
Viaarxiv icon

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Add code
Jun 12, 2024
Figure 1 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Figure 2 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Figure 3 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Figure 4 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Viaarxiv icon

3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs

Add code
Jun 07, 2024
Figure 1 for 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
Figure 2 for 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
Figure 3 for 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
Figure 4 for 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
Viaarxiv icon

UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control

Add code
Mar 06, 2024
Viaarxiv icon

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

Add code
Sep 21, 2023
Viaarxiv icon