|
Peijie Qiu
I'm a PhD student at Washington University in St. Louis (WashU), advised by Prof. Aris Sotiras (now at University of Pennsylvania).
My current and past research encompasses VLMs/LLMs/MLLMs, Reinforcement Learning from Human Feedback (RLHF), Generative Models (VAEs, GANs and Diffusion/Flow Models), (Multimodal) Representation Learning, Medical Image Analysis,
and Sequential Modeling. I've also interned at Amazon as an applied scientist (hosted by Visual Search Team).
Email /
Google Scholar /
Github
|
|
News
[02-23-2026] Two papers accepted to CVPR 2026๐.
[05-23-2025] One paper accepted to MICCAI 2025๐.
[05-01-2025] Two papers accepted to ICML 2025๐.
[03-03-2025] I will be an intern at Amazon this summer, hosted by the Visual Search Team.
[12-09-2024] Two papers accepted to AAAI 2025๐.
|
|
|
SafeUDPO: Safeguard Large Language Models With Utility-Calibrated Preference Optimization
Author list is hidden for double-blind review.
[preprint coming soon]
TL;DR: A novel method to safeguard LLMs by calibrating the utility of listwise preference optimization.
|
|
|
ImageRAGTurbo: Towards One-step Text-to-Image Generation with Retrieval-Augmented Diffusion Models
Peijie Qiu, Hariharan Ramshankar, Arnau Ramisa, Renรฉ Vidal, Vamsi Salaka, Rahul Bhagat, Amit Kumar K C
[CVPR 2026]
TL;DR: Accelerating text-to-image generation by retrieving and fusing relevant image features from a large database, enabling one-step generation with diffusion models.
|
|
|
LLaDA-MedV: Exploring Large Language Diffusion Models for Biomedical Image Understanding
Xuanzhao Dong*, Wenhui Zhu*, Xiwen Chen*, Zhipeng Wang*, Peijie Qiu, Shao Tang, Xin Li Yalin Wang (*: co-first authors)
[CVPR 2026; preprint coming soon]
TL;DR: The first dLLM for biomedical image understanding, improving both performance and efficiency compared to auto-regressive LLMs.
|
|
|
DRA-GRPO: Your GRPO Needs to Know Diverse Reasoning Paths for Mathematical Reasoning
Xiwen Chen*, Wenhui Zhu*, Peijie Qiu*, Xuanzhao Dong, Hao Wang, Haiyu Wu, Huayu Li, Aristeidis Sotiras, Yalin Wang, Abolfazl Razi (*: co-first authors)
[arXiv; Submitted to ACL]
TL;DR: A novel method to explore diverse reasoning paths for mathematical reasoning by downweighing the redundant paths, improving the performance of GRPO on MATH dataset by 7.5% (from 50.3% to 57.8%).
|
|
|
Active Source-Free Cross-Domain and Cross-Modality Adaptation for Volumetric Medical Image Segmentation by Image Sensitivity and Organ Heterogeneity Sampling
Jin Yang, Xiaobing Yu, Peijie Qiu, Daniel Marcus, Aristeidis Sotiras
[MICCAI 2025]
TL;DR: An active source-free 3D segmentation method that adapts across domains/modalities by sampling target volumes using sensitivity and heterogeneity.
|
|
|
How Effective Can Dropout Be in Multiple Instance Learning ?
Wenhui Zhu*, Peijie Qiu*, Xiwen Chen*, Zhangsihao Yang, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang (*: co-first authors)
[ICML 2025]
TL;DR: A novel Dropout approach for MIL that significantly improves performance.
|
|
|
Cracking Instance Jigsaw Puzzles: An Alternative to Multiple Instance Learning for Whole Slide Image Analysis
Xiwen Chen*, Peijie Qiu*, Wenhui Zhu*, Hao Wang, Huayu Li, Xuanzhao Dong, Xiaotong Sun, Xiaobing Yu, Yalin Wang, Abolfazl Razi, Aristeidis Sotiras (*: co-first authors)
[ICCV 2025]
TL;DR: An alternative to permuation-invariant MIL for whole-slide image analysis by solving instance-level โjigsawโ puzzles.
|
|
|
AgileFormer: Spatially Agile Transformer UNet for Medical Image Segmentation
Peijie Qiu, Jin Yang, Sayantan Kumar, Soumyendu Sekhar Ghosh, Aristeidis Sotiras
[Biomed. Signal Processing & Control]
TL;DR: A roadmap to introduce spatially varying components into transformer-based UNets for segmentation.
|
|
|
Multimodal Variational Autoencoder: a Barycentric View
Peijie Qiu, Wenhui Zhu, Sayantan Kumar, Xiwen Chen, Xiaotong Sun, Jin Yang, Abolfazl Razi, Yalin Wang, Aristeidis Sotiras
[AAAI 2025; Oral]
TL;DR: A geometric view of multimodal VAEs that unifies existing probabilistic approaches and provides an information-theoretic view for probability mass allocation.
|
|
|
Context-Aware Optimal Transport Learning for Retinal Fundus Image Enhancement
Vamsi Krishna Vasa*, Peijie Qiu*, Wenhui Zhu, Yujian Xiong, Oana Dumitrascu, Yalin Wang (*: co-first authors)
[WACV 2025]
TL;DR: A novel context-aware optimal transport learning method for image-to-image translation.
|
|
|
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification
Wenhui Zhu*, Xiwen Chen*, Peijie Qiu*, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang (*: co-first authors)
[ECCV 2024]
TL;DR: A novel method to explore diverse global representations for MIL-based whole slide image classification, improving the performance by 5.5% compared to the state-of-the-art methods.
|
|
|
QCResUNet: Joint Subject-level and Voxel-level Segmentation Quality Prediction
Peijie Qiu, Satrajit Chakrabarty, Phuc Nguyen, Soumyendu Sekhar Ghosh, Aristeidis Sotiras
[Medical Image Analysis] [MICCAI 2023]
TL;DR: A novel method for predicting the quality of medical image segmentation at both subject and voxel levels, improving the reliability of automated segmentation tools.
|
|
|
TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning
Xiwen Chen*, Peijie Qiu*, Wenhui Zhu*, Huayu Li, Hao Wang, Aristeidis Sotiras, Yalin Wang, Abolfazl Razi (*: co-first authors)
[ICML 2024]
TL;DR: A novel time-aware MIL method for multivariate time series classification that models temporal dependencies and achieves state-of-the-art performance on several benchmarks.
|
|
Multimodal Variational Autoencoder: a Barycentric View. AAAI'25 [Slides]
|
|
A Deep Dive into Diffusion and Flow Models. Amazon Visual Search [Slides]
|
|
From Vision-Language Models to Multimodal Language Models. CIRC [Slides]
|
|
Reviewer
|
CVPR, ICCV, ECCV, WACV, ICLR, AAAI, IJCAI, AISTATS, KDD, MICCAI, ISBI, TMI, MedIA, etc.
|
Design credits to this webpage. Feel free to steal this website.
|
|