Peijie Qiu

I'm a PhD student at Washington University in St. Louis (WashU), advised by Prof. Aris Sotiras (now at University of Pennsylvania). My current and past research encompasses VLMs/LLMs/MLLMs, Reinforcement Learning from Human Feedback (RLHF), Generative Models (VAEs, GANs and Diffusion/Flow Models), (Multimodal) Representation Learning, Medical Image Analysis, and Sequential Modeling. I've also interned at Amazon as an applied scientist (hosted by Visual Search Team).

Email  /  Google Scholar  /  Github

profile photo

News

[02-23-2026] Two papers accepted to CVPR 2026๐ŸŒŸ.
[05-23-2025] One paper accepted to MICCAI 2025๐ŸŒŸ.
[05-01-2025] Two papers accepted to ICML 2025๐ŸŒŸ.
[03-03-2025] I will be an intern at Amazon this summer, hosted by the Visual Search Team.
[12-09-2024] Two papers accepted to AAAI 2025๐ŸŒŸ.

Research

SafeUDPO: Safeguard Large Language Models With Utility-Calibrated Preference Optimization


Author list is hidden for double-blind review.
[preprint coming soon]

TL;DR: A novel method to safeguard LLMs by calibrating the utility of listwise preference optimization.

ImageRAGTurbo: Towards One-step Text-to-Image Generation with Retrieval-Augmented Diffusion Models


Peijie Qiu, Hariharan Ramshankar, Arnau Ramisa, Renรฉ Vidal, Vamsi Salaka, Rahul Bhagat, Amit Kumar K C
[CVPR 2026]

TL;DR: Accelerating text-to-image generation by retrieving and fusing relevant image features from a large database, enabling one-step generation with diffusion models.

LLaDA-MedV: Exploring Large Language Diffusion Models for Biomedical Image Understanding


Xuanzhao Dong*, Wenhui Zhu*, Xiwen Chen*, Zhipeng Wang*, Peijie Qiu, Shao Tang, Xin Li Yalin Wang (*: co-first authors)
[CVPR 2026; preprint coming soon]

TL;DR: The first dLLM for biomedical image understanding, improving both performance and efficiency compared to auto-regressive LLMs.

DRA-GRPO: Your GRPO Needs to Know Diverse Reasoning Paths for Mathematical Reasoning


Xiwen Chen*, Wenhui Zhu*, Peijie Qiu*, Xuanzhao Dong, Hao Wang, Haiyu Wu, Huayu Li, Aristeidis Sotiras, Yalin Wang, Abolfazl Razi (*: co-first authors)
[arXiv; Submitted to ACL]

TL;DR: A novel method to explore diverse reasoning paths for mathematical reasoning by downweighing the redundant paths, improving the performance of GRPO on MATH dataset by 7.5% (from 50.3% to 57.8%).

Active Source-Free Cross-Domain and Cross-Modality Adaptation for Volumetric Medical Image Segmentation by Image Sensitivity and Organ Heterogeneity Sampling


Jin Yang, Xiaobing Yu, Peijie Qiu, Daniel Marcus, Aristeidis Sotiras
[MICCAI 2025]

TL;DR: An active source-free 3D segmentation method that adapts across domains/modalities by sampling target volumes using sensitivity and heterogeneity.

How Effective Can Dropout Be in Multiple Instance Learning ?


Wenhui Zhu*, Peijie Qiu*, Xiwen Chen*, Zhangsihao Yang, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang (*: co-first authors)
[ICML 2025]

TL;DR: A novel Dropout approach for MIL that significantly improves performance.

Cracking Instance Jigsaw Puzzles: An Alternative to Multiple Instance Learning for Whole Slide Image Analysis


Xiwen Chen*, Peijie Qiu*, Wenhui Zhu*, Hao Wang, Huayu Li, Xuanzhao Dong, Xiaotong Sun, Xiaobing Yu, Yalin Wang, Abolfazl Razi, Aristeidis Sotiras (*: co-first authors)
[ICCV 2025]

TL;DR: An alternative to permuation-invariant MIL for whole-slide image analysis by solving instance-level โ€œjigsawโ€ puzzles.

AgileFormer: Spatially Agile Transformer UNet for Medical Image Segmentation


Peijie Qiu, Jin Yang, Sayantan Kumar, Soumyendu Sekhar Ghosh, Aristeidis Sotiras
[Biomed. Signal Processing & Control]

TL;DR: A roadmap to introduce spatially varying components into transformer-based UNets for segmentation.

Multimodal Variational Autoencoder: a Barycentric View


Peijie Qiu, Wenhui Zhu, Sayantan Kumar, Xiwen Chen, Xiaotong Sun, Jin Yang, Abolfazl Razi, Yalin Wang, Aristeidis Sotiras
[AAAI 2025; Oral]

TL;DR: A geometric view of multimodal VAEs that unifies existing probabilistic approaches and provides an information-theoretic view for probability mass allocation.

Context-Aware Optimal Transport Learning for Retinal Fundus Image Enhancement


Vamsi Krishna Vasa*, Peijie Qiu*, Wenhui Zhu, Yujian Xiong, Oana Dumitrascu, Yalin Wang (*: co-first authors)
[WACV 2025]

TL;DR: A novel context-aware optimal transport learning method for image-to-image translation.

DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification


Wenhui Zhu*, Xiwen Chen*, Peijie Qiu*, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang (*: co-first authors)
[ECCV 2024]

TL;DR: A novel method to explore diverse global representations for MIL-based whole slide image classification, improving the performance by 5.5% compared to the state-of-the-art methods.

QCResUNet: Joint Subject-level and Voxel-level Segmentation Quality Prediction


Peijie Qiu, Satrajit Chakrabarty, Phuc Nguyen, Soumyendu Sekhar Ghosh, Aristeidis Sotiras
[Medical Image Analysis] [MICCAI 2023]

TL;DR: A novel method for predicting the quality of medical image segmentation at both subject and voxel levels, improving the reliability of automated segmentation tools.

TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning


Xiwen Chen*, Peijie Qiu*, Wenhui Zhu*, Huayu Li, Hao Wang, Aristeidis Sotiras, Yalin Wang, Abolfazl Razi (*: co-first authors)
[ICML 2024]

TL;DR: A novel time-aware MIL method for multivariate time series classification that models temporal dependencies and achieves state-of-the-art performance on several benchmarks.

Presentations/Talks

Multimodal Variational Autoencoder: a Barycentric View. AAAI'25 [Slides]

A Deep Dive into Diffusion and Flow Models. Amazon Visual Search [Slides]

From Vision-Language Models to Multimodal Language Models. CIRC [Slides]

Reviewer

CVPR, ICCV, ECCV, WACV, ICLR, AAAI, IJCAI, AISTATS, KDD, MICCAI, ISBI, TMI, MedIA, etc.

Design credits to this webpage. Feel free to steal this website.