Publications

(2024). Movie Gen: A Cast of Media Foundation Models. In arxiv:2410.13720.

PDF Cite Dataset Blog

(2024). Imagine yourself: Tuning-Free Personalized Image Generation. In arxiv:2409.13346.

PDF Cite Project

(2023). Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression. In ECCV 2024.

PDF Cite Blog

(2023). Context Diffusion: In-Context Aware Image Generation. In ECCV 2024.

PDF Cite Project

(2023). GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation. In CVPR 2024.

PDF Cite Project

(2023). Gen2Det: Generate to Detect. In CVPRW 2024.

PDF Cite

(2023). Unaligned Video-Text Pre-training using Iterative Alignment. Under Review.

PDF

(2022). FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning. In EMNLP 2022.

PDF Cite

(2022). CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval. In KDD 2022.

PDF Cite Poster Blog

(2021). Large-Scale Attribute-Object Compositions. In arxiv:2105.11373.

PDF Cite Blog