Name: GRAIL-V at CVPR 2026
Start: 2026-06-03T07:30:00-06:00
End: 2026-06-03T12:30:00-06:00
Location: CVPR 2026

Accepted papers

GRAIL-V 2026 Papers

Accepted papers for GRAIL-V at CVPR 2026 are listed below. Links open the corresponding OpenReview pages and, where available, the official CVF Open Access proceedings pages.

Outstanding Paper · $1,500 HTEF: Holistic Brand-Theme Alignment Scoring as a Catalog Gate for Grounded Conversational Recommendation Best Paper · $2,000 SHOE - Semantic HOI Open-vocabulary Evaluation metric

#4 Long paper

CompAgent: An Agentic Framework for Visual Compliance Verification

Rahul Ghosh, Baishali Chaudhury, Hari Prasanna Das, Meghana Ashok, Ryan Razkenari, Long Chen, Sungmin Hong, Chun-Hao Liu

OpenReview CVPR Proceedings

#5 Long paper

A Sanity Check on Composed Image Retrieval

Yikun Liu, Jiangchao Yao, Weidi Xie, Yanfeng Wang

OpenReview CVPR Proceedings

#7 Long paper

Emotional Vocabulary as Semantic Grounding: How Language Register Affects Diffusion Efficiency in Video Generation

Scott Boudreaux

OpenReview CVPR Proceedings

#9 Long paper

EFSA: Episodic Few-Shot Adaptation for Text-to-Image Retrieval

Muhammad Huzaifa, Yova Kementchedjhieva

OpenReview CVPR Proceedings

#10 Long paper

ViSS-R1: Self-Supervised Reinforcement Video Reasoning

Bo Fang, YuXin Song, Haoyuan Sun, Xinyao Zhang, Qiangqiang Wu, Wenhao Wu, Antoni B. Chan

OpenReview CVPR Proceedings

#11 Long paper

HIVE: Query, Hypothesize, Verify — A LLM Framework for Multimodal Reasoning-Intensive Retrieval

Mahmoud Abdalla, Mahmoud SalahEldin Kasem, Mohamed Mahmoud, Mostafa Farouk Senussi, Abdelrahman Abdallah, Hyun Soo Kang

OpenReview CVPR Proceedings

#12 Non-archival submission

MentalBlackboard: Evaluating Spatial Visualization via Mathematical Transformations

Nilay Yilmaz, Maitreya Patel, Naga Sai Abhiram kusumba, Yixuan He, Yezhou Yang

OpenReview

#13 Long paper

RePlan-Bot: Multi-Level Replanning for Embodied Instruction Following

Xicheng Gong, Guozheng Sun, Peiran Xu, Yadong MU

OpenReview CVPR Proceedings

#20 Long paper

Towards Context-Aware Image Anonymization with Multi-Agent Reasoning

Robert Aufschläger, Jakob Folz, Gautam Savaliya, Manjitha D Vidanalage, Michael Heigl, Martin Schramm

OpenReview CVPR Proceedings

#21 Long paper

CoCoA-DVC: Consistency and Concept Aware Training for Dense Video Captioning

Jay Nitin Paranjape, Yue Guo, sankar venkataraman, Vishal M. Patel, Nataraj Jammalamadaka

OpenReview CVPR Proceedings

#23 Long paper

DualProc: Dual-Process Prompting Reduces Confident Errors in Vision-Language Models for Grounded Retrieval and Agentic Pipelines

Aayam Bansal, Ishaan Gangwani

OpenReview CVPR Proceedings

#24 Long paper

CropVLM: Learning to Zoom for Fine-Grained Vision-Language Perception

Miguel Carvalho, Helder Dias, Bruno Martins

OpenReview CVPR Proceedings

#27 Long paper

BRIDGE: Multimodal-to-Text Retrieval via Reinforcement-Learned Query Alignment

Mohamed Darwish Mounis, Mohamed Mahmoud, Shaimaa Sedek, Mahmoud Abdalla, Mahmoud SalahEldin Kasem, Abdelrahman Abdallah, Hyun Soo Kang

OpenReview CVPR Proceedings

#29 Long paper

Memory in Multimodal AI Agents: Hardware-Software Co-Design for KV Caches, Attention IO, and Retrieval Stores

Shubham Khandelwal

OpenReview

#30 Short paper

ChatUMM: Robust Context Tracking for Conversational Interleaved Generation

Wenxun Dai, Zhiyuan Zhao, Yule Zhong, Yiji Cheng, Jian-Wei Zhang, LinqingWang, Shiyi Zhang, Yunlong Lin, Runze He, Fellix Song, Wayne Zhuang, Yong Liu, Haoji Zhang, Yansong Tang, Chunyu Wang

OpenReview CVPR Proceedings

#31 Long paper

Gaze-Regularized Vision-Language-Action Models for Robotic Manipulation

Anupam Pani, Yanchao Yang

OpenReview CVPR Proceedings

#33 Non-archival submission

WIST: Web-Grounded Iterative Self-Play Tree for Domain-Targeted Reasoning Improvement

Fangyuan Li, Pengfei Li, Shijie Wang, Junqi Gao, Jianxing Liu, Biqing Qi, Yuqiang Li

OpenReview

#35 Non-archival submission

Investigating VLM Hallucination from a Cognitive Psychology Perspective: A First Step Toward Interpretation with Intriguing Observations

Xiangrui Liu, Man Luo, Agneet Chatterjee, Hua Wei, Chitta Baral, Yezhou Yang

OpenReview CVPR Proceedings

#36 Long paper Outstanding Paper · $1,500

HTEF: Holistic Brand-Theme Alignment Scoring as a Catalog Gate for Grounded Conversational Recommendation

Md Mahmudur Rahman, Dhruv Garg, Rishabh Rathod, Sanket Bindle

OpenReview CVPR Proceedings

#40 Long paper

The Race between Agentic AI Capabilities and Data Quality Control in Online Surveys

Sourav Panda, Hillmer Chona, Rupak Kumar Das, Shreyash Kale, Shikha Soneji, Jonathan Dodge

OpenReview

#43 Non-archival submission

Evaluating Reasoning Fidelity in Visual Text Generation

Jiajun Hong, Jiawei Zhou

OpenReview

#46 Non-archival submission

Lightweight and Production-Ready PDF Visual Element Parsing

Meizhu Liu, Yassi Abbasi, Matthew Rowe, M. Avendi, Paul Li

OpenReview CVPR Proceedings

#51 Non-archival submission

M3Grounder: Mask-Based Multi-Span and Multi-Granular Grounding for Document QA

Venkata Kesav Venna, Sai Madhusudan Gunda, Jyothi Swaroopa Jinka, Hrithik Sagar Rachakonda, Anirudh Srinivasan, Ravi Kiran Sarvadevabhatla

OpenReview

#52 Long paper Best Paper · $2,000

SHOE - Semantic HOI Open-vocabulary Evaluation metric

Maja Noack, Qinqian Lei, Taipeng Tian, Bihan Dong, Robby T. Tan, Yixin Chen, John Young, Saijun Zhang, Bo Wang

OpenReview CVPR Proceedings

#53 Long paper

RAGENT: Robust Optimization for Grounded Vision-Language Retrieval

Kathy Wu, Sarthak Srivastava

OpenReview CVPR Proceedings

#55 Short paper

Learning to Mix Flat and Curved Representations for Vision-Language Retrieval

Kathy Wu, Sarthak Srivastava

OpenReview

#56 Long paper

Neural-Symbolic Intention Refinement with User Feedback for Text-to-Image Retrieval

BAI YU, Lei Zhang, Xiaoyan Hu, Feng Zhu, Rui Zhao

OpenReview CVPR Proceedings

#59 Long paper

Knowledge or Action? Automation Boundary Prediction with Intent Discovery and Knowledge Use-Case Enablement for Agentic Enterprise Support

Kumar Mayank, Ipseeta Sahu, Sajeetha Jaganathan

OpenReview CVPR Proceedings

#61 Long paper

Negation Matters: Training-Free Negation-Aware Image Retrieval

Aashish Pokhrel, Shivanand Venkanna Sheshappanavar

OpenReview CVPR Proceedings

#63 Long paper

Seeing without Looking: Do Vision-Language Benchmarks Really Test Vision?

Zixuan Lan, Luzhe Sun, Matthew Walter, Jiawei Zhou

OpenReview CVPR Proceedings

#64 Long paper

CMAG: Concept-Scaffolded Retrieval for Marketplace Avatar Generation

Rajeev Goel, Jason Ding, Phani Harish Wajjala, Pavan K. Turaga, Tejaswi Gowda, Krishna C. Garikipati

OpenReview CVPR Proceedings

#65 Long paper

A Multi-Agent Framework for Grounding Medical AI in Expert Clinical Knowledge under Domain Shift

Midhat Urooj, Ayan Banerjee, Sandeep Gupta

OpenReview CVPR Proceedings

#68 Long paper

SlotVTG: Object-Centric Adapter for Generalizable Video Temporal Grounding

Jiwook Han, Geo Ahn, Youngrae Kim, Jinwoo Choi

OpenReview CVPR Proceedings

#70 Long paper

Towards Robust Zero-Shot Video Temporal Grounding

Nutthadech Banditakkarakul, Bo Chen, Stephen Gould

OpenReview CVPR Proceedings

#71 Long paper

AgenticRAG-Driven Floorplan Parsing for Assistive Indoor Navigation for Blind and Low-Vision Users

Aydin Ayanzadeh, Tim Oates

OpenReview CVPR Proceedings

#72 Long paper

EVICT: Evidence-Sufficiency Verification via Counterfactual Dropout for Visually-Grounded Selective Question Answering

Varun Kotte

OpenReview CVPR Proceedings

#73 Long paper

CALIBRA: Calibration-Aware Multi-Agent Verification for Contactless Physiological Monitoring

Shadman Sakib, Gaurav Shinde, Nirmalya Roy

OpenReview CVPR Proceedings

Stay connected

General inquiries

Reach out to the organizers with questions about submissions, sponsorship, or program.

Email organizers See updates