Pritish Sahu

Advanced Computer Scientist
SRI International, Princeton USA

Email: pritish.sahu@gmail.com
Phone: +1 732-485-2582
Location: 304 White Pine Cir, Lawrenceville, NJ 08648

Pritish Sahu

Research Interests

I focus on developing robust, adaptive models that address key limitations of LLMs and LVLMs, such as bias, hallucinations, and limited reasoning. My work enhances multimodal reasoning and decision-making in noisy, incomplete, and domain-specific contexts. I tackle challenges like task-specific overfitting and limited transferability through visual reasoning, domain adaptation, and disentangled representations.

Publications

2024
Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification
Sahu, P., Sikka, K., and Divakaran, A.
EMNLP
2022
Unpacking large language models with conceptual consistency
Sahu, P., Cogswell, M., Gong, Y., & Divakaran, A.
arXiv preprint arXiv:2209.15093
Challenges in Procedural Multimodal Machine Comprehension: A Novel Way To Benchmark
Sahu, P., Sikka, K., and Divakaran, A.
Winter Conference on Applications of Computer Vision (WACV)
Savir-t: Spatially attentive visual reasoning with transformers
Sahu, P., Basioti, K., & Pavlovic, V.
Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), pp. 460-476
DAReN: A Collaborative Approach Towards Visual Reasoning And Disentangling
Sahu, P., Basioti, K., & Pavlovic, V.
International Conference on Pattern Recognition (ICPR), pp. 4448-4455
2021
Comprehension Based Question Answering using Bloom's Taxonomy
Sahu, P., Cogswell, M., Divakaran, A., & Rutherford-Quach, S.
Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021), pp. 20-28
Towards solving multimodal comprehension
Sahu, P., Sikka, K., and Divakaran, A.
ArXiv preprint
2020
Zero-shot learning with knowledge enhanced visual semantic embeddings
Sikka, K., Huang, J., Silberfarb, A., Nayak, P., Rohrer, L., Sahu, P., ... & Rohwer, R.
arXiv preprint arXiv:2011.10889
Unsupervised multi-target domain adaptation: An information theoretic approach
Gholami, B., Sahu, P., Rudovic, O., Bousmalis, K., & Pavlovic, V.
IEEE Transactions on Image Processing, 29, 3993-4002
2019
Relevance factor vae: Learning and identifying disentangled factors
Kim, M., Wang, Y., Sahu, P., & Pavlovic, V.
arXiv preprint arXiv:1902.01568
Task-discriminative domain alignment for unsupervised domain adaptation
Gholami, B., Sahu, P., Kim, M., & Pavlovic, V.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops
Bayes-factor-vae: Hierarchical bayesian deep auto-encoder models for factor disentanglement
Kim, M., Wang, Y., Sahu, P., & Pavlovic, V.
Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2979-2987
Unsupervised visual domain adaptation: A deep max-margin gaussian process approach
Kim, M., Sahu, P., Gholami, B., & Pavlovic, V.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4380-4390
2016
Filling in the blanks: Reconstructing microscopic crowd motion from multiple disparate noisy sensors
Yoon, S., Kapadia, M., Sahu, P., & Pavlovic, V.
IEEE Winter Applications of Computer Vision Workshops (WACVW), pp. 1-9

Patents

Adapting a language model for multimodal multi-task learning
Sikka, K., Cogswell, M., Sahu, P., Ye, M., Rahman, A., Sridhar, R. and Divakaran, A.
SRI International Inc
U.S. Patent Application 18/619,916 (2024)
Method and system for determining a measure of conceptual consistency in large language models
Cogswell, M., Divakaran, A., Gong, Y. and Sahu, P.
SRI International Inc
U.S. Patent Application 18/541,035 (2024)
System and method for content comprehension and response
Divakaran, A., Sikka, K., Yao, Y., Gong, Y., Nunn, S., Sahu, P., Cogswell, M.A., Hostetler, J. and Rutherford-quach, S.
SRI International Inc
U.S. Patent 11,934,793 (2024)
System and method for comprehension based question answering using taxonomy
Divakaran, A., Cogswell, M.A. and Sahu, P.
SRI International Inc
U.S. Patent Application 17/869,589 (2023)

Academic Theses

Unlocking Visual Reasoning: Exploring Representations for Enhanced Problem-Solving
Ph.D. in Computer Science (2017-2024)
Rutgers University, School of Graduate Studies
Advisor: Dr. Vladimir Pavlovic
Cube Maze
M.S. in Computer Science (2015-2017)
Rutgers University, School of Graduate Studies
Advisor: Dr. James Abello
Study of Approaches to Remove Show-through and Bleed-through in Document Images
B.Tech. in Computer Science & Engineering (2007-2011)
National Institute of Technology, Rourkela, India
Advisor: Dr. Pankaj Sa

Conference/Journal Reviewer

  • Transactions on Pattern Analysis and Machine Intelligence
  • IEEE Transactions on Intelligent Systems
  • IEEE Transactions on MultiMedia
  • Machine Vision and Applications
  • Major Conferences:
    • European Conference on Computer Vision (ECCV)
    • International Conference on Computer Vision (ICCV)
    • Computer Vision and Pattern Recognition (CVPR)
    • Winter Conference on Applications of Computer Vision (WACV)
    • Asian Conference on Computer Vision (ACCV)
    • ACM Multimedia Conference (ACMMM)
    • International Conference on Pattern Recognition (ICPR)

Technical Talks

  • SRI International, Princeton, USA, 2023

Teaching Experience

  • 2023, Principles of Information and Data Management, Teaching Assistant, Rutgers University
  • 2021-2022, Machine Learning, Teaching Assistant, Rutgers University
  • 2019, Introduction to Artificial Intelligence, Teaching Assistant, Rutgers University
  • 2018-2019, Discrete Structures, Teaching Assistant, Rutgers University
  • 2018, Data Interaction and Visual Analytics, Teaching Assistant, Rutgers University
  • 2017, Data Structures & Algorithms, Teaching Assistant, Rutgers University
  • 2016, Introduction to Computer Science, Teaching Assistant, Rutgers University

Honors and Awards

  • Piero Zamperoni Best Student Paper Award presented at ICPR 2022 for “DAReN: A Collaborative Approach Towards Visual Reasoning And Disentangling".
    Featured in the IAPR newsletter, Volume 44, Number 4, October 2022 edition.
  • Received “Outstanding Programming Application Award” and “Outstanding Project Award” from Computer Science Department at Rutgers University.
  • Samsung Best Project Award: For successfully implementing Full HD & Smart TV features on Samsung TV.

Press Coverage

  • SRI featured Pelican (article) in an interview about efforts to enhance the trustworthiness of generative AI systems, discussing our research on making AI behavior more reliable and aligned with human values.