Niluthpol Chowdhury Mithun

I am an Advanced Computer Scientist at Center for Vision Technologies, SRI International in Princeton, USA. I work on solving real world problems using computer vision and machine learning.

I received my Ph.D. from University of California, Riverside, under the supervision of Prof. Amit K. Roy-Chowdhury. Previously, I completed my Bachelors and Masters from Bangladesh University of Engineering and Technology.

I worked at Samsung R&D Institute Bangladesh as a Software Engineer from 2011-2014. I have spent Summer 2018 at SRI Center for Vision Technologies and Summer 2017 at Bosch Research Center, Pittsburgh as Research Intern.

Email  /  CV  /  Google Scholar  / LinkedIn      


 Research

I am interested in computer vision, machine learning, and multimedia. My researches are mainly focused on weakly supervised learning, multimodal data analysis, semantic scene understanding, and vision-based localization.



 News


 Selected Publications




   Text-based Temporal Localization of Novel Events
   Sudipta Paul, Niluthpol Chowdhury Mithun, Amit K Roy-Chowdhury
   European Conference on Computer Vision (ECCV), 2022




  
   Striking the Right Balance: Recall Loss for Semantic Segmentation
    Junjiao Tian, Niluthpol Chowdhury Mithun, Zachary Seymour, Han-Pang Chiu, Zsolt Kira
   International Conference on Robotics and Automation (ICRA), 2022
   [Code]




  
   GraphMapper: Efficient Visual Navigation by Scene Graph Generation
   Zachary Seymour, Niluthpol Chowdhury Mithun, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar
   International Conference on Pattern Recognition (ICPR), 2022



   
    SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language  
    Navigation in Continuous Environments
    Muhammad Irshad, Niluthpol C Mithun, Zachary Seymour, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar
    International Conference on Pattern Recognition (ICPR), 2022



  
   Text-based Localization of Moments in a Video Corpus
    Sudipta Paul, Niluthpol Chowdhury Mithun, Amit K Roy-Chowdhury
   IEEE Transactions on Image Processing (TIP), 2021



 
   Long-Range Augmented Reality with Dynamic Occlusion Rendering
    Mikhail Sizintsev, Niluthpol Chowdhury Mithun, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar
   IEEE Transactions on Visualization and Computer Graphics (TVCG), 2021




   MaAST: Map Attention with Semantic Transformers for Efficient Visual Navigation
   Zachary Seymour, Kowshik Thopalli, Niluthpol C Mithun, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar
   International Conference on Robotics and Automation (ICRA), 2021




   RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization
    Niluthpol Chowdhury Mithun, Karan Sikka, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar
    ACM International Conference on Multimedia (ACM MM), 2020
   [BEST Paper Candidate]  [GRAL (Ground RGB to Aerial LIDAR) Dataset]




   Weakly Supervised Video Moment Retrieval from Text Queries
    Niluthpol Chowdhury Mithun, Sujoy PaulAmit K. Roy-Chowdhury
   IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
   [Code]

 



   Joint Embeddings with Multimodal Cues for Video-Text Retrieval
   Niluthpol Chowdhury Mithun, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury
   International Journal Multimedia Information Retrieval (IJMIR), 2019




 
   Construction of Diverse Image Datasets from Web Collections with Limited Labeling
   Niluthpol Chowdhury MithunRameswar Panda, Amit K. Roy-Chowdhury
   IEEE Transactions Circuits & Systems for Video Technology (TCSVT), 2019



   Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval
    Niluthpol Chowdhury Mithun, Juncheng B Li, Florian Metze, Amit K.  Roy-Chowdhury
   ACM Int. Conference on Multimedia Retrieval (ICMR), 2018
   [Winner of BEST Paper Award] [Code]



  
   Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval
    Niluthpol Chowdhury Mithun, Rameswar Panda, Evangelos Papalexakis, Amit K. Roy-Chowdhury
   ACM International Conference on Multimedia (ACM MM), 2018



  
   ODDS: Real-Time Object Detection using Depth Sensors on Embedded GPUs
   Niluthpol Chowdhury Mithun, Sirajum Munir, Karen Guo, Charles Shelton
    ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN), 2018
   [video]



 Learning Long-Term Invariant features for Vision-based Localization

 Niluthpol Chowdhury Mithun, Cody Simons, Robert Casey, Stefan Hilligardt,  Amit K. Roy-Chowdhury

IEEE Winter Conference on Computer Vision (WACV), 2018 




 Deep Learning Based Identity Verification in Renaissance Portraits

Akash Gupta, Niluthpol Chowdhury Mithun, Conrad Rudolph, Amit K. Roy-Chowdhury

IEEE Conference on Multimedia & Expo (ICME), 2018





    Diversity-aware Multi Video Summarization
      Rameswar Panda, Niluthpol Chowdhury Mithun, Amit K. Roy-Chowdhury
    IEEE Transactions on Image Processing (TIP), 2017

    [Tour20 video summarization dataset]




   Generating Diverse Image Datasets with Limited Labeling
   Niluthpol Chowdhury Mithun, Rameswar Panda, Amit K. Roy-Chowdhury
   ACM Multimedia Conference (ACM MM), 2016
   [DivNet dataset]




   Detection and Classification of Vehicles from Video using  Multiple Time-Spatial Images
   Niluthpol Chowdhury Mithun, Nafi Ur Rashid, S. M. Mahbubur Rahman
   IEEE Transactions on Intelligent Transportation Systems (TITS), 2012
   [EBVT dataset]




This website is awesome!