Niluthpol C. Mithun

	Niluthpol Chowdhury Mithun
	I am a Senior Computer Vision Scientist at Center for Vision Technologies, SRI International in Princeton, USA. I work on solving real world problems using computer vision and machine learning.
	I received my Ph.D. from University of California, Riverside, under the supervision of Prof. Amit K. Roy-Chowdhury. Previously, I completed my Bachelors and Masters from Bangladesh University of Engineering and Technology. I worked at Samsung R&D Institute Bangladesh as a Software Engineer from 2011-2014. I have spent Summer 2018 at SRI Center for Vision Technologies and Summer 2017 at Bosch Research Center, Pittsburgh as Research Intern.
	Email / CV / Google Scholar / LinkedIn

Research

I am interested in computer vision, machine learning, and multimedia. My researches are mainly focused on weakly supervised learning, semantic scene understanding, vision-based localization and multimodal data analysis.

News

Paper on Domain Adaptation for Semantic Segmentation accepted at WACV 2024.
Paper on Source Free Domain Adaptation accepted at CVPR 2023.
Paper on Cross-View Visual Geo-Localization accepted at IEEE VR 2023.
Paper on Text-based Temporal Localization of Novel Events accepted at ECCV 2022.
Paper on a New Loss Function for Imbalanced Semantic Segmentation accepted at ICRA 2022.
Two paper on Learning-based Visual Navigation accepted at ICPR 2022.
Paper on GPS-Denied Navigation and Mapping accepted in WACV 2022.
Paper on Text-Based Video Corpus Moment Retrieval accepted in IEEE TIP 2021.
Paper on Long-Range Augmented Reality accepted at IEEE Transactions on Visualization and Computer Graphics. The paper was presented at ISMAR 2021.
Paper on Semantic Map Attention for Efficient Visual Navigation accepted at ICRA 2021.
Our Paper on Cross-Modal Visual Localization is one of the four Best Paper Candidates at ACMMM 2020!

Selected Publications

   Unsupervised Domain Adaptation for Semantic Segmentation with Pseudo Label Self-Refinement
   X. Zhao, Niluthpol Chowdhury Mithun, A. Rajvanshi, H. Chiu, S. Samarasekera
   IEEE Winter Conference on Applications of Computer Vision (WACV), 2024

   C-SFDA: A Curriculum Learning Aided Self-Training Framework for Efficient Source Free Domain Adaptation
   N. Karim, Niluthpol Chowdhury Mithun, A. Rajvanshi, H. Chiu, S. Samarasekera, N. Rahnavard
   IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023

   Cross-View Visual Geo-Localization for Outdoor Augmented Reality
   Niluthpol Chowdhury Mithun, K. Minhas, H. Chiu, T. Oskiper, M. Sizintsev, S. Samarasekera, R. Kumar
   IEEE Conference on Virtual Reality and 3D User Interfaces (IEEE VR), 2023

   Text-based Temporal Localization of Novel Events
   Sudipta Paul, Niluthpol Chowdhury Mithun, Amit K Roy-Chowdhury
   European Conference on Computer Vision (ECCV), 2022

   Striking the Right Balance: Recall Loss for Semantic Segmentation
    Junjiao Tian, Niluthpol Chowdhury Mithun, Zachary Seymour, Han-Pang Chiu, Zsolt Kira
   International Conference on Robotics and Automation (ICRA), 2022
   [Code]

    SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language
    Navigation in Continuous Environments
    M. Irshad, Niluthpol Chowdhury Mithun, Z. Seymour, H. Chiu, S. Samarasekera, R. Kumar
    International Conference on Pattern Recognition (ICPR), 2022

   Text-based Localization of Moments in a Video Corpus
    Sudipta Paul, Niluthpol Chowdhury Mithun, Amit K Roy-Chowdhury
   IEEE Transactions on Image Processing (TIP), 2021

   Long-Range Augmented Reality with Dynamic Occlusion Rendering
    Mikhail Sizintsev, Niluthpol Chowdhury Mithun, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar
   IEEE Transactions on Visualization and Computer Graphics (TVCG), 2021

   MaAST: Map Attention with Semantic Transformers for Efficient Visual Navigation
   Zachary Seymour, Kowshik Thopalli, Niluthpol C Mithun, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar
   International Conference on Robotics and Automation (ICRA), 2021

   RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization
    Niluthpol Chowdhury Mithun, Karan Sikka, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar
    ACM International Conference on Multimedia (ACM MM), 2020
   [BEST Paper Candidate] [GRAL (Ground RGB to Aerial LIDAR) Dataset]

   Weakly Supervised Video Moment Retrieval from Text Queries
    Niluthpol Chowdhury Mithun, Sujoy Paul, Amit K. Roy-Chowdhury
   IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
   [Code]

   Joint Embeddings with Multimodal Cues for Video-Text Retrieval
   Niluthpol Chowdhury Mithun, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury
   International Journal Multimedia Information Retrieval (IJMIR), 2019

   Construction of Diverse Image Datasets from Web Collections with Limited Labeling
   Niluthpol Chowdhury Mithun, Rameswar Panda, Amit K. Roy-Chowdhury
   IEEE Transactions Circuits & Systems for Video Technology (TCSVT), 2019

   Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval
    Niluthpol Chowdhury Mithun, Juncheng B Li, Florian Metze, Amit K. Roy-Chowdhury
   ACM Int. Conference on Multimedia Retrieval (ICMR), 2018
   [Winner of BEST Paper Award] [Code]

   Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval
    Niluthpol Chowdhury Mithun, Rameswar Panda, Evangelos Papalexakis, Amit K. Roy-Chowdhury
   ACM International Conference on Multimedia (ACM MM), 2018

   ODDS: Real-Time Object Detection using Depth Sensors on Embedded GPUs
   Niluthpol Chowdhury Mithun, Sirajum Munir, Karen Guo, Charles Shelton
    ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN), 2018
   [video]

Learning Long-Term Invariant features for Vision-based Localization

Niluthpol Chowdhury Mithun, Cody Simons, Robert Casey, Stefan Hilligardt, Amit K. Roy-Chowdhury

IEEE Winter Conference on Computer Vision (WACV), 2018

Diversity-aware Multi Video Summarization
    Rameswar Panda, Niluthpol Chowdhury Mithun, Amit K. Roy-Chowdhury
    IEEE Transactions on Image Processing (TIP), 2017
    [Tour20 video summarization dataset]

   Generating Diverse Image Datasets with Limited Labeling
   Niluthpol Chowdhury Mithun, Rameswar Panda, Amit K. Roy-Chowdhury
   ACM Multimedia Conference (ACM MM), 2016
   [DivNet dataset]

   Detection and Classification of Vehicles from Video using Multiple Time-Spatial Images
   Niluthpol Chowdhury Mithun, Nafi Ur Rashid, S. M. Mahbubur Rahman
   IEEE Transactions on Intelligent Transportation Systems (TITS), 2012
   [EBVT dataset]