Rohit Mujumdar

rohit dot mujumdar at yahoo dot com


πŸ‘¨β€πŸ’» I am a Research Engineer/Scientist in the OCTO and SATG (Office of the CTO and Software Advanced Tech Group) at Intel Corporation, Santa Clara (CA), where I work on research and development of software for Extreme Scale Graph Analytics. I previously worked as a Software Engineer at NCR Coproration. I graduated from Georgia Institute of Technology in Spring 2021 with an MS in Computer Science and a specialization in Machine Learning + Social Computing.

πŸ”¬ I collaborated with Prof. Srijan Kumar to develop HawkEye, a reputation system for Birdwatch (now called Community Notes), Twitter’s community-driven feature to address misinformation, which was the first research study on Birdwatch and was successful in influencing the Birdwatch platform. Our paper based on this work got published in ASONAM'21.

πŸš€ I interned with the Cloud Services Design Analytics group at IBM Research. I filed two patents based on my work and also authored a poster, which won the Best Poster (Honorable Mention) award at INFORMS'20

πŸ‹ Prior to commencing my MS degree in Fall 2019, I worked as a Data Scientist (NLP Research) at Froot Research, (now acquired by Aarav Solutions). As one of the first few employees of an early-stage AI start-up, I was mentored by the founder Amit Gautam.

πŸ§‘β€πŸŽ“ I earned a Bachelor's degree in Computer Engineering in 2017 from Vishwakarma Institute of Technology, Pune, where I was advised by Dr. Manasi Patwardhan.

πŸŽ₯ I run a video podcast, 'Talking To The Moon', where I interview people across professions and walks of life and listen to the stories they have to tell. More about the idea behind the vodcast in this article.

πŸ’ƒπŸ» Outside of work, I love to dance! (a few of my dance videos). I am an avid practitioner of Yoga and an ardent promoter of mental health awareness.

πŸ“§ Feel free to drop me an email or DM me on Twitter; I love making new friends and am always open to interesting conversations!

E-mail  |  Resume  |  LinkedIn  |  Github  |  Twitter  |  Blog

Interests

My interests lie in Applied Machine Learning and Data Science, in which I am particularly fond of Computational Social Science/Social Computing. Projects involving graph datasets and graph analytics have been a common theme in my journey. I have lately also been interested in Graph Neural Networks and Data Centric AI.

Publications
3DSP Recognizing Similar Relationships Within Ontology to Fine Tune Ontology
Neelam Chandolikar, Rishav Raj, Rohit Mujumdar
Keywords: knowledge graphs, natural language processing, semantic search engine, ontology learning, triplet extraction, edcuation technology
ICDMAI

Ontology learning process involves identification of concepts and the relationships between these concepts. Automated ontology learning based on ML/DL techniques identifies these triples but suffers from the problem of duplicate or simialr rrelationships. We propose a solution to identify simialr relationship so that the ontology can be fine-tuned.

paper
3DSP Overcoming Language Disparity in Online Content Classification with Multimodal Learning
Gaurav Verma, Rohit Mujumdar, Zijie J. Wang, Munmun De Choudhury, Srijan Kumar
Keywords: social media, multimodal language models, language disparity, language translation
ICWSM 2022

We investigate the disparity between English and non-English language models and show that detection frame-works based on pre-trained large language models like BERT and multilingual-BERT systematically perform better on the English language. We demonstrate the promise of incorporating the information contained in images via multimodal machine learning to address this disparity.

website | paper
3DSP HawkEye: A Robust Reputation System for Community-based Counter-Misinformation
Rohit Mujumdar, Srijan Kumar
Keywords: social network analysis, misinformation, graph algorithms, adverserial attack
ASONAM 2021

We investigate the robustness of Birdwatch against adversaries and show that the current Birdwatch system is vulnerable to manipulation attacks. To overcome this vulnerability, we develop HawkEye, a cold-start-aware graphbased recursive algorithm, and show that it is more robust to such attacks.

paper | code | video
3DSP A Heuristic Approach To Compute Service Request Resolution Time (Poster)
Rohit Mujumdar, Pawan Chowdhary, Shubhi Asthana.
Keywords: time series analysis, operations research, ticket resolution, predictive model
INFORMS 2020, Best Poster Award (Honorable Mention Award)

We use statistical analyses and regression-based techniques to predict the resolution times of incident tickets. We employ techniques like dynamic rolling window, auto-regressive window-flip and artificial data creation handle data eccentricities.

poster | video
Patents
3DSP US20220270019A1 : Ticket-agent matching and agent skillset development
Rohit Mujumdar, Shubhi Asthana, Pawan Chowdhary, Aly Megahed, Bing Zhang
Keywords: operations research, ticket resolution, employee skill management, sentiment analysis, performance review
3DSP US20220164744A1 : Demand forecasting of service requests volume
Bing Zhang, , Shubhi Asthana, Pawan Chowdhary, Aly Megahed, Rohit Mujumdar, Taiga Nakamura
Keywords: operations research, ticket resolution, human-in-the-loop, time series forecast
Selected Projects
3DSP Exploring Fairness in Graph Embeddings
Rohit Mujumdar, Sanjana Garg, Rohit Gajawada
Keywords: graph neural networks, fairness in AI, AI ethics, recommendation systems
(Web Search and Text Mining, Spring 2021. Georgia Tech)

  • Demonstrated bias in graph embeddings (generated for movie recommendation system) in existing techniques data using node2vec and metapath2vec
  • Introduced fairness mitigation methods based on Fairwalk and demonstated recommendations with lesser bias
  • report | code
    3DSP Do Scientific Ideas Originating from more Prestigious Universities Spread Faster?
    Rohit Mujumdar, David Kartchner
    (Data Science for Epidemiology, Fall 2020. Georgia Tech)
    Keywords: epidemiology, microsoft academic graph, natural language processing, epistemiology

  • Investigated the imbalance in the spread of ideas across academic research networks caused due to differences in academic prestige using disease spread models adapted from epidemiology.
  • Assessed if idea spread is driven by connectivity amongst original authors or the explicit prestige of their institution
  • website | report | code | software
    3DSP Can Machines Detect if you’re a Jerk?
    Rohit Mujumdar, Parvathy Sarat, Prathik Kaundinya, Sahith Dambekodi
    (Deep Learning, Fall 2020. Georgia Tech)
    Keywords: natural language processing, deep learning, ai ethics, reddit, computational social science

  • Used language models to assess if we can replicate the sentiments shared by Redditors and classify the Redditor's original post according to the verdict that was declared by rest of the Redditors.
  • Attempt to understand how a machine performs in a task that is entirely subjective but is possibly objective
  • report | code
    3DSP Conference Paper Acceptance Prediction
    Rohit Mujumdar, Rohan Goel, Arthita Ghosh, Shravani Sistla, Neha Pande.
    (Machine Learning, Spring 2020. Georgia Tech)
    Keywords: natural language processing, machine learning, feature engineering, peer read dataset

  • Investigated the role of peripheral features of research papers in their potential acceptability
  • Devised several innovative features such as abstract novelty and complexity, research strength score, title word-cloud etc
  • report | code | video
    3DSP Explainable Content Moderation Using CNNs
    Rohit Mujumdar, Shalini Chaudhuri, Sreehari Sreejith, Sushmita Singh
    Keywords: computer vision, image classifciation, content moderation, convolutional neural networks, violence detection, image flagging
    (Computer Vision, Fall 2019. Georgia Tech)

  • Built a minimal viable content moderating system to identify and flag regions of images containing violent/gory content
  • Achieved an accuracy of 0.89 by using Transfer Learning with Convolutional Neural Networks (VGG-16)
  • Captured model explainability by using Grad-CAM to generate visual explanations of the salience regions in the images
  • report | code | video
    3DSP Semantic Search Engine using a Dynamic Ontology
    Rohit Mujumdar, Poshraj Sharma, Pranjal Patil, Akanksha Patil, Dr Manasi Patwardhan.
    Keywords: natural language processing, semantic search, ontology, triplet extraction, entity-relation-entity, phrase2vec, education technology
    (Undergrad Capstone Project, 2016-17, VIT Pune)

  • Developed an e-learning platform for government schools by implementing a dynamic science ontology to store triplets (Entity-Relation-Entity) extracted from web-scraped data.
  • Devised a Phrase2Vec model driven similarity-scoring algorithm to replace similar Relations by a representative Relation.
  • report | video | code
    Leadership

    In The Media


    "In the midst of winter, I found there was, within me, an invincible summer." ~ Albert Camus
    Design inspired from here

    Flag Counter