Mushtari Sadia
Mushtari Sadia

Mushtari Sadia

About me

I'm Sadia (she/her), a first year PhD student at the department of CSE in University of Michigan, Ann Arbor. I'm also a member of the research group of Professor Ang Chen. My research interests lie at the intersection of systems, security, privacy, and machine learning. I previously graduated from the department of CSE, BUET, and worked as a lecturer in the department of CSE, BRAC University, and as an adjunct lecturer at the department of CSE, BUET.

In my free time, I love travelling, dancing and binge-watching sitcoms.

My Resume

Work Experience

  1. Graduate Student Research Assistant

    Department of Computer Science and Engineering,

    University of Michigan, Ann Arbor

    August 2024 — Present

    Working with Professor Ang Chen.

  2. Lecturer (Full-time)

    Department of Computer Science and Engineering,

    BRAC University

    June 2023 — July 2024

    Course Instructor:

    (Summer 2023, Fall 2023, Spring 2024) CSE 220: Data Structures

    (Summer 2023, Fall 2023, Spring 2024, Summer 2024) CSE 221: Algorithms

    (Summer 2023, Fall 2023, Summer 2024) CSE 220: Data Structures Sessional

    (Summer 2023, Spring 2024, Summer 2024) CSE 110: Programming Language I Sessional

  3. Adjunct Lecturer

    Department of Computer Science and Engineering,

    Bangladesh University of Engineering and Technology

    November 2023 — March 2024

    Course Instructor:

    CSE391: Embedded Systems and Interfacing

    CSE392: Embedded Systems and Interfacing Sessional

    CSE102: Structured Programming Language Sessional

    CSE412: Simulation and Modeling Sessional

  4. Research Fellow

    Fatima Fellowship

    August 2023 — February 2024

    Worked with Dr. Praneeth Vepakomma (MIT) on a distributed machine learning project, implementing the concept of layer parallelization of the training process of deep learning models.

Research Experience

  1. Effectiveness of Transformer-based Language Models in Detecting Advanced Persistent Threats from System Provenance Graphs (2022 - Current)

    Computer Security, Natural Language Processing

    Undergraduate thesis project under Dr. Anindya Iqbal and Dr. Shahrear Iqbal(Research Officer, National Research Council (NRC) Canada). In this work, I co-implemented a framework which includes a robust process of creating a provenance graph from raw log data from the DARPA OPTC and DARPA TC E3 datasets, subsequently generating event sequences from that graph and finally transforming the data into a suitable format for transformer-based language models such as BERT, RoBERTa, GPT. My personal contributions were building the postgres database from the raw log datasets, designing SQL queries to generate the provenance graph, co-writing the code for preprocessing the graph data to extract traces with relevant attributes, and finally building the experiments with various pre-trained LLMs. We were able to achieve state of the art performance from our framework in detection of APT attacks. Currently, this work is under review for publication.


    Read the preprint: [arXiv]
    apt
  2. Advancing Parallelization in Deep Learning Training: A Novel HSIC-Based Approach and Its Comparative Performance Against Traditional Federated Models (2023 - Current)

    Federated Learning, Privacy, Optimization

    Working with Dr. Praneeth Vepakomma (MIT Camera Culture Group) as part of the Fatima Fellowship Research Program, on a distributed machine learning project. We developed a method for parallelizing the forward propagation in neural network training, utlizing the HSIC objective function to eliminate the need for backpropagation, all while maintaining the same level of accuracy. We also discovered that, in some instances, employing slightly outdated local updates can signifcantly reduce communicaton costs without compromising accuracy. We are presently in the phase of manuscript review of this work, as we prepare it for publication.

    fedML
  3. Development of Flood Forecasting System for Bangladesh-India Using Different Machine Learning Techniques (2020)

    Machine Learning

    In this study, I preprocessed datasets of weather parameters and employed the use of five different machine learning algorithms- exponent back propagation neural network (EBPNN), multilayer perceptron (MLP), support vector regression (SVR), DT Regression (DTR), and extreme gradient boosting (XGBoost), which were used to develop total 180 independent models based on a different combination of time lags for input data and lead time in forecast. Models were developed for Someshwari-Kangsa sub-watershed of Bangladesh’s North Central hydrological region with 5772 km2 drainage area.


    EGU General Assembly 2021: [Poster Presentation]
    EGU
  4. Dengue Forecasting System (2020-2022)

    Machine Learning

    Worked with Dr. ABM Alim Al Islam, Ramisa Alam, Mashiat Mustaq and Tahiea Taz. In this project, we built a time series forecasting model-based dengue forecast system that predicts the number of dengue cases in any given region based on the recent cases in that region and the state of different weather parameters. My contributions in the project was preprocessing the datasets as well as developing the models using MS Azure Services.

    Presented the idea and won the 1st Runner Up position in Microsoft Virtual Hackathon 2022.

    [Github]

    Dengue

Education

  1. PhD in Computer Science and Engineering

    University of Michigan, Ann Arbor

    August 2024 - Present
  2. B.Sc. in Computer Science and Engineering

    Bangladesh University of Engineering and Technology

    April 2018 - May 2023

    CGPA: 3.84/4.00

    Notable Courses:

    CSE471- Machine Learning CSE405- Computer Security
    CSE453- High Performance Database Systems CSE309- Compiler Design
    CSE321- Computer Networks CSE313- Operating Systems
    CSE463- Introduction to Bioinformatics CSE409- Computer Graphics
    CSE411- Simulation and Modeling MATH247- Linear Algebra
    MATH245- Statistics and Probability CSE305- Computer Architecture

  3. Higher Secondary School Certificate (HSC)

    Engineering University School and College

    2017

    GPA: 5.00/5.00

    - Board General Scholarship

  4. Secondary School Certificate (SSC)

    Engineering University School and College

    2017

    GPA: 5.00/5.00

Technical Skills

  1. Programming Languages

    C/C++, x86 Assembly, Bison/Flex, Python, Java, Javascript, Bash, MySQL

  2. Frameworks

    Docker, PyTorch, NS3, xv6, Django REST, ReactJS, Git, Oracle DBMS, LaTeX, Wireshark

  3. Libraries

    Sklearn, Pandas, Matplotlib, Seaborn

Projects

Achievements

Competitions

  1. Microsoft Virtual Hackathon

    1st Runner Up (2022) (Among 700 teams around the world)

    We built an AI based dengue forecast system using MS Azure Services that predicts the number of dengue cases in any given region based on the recent cases in that region and the state of different weather parameters.

    Code: https://github.com/Mushtari-Sadia/Predictado-A-Dengue-Forecasting-Dashboard

    dengue
    Image: The dashboard built with MS PowerBI [competition link]
  2. HerWILL Datathon (2022)

    Champion (Among 110 female contestants around the world)

    A machine learning based forecasting system for predicting taxi demand in a city.

    Code: https://github.com/ramisa2108/Taxi-Demand-Forecasting-System

    herwill
    Image: What an honor and a privilege it is to have received this certificate from two of the most prominent scientists and educators of our lifetime, Dr. Pascal Van Hentenryck (https://en.wikipedia.org/wiki/Pascal_Van_Hentenryck) and Dr. M. Zafar Iqbal (https://en.wikipedia.org/wiki/Muhammed_Zafar_Iqbal). [article link]
  3. UNDP Women’s Digital Innovation Hackathon (2021)

    2nd Runner Up (Among 30 teams)

    Built an AI based Dengue Monitor & Control System

    dengue
    Image: The jury listens to the final pitch about improving Dengue fever control [news article link]
  4. Dhaka-AI 2020

    Participant

    A month-long competition on vehicle detection and classification task from traffic images using object detection models such as Yolo-v5, EfficientDet.

    Code: https://github.com/Mushtari-Sadia/Vehicle-Detection-with-State-of-the-Art-Deep-Learning-Models

  5. Ada Lovelace Datathon 2021

    Participant

    A competition on data analysis using ML on covid mental health.

    Code: https://www.kaggle.com/code/mushtarisadia/team5-thepowerpuffcoders-ensemble-rf-log-reg/notebook

  6. NLP Hackathon 2023

    Participant

    A competition on named entity recognition of a bangla language dataset.

    Code: https://www.kaggle.com/code/mushtarisadia/nlp-hackathon-2023/notebook

  7. Robi Datathon 2.0

    Participant

    A competition on data analysis using ML on data collected by the Robi (mobile SIM) company.

    Code: https://www.kaggle.com/code/mushtarisadia/robi-datahon-2-0-final-e646e6/notebook

  8. Kaggledays competition

    Participant

    In this competition, the aim was to predict the unit load power generation based on the given factors of a steam turbine in specific working environments using ML algorithms.

    Code: https://www.kaggle.com/code/mushtarisadia/fastai-power-lgb/notebook

Honors & Awards

  1. Fatima Fellowship

    August 2023-Present

    [Visit Their Website]

    The Fatima Al-Fihri Predoctoral Fellowship is a free 9-month program in which students from around the world, who are planning on applying to computer science or machine learning PhD programs in the United States or Europe, work with current PhD students or researchers on research projects to gain research experience and strengthen their applications.

  2. GHC Scholarship

    2022

    A scholarship granted to only a few selected candidates based on merit. Received the privilege of attending the Grace Hopper Celebration of Women in Computing, a conference that brings together women in tech from around the world.

  3. Dean’s List Scholarship

    2018-2022

    This scholarship is granted to undergraduate students for their academic excellence.

  4. Dhaka Board General Scholarship (HSC)

    2017

Leadership Experience

  1. President

    Bangladeshi Women In Computer Science & Engineering (BWCSE)

    April 2022 - May 2023

    [Visit the Website]

    The mission of BWCSE is to empower Bangladeshi women by fostering academic, social, and professional growth in the field of computer science and engineering. As president, my responsibilities included coordinating and organizing various competitive programming competitions, workshops and seminars on various CS fields.

  2. Batch Representative

    Bangladeshi Women In Computer Science & Engineering (BWCSE)

    April 2021 - April 2022
  3. Organizer

    BUET CSE FEST 2022

    June 2022 - August 2022

    [Visit our Facebook Page]

    Coordinated several inter-university competitions such as hackathon, programming contest, deep learning competition, AI contest; as well as cultural programs on behalf of the graduating class.

  4. Student Tutor

    December 2017 - April 2023

    Tutored many students ranging from kindergarten, primary school, middle school, high school and undergrad-level students during my undergrad years.