Jackie Giang Vo

An enthusiastic graduate from Carnegie Mellon University with a Master's degree in Computational Biology with 3 years of industry/academia molecular biology and biochemistry work experience and additional 3 years research experience. I am interested in Machine Learning/AI in biomedical fields, especially in drug discovery and medical diagnosis and treatment, as well as bioinformatics and the “song” of sequencing.

Beyond my academic pursuits, I am an avid athlete who find immense joy and balance in leading an active lifestyle, continuously striving to challenge and improve myself physically. I am currently challenging myself in upcoming full marathon and triathlon. Apart from that, I am an amateur video creator, who enjoys the art of film making and editing.

Education

Carnegie Mellon University

Master in Computation Biology

Pittsburgh, PA

St. John’s University

BS in Biology, Minor Chemistry

Jamaica, NY

Experience

Computational Biologist Intern

Predictive Oncology Inc.

Pittsburgh, PA

May 2023 - Aug 2023

During my 12-week internship, I delved extensively into bioinformatics, software development, and machine learning, collaborating on projects that honed my skills in teamwork, communication, and efficient work management. This comprehensive internship experience, coupled with invaluable mentorship, propelled me beyond my comfort zone both personally and professionally, equipping me with an array of skills beyond my expectations.

Led Variant Calling project using the Illumina DRAGEN pipeline and conducted analysis for tumor-related patients, contributing to the enhancement of PEDAL—an ML, AI-based drug response prediction platform.
Contributed to the development of the Drug Combination Pipeline, leveraging experimental data encompassing cell lines, drug combinations, doses, and empirical responses to predict cell line responses to specific drug combinations.
Supported the analysis of tumor stained histopathology images utilizing AWS.
Utilized Confluence for effective task logging and team communication, streamlining collaborative efforts.

Research Assistant

University of Pittsburgh

Pittsburgh, PA

Jan 2023 - May 2023

Dr. David Koes’ Lab focuses on developing computational machine learning method on drug discovery, involving in protein structures and docking mechanism.

Developed Next Generation CrossDock pipeline between proteins and ligands using PySpark and bioinformatics packages such as DeeplyTough.
Enhanced and debugged bioinformatics packages, ensuring optimal performance and reliability in research workflows.

Research Assistant

Memorial Sloan Kettering Cancer Center

New York, NY

Mar 2020 - July 2022

The Center of Molecular Oncology (CMO) Innovation Laboratory and Michael Berger’s Laboratory focus on novel computational and experimental techniques to characterize the spectrum of genetic mutations in human tumors in order to identify biomarkers of cancer progression and drug response. Additionally, the Innovation Laboratory develops assays and platforms for the molecular profiling of tumors and associated computational pipelines and evaluates emerging technologies for potential clinical use.

Devised and fine-tuned an innovative, automation-compatible dual DNA and RNA extraction method from FFPE samples, streamlining processes effectively.
Formulated comprehensive Standard Operating Procedures (SOPs) and collaborated closely with the clinical research department, facilitating the transition from manual methodologies to automated platforms.
Engaged in collaborative efforts with a team of three and computational biologists to meticulously analyze DNA and RNA sequencing data, ensuring thorough and accurate insights.
Conducted extensive research and experiments aimed at refining miRNA extraction methodologies from serum and plasma, contributing to ongoing advancements in the field.

SURF Fellow

SUNY Upstate Medical University

Syracuse, NY

June 2019 - Aug 2019

I worked at Dr. Bruce Knutson Laboratory for my summer fellowship. Knutson Lab focuses on molecular and biochemistry of RNA polymerase I transcription.

Employed computational tools like Patch-dock, PyMOL, and MS-XL data to create an in silico model of the RNA Polymerase I PIC (Pre-Initiation Complex).
Successfully expressed Core Factors and TATA Binding Protein (TBP) utilizing different epitope tags. The proteins were purified via a meticulous series of column purification methods, including affinity chromatography, size exclusion chromatography, and AKTA purification.
Refined and executed in vitro Fe-BABE protein hydroxyl radical cleavage assays to pinpoint the precise location of TBP within the Pre-Initiation Complex of RNA Polymerase I.

Technical Skills

Python, R, Golang, Bash

Programing Languages

MySQL

Database Management

AWS, Spark

Cloud-based Technologies

Confluence, Notion

Task Management

PyTorch

Machine Learning Tools

Git

Version Control

Projects

To impute or not to impute

Carnegie Mellon University

Implemented the neural network architecture Multiple Input Multiple Output (MIMO) on TAPE protein dataset in order to predict the expression signal from different mutations and compared the performance of MIMO compared to regular ensemble as well as One Input One Output architecture.

Evaluated and compared neural network model performance among different pruning methods including L1-norm, magnitude based pruning as well as regression based feature construction to find the best methods to leverage between computational expenses (more masked weights) and performance (model accuracy etc).

Molecular docking with GNINA and Molecular Dynamics simulation of T4 Lysozyme Ligand Binding Complexes within different solvents including Indole, Toluene etc., followed by trajectories analysis and comparison between docked and undocked conformers.

Finding MIMO

Carnegie Mellon University

Methods for Inducing Neural Network Sparsity

Carnegie Mellon University

Molecular Dynamics in Drug Discovery

Carnegie Mellon University

Tackled and compared several methods of data imputation methods on active learning and their disadvantages in handling missing data in active learning. Three different methods were implemented or proposed (3), then compared, including Features selection imputation (1), Imputation and sample query using MICE (2), and Features and Sample importance imputation (3).

Relevant Experience

02604 - Bioinformatics Algorithms, Spring 24, Carnegie Mellon University

Teaching Assistant

Dr. David Koe’s Lab, University of Pittsburgh

Research Assistant

Leadership

The Graduate Student Assembly (GSA) is the branch of student government that represents all graduate students at Carnegie Mellon University, whose mission is to advocate for and support the diverse needs of all CMU graduate students in their personal, professional, and public lives.

As a GSA Rep, I was responsible for organizing monthly social events, strengthen the community between all cohorts of Master students in three majors including Computational Biology (CB), Quantitative Biology and Bioinformatics (QBB), and Biotechnology and Pharmaceutical Engineering (BTPE).

MSCB GSA Representatives

I served as the captain for two intramural sport teams (volleyball and soccer) for the department in Fall 2023 and Spring 2024, playing the CMU intramural tournament.

Departmental Intramural Sport Team Captain