Ph.D. Researcher · ML & Computational Health

Richmond
Ewenam
Azumah

M.S. Scientific Computing (completed) · Ph.D. Computer Science (in progress) @ Georgia State University. I build machine learning systems for genomic surveillance, epidemiological modeling, and health informatics — where data meets real-world impact.

3.2×
Cluster score improvement
over DNABERT-2
3K+
Excess syphilis cases
quantified post-COVID
1M
Genomic reads
processed (SARS-CoV-2)
3.61
Graduate GPA
Scientific Computing
01

About Me

Richmond Azumah

Graduate researcher and ML engineer passionate about building computational tools that improve human health outcomes.

  • Location Atlanta, GA
  • Email azumahrichmond6@gmail.com
  • Phone +1 (470) 929-0724
  • Website Ewenam.github.io
  • Status Ph.D. Candidate, CS (GPA 3.67)

I am a Ph.D. researcher in Computer Science at Georgia State University (GPA 3.67), working at the intersection of machine learning, bioinformatics, and public health. I hold an M.S. in Scientific Computing (GPA 3.61) and a B.S. in Statistics (First Class Honors, KNUST Ghana). My doctoral research focuses on unsupervised genomic representation learning, probabilistic community structure inference, and epidemiological excess burden modeling.

I hold a B.S. in Statistics (First Class Honors) from KNUST, Ghana, and bring a strong mathematical foundation to every project — from variational EM formulations for stochastic block models to masked reconstruction losses in VQ-VAEs trained across 4× GPU clusters.

Beyond research, I am passionate about broadening participation in computing, having taught and mentored students from middle school through university level across three countries.

Python · PyTorch · MATLAB
VQ-VAE · Transformers · GANs
Kraken2 · Bracken · BioPython
SBM · Louvain · Spectral-GMM
Epidemiological Modeling
OLS · AICc · Bootstrapping
Scikit-Learn · R · SQL
Distributed GPU Training
02

Research

⬤ Bioinformatics — GSU 2026
Probabilistic Reconstruction of Microbial/Viral Interaction Networks in Wastewater

Applied SBM and Spectral-GMM to NCBI wastewater metagenomic data. Built full pipeline: FASTA → Kraken2 → Bracken → CLR normalization → Spearman correlation graph → probabilistic community detection. Identified human gut, pathogen-enriched, and environmental taxa communities with biological validation.

⬤ Epidemiology — GSU 2026
Modeling Excess Congenital Syphilis Cases Post-COVID-19 Using n-Subepidemic Toolbox

Fitted 2-sub-epidemic logistic model to US congenital syphilis surveillance data (2010–2023, AICc = 138.50). Estimated 3,012 excess cases (~31% above pre-pandemic baseline) across 2020–2023. State-level OLS regression identified maternal syphilis rate as strongest predictor of excess burden (r = 0.425, p = 0.005).

03

Projects

P01
Brain Network Community Detection

Applied Louvain method to functional brain connectivity data across temporal windows. Developed label standardization logic to consistently distinguish control-dominated vs. schizophrenia-dominated communities across varying numbers of detected communities.

Python NetworkX Louvain Graph ML
2023–24
P02
Semi-Supervised Learning: MNIST & Two-Moon Datasets

Implemented and compared Entropy Minimization, Pseudo-Labeling, and Virtual Adversarial Training on MNIST and the Two-Moon toy dataset. Proposed a novel noisy Pseudo-Label variant that improved classification on unlabeled data beyond baseline methods.

PyTorch Semi-supervised VAT Pseudo-Labeling
2023
P03
GAN Survey: Theory, Variants & Applications

Comprehensive review of 7 GAN variants (DCGAN, cGAN, WGAN, LSGAN, CycleGAN, CoGAN, InfoGAN) covering mathematical formulations, KL/JS divergence relationships, minimax optimization theory, and domain-specific applications including image synthesis, NLP, and 3D modeling.

Deep Learning GANs Literature Review
2023
P04
Neural Network Fundamentals — From Scratch

Built dense layers, ReLU, and Softmax activations from scratch in NumPy. Explored forward pass mechanics, activation behavior, and multi-layer composition as mathematical foundations for deeper architecture study.

Python NumPy Neural Networks
2023
P05
COVID-19 Tweet Sentiment & Public Health Analysis

Large-scale dataset of COVID-19 related tweets spanning 2020–2023. Ongoing research into NLP-based public health surveillance — sentiment trends, misinformation spread patterns, and geographic health signal extraction from social media at scale.

NLP Twitter API Public Health Pandas
2023–26
04

Education

Ph.D. Computer Science
Georgia State University — Atlanta, GA
GPA: 3.67 · In Progress
Relevant Coursework

Advanced Algorithms in Bioinformatics · Machine Learning · Applied Mathematics · Database Systems · Epidemiological Modeling

M.S. Scientific Computing
Georgia State University — Atlanta, GA
GPA: 3.61 · Completed
B.S. Statistics — First Class Honors

Kwame Nkrumah University of Science & Technology, Ghana · GPA: 3.32

Let's Connect

I'm actively seeking research collaborations, GRA positions, and opportunities in ML for health informatics and bioinformatics.