Jack Lee Jian Ming

I'm a

About

Welcome to my personal website! This is a place where I showcase my skills and experience to curious onlookers who may be on the look out for collaboration or recruitment.

Hi there, my name's Jack! I'm a Data Scientist who simply loves to draw insights from data and use them to solve challenging business problems.

Also an ML Engineer who enjoys building functional ML applications, be it internal or external tooling to support an organisation's technical requirements.

It is a passion of mine to study and implement best practices in every project I have come across, especially so when it comes to upholding quality attributes such as Scalability, Reliability, Maintainability, and Observability.

Data Scientist / ML Engineer / Web Developer

Here's a short description of my origins and journeys.

  • City: Kuala Lumpur, Malaysia
  • Vertical: Healthcare
  • Degree: Master
  • Interest: Building Scalable ML Systems

Started out as a classically trained wet-lab biologist during the academic days, then soon became massively interested in leveraging AI to empower research in the applied space.

Eventually realized the importance of proper ML infrastructure for end-to-end data science works. And thus, I became a major advocate of creating efficient data and ML pipelines to make the lives of my fellow colleagues better by streamlining routine processes.

Fun Facts

Occasionally dabble into graphic designs and video production for recreational fun.

Enjoys Teaching

Proactive Learner

Casual Gamer

Plays Kalimba

Skills

These are the skills that I have cultivated throughout my vocation. Ones that are spread out across both backend and frontend developments; spanning from building containerised microservices with embedded databases to creating web applications hosted on cloud platforms.

All in all, my skillsets are focused on leveraging AI to solve analytical problems and building data products.

Development

Windows

Linux

Git

PyTest

VSCode

Jupyter

Google Colab

Languages

Python

Java

R

SQL

Bash

CSS

HTML

JavaScript

Machine Learning

Pandas

NumPy

SciPy

Statsmodels

Scikit-Learn

Deep Learning

Tensorflow

Keras

PyTorch

Cloud Computing

GCP

AWS

Azure

Distributed Computing

Dask

PySpark

RAPIDS

Deployment

Docker

Kubernetes

Heroku

Web Framework

Flask

FastAPI

Streamlit

Bootstrap

Data Visualization

Matplotlib

Seaborn

Plotly

Power BI

Database

SQLite

PostgreSQL

MongoDB

Object Relational Mapper

SQLAlchemy

SQLModel

Resume

Summary

Passionate engineering-oriented data scientist with 3+ years of experience in building deep learning models and end-to-end ML pipelines. Excited to build and ship AI models to production. Proficient in predictive modeling, data engineering, data visualisation, and object-oriented programming (OOP).

Professional Experience


Data Scientist

February 2022 - November 2023

AMILI Pte Ltd, Singapore

  • Developed core components of the knowledge graph-oriented (KG) platform to deliver personalized insights to healthcare professionals for prescribing therapeutics and precise diet plans to consumers.
  • Collaborated with DevOps and Data Engineers to build a scalable & robust MLOps system infrastructure for facilitating data and ML pipelines.
  • Built solutions with commercial and open-source LLMs (i.e. ChatGPT, Zephyr, Llama, etc) for data cleansing/validation, and classification tasks.
  • Lead a team of interns from industry placement programmes to build NLP and Computer Vision capabilities of the KG platform.
Languages, Platforms, Tools Used

Python, Bash, Cypher, VSCode, Neo4j, KNIME, Amazon AWS, Selenium, PyTest, JupyterLab, Tensorflow, PyTorch, SpaCy, Pandas, Scipy, Numpy, Numba.

 

Data Science Consultant (Freelance)

March 2021 - Present

Kuala Lumpur, Malaysia

  • Conduct workshops to educate clients and colleagues on best practices in operationalising ML projects.
  • Workshop topics include ML Infrastructure and Platform Development, Data Governance, ML Lifecycle, Deployment Strategies, and DevOps.

Data Scientist

March 2019 - Sep 2020

NovaGlobal Pte Ltd, Singapore

  • Research, develop and validate processes involved in building end-to-end ML pipelines for computer vision task in the medical imaging domain.
  • Collaborated with NVIDIA to evaluate its proprietary platform in adopting novel organ models into Clara’s ML Pipeline via RESTful APIs.
  • Assisted in leading technical workshops to prospective clients in adopting NVIDIA RAPIDS & Clara platforms.
  • Collaborated with Microsoft to build a bibliometric Power BI dashboard with its prototype dataset derived from Microsoft Academic Graph, conducted data cleansing and analytics on Microsoft Azure.
  • Developed an ETL & analytics pipeline for conducting a pilot study to examine correlations between climate and dengue fever via ARIMA time series analysis.
Languages, Platforms, Tools Used

Python, U-SQL, Bash, VSCode, Microsoft Azure, JupyterLab, Tensorflow, Keras, Pandas, Numpy, Dask, Statsmodel, Seaborn, Plotly, Power BI.

 

Data Analyst

March 2018 - Feb 2019

Alterquo Sdn Bhd, Malaysia

  • Collaborated with NUHS researchers to develop an ML pipeline to stratify patients, and predict corresponding cardiovascular diseases based on clinical data with XGBoost.
  • Conducted data cleaning, descriptive stats, analysis, and predictive modeling on anonymised healthcare patient data in R.
Languages, Platforms, Tools Used

R, RStudio, XGBoost, MICE, Caret, Boruta, ANOVA, F-Statistics, Screeplot, Unsupervised Hierarchical Clustering, Gapminder Tools Offline.

 

Research Assistant (Intern)

Nov 2016 - Dec 2016

Perdana University, Malaysia

  • Provided assistance towards a Master Student's research project.
  • Utilised a series of web tools to analyse the ebolavirus proteome for identifying potential T-cell epitope targets for downstream drug discovery research endeavours.
  • Helped in identifying HLA-specific supertypes that covers ~86% of the human population.
Languages, Platforms, Tools Used

Bash, Cygwin, BLAST+, NetCTLpan, Multipred2, TEPITOPE, MAFFT, BioEdit, RasMol.

 

Content Writer (Intern)

Oct 2015 - March 2016

NetON, Australia

  • Composed original reviews regarding Content Management System (CMS) and eCommerce platform like WordPress, Drupal and Magento respectively.
  • Studied and implemented Search Engine Optimization (SEO) knowledge on articles as to improve search engine rankings via specified keywords and on-site metadata.
  • Utilized Google Analytics, AdWords, and Webmaster Tools to observe and adjust keyword matching, structured data snippets and metadata.

Education


Master of Science (MSc) in Bioinformatics

2018 - 2020

Perdana University, Malaysia

  • Defended Thesis on March 2021. Graduated on August 2021.
  • Built a Deep Learning (DL) web application to predict protein homology via metadata curated from proteomic databases. Essentially a full-stack project that spans from data engineering to model deployment with Streamlit.
  • Worked with structured data of high cardinality, class imbalance, and full of categorical variables in nature to build/train/evaluate a Deep Feedforward Neural Network with custom loss functions.
Project Overview
  • Retrieve, process, and transform XML data from RESTful API into JSON.
  • Data wrangling in Pandas DF, feature engineering with Scikit-Learn.
  • Developed custom neural networks with Tensorflow.Keras API.
  • Devised 2 AI strategies: Multiclass & Multilabel Classification.
  • Multilabel approach outperformed Gold Standard classifier by 12% in detecting remote protein homologs.
  • Conducted benchmarking test of AI models using cloud infrastructure (i.e. GCP and AWS).
  • Deployed AI model to Streamlit, containerised with Docker, and hosted on Heroku Cloud Platform.

Bachelor of Science (BSC) in Biotechnology (Hons)

2015 - 2018

UCSI University, Malaysia

  • Graduated with CGPA 3.59
  • Conducted R&D on immuno-centric approach to tackling colon cancer with targeted antibodies.

Volunteer Experience


Graphic Designer & IT/UX Mentor

March 2018 - March 2020

Perdana University, Malaysia

  • Works primarily with Perdana University School of Data Science (PUScDS) to produce marketing and promotional materials such as logos, brochures, infographics for events, conferences, and academic programs.
  • Responsible for designing Faculty Dossier, MyBioInfoNet's Logo, InSyB2018 Conference Materials, and Data Science Infographics.
  • Provided mentorship to MSc and PhD students on programming fundamentals and design principles in scientific presentation/storytelling.

Design Lead

May 2017 - Oct 2018

TEDxUCSIUniversity, Malaysia

  • Supervised, coordinated, and mentored a savvy team of artists to design posters, booklets, tickets, t-shirts, bunting, and presenter profile templates.
  • Streamlined design workflows by delegating tasks according to each member's area of expertise.
  • Personally responsible for designs on event booklet, t-shirts, and presenter profile templates.

Video Producer

May 2016 - July 2017

UCSI University, Malaysia

  • Collaborated with an experienced illustrator pitch, design, and implement visual storytelling elements of event highlights to captivate audience for the university's opening event ceremonies.
  • Produced visual masterpieces for the following events: 4th Applied Science Week (2016) and Applied Science Symposium (2017).
Platforms, Tools Used

Adobe Photoshop, Adobe illustrator, Adobe After Effects, Adobe Premier Pro

Portfolio

On the journey of continuous learning, these are the side projects that I have worked on to hone my craft and learn new skills.

  • All
  • Data Science
  • Web App

Contact

Feel free to reach out to me through LinkedIn and GitHub for collaboration or recruitment purposes!