• Hi!
    I'm Ziyuan

    Experienced Engineer with a demonstrated history of working in healthcare analytics.

    Download CV

  • I'm
    a Science Lover

    My research interest lies in applications of healthcare and precision medicine.

    View Projects

About Me

Who Am I?

I'm Ziyuan Shen. I am a master graduate from Duke University. I major in Electrical and Computer Engineering, with a course focus on software design, machine learning and data science. I worked at Duke Institute for Health Innovation as a data scientist intern.

My research interest lies in medical machine learning and precision medicine. My dream is to apply my technical skills to real applications and improve human beings' health conditions.

Education

Education

August 2018 - May 2020
Major: Electrical and Computing Engineering
Course Focus: Machine Learning and Data Science
Overall GPA: 3.97/4.0

Coursework:

  • Software Design
  • CMOS VLSI Design
  • Fundamentals of Computer Systems and Engineering
  • Statistical Programming (R programming)
  • Medical Deep Learning
  • Probabilistic Machine Learning
  • Pattern Classification and Recognition Technology
  • Data Science and Health
  • Vector Space Methods with Applications

September 2014 - June 2018
Major: Information Science and Engineering
Thesis: Synthesizing Bayesian System Using Chemical Reaction Networks
Overall GPA: 3.72/4.0

Coursework:

  • Fundamentals of Computer Science (C++)
  • Data Structures
  • Computer Architecture & Logic Design
  • Probability Statistics & Stochastic Processes
  • Digital Signal Processing
  • Statistical Signal Processing
  • Digital Circuits
  • Analog Circuits
  • Principles of Automatic Control, etc

June 2012 - August 2012
Overall GPA: 3.85/4.0

Coursework:

  • Academic Writing
  • Precalculus Mathematics

Experience

Work Experience

Duke Institute for Health Innovation
May 2019 - May 2020

Data Scientist Intern

Internship at DIHI to work with ongoing machine learning projects related to healthcare. I use data science technologies to manipulate large-scale hospital data, and develop predictive models for clinical use. Develop full-stack applications to monitor patients heart rate for clinical use.

National Mobile Communications Research Laboratory
March 2015 - June 2018

Research Assistant

Conduct DNA computing research regarding DNN (Deep Neural Network), Markov chain computation and digital logic synthesis.

My Specialty

My Skills

Computer Languages

Python3

90%

SQL

90%

Shell Scripting

70%

C++

70%

Java

80%

R

80%

Javascript

70%


Other Tools

PostgreSQL

70%

SQLite

85%

MongoDB

70%

AWS

80%

Linux

80%

Scikit-Learn

80%

TensorFlow

75%
My Projects

Recent Projects

 Machine Learning & Data Analytics

 Adult Inpatient Decompensation Prediction
  May 20, 2019 - Present

Adult Inpatient Decompensation Prediction

This project aims to initialize machine learning models for predicting adult inpatients' decompensation (ICU admission, mortality, etc) in real time. Most preliminary work before building models includes data cleaning, data visualization, data quality assurance and data manipulation etc.

Python, SQL, Shell Scripting (SQLite, AWK, etc)

 Open Source Contribution
  Sep 10, 2019 - Present

Open source contribution: Add SPIE-AAPM-NCI breast cancer whole slide image dataset to TensorFlow datasets

The dataset consists of patches extracted from breast cancer whole slide images (WSI). Each patch is labelled by a tumor cellularity score. The dataset can be used to develop an automated method for evaluating cancer cellularity and enhancing tumor burden assessment.

Python, TensorFlow

 30-Day Hospital Readmission Prediction
  April 10, 2019 - May 10, 2019

30-Day Hospital Readmission Prediction

With the development of modern large scale computing technologies, machine learing techniques have been increasingly applied to large scale healthcare datasets. Among numerous research topics, predicting hospital 30-Day readmission is of our interests. If we are able to accurately predict readmission, we can identify patients at high risk and prevent early discharging them.

Python, SQlite, Scikit-Learn, etc

 Breast Cancer Prediction
  February 1, 2019 - May 10, 2019

Breast Cancer Prediction: Classify breast cancer subtypes using Breast Cancer Wisconsin Dataset

Cancer classification historically requires prior biological knowledge. Prediction models with high accuracy are aimed to aid oncologists with diagnosis and prognosis. Apart from achieve high accuracy, the goal of this project is to generate comprehensive data analysis that helps with identifying a suitable classifier.

Python, Scikit-Learn, Matplotlib, PCA, T-SNE

 Times Series Medication Administration Prediction
  November 1, 2019 - December 15, 2019

Times Series Medication Administration Prediction

The project aims to employ LSTM to predict medical outcomes using MIMIC-III dataset. Most work is based on An attention based deep learning model of clinical events in the intensive care unit and functions to produce similar results. Selection of features and model specifics may vary, though.

Python, PostgreSQL, Keras, LSTM

 Squeeze and Excitation Network
  October 1, 2019 - November 1, 2019

Squeeze and Excitation Network

Implements Squeeze and Excitation Network proposed in publication Squeeze-and-Excitation Networks. Resnet50 and Resnet110 with and without SE-network are implemented from scratch. Multi-class image classification tasks on Oxford Pet and Cifar10 datasets are performed using the implemented networks.

Python, TensorFlow, CNN, image classification

 Web Scraping & App Design

 Central News Hub
  November 15, 2019 - December 1, 2019

Central News Hub

Create a Shiny app that serves as a central news hub. The user is able to specify all arguments the news api provides.

R, HTML, Shiny App
Run App

 NBA STATS HUB
  November 15, 2019 - December 15, 2019

NBA Statistics Hub

Scrape data from NBA stats website; Perform stats analysis including age and efficiency; Build R shiny app to search stats;

R, HTML, GGplot, Javascript, Shiny App
Run App

Read

Publications

HTML5 Bootstrap Template by colorlib.com
April 11, 2019
Springer Journal of Natural Computing

Molecular computing for Markov Chains

In this paper, it is presented a methodology for implementing arbitrarily constructed time-homogenous Markov chains with biochemical systems. Not only discrete but also continuous-time Markov chains are allowed to be computed.

HTML5 Bootstrap Template by colorlib.com
January 3, 2019
IEEE International Workshop on Signal Processing Systems (SiPS)

Synthesizing a Neuron Using Chemical Reactions

In this paper, by revealing the common probability base, we aim to implement the most fundamental element of DNN, a neuron, using chemical reaction networks (CRNs).

HTML5 Bootstrap Template by colorlib.com
October 22, 2018
SCIENCE CHINA Information Sciences (Sci China Inf Sci)

DNA computing for combinational logic

This timely overview study introduces combinational logic synthesized in DNA computing from both analog and digital perspectives separately. State-of-the-art research progress is summarized for interested readers to quick understand DNA computing, initiate discussion on existing techniques and inspire innovation solutions.

HTML5 Bootstrap Template by colorlib.com
December 27, 2017
Springer Journal of Signal Processing Systems

Molecular Synthesis for Probability Theory and Stochastic Process

It is common that probability theory and stochastic process, especially Markov chains, have long been used to study and explain the behaviors of chemical reaction networks (CRNs). Nonetheless, this paper sees things from a reverse angle, devoting itself in synthesizing common probability theory and stochastic process with CRNs. The main motivation is to imitate and explore the evolution of large-scale and complex practical systems based on CRNs, by making use of the inherent parallelism and randomness.

HTML5 Bootstrap Template by colorlib.com
November 16, 2017
IEEE International Workshop on Signal Processing Systems (SiPS)

CRN-Based Design Methodology for Synchronous Sequential Logic

This design approach, which stores logic information in keysmith and releases it through key, primarily focuses on the underlying state transitions behind the required logic rather than the electronic circuit representation. Therefore, it can be uniformly and easily employed to implement any synchronous sequential logic with molecular reactions.

HTML5 Bootstrap Template by colorlib.com
November 11, 2017
IEEE International Conference on Wireless Communications and Signal Processing (WCSP)

Synthesizing Markov Chain with Reversible Unimolecular Reactions

Here, a succinct and systematic approach based on reversible uni-molecular reactions is proposed for any time-homogeneous Markov chain, no matter it is discrete or continuous. For the chemical reaction networks (CRNs), molecular concentrations at time t reflect the probability distribution of the continuous-time Markov chain at time t. The final concentrations indicate the steady state of the Markov chain.

HTML5 Bootstrap Template by colorlib.com
December 12, 2016
IEEE International Workshop on Signal Processing Systems (SiPS)

Synthesis of Probability Theory Based on Molecular Computation

Considering the common knowledge that probability theory has long been used to explain stochastic behaviors of chemical reactions, this paper devotes itself to synthesizing probability theory based on molecular computation.

Get in Touch

Contact