Yunfei (Eric) Liao

Eric Liao

πŸ“§ Email: [email protected]

πŸ“ Location: Charlotte, NC, US

πŸ“ͺ LinkedIn: linkedin.com/in/ericlyf

πŸ–₯️ GitHub: github.com/Afei99357

πŸ”¬ Google Scholar: Yunfei Liao


πŸŽ“ Education

  • Ph.D. in Bioinformatics
    University of North Carolina at Charlotte (UNCC)
    Expected: May 2025

  • M.S. in Information Technology
    University of North Carolina at Charlotte (UNCC)
    December 2019

  • M.Eng. in Materials Processing
    Nanchang Hangkong University (NCHU), China
    June 2016

  • B.Eng. in Materials Processing
    Nanchang Hangkong University (NCHU), China
    June 2013


🧳 Professional Experience

  • Research Assistant
    Elizabeth Cooper Lab - UNCC
    January 2023 – Present

    • Led interdisciplinary research on West Nile Virus (WNV) and population genetics of Culex mosquitoes.
    • Developed data pipelines for ecological, meteorological, and genetic datasets.
    • Collaborated with CDC, USDA, and public health departments for data acquisition.

  • Research Assistant
    Xiuxia Du Lab - UNCC
    May 2019 – December 2022

    • Designed and implemented machine learning algorithms for metabolomics-based biomarker discovery.
    • Developed memory-efficient data analysis pipelines for mass spectrometry.

  • Project Manager
    AVIC Digital
    June 2016 – July 2018

    • Managed Manufacturing Execution System (MES) projects for aviation manufacturing.
    • Coordinated requirements gathering and technical implementation.

  • Research Assistant
    Nanchang Hangkong University
    February 2013 – April 2016

βš™οΈ Projects

West Nile Virus Prediction at the Cooper Lab

  • California WNV data visualization illustration (click)

  • Sole investigator for a profoundly interdisciplinary project analyzing WNV antecedents, overturning many long held expectations and highlighting new avenues for investigation

  • Collected meteorological, demographic, ecological, genetic and CDC disease surveillance data from over 1000 sources, ranging from gigantic (Copernicus) to arcane (CA Arbovirus Bulletin)

  • Interpolated, resampled, repaired, and consulted relevant govt authorities on data irregularities

  • Collaborated with USDA, CDC, CA Public Health Dept, taking an active role in data acquisition

  • Exhaustively analyzed datasets with a wide range of discipline-specific statistical algorithms

  • Innovatively exposed sensitivities and relationships using a wide variety of statistical analyses

  • Automated dozens of processes, ranging from scraping (beautifulsoup, selenium), to computer vision (cv2), even orchestration and database ingestion (Airflow)

ELT Pipeline Extracting Climate Data at the Cooper Lab

  • Designed and implemented an automated ELT pipeline to extract, load, and transform daily weather data from the Open-Meteo API using Apache Airflow, DBT, DuckDB and AWS.
  • Orchestrated data workflows in Airflow, triggering DBT runs via Dockerized environments to ensure seamless data processing.
  • Developed DBT models to transform and structure climate data for efficient analysis and visualization.
  • Scheduled daily updates, integrating Airflow DAG execution with a build evidence project, then pushed processed data to AWS S3 for cloud storage and further analysis.
  • Built an interactive climate visualization that enables users to dynamically explore climate trends across the U.S..

Population Genetics in Culex Mosquitoes at the Cooper Lab

  • Conducted population genetics research on Culex tarsalis to identify genetic adaptations to environmental factors such as temperature and precipitation
  • Collected a wealth of environmental data similar to the WNV project mentioned above
  • Used field-specific statistical models like Redundancy Analysis to search for significant environmental correlations and effects on adaptiveness of individual SNPs
  • Displayed discovered connections using a variety of visualizations such as upset plots and pairwise plots of several kinds

Biomarker Discovery at the Du Lab

  • Designed and implemented supervised machine learning algorithm to detect early disease via un-targeted metabolomics studies using scikit-learn

Automated Data Analysis Pipeline at the Du Lab

  • Developed a memory-efficient prescreening algorithm for mass spectrometry search
  • Created visualizations for large datasets using html and javascript

MES Project Management at AVIC Digital

  • Led projects focused on Manufacturing Execution System (MES) for an airplane manufacturer, ensuring seamless integration of software with manufacturing workflows
  • Gathered customer’s application requirements and translated into actionable designs for development teams

πŸ“Š Skills

  • Programming Languages: Python, SQL, R, JavaScript, HTML, CSS
  • Data Engineering Tools: Pandas, NetCDF/Xarray, Airflow, Matplotlib
  • Data Science: Classical machine learning, visualization, statistical modeling, MLOps, MLFlow
  • Software Tools: Docker, Git, AWS (EC2, S3)
  • Languages: English (Fluent), Chinese (Native)

πŸ“Ž Publications (* co-first author)