Chenghao Lyu

Chenghao Lyu

Ph.D.

UMass Amherst

About Me

My name is Chenghao Lyu. I am a researcher and engineer developing next generation of big data analytics systems with artificial intelligence solutions. I am currently working as an Applied Scientist at Amazon. Before joining Amazon, I earned my Ph.D. from UMass Amherst, and hold MS and BS degrees at Fudan University. During my Ph.D., I was also a scientific collaborator with CEDAR team at Ecole Polytechnique (X).

Email: clyu AT amazon.com

Interests
  • Adaptive Query Execution and Optimization
  • Big Data Analytics Systems
  • Machine Learning
  • Multi-objective Optimizations
Education
  • Ms/PhD in Computer Science, 2018 - 2025.1

    UMass Amherst, MA, USA

  • MSc in Electronic Engineering, 2018

    Fudan University, Shanghai, China

  • BSc in Electronic Engineering, 2015

    Fudan University, Shanghai, China

News

[2025.06] Our paper “Graph Transformers for Query Plan Representation: Potentials and Challenges” has been accepted to PVLDB2025, and will be appeared in VLDB 2026!

[2025.01] I am Amazon as an applied scientist.

[2025.01] I successfully defended my PhD! Many thanks to my committee members—Yanlei Diao, Prashant Shenoy, Peter Haas, and David Irwin—for their invaluable support.

[2025.01] I received the Dr. Phil Bernstein Graduate Scholarship in Computer Science.

[2024.06] Our paper “A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning” was accepted to PVLDB 2024!

[2024.03] I was back to Amherst and defensed my thesis proposal.

[2023.12] We released UDAO, the unified data analytics optimizer, to public and PyPI. Try “pip install udao”.

[2023.10] I reported my on-going work “An Adaptive, Multi-Resolution, and Multi-Objective Parameter Tuning Approach for Spark SQL” in the ERC BigFastData Workshop.

[2023.05] We released our Spark-TPCH dataset and the MOO framework (where I contribute the internal solver) in our UDAO project, a Uniformed Data Analytics Optimizer.

[2022.07] Our paper with Alibaba Cloud on “Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing” was accepted to PVLDB 2022!

[2021.10] I started working as a scientific collaborator in the CEDAR project-team of Inria and LIX, at Ecole Polytechnique. Bonjour!

[2020.10] Our paper “Spark-based Cloud Data Analytics using Multi-Objective Optimization” was accepted to ICDE 2021!

Publications

(2025). Graph Transformers for Query Plan Representation: Potentials and Challenges. In PVLDB, 18(13), 2024.

PDF Cite Code DOI

(2024). A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning. In PVLDB, 17(11), 2024.

PDF Cite Code Poster DOI Tech Report

(2022). Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing. In PVLDB, 15(11), 2022.

PDF Cite DOI Tech Report

(2021). Spark-based Cloud Data Analytics using Multi-Objective Optimization. In ICDE, 2021.

PDF Cite Code DOI Tech Report

(2021). Neural-based Modeling for Performance Tuning of Spark Data Analytics. arXiv, 2021.

PDF Cite

(2019). UDAO: A Next-Generation Unified Data Analytics Optimizer. In PVLDB 12(12), 2019.

PDF Cite DOI