Chenghao Lyu

Chenghao Lyu

Ph.D. Candidate

UMass Amherst

About Me

My name is Chenghao Lyu. I am also a fifth-year Ph.D. student candidate in the College of Information and Computer Sciences at University of Massachusetts, Amherst, and I am part of the DREAM Lab. I am advised by Prof. Yanlei Diao, and co-advised by Prof. Prashant Shenoy. Before joining UMass Amherst, I got my BS in EE and MS in CS from Fudan University, where I was advised by Prof. X. Sean Wang. I worked as a scientific collaborator in CEDAR11 team at Ecole Polytechnique in France for 2.5 years during my PhD.

My research lies in the intersection of big data analytics systems, machine learning, and multi-objective optimization, with a focus on designing optimizers to auto-configure parameters for large-scale systems to achieve improved performance and cost reduction.

I am actively seeking a full-time job in the industrial market in 2024/2025.

Email: {first-name}@cs.umass.edu

Interests
  • Adaptive Query Execution and Optimization
  • Big Data Analytics Systems
  • Machine Learning
  • Multi-objective Optimizations
Education
  • Ms/PhD in Computer Science, 2018 - Present

    UMass Amherst, MA, USA

  • MSc in Electronic Engineering, 2018

    Fudan University, Shanghai, China

  • BSc in Electronic Engineering, 2015

    Fudan University, Shanghai, China

News

[2024.06] Our paper “A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning” was accepted to VLDB 2024!

[2024.03] I was back to Amherst and defensed my thesis proposal!

[2023.12] We released UDAO, the unified data analytics optimizer, to public and PyPI. Try “pip install udao”!

[2023.10] I reported my on-going work “An Adaptive, Multi-Resolution, and Multi-Objective Parameter Tuning Approach for Spark SQL” in the ERC BigFastData Workshop

[2023.05] We released our Spark-TPCH dataset and the MOO framework (where I contribute the internal solver) in our UDAO project, a Uniformed Data Analytics Optimizer.

[2022.07] Our paper with Alibaba Cloud on “Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing” was accepted to VLDB 2022!

[2021.10] I started working as a scientific collabarator in the CEDAR project-team of Inria and LIX, at Ecole Polytechnique. Bonjour!

[2020.10] Our paper “Spark-based Cloud Data Analytics using Multi-Objective Optimization” was accepted to ICDE 2021!

[2020.02] I started my internship at Alibaba DAMO Academy.

Publications

(2024). A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning. In PVLDB, 17(11), 2024.

PDF Cite Code Poster DOI Tech Report

(2022). Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing. In PVLDB, 15(11), 2022.

PDF Cite DOI Tech Report

(2021). Spark-based Cloud Data Analytics using Multi-Objective Optimization. In ICDE, 2021.

PDF Cite Code DOI Tech Report

(2021). Neural-based Modeling for Performance Tuning of Spark Data Analytics. arXiv, 2021.

PDF Cite

(2019). UDAO: A Next-Generation Unified Data Analytics Optimizer. In PVLDB 12(12), 2019.

PDF Cite DOI

Experience

 
 
 
 
 
The CEDAR project-team, LIX, Ecole Polytechnique
Scientific Collaborator
Oct 2021 – May 2024 Paris
Developing a Unified Data Analytics Optimizer (UDAO) system.
 
 
 
 
 
DAMO Academy, Alibaba
Research Intern
Feb 2020 – Dec 2021 Remote&Hangzhou
Designed the new architecture of a resource optimizer in big data systems. Saved 36-37% latency and 37-75% cost over production workloads of 0.6M jobs and a simulator of the extended Alibaba MaxCompute environment.