Yuyu Luo (骆昱宇)

Logo

Assistant Professor
Data Science and Analytics Thrust
Information Hub
The Hong Kong University of Science and Technology (Guangzhou)

Email: yuyuluo[at]hkust-gz.edu.cn
Office: E2-615, HKUST(GZ)

Google Scholar

Brief Biography

I am an Assistant Professor at the Data Science and Analytics Thrust, Information Hub, The Hong Kong University of Science and Technology (Guangzhou). I also hold an affiliated position at the Hong Kong University of Science and Technology (Clear Water Bay campus). I direct the Data Intelligence and Analytics Lab DIAL @ HKUST(GZ), where our mission is to develop next-generation intelligent systems through research at the intersection of LLMs, Foundation Agents, AI for Databases, and Data-centric AI.


I am actively looking for self-motivated PhD students (Fall 2025, Spring/Fall 2026), MPhil students, and Research Assistants/Visiting Students to join our lab. If you’re passionate about AI and data systems and interested in working with us, please send your CV and transcripts (for all degrees) via email.

Preprints

  1. DeepFund: Will LLM be Professional at Fund Investment? A Live Arena Perspective
    Changlun Li, Yao Shi, Yuyu Luo, Nan Tang
  2. Rise of the Community Champions: From Reviewer Crunch to Community Power
    Changlun Li, Yao Shi, Yuyu Luo, Nan Tang
  3. Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search
    Boyan Li, Jiayi Zhang, Ju Fan, Yanwei Xu, Chong Chen, Nan Tang, Yuyu Luo
  4. Atom of Thoughts for Markov LLM Test-Time Scaling
    Fengwei Teng, Zhaoyang Yu, Quan Shi, Jiayi Zhang, Chenglin Wu, Yuyu Luo
  5. NL2SQL-Bugs: A Benchmark for Detecting Semantic Errors in NL2SQL Translation
    Xinyu Liu, Shuyu Shen, Boyan Li, Nan Tang, Yuyu Luo
  6. nvBench 2.0: A Benchmark for Natural Language to Visualization under Ambiguity
    Tianqi Luo, Chuhan Huang, Leixian Shen, Boyan Li, Shuyu Shen, Wei Zeng, Nan Tang, Yuyu Luo
  7. Self-Supervised Prompt Optimization
    Jinyu Xiang, Jiayi Zhang, et al., Sirui Hong, Chenglin Wu, Yuyu Luo
  8. AskChart: Universal Chart Understanding through Textual Enhancement
    Xudong Yang, Yifan Wu, Yizhang Zhu, Nan Tang, Yuyu Luo

Surveys

  1. A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?
    [ NL2SQL Handbook]
  2. Graph Neural Networks for Databases: A Survey

Selected Publications

    Year 2025
  1. AFlow: Automating Agentic Workflow Generation
    Jiayi Zhang, Jinyu Xiang, et al. Yuyu Luo, Chenglin Wu
    ICLR 2025 (Oral, top 1.8%).
  2. Coreset Selection over Incomplete Data for Data-Effective and Data-Efficient Machine Learning
    Chengliang Chai, Nan Tang, Ju Fan, Yuyu Luo, Guoliang Li, Ye Yuan, Guoren Wang
    TODS 2025.
  3. Augmenting Realistic Charts with Virtual Overlays
    Yao Shi, Boyan Li, Yuyu Luo, Lei Chen, Nan Tang
    CHI 2025.

  4. Year 2024
  5. The Dawn of Natural Language to SQL: Are We Fully Ready?
    Boyan Li, Yuyu Luo, Chengliang Chai, Guoliang Li, Nan Tang
    VLDB 2024. [Homepage]
  6. HAIChart: Human and AI Paired Visualization System
    Yupeng Xie, Yuyu Luo, Guoliang Li, Nan Tang
    VLDB 2024. [Code]
  7. Are Large Language Models Good Statisticians?
    Yizhang Zhu, Shiyin Du, Boyan Li, Yuyu Luo, Nan Tang
    NeurIPS 2024 [ Dataset]
  8. VerifAI: Verified Generative AI
    Nan Tang, Chenyu Yang, Ju Fan, Lei Cao, Yuyu Luo, Alon Halevy
    CIDR 2024.
  9. Data Playwright: Authoring Data Videos with Annotated Narration
    Leixian Shen, Haotian Li, Yun Wang, Tianqi Luo, Yuyu Luo, Huamin Qu
    TVCG 2024. [Homepage]
  10. ChartInsights: Evaluating Multimodal Large Language Models for Low-Level Chart Question Answering
    Yifan Wu, Lutao Yan, Leixian Shen, Yunhai Wang, Nan Tang, Yuyu Luo
    EMNLP 2024. [ Dataset]
  11. Fast, Robust and Interpretable Participant Contribution Estimation for Federated Learning
    Yong Wang, Yuyu Luo, Kaiyu Li, Guoliang Li, Yunyan Guo, Zhuo Wang
    ICDE 2024.
  12. Mitigating Data Scarcity in Supervised Machine Learning through Reinforcement Learning Guided Data Generation
    Chengliang Chai, Kaisen Jin, Nan Tang, Ju Fan, Lianpeng Qiao, Yu-Ping Wang, Yuyu Luo, Ye Yuan, Guoren Wang
    ICDE 2024.
  13. CoInsight: Visual Storytelling for Hierarchical Tables with Connected Insights
    Guozheng Li, Runfei Li, Yunshan Feng, Yu Zhang, Yuyu Luo, Chi Harold Liu
    TVCG 2024.

  14. Year 2023
  15. Learned Data-aware Image Representations of Line Charts for Similarity Search
    Yuyu Luo, Yihui Zhou, Nan Tang, Guoliang Li, Chengliang Chai, Leixian Shen
    SIGMOD 2023. [Slides]
  16. GoodCore: Coreset Selection over Incomplete Data for Data-effective and Data-efficient Machine Learning
    Chengliang Chai, Jiabin Liu, Nan Tang, Ju Fan, Dongjing Miao, Jiayi Wang, Yuyu Luo, Guoliang Li
    SIGMOD 2023. (Best of SIGMOD 2023 Papers) [Slides]
  17. Demystifying Artificial Intelligence for Data Preparation
    Chengliang Chai, Nan Tang, Ju Fan, Yuyu Luo
    SIGMOD 2023. [Tutorial Slides: Part1, Part2, Part3]

  18. Year 2022
  19. Steerable Self-driving Data Visualization.
    Yuyu Luo, Xuedi Qin, Chengliang Chai, Nan Tang, Guoliang Li, Wenbo Li.
    IEEE TKDE 2022.
  20. Sevi: Speech-to-Visualization through Neural Machine Translation
    Jiawei Tang, Yuyu Luo, Mourad Ouzzani, Guoliang Li, Hongyang Chen.
    ACM SIGMOD 2022 (Demo Track).
  21. Data Management for Machine Learning: A Survey
    Chengliang Chai, Jiayi Wang, Yuyu Luo, Zeping Niu, Guoliang Li.
    IEEE TKDE 2022.
  22. Towards Natural Language Interfaces for Data Visualization: A Survey
    Leixian Shen, Enya Shen, Yuyu Luo, Xiaocong Yang, Xuming Hu, Xiongshuai Zhang, Zhiwei Tai, Jianmin Wang.
    IEEE TVCG 2022.
  23. Selective Data Acquisition in the Wild for Model Charging
    Chengliang Chai, Jiabin Liu, Nan Tang, Guoliang Li, Yuyu Luo.
    VLDB 2022.
  24. Feature Augmentation with Reinforcement Learning
    Jiabin Liu, Chengliang Chai, Yuyu Luo, Yin Lou, Jianhua Feng, Nan Tang.
    ICDE 2022.
  25. RW-Tree: A Learned Workload-aware Framework for R-tree Construction
    Haowen Dong, Chengliang Chai, Yuyu Luo, Jiabin Liu, Jianhua Feng, Chaoqun Zhan.
    ICDE 2022.
  26. Interactively Discovering and Ranking Desired Tuples by Data Exploration
    Xuedi Qin, Chengliang Chai, Yuyu Luo, Tianyu Zhao, Nan Tang, Guoliang Li, Jianhua Feng, Xiang Yu, Mourad Ouzzani.
    The VLDB Journal 2022.
  27. GALVIS: Visualization Construction through Example-Powered Declarative Programming.
    Leixian Shen, Enya Shen, Zhiwei Tai, Yun Wang, Yuyu Luo, Jianmin Wang.
    CIKM 2022 (Best Demo Paper Honorable Mention).

  28. Year 2021
  29. Synthesizing Natural Language to Visualization (NL2VIS) Benchmarks from NL2SQL Benchmarks
    Yuyu Luo, Nan Tang, Guoliang Li, Chengliang Chai, Wenbo Li, Xuedi Qin
    ACM SIGMOD 2021 [Project Page]

  30. Natural Language to Visualization by Neural Machine Translation
    Yuyu Luo, Nan Tang, Guoliang Li, Jiawei Tang, Chengliang Chai, Xuedi Qin
    IEEE VIS 2021 [Code] [Poster]

  31. nvBench: A Large-Scale Synthesized Dataset for Cross-Domain Natural Language to Visualization Task
    Yuyu Luo, Jiawei Tang, Guoliang Li
    Workshop on NL VIZ 2021 at IEEE VIS 2021

  32. Year 2020
  33. DeepTrack: Monitoring and Exploring Spatio-Temporal Data
    – A Case of Tracking COVID-19 –

    Yuyu Luo, Wenbo Li, Guoliang Li, Nan Tang
    VLDB 2020.
  34. VisClean: Interactive Cleaning for Progressive Visualization.
    Yuyu Luo, Chengliang Chai, Xuedi Qin, Nan Tang, Guoliang Li.
    VLDB 2020. [Video Demonstration]
  35. Interactive Cleaning for Progressive Visualization through Composite Questions.
    Yuyu Luo, Chengliang Chai, Xuedi Qin, Nan Tang, Guoliang Li.
    IEEE ICDE 2020. [Video]
  36. Human-in-the-loop Outlier Detection
    Chengliang Chai, Lei Cao, Guoliang Li, Jian Li, Yuyu Luo, Samuel Madden.
    ACM SIGMOD 2020.
  37. Interactively Discovering and Ranking Desired Tuples without Writing SQL Queries.
    Xuedi Qin, Chengliang Chai, Yuyu Luo, Nan Tang, Guoliang Li.
    ACM SIGMOD 2020. [Video Demonstration]
  38. DEEPEYE: A Data Science System for Monitoring and Exploring COVID-19 Data.
    Yuyu Luo, Nan Tang, Guoliang Li, Tianyu Zhao, Wenbo Li, Xiang Yu.
    IEEE Data Engineering Bulletin, 2020. (invited)
  39. CrowdChart: Crowdsourced-based Data Extraction from Visualization Chart.
    Chengliang Chai, Guoliang Li, Ju Fan, Yuyu Luo.
    IEEE TKDE 2020.

  40. Year 2019
  41. Making Data Visualization More Efficient and Effective: A Survey.
    Xuedi Qin, Yuyu Luo, Nan Tang, Guoliang Li.
    The VLDB Journal.
  42. MathGraph: A Knowledge Graph for Automatically Solving Mathematical Exercises.
    Tianyu Zhao, Yan Huang, Songfan Yang, Yuyu Luo, et al.
    DASFAA 2019. (Best Paper Award)

  43. Year 2018
  44. DeepEye: Towards Automatic Data Visualization.
    ICDE 2018 Highly Cited Papers Top-2
    Yuyu Luo, Xuedi Qin, Nan Tang, Guoliang Li.
    IEEE ICDE 2018. [DeepEye-APIs (Python3.6)]
  45. DeepEye: Creating Good Data Visualizations by Keyword Search (Demo).
    Yuyu Luo, Xuedi Qin, Nan Tang, Guoliang Li, Xinran Wang.
    ACM SIGMOD 2018. [Online Demo]

PhD Students

Selected Awards

Professional Services