Jiayu LIU 刘家毓
👋 Welcome to my homepage! 🥂
I’m Jiayu LIU 刘家毓, a third-year undergraduate CS student supervised by Prof. Yangqiu Song and Prof. Yiren Fung at HKUST.
💞️ I’m passionate about playing piano, violin, football, and working out in the gym.
🌱 I’m currently interested in Natural Language Processing, especially in:
Improving LLM trustworthiness:
GProofT (FEVER 2024),
MarConf (ACL 2025 Main),
MarPT (Under review in ACL Rolling Review),
CritiCal (Under review in ACL Rolling Review).Evaluating and enhancing LLM reasoning capabilities:
RFMDataset (MathNLP 2025, under review in ACL Rolling Review),
Multirole-R1 (Under review in ICLR).Advanced tool-use capabilities in agentic systems:
CostBench. (Under review in ACL Rolling Review)
🖋️ Google Scholar
📫 Contact: jliufv@connect.ust.hk
😄 Pronouns: He/Him
🔥 News
- [2025/11] Released CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents, which got 20 upvotes and ranked #4 in Hugging Face Daily Papers (November 6th)!
- [2025/8] Honored to receive the UROP Support Grant and UROP Research Travel Sponsorship!
- [2025/7] Released Diversity-Enhanced Subjective Question-Answering, which got 22 upvotes and ranked #8 in Hugging Face Daily Papers (July 29th)!
- [2025/7] Will join University of Illinois Urbana-Champaign as an exchange undergraduate student in Spring 2026!
Actively seeking research internship opportunities! - [2025/5] My first-author paper Revisiting Epistemic Markers in Confidence Estimation is accepted to ACL 2025 Main!
Sincere gratitude to all my collaborators! - [2025/2] Honored to join HKUST RenAI Lab!
Looking forward to learning from Prof. Fung Yiren! - [2025/1] Honored to receive HKIE Scholarship 2024/25!
- [2024/10] My co-first-author paper GProofT is accepted by The Seventh FEVER Workshop!
- [2024/9] Honored to receive The Joseph Lau Luen Hung Charitable Trust Scholarship 2024/25!
- [2024/6] Traveled to Charles University in Prague for summer exchange!
Wonderful experience — loved everything there 🥰 - [2024/6] Honored to join HKUST KnowComp Group!
Looking forward to learning from Prof. Song Yangqiu! - [2023/9] Honored to receive China Soong Ching Ling Foundation Zhiyuan Bursary!
📖 Publications
🧩 Trustworthiness and Reliability
Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models’ Uncertainty?
Jiayu Liu, Qing Zong, Weiqi Wang, Yangqiu Song
ACL 2025 Main
GProofT: A Multi-dimension Multi-round Fact Checking Framework Based on Claim Fact Extraction
Jiayu Liu*, Junhao Tang*, Hanwen Wang*, Baixuan Xu, Haochen Shi, Weiqi Wang, Yangqiu Song
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)
Prospect Theory Fails for LLMs: Revealing Instability of Decision-Making under Epistemic Uncertainty
Rui Wang*, Qihan Lin*, Jiayu Liu*, Qing Zong, Tianshi Zheng, Weiqi Wang, Yangqiu Song
Under review in ACL Rolling Review
CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?
Qing Zong, Jiayu Liu, Tianshi Zheng, Chunyang Li, Baixuan Xu, Haochen Shi, Weiqi Wang, Zhaowei Wang, Chunkit Chan, Yangqiu Song
Under review in ACL Rolling Review
🧠 Advanced Reasoning
Diversity-Enhanced Reasoning for Subjective Questions
Yumeng Wang*, Zhiyuan Fan*, Jiayu Liu*, Jen-tse Huang, Yi R. Fung
Under review in ICLR 2026
Mathematical Proof as a Litmus Test: Revealing Failure Modes of Advanced Large Reasoning Models
Dadi Guo*, Jiayu Liu*, Zhiyuan Fan, Zhitao He, Haoran Li, Yumeng Wang, Yi R. Fung
MathNLP 2025, under review in ACL Rolling Review
🧰 Tool Use
CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Agents
Jiayu Liu, Cheng Qian, Zhaochen Su, Qing Zong, Shijue Huang, Bingxiang He, Yi R. Fung
Under review in ACL Rolling Review
🤝 Collaboration
VLM-Dixit: Investigating Multi-Modal Abductive Reasoning and Entailment Verification with VLMs in Dixit Gameplay
MO Yunxiang*, Tianshi Zheng*, Qing Zong, Jiayu Liu, Baixuan Xu, Yauwai Yim, Chunkit Chan, Jiaxin Bai, Yangqiu Song
The 5th Wordplay: When Language Meets Games @ EMNLP 2025
🧾 Academic Services
- [2025/5] Reviewer of IJCAI 2025
