Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
The Legend of Zelda (Unity Remaster)
This is a remaster of the original Legend of Zelda game. We authentically implemented the first dungeon of the original game. In this project, I implemented the room transition, locked doors, enemy AI ,the player’s attack and health system, etc.
SpaceShift: The Size Puzzle
This is my first original game. The key mechanism is that the player can be shifted to three different sizes, making different interactions with the environment. Players need to keep changing the size to navigate through the tunnels, air currents and breakable platforms and solve the platform puzzles.
Chrono Portals: A Journey Home
This is our final project game developed by our team of five: Shufeng Chen, Xuteng Luo, Yufan Wu, Yushi Li, and I. It’s a puzzle game featuring portals that can not only change the position but also the age of the player and her belongings.
publications
Robust Sparse Mean Estimation via Incremental Learning
Jianhao Ma, Rui Ray Chen, Yinghui He, Salar Fattahi, and Wei Hu
Download Paper
ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning, 2023
Hi-ToM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning in Large Language Models
Yinghui He, Yufan Wu, Yilin Jia, Rada Mihalcea, Yulong Chen, and Naihao Deng
Download Paper
EMNLP 2023, 2023
“They don’t know that we know they know we know” 🤯 — Does GPT-4 have Higher-Order Theory of Mind? Introducing 👋 Hi-ToM: a benchmark pushing LLMs to their limits in higher-order ToM (3rd order & beyond). LLMs’ performance declines drastically to near 0 📉 on 3rd and 4th!
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
Xi Ye, Fangcong Yin*, Yinghui He*, Joie Zhang*, Howard Yen*, Tianyu Gao, Greg Durrett, Danqi Chen
Download Paper
arxiv preprint, 2025
“🤔Now most LLMs have >= 128K context sizes, but are they good at generating long outputs, such as writing 8K token chain-of-thought for a planning problem? 🔔Introducing LongProc (Long Procedural Generation), a new benchmark with 6 diverse tasks that challenge LLMs to synthesize highly dispersed information and generate long, structured outputs.
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
Jiahao Qiu*, Yinghui He*, Xinzhe Juan*, Yiming Wang, Yuhan Liu, Zixin Yao, Yue Wu, Xun Jiang, Ling Yang, Mengdi Wang
Download Paper
arxiv preprint, 2025
Can AI Be Blamed for a Teen’s Suicide? Do AI Chatbots encourage suicide? 🧒📱What if your teen’s favorite AI character crossed the line? 💔 A 14-year-old boy in Florida took his own life after forming a deep bond with an AI character on http://Character.AI. The AI chatbot — modeled after a Game of Thrones persona — reportedly discussed his suicidal thoughts and encouraged these dangerous ideas. ⚠️AI can help, but unfortunately, it can also harm.
AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models
Yinghui He, Abhishek Panigrahi, Yong Lin, Sanjeev Arora
Download Paper
arxiv preprint, 2025
Kids improve when a good teacher offers adaptive, targeted feedback. Can a small LLM benefit if a large LLM provide helpful feedback, in-context?? Naive ideas fail here. We propose AdaptMI: adaptive, skill-based in-context supervision that boosts 1B models by 6% on challenging math tasks.
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Writing Consultant, 2021-2022
Undergraduate course, Shanghai Jiao Tong University, 2021
<!– This is a description of a teaching experience. You can use markdown like any other post.
Chemistry I, Fall 2021
Undergraduate course, Shanghai Jiao Tong University, 2021
<!– This is a description of a teaching experience. You can use markdown like any other post.
Physics I, Summer 2022
Undergraduate course, Shanghai Jiao Tong University, 2022
<!– This is a description of a teaching experience. You can use markdown like any other post.
Intro to Computer Science theory, Winter 2024
Undergraduate course, University of Michigan, EECS Department, 2024
<!– This is a description of a teaching experience. You can use markdown like any other post.