3 24 11

Yu Zhao

yuzhaouoe

https://yuzhaouoe.github.io/

AI & ML interests

NLP/ML

Recent Activity

upvoted a paper 3 days ago

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

upvoted a paper 23 days ago

Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context

upvoted a paper about 2 months ago

Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models

View all activity

Organizations

upvoted a paper 3 days ago

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Paper • 2605.30280 • Published 10 days ago • 138

upvoted a paper 23 days ago

Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context

Paper • 2605.13831 • Published 25 days ago • 87

upvoted a paper about 2 months ago

Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models

Paper • 2602.12586 • Published Feb 13 • 2

upvoted a paper 7 months ago

OpenSIR: Open-Ended Self-Improving Reasoner

Paper • 2511.00602 • Published Nov 1, 2025 • 21

upvoted a paper 8 months ago

Learning GUI Grounding with Spatial Reasoning from Visual Feedback

Paper • 2509.21552 • Published Sep 25, 2025 • 11

upvoted 2 papers 11 months ago

Inverse Scaling in Test-Time Compute

Paper • 2507.14417 • Published Jul 19, 2025 • 28

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

Paper • 2507.08800 • Published Jul 11, 2025 • 81

upvoted 2 papers about 1 year ago

Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem

Paper • 2506.03295 • Published Jun 3, 2025 • 17

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

Paper • 2505.10610 • Published May 15, 2025 • 55

upvoted an article about 1 year ago

Article

🦸🏻#1: Open-endedness and AI Agents – A Path from Generative to Creative AI?

Kseniase

•

Dec 25, 2024

• 16

upvoted a paper over 1 year ago

Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression

Paper • 2503.02812 • Published Mar 4, 2025 • 10

upvoted a collection over 1 year ago

Q-Filters

Collection

Pre-computed Q-Filters for efficient KV cache compression. • 15 items • Updated Mar 3, 2025 • 7

upvoted 5 papers over 1 year ago

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21, 2024 • 20

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

Paper • 2408.05147 • Published Aug 9, 2024 • 42

upvoted an article almost 2 years ago

Article

Improving Hugging Face Training Efficiency Through Packing with Flash Attention 2

RQlee, ArthurZ, achikundu, lwtr, rganti, mayank-mishra

•

Aug 21, 2024

• 41

upvoted a collection almost 2 years ago

🔍 Interpretability & Analysis of LMs

Collection

Outstanding research in LM interpretability and evaluation, summarized • 136 items • Updated 11 days ago • 119

upvoted an article almost 2 years ago

Article

Introducing RWKV - An RNN with the advantages of a transformer

BlinkDL, Hazzzardous, sgugger, ybelkada

•

May 15, 2023

• 25

Yu Zhao

AI & ML interests

Recent Activity

Organizations

yuzhaouoe's activity

🦸🏻#1: Open-endedness and AI Agents – A Path from Generative to Creative AI?

Improving Hugging Face Training Efficiency Through Packing with Flash Attention 2

Introducing RWKV - An RNN with the advantages of a transformer