taehyeon
rlaxogus99
AI & ML interests
None yet
Organizations
None yet
multimodal
agent
-
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Paper • 2411.02337 • Published • 36 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 49 -
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use
Paper • 2411.10323 • Published • 34 -
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Paper • 2411.17465 • Published • 89
coding
video
reasoning
peft
rag
quantize
-
BitNet a4.8: 4-bit Activations for 1-bit LLMs
Paper • 2411.04965 • Published • 69 -
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization
Paper • 2411.02355 • Published • 51 -
Ultra-Sparse Memory Network
Paper • 2411.12364 • Published • 23 -
VisionZip: Longer is Better but Not Necessary in Vision Language Models
Paper • 2412.04467 • Published • 117
CV
reasoning
multimodal
peft
agent
-
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Paper • 2411.02337 • Published • 36 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 49 -
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use
Paper • 2411.10323 • Published • 34 -
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Paper • 2411.17465 • Published • 89
rag
coding
quantize
-
BitNet a4.8: 4-bit Activations for 1-bit LLMs
Paper • 2411.04965 • Published • 69 -
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization
Paper • 2411.02355 • Published • 51 -
Ultra-Sparse Memory Network
Paper • 2411.12364 • Published • 23 -
VisionZip: Longer is Better but Not Necessary in Vision Language Models
Paper • 2412.04467 • Published • 117
video