Julia K's picture

Julia K

juliak115

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning

upvoted a paper 6 days ago

Accent Vector: Controllable Accent Manipulation for Multilingual TTS Without Accented Data

upvoted a paper about 2 months ago

End-to-End Joint ASR and Speaker Role Diarization with Child-Adult Interactions

View all activity

Organizations

None yet

upvoted 2 papers 6 days ago

DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning

Paper • 2603.12257 • Published 8 days ago • 31

Accent Vector: Controllable Accent Manipulation for Multilingual TTS Without Accented Data

Paper • 2603.07534 • Published 12 days ago • 5

upvoted 11 papers about 2 months ago

End-to-End Joint ASR and Speaker Role Diarization with Child-Adult Interactions

Paper • 2601.17640 • Published Jan 25 • 5

daVinci-Dev: Agent-native Mid-training for Software Engineering

Paper • 2601.18418 • Published Jan 26 • 126

Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis

Paper • 2601.14417 • Published Jan 20 • 5

HeartMuLa: A Family of Open Sourced Music Foundation Models

Paper • 2601.10547 • Published Jan 15 • 48

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published Jan 6 • 49

Motion Attribution for Video Generation

Paper • 2601.08828 • Published Jan 13 • 71

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 318

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Paper • 2601.06943 • Published Jan 11 • 214

BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published Jan 10 • 200

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 195

Rethinking Video Generation Model for the Embodied World

Paper • 2601.15282 • Published Jan 21 • 44

upvoted 2 papers 8 months ago

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4, 2025 • 273

Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe

Paper • 2508.01691 • Published Aug 3, 2025 • 10