gn00029914's picture

gn00029914

gn00029914

·

AI & ML interests

None yet

Recent Activity

liked a model about 23 hours ago

sgrankin/gemma-4-E4B-it-oQ8-fp16

liked a model about 23 hours ago

z-lab/gemma-4-31B-it-DFlash

liked a model about 23 hours ago

mlx-community/gemma-4-31B-it-assistant-bf16

View all activity

Organizations

upvoted a collection about 24 hours ago

Gemma-4 Assistant (MTP)

4 items • Updated 15 days ago • 24

upvoted 4 papers 5 days ago

SAW-INT4: System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving

Paper • 2604.19157 • Published 30 days ago • 1

A Note on TurboQuant and the Earlier DRIVE/EDEN Line of Work

Paper • 2604.18555 • Published about 1 month ago • 1

Polynomial-Time Optimal Group Selection via the Double-Commutator Eigenvalue Problem

Paper • 2605.00834 • Published 13 days ago • 1

Unification of Signal Transform Theory

Paper • 2605.11589 • Published 9 days ago • 1

upvoted 3 papers 6 days ago

Approximating Uniform Random Rotations by Two-Block Structured Hadamard Rotations in High Dimensions

Paper • 2604.23418 • Published 26 days ago • 1

Generating Hadamard matrices with transformers

Paper • 2604.11101 • Published 10 days ago • 1

The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval

Paper • 2502.11276 • Published Feb 16, 2025 • 1

upvoted a collection 7 days ago

F2LLM

23 items • Updated Mar 20 • 5

upvoted 2 papers 7 days ago

F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World

Paper • 2603.19223 • Published Mar 19 • 33

Revisiting RaBitQ and TurboQuant: A Symmetric Comparison of Methods, Theory, and Experiments

Paper • 2604.19528 • Published 21 days ago • 1

upvoted 7 papers 8 days ago

The Newton-Muon Optimizer

Paper • 2604.01472 • Published Apr 1 • 1

PolarGrad: A Class of Matrix-Gradient Optimizers from a Unifying Preconditioning Perspective

Paper • 2505.21799 • Published Feb 5 • 1

Sparser, Faster, Lighter Transformer Language Models

Paper • 2603.23198 • Published 13 days ago • 2

Spectrum-Adaptive Generalization Bounds for Trained Deep Transformers

Paper • 2605.07297 • Published 13 days ago • 1

Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers

Paper • 2605.06169 • Published 14 days ago • 185

ELF: Embedded Language Flows

Paper • 2605.10938 • Published 10 days ago • 14

Towards Closing the Autoregressive Gap in Language Modeling via Entropy-Gated Continuous Bitstream Diffusion

Paper • 2605.07013 • Published 14 days ago • 2

upvoted a collection 18 days ago

Granite 4.1 Language Models

Efficient language models for multilingual generation, coding, RAG, and AI assistant workflows. • 6 items • Updated 21 days ago • 51

upvoted a paper 21 days ago

Global Lyapunov functions: a long-standing open problem in mathematics, with symbolic transformers

Paper • 2410.08304 • Published Oct 10, 2024 • 1