tomg-group-umd
's Collections
Retrofitting Recurrence
updated
Teaching Pretrained Language Models to Think Deeper with Retrofitted
Recurrence
Paper
•
2511.07384
•
Published
•
16
smcleish/Recurrent-Llama-3.2-train-recurrence-32
Text Generation
•
1B
•
Updated
•
29
smcleish/Recurrent-Llama-3.2-train-recurrence-16
Text Generation
•
1B
•
Updated
•
11
smcleish/Recurrent-Llama-3.2-train-recurrence-8
Text Generation
•
1B
•
Updated
•
1.11k
smcleish/Recurrent-Llama-3.2-train-recurrence-4
Text Generation
•
1B
•
Updated
•
7
smcleish/Recurrent-TinyLlama-3T-train-recurrence-32
Text Generation
•
0.8B
•
Updated
•
11
•
1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-16
Text Generation
•
0.8B
•
Updated
•
10
•
1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-8
Text Generation
•
0.8B
•
Updated
•
12
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4
Text Generation
•
0.8B
•
Updated
•
10
smcleish/Recurrent-OLMo-2-0425-train-recurrence-32
Text Generation
•
1B
•
Updated
•
8
•
2
smcleish/Recurrent-OLMo-2-0425-train-recurrence-16
Text Generation
•
1B
•
Updated
•
7
smcleish/Recurrent-OLMo-2-0425-train-recurrence-8
Text Generation
•
1B
•
Updated
•
10
smcleish/Recurrent-OLMo-2-0425-train-recurrence-4
Text Generation
•
1B
•
Updated
•
10
•
1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-single-phase
Text Generation
•
0.8B
•
Updated
•
7
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-two-phase
Text Generation
•
0.8B
•
Updated
•
7
smcleish/Recurrent-Llama-3.2-untrained
Text Generation
•
1B
•
Updated
•
15
smcleish/Recurrent-TinyLlama-3T-untrained
Text Generation
•
0.8B
•
Updated
•
9
smcleish/Recurrent-OLMo-2-0425-untrained
Text Generation
•
1B
•
Updated
•
6
smcleish/Recurrent-Llama-3.2-2-4-2-untrained
Text Generation
•
1B
•
Updated
•
4
•
1
smcleish/retrofitting-llama-fineweb-edu-tokenized
Viewer
•
Updated
•
332M
•
462