Pietro Lesci
pietrolesci
AI & ML interests
I like developing and applying causal methods to study the effect of training choices on models’ behaviour, including memorisation, shortcut learning, and tokenisation.
Organizations
models
27
pietrolesci/tokenisers
Updated
pietrolesci/tokenizers
Updated
pietrolesci/small_bpe128k
Updated
•
26
pietrolesci/small_langspec128k
Updated
•
14
pietrolesci/small_unigramlm128k
Updated
pietrolesci/small_tokmix128k
Updated
pietrolesci/small_multigram128k
Updated
•
18
pietrolesci/me100M_finewebedu-20B_bpe32000minipile
Updated
pietrolesci/me100M-tied_finewebedu-20B_bpe32000minipile
Updated
pietrolesci/me850M_minipile_bpe32000minipile
Updated
datasets
56
pietrolesci/unimixlm
Viewer
•
Updated
•
81.9M
•
226
pietrolesci/me-minipile-evals
Viewer
•
Updated
•
1.22M
•
33
pietrolesci/pile-deduped
Viewer
•
Updated
•
748M
•
380
pietrolesci/pythia-deduped-memorisation-profiles
Viewer
•
Updated
•
2.13M
•
74
pietrolesci/pile-validation
Viewer
•
Updated
•
429k
•
109
pietrolesci/pile-deduped-subset
Viewer
•
Updated
•
16.3k
•
36
pietrolesci/pythia-deduped-stats
Viewer
•
Updated
•
16.3M
•
1.55k
pietrolesci/pythia-deduped-stats-raw
Viewer
•
Updated
•
14.9M
•
13.7k
pietrolesci/agnews
Viewer
•
Updated
•
510k
•
139
pietrolesci/amazoncat-13k
Viewer
•
Updated
•
5.99M
•
740
•
1