google
/

t5-v1_1-large

text2text-generation

text-generation-inference

Model card Files Files and versions

Update README.md

#1

by TimeRobber - opened Jan 3, 2023

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -11,7 +11,9 @@ license: apache-2.0
 ## Version 1.1
-[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).
 - Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.

 ## Version 1.1
+[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model
+- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).
 - Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.