you should try this.
#8
by
ZeroWw
- opened
By using another layer, it might improve even more.
https://huggingface.co/moelanoby/phi-3-M3-coder/blob/main/architecture.py
By using another layer, it might improve even more.
https://huggingface.co/moelanoby/phi-3-M3-coder/blob/main/architecture.py