sai_reddy's picture

14 2 1

sai_reddy

saireddy

·

AI & ML interests

None yet

Organizations

New activity in moonshotai/Kimi-Linear-48B-A3B-Instruct 2 months ago

insights on comparisons with Qwen/Qwen3-Next-80B-A3B-Instruct ?

#14 opened 2 months ago by

New activity in Qwen/Qwen3-VL-235B-A22B-Instruct-FP8 3 months ago

function calling

#4 opened 3 months ago by

New activity in Qwen/Qwen3-30B-A3B-Instruct-2507-FP8 5 months ago

possible to extend context to 1m tokens ?

#5 opened 5 months ago by

New activity in google/gemma-2-9b over 1 year ago

RuntimeError: Index put requires the source and destination dtypes match, got BFloat16 for the destination and Float for the source.

#24 opened over 1 year ago by

model.generate is throwing AttributeError: 'HybridCache' object has no attribute 'float'

#18 opened over 1 year ago by

base vs instruct model

#17 opened over 1 year ago by

Inference error

#20 opened over 1 year ago by

New activity in google/gemma-7b over 1 year ago

8-bit precision error

#32 opened almost 2 years ago by

New activity in google/gemma-7b-it over 1 year ago

ValueError with multi A100 GPUS

#28 opened almost 2 years ago by

New activity in meta-llama/Meta-Llama-3-8B-Instruct over 1 year ago

ValueError: You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on.

#35 opened over 1 year ago by

New activity in meta-llama/Meta-Llama-3-70B-Instruct over 1 year ago

Base vs instruct

#17 opened over 1 year ago by

New activity in google/gemma-7b-it almost 2 years ago

Could not find GemmaForCausalLM neither in <module 'transformers.models.gemma'

#36 opened almost 2 years ago by