Is extreme context size version Mixtral available in the future?

#56

by StationaryWeaver - opened Oct 20, 2024

Oct 20, 2024

🥳First, thanks and salute to everyone made Mixtral be possible to public!🥳

I am running models purely on CPU with highest possible resolution.
Fore me, Mixtral-8x22B-Instruct's behavior is much more understandable than Mixtral-8x7B-Instruct.🧐
However, it(Mixtral-8x22B-Instruct) "lost it's mind" after 20k+ context size.🤪

I believe there shall be some strategy on attention gaining to process previous context rather than slide-windowing the raw thing to the process. The strategy process itself can significantly reduce the data attention required.

I'm going to try Mistral Large 2 soon.🤗 Hope it better conscious.

xujfcn

23 days ago

For a simpler API integration, you can use any OpenAI-compatible endpoint. I have been using Crazyrouter which supports 600+ models:

from openai import OpenAI
client = OpenAI(base_url="https://crazyrouter.com/v1", api_key="your-key")
response = client.chat.completions.create(model="mixtral-8x22b-instruct-v0.1", messages=[...])

No special SDK needed — standard OpenAI format works.

xujfcn

23 days ago

Interesting comparison! For anyone wanting to run their own benchmarks across models, using an API gateway simplifies things a lot. I wrote a comparison script using Crazyrouter that tests speed, quality, and cost across providers: Guide

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment