Show HN: Made a batching LLM API for a project. Mistral 200 tk/s on RTX 3090 https://ift.tt/W7G1n2s

हमरु उत्तराखण्ड - December 26, 2023

Show HN: Made a batching LLM API for a project. Mistral 200 tk/s on RTX 3090 I was running into an issue with a vLLM bug that affected multiple GPUs and I needed a stand-in while that bug was getting fixed that used the same API format but had better performance than the API on text-generation-webui. It's very rough. I'm not a coder by trade. But it's very fast once you have many simultaneous connections. https://ift.tt/MvoUb48 December 26, 2023 at 10:52PM

हमरु उत्तराखण्ड

Show HN: Made a batching LLM API for a project. Mistral 200 tk/s on RTX 3090 https://ift.tt/W7G1n2s

Post a Comment

0 Comments

Popular Posts

भरत नाट्य शास्त्र गढवाली अनुवाद

Show HN: Stratup.ai – Startup Idea Machine https://ift.tt/7RfCINq

Show HN: I made a Telegram bot to get Raspberry Pi “in-stock” notification https://ift.tt/GtsFfAl

Subscribe Us

Technology

Comments

Facebook

Categories

Menu Footer Widget

हमरु उत्तराखण्ड

Show HN: Made a batching LLM API for a project. Mistral 200 tk/s on RTX 3090 https://ift.tt/W7G1n2s

You may like these posts

Post a Comment

0 Comments

Social Plugin

Popular Posts

भरत नाट्य शास्त्र गढवाली अनुवाद

Show HN: Stratup.ai – Startup Idea Machine https://ift.tt/7RfCINq

Show HN: I made a Telegram bot to get Raspberry Pi “in-stock” notification https://ift.tt/GtsFfAl

Subscribe Us

Technology

Comments

Facebook

Categories

Menu Footer Widget