Feedback on new computer build for locally running Mistral-7B

bretbernhoft · August 10, 2025, 10:13am

Deployed the Mistral-7B AI model on refurbished hardware and would like to take that learning path to the next level. So I am presently assembling parts for a new, dedicated computer to deploy and develop applications using (on top of) Mistral-7B or another small-sized AI model.

As extra context, I have worked with computers for a while, but never with the goal of running a locally maintained, self-hosted AIaaS. I am fairly familiar with PC hardware, but could use some advice here.

Would you consider the following core components sufficient to house and power an AI model (such as Mistral-7B) for learning and tinkering purposes?

Corsair Vengeance LPX DDR4 RAM 32GB
Samsung 990 EVO Plus SSD 1TB, PCIe Gen 4x4, Gen 5x2 M.2 2280
ASUS TUF Gaming A520M-PLUS AMD AM4 microATX Motherboard
AMD Ryzen 5 5500 CPU
ASUS Dual GeForce RTX™ 5060 8GB GPU

My ultimate objective is not to train models, but to build private full stack applications using artificial intelligence. I would like to supercharge my understanding of and familiarity with AI using my own homelab.

I appreciate any feedback on this subject.

hunor · August 12, 2025, 1:27am

From my experience, GPU memory is king. Having a good CPU and fast storage helps a lot.

I have the following setup:

Ryzen 9 5950 X
32 GB RAM
1 TB NVMe SSD
RTX 3080 10 GB

I run Ollama (with Ubuntu 24 LTS) bare metal on this computer and set up Docker for Ollama, Flowise on a separate computer. My biggest problem is that I can only run smaller models relatively fast. (llama3.2:8b).

The size of the GPU memory influences a lot more, than the GPU speed.

(I am about to deploy this as an agent to search and interact with our internal Wiki, runs on Bookstack.)

bretbernhoft · August 12, 2025, 11:25pm

Thank you for your response and sharing specs. It seems your local AI setup is rather similar to my own near future machine. Which is encouraging to read.

Did you do research ahead of assembling your hardware? I ask because my research shows a build such as yours is more than sufficient for running small LLMs.

Do you have any noticeable latency, delay or lag when interacting with your AI model?

You mention that GPU memory is important. Do you ever run out of VRAM? I am concerned 8GB is not enough for much more than the basics, such as chatting and simple app development.

Louie1961 · August 13, 2025, 1:59am

Mistral recommends a 16gb GPU. Personally I would double the system ram and go for a bit more CPU. I would go Ryzen 7 5700X for just a little more money.

hunor · August 13, 2025, 12:56pm

I did 0 research , not because I usually don’t. This PC was available for me. We purchased it to drive LED walls, but that never happened.

The VRAM seems a bit limited. Lama3.1:8 b does well, but I had no chance to push multiple users to test the system. Embedding models are ok with 1.3 GB Max usage. But that means it has to unload the model when I do embedding.

bretbernhoft · August 16, 2025, 3:20am

That is all good to know.

Maybe the 5060 16GB model would be more appropriate then? I will consider going that route.

Thank you for the information.

bretbernhoft · August 16, 2025, 3:34am

Thank you for sharing your experiences.

I don’t plan on having multiple users access any AIs locally hosted on my LAN. As I am still in the early phases of learning how to deploy and work with some of these technologies. All of this tinkering is a means of figuring it out though

From what you’re saying, it sounds like 8GB (or even 16GB) might be insufficient for maintaining a reliable AI-driven service for an organization. Is that right?

hunor · August 21, 2025, 1:37am

With 16 GB, you can run decent models. It depends on whether you are constantly switching between them, for example, doing embedding and then running other workflows. If the computer has to unload the model and load a new one, it could take some time.

bretbernhoft · August 30, 2025, 11:55am

I ultimately ended up purchasing a 5060 16 GB GPU for this project. However, I don’t (initially) plan on switching between different models, instead working with one LLM at a time.

bretbernhoft · September 3, 2025, 5:28am

As one last update for this thread, below I have attached a photo of the parts I am using for my AI machine:

Thank you for the recommendations to go with a 16GB GPU, instead of an 8 GB GPU. Having made that change, to the best of my understanding now, is going to be significant in how this learning exercise plays out for me.

Wish me luck as I adventure into territories unknown. Pretty excited TBH.

hunor · September 5, 2025, 1:51am

New PC/hardware day is always exciting.