How I Built a Chatbot for wagah.lk on a Mac Mini

Running an online store, you notice the same questions come back over and over. Is this in stock? How much is it? Do you deliver to my town? Most of them landed in our WhatsApp, and answering them by hand all day isn’t a great use of anyone’s time.

So I decided to build a chatbot for wagah.lk that could handle the easy ones on its own and only pass the real questions to a human. The catch was that I didn’t want a monthly AI bill for it. This is a small business, and paying per message for a model in the cloud adds up fast once people actually start using it.

The answer was to run the model myself. I had a Mac Mini (the 24GB one) sitting on the desk, and that turned out to be plenty for the job.

Why a Mac Mini

The Mac Mini’s unified memory is the trick here. On a 24GB machine, the CPU and the GPU share the same pool of memory, so a model can use most of that 24GB without the usual juggling you’d do on a PC with a separate graphics card. That’s enough to run a small, capable language model comfortably and still leave room for everything else.

The hardware is also a one time cost. Once it’s bought, the model runs for nothing on top of the electricity. No tokens to count, no per message charge, and no surprise invoice at the end of the month. For a shop chatbot that mostly answers stock and price questions, that maths makes a lot of sense.

Running the model

I used Ollama to run the model locally. If you haven’t seen it, Ollama is the easiest way I’ve found to pull a model and start talking to it on your own machine. You install it, run one command, and you’ve got a model serving on a local port.

ollama run llama3.1:8b

An 8 billion parameter model fits well within 24GB and responds quickly enough that a customer waiting on the website doesn’t feel it lagging. You can try a few sizes and see what your machine is happy with. Smaller models are faster but a bit more literal, and bigger ones read more naturally but eat more memory. The 8B size was a good middle ground for me.

Ollama also exposes a simple HTTP endpoint, which is what makes the next part possible.

Teaching it about our products

A plain language model knows a lot about the world, but it knows nothing about what we sell. Ask it “do you have this bag in stock” and it’ll happily make something up, which is the last thing you want.

The way around that is a technique called RAG (Retrieval Augmented Generation). The idea is simple. Before the model answers, you go and fetch the relevant facts and hand them to it, so it answers from real data instead of guessing.

In practice this meant:

Exporting our product list (name, category, price, stock, a line of description) into a plain file.
Turning each product into an embedding, which is a numeric fingerprint of its meaning, so I could search by what a customer means rather than just the exact words they typed.
When a question comes in, finding the few products that best match it and passing those to the model along with the question.

So when someone asks “any leather handbags under 10,000?”, the system pulls the matching products first, then the model writes a friendly answer using only those. If nothing matches, it says so instead of inventing a product. That last part matters more than it sounds.

Wiring it into the site

The store itself runs as a static site with a small backend, so adding the chatbot was mostly a matter of giving it somewhere to talk to.

I put a small service in front of the Mac Mini. It takes a question from the website, runs the retrieval step, calls the local model through Ollama, and sends the answer back. The website has a little chat bubble in the corner that posts to that service and shows the reply.

The flow end to end looks like this:

A customer types a question in the chat bubble on the site.
The request hits my service, which finds the relevant products.
Those products plus the question go to the local model.
The model writes an answer, and it comes back to the chat bubble.

The whole round trip happens in a second or two, and the model is sitting on a Mac Mini on my desk the entire time.

Keeping it honest

A chatbot that’s loose with the truth is worse than no chatbot. A few rules made it behave.

The first is to stay on topic. In its instructions I told it to only answer about our products, delivery, and the store. If someone asks it to write their homework, it politely declines. The second is to never guess stock or price. It can only use the numbers it was handed from the product data, so if there’s nothing to go on, it doesn’t answer.

The last rule is to hand off to a human. For anything about an actual order, a payment, or a complaint, it points the customer to WhatsApp instead of trying to handle it itself. Some things should reach a person, and a bot that knows its own limits is far more useful than one that bluffs.

I also keep the model’s instructions short and clear. The more you over explain to it, the more creative it gets, and creative is not what you want from a shop assistant.

What it cost and what I’d change

The running cost is electricity, and that was the whole point. The Mac Mini was a one time spend, the software is free, and there’s no usage bill no matter how many people chat.

The one honest trade off is that the model lives on a machine in one place. If that machine is off, the bot is off. For a small store that’s fine, but if I wanted it bulletproof I’d either keep the Mac Mini on a small backup power supply or fall back to a cloud model only when the local one can’t be reached. I might add that later.

The other thing I’d improve is keeping the product data fresh on its own, so the bot always knows the current stock without me re exporting anything by hand. That’s just plumbing, and it’s on the list.

Conclusion

You don’t need a big budget or a cloud subscription to put a useful chatbot on your website. A Mac Mini, a free local model, and a bit of work to feed it your own data gets you something that genuinely takes load off your day. Ours now answers the easy questions around the clock, and the messages that reach me are the ones actually worth my time.

If you’re thinking of doing the same for your own store, let me know. Happy to point you in the right direction.