Llm - Tag - vo.rs

Prompt Injection: The SQL Injection of the AI Era

Wed, 20 May 2026 10:30:00 +0000

Every generation of software gets the vulnerability it deserves. The web era handed us SQL injection, a flaw so persistent it still tops vulnerability lists decades after the fix was well understood. The large language model era has produced its own signature weakness, and it rhymes almost perfectly with the old one. It is called prompt injection, and if you are building anything that lets a model read untrusted text, you need to understand it.

RAG Explained: How AI Stops Making Things Up

Tue, 07 Apr 2026 11:30:00 +0000

Imagine a brilliant colleague who has read most of the internet, speaks with unshakeable confidence, and occasionally invents a fact so smoothly that you only catch it because you happen to know the truth. That is a large language model on a bad day. It is not lying, exactly; it simply does not know what it does not know. Retrieval-Augmented Generation, or RAG, is the technique that hands that colleague a library card and a quiet instruction: before you answer, go and look it up. The result is an AI that grounds its words in real documents rather than in the foggy recollections of its training data.

What Is a Token, Really? How LLMs Read, Reason, and Bill You

Fri, 13 Mar 2026 16:00:00 +0000

Every conversation you have with a language model is quietly measured, chopped, and counted in a unit you almost never see. It is not the word, nor quite the letter. It is the token: the atom of AI text, the thing the model actually reads, the thing your bill is calculated from, and the reason your carefully crafted prompt sometimes behaves in ways that feel slightly arbitrary. Understand tokens and a great deal about how these systems read, reason, and charge suddenly clicks into place.

Local AI on Your Own Metal: Running LLMs Offline with Ollama

Tue, 24 Feb 2026 11:00:00 +0000

Not so long ago the idea of a capable language model running on the computer under your desk, with no internet connection and no monthly bill, sounded faintly absurd. We have written before about the leap from the stumbling early days of GPT-2 to the polished conversations of modern chatbots, and the assumption baked into all of it was that the clever part lived in someone else’s datacentre. That assumption no longer holds. A tool called Ollama has made running open-weight language models on your own hardware about as difficult as installing a music player. This guide shows you how to do it, what to expect from the machine you already own, and where the honest limits lie.