Essential terms and definitions for understanding large language models, AI, and prompt engineering.
A way to access LLM capabilities programmatically from your applications.
A standardized test used to evaluate and compare LLM performance.
The maximum amount of text (in tokens) an LLM can consider at once.
Numerical vector representations of text that capture semantic meaning.
Training a pre-trained model on specific data to improve performance for particular tasks.
When an LLM generates confident but factually incorrect or made-up information.
The process of running a trained model to generate predictions or outputs.
An AI model trained on massive text datasets to understand and generate human-like text.
The time delay between sending a prompt and receiving a response.
AI models that can process multiple types of input like text, images, and audio.
An LLM with publicly available weights that can be downloaded and self-hosted.
A learnable value in a neural network—more parameters generally means more capability.
The input text you provide to an LLM to get a response.
The practice of crafting effective prompts to get optimal responses from LLMs.
A technique that retrieves relevant documents to provide context for LLM responses.
Instructions that define the LLM's behavior, persona, and constraints.
Receiving LLM output token-by-token as it's generated rather than waiting for completion.
The basic unit of text that LLMs process—roughly 4 characters or 0.75 words.
A parameter that controls randomness in LLM outputs—lower is more deterministic.
The neural network architecture underlying all modern LLMs, introduced in 2017.
Now that you understand the terminology, test models side-by-side with your own prompts.