Temperature, Top-p, Top-k

Language models don’t always pick the most likely word. They can be tuned to be more predictable or more creative using parameters like temperature, top‑p, and top‑k. These control the randomness of the output.

Temperature

Temperature scales the probability distribution before sampling. Lower temperature makes the model more confident and repetitive. Higher temperature makes it more random and creative.

0.0–0.2: deterministic, picks the most likely word (good for facts, code).
0.5–0.7: balanced, slight variation (good for chatbots).
0.8–1.0: creative, may produce surprises (good for poetry, brainstorming).
>1.0: very random, often nonsense.

temperature = 0.1: "The cat sat on the mat."
temperature = 0.9: "The fluffy feline perched upon the cozy rug."

Top‑p (Nucleus Sampling)

Instead of considering all possible next words, top‑p selects the smallest set of words whose cumulative probability exceeds p (e.g., 0.9). It dynamically adjusts the number of candidates.

p = 1.0: consider all words.
p = 0.9: consider only the top 90% probability mass (filters out very unlikely words).
Lower p = more focused, less diversity.

Top‑k

Top‑k sampling limits the model to consider only the k most likely next words. For example, top‑k=40 means only the 40 highest‑probability words are candidates. The rest are ignored.

How to Use Together

In practice, you often set temperature and top‑p together. For factual tasks: temperature 0.1, top‑p 0.9. For creative writing: temperature 0.8, top‑p 0.9. Top‑k is less common but useful for controlling vocabulary size.

Two Minute Drill

Temperature controls randomness: low = predictable, high = creative.
Top‑p filters out low‑probability words dynamically.
Top‑k limits the number of candidate words.
Adjust these based on your task: facts vs. creativity.

Need more clarification?

Drop us an email at career@quipoinfotech.com

Welcome to Quipoin

Quipoin Menu

Temperature, Top-p, Top-k

Need more clarification?