Token (LLM): definition | ReplyLabs

A token is the small chunk of text a language model reads and writes in. Models do not see whole words; they break text into tokens, which are common character sequences drawn from a fixed vocabulary. One token is roughly 3 to 4 characters of English, so a short word is one token and a longer word may be split into two or three. Tokens matter because AI is billed per token, not per request, so the number of tokens in your prompt and its output is the main driver of what an AI run costs.

How is text turned into tokens?

The model uses a tokenizer that splits text into subword pieces, often using a method called Byte-Pair Encoding, where frequent character sequences are merged into single tokens. A common word like "email" might be one token, while an unusual name or a long technical word gets split across several. Punctuation and spaces count too. This is why "count the words" is never an exact guide to cost: the model counts tokens, not words.

Why do tokens drive AI cost?

Providers charge per token, and there are two sides to the meter:

Input tokens. Everything you send: the instruction, any column values substituted into the prompt, and any context. A long, verbose prompt costs more on every single row it runs against.
Output tokens. Everything the model writes back. Output tokens usually cost more than input tokens because generating text is more computationally intensive than reading it.

So total cost per row is roughly input tokens plus output tokens, priced at the provider's per-token rate. Multiply that by your row count and small per-row differences become large bills at volume.

What does this mean for spreadsheet runs?

When you run an AI prompt over a column, every row pays for the tokens in that row's prompt plus the tokens in its output. Two habits keep cost down:

Keep the prompt tight. Trim filler instructions. A leaner prompt cuts input tokens on every row.
Constrain the output. Asking for "one sentence" instead of a paragraph cuts output tokens, which are the pricier side.

ReplyLabs shows a cost preview before you run, so you see the token-driven figure for your exact row count up front. To model it for a specific list, use the AI cost calculator, and for the wider context see AI in Google Sheets.

How is text turned into tokens?

Why do tokens drive AI cost?

Providers charge per token, and there are two sides to the meter:

Input tokens. Everything you send: the instruction, any column values substituted into the prompt, and any context. A long, verbose prompt costs more on every single row it runs against.

Output tokens. Everything the model writes back. Output tokens usually cost more than input tokens because generating text is more computationally intensive than reading it.

So total cost per row is roughly input tokens plus output tokens, priced at the provider's per-token rate. Multiply that by your row count and small per-row differences become large bills at volume.

What does this mean for spreadsheet runs?

When you run an AI prompt over a column, every row pays for the tokens in that row's prompt plus the tokens in its output. Two habits keep cost down:

Keep the prompt tight. Trim filler instructions. A leaner prompt cuts input tokens on every row.

Constrain the output. Asking for "one sentence" instead of a paragraph cuts output tokens, which are the pricier side.

Token (LLM)

How is text turned into tokens?

Why do tokens drive AI cost?

What does this mean for spreadsheet runs?

Related

Try it on your own list

Token (LLM)

How is text turned into tokens?

Why do tokens drive AI cost?

What does this mean for spreadsheet runs?

Related

Try it on your own list