What is Tokenisation?
The process of breaking text into smaller pieces called tokens that an AI model can process.
Why It Matters
Tokenisation affects how much text a model can handle at once and how well it understands different words and languages.
Real-World Example
The word 'understanding' might be split into tokens like 'under', 'stand', and 'ing'.
“Understanding terms like Tokenisation matters because it helps you have better conversations with developers and make smarter decisions about your software. You do not need to be technical. You just need to know enough to ask the right questions.”
Related Terms
Large Language Model (LLM)
An AI system trained on massive amounts of text that can understand and generate human language.
Context Window
The maximum amount of text an AI model can consider at once when generating a response.
Embeddings
A way of representing words, sentences, or other data as lists of numbers that capture their meaning.
Learn More at buildDay Melbourne
Want to understand these concepts hands-on? Join our one-day workshop and build a real web application from scratch.
Related Terms
Large Language Model (LLM)
An AI system trained on massive amounts of text that can understand and generate human language.
Embeddings
A way of representing words, sentences, or other data as lists of numbers that capture their meaning.
Context Window
The maximum amount of text an AI model can consider at once when generating a response.
Transformer
A type of AI architecture that processes text by paying attention to relationships between all words at once, rather...
Attention Mechanism
A technique that lets AI models focus on the most relevant parts of the input when generating output.
Fine-tuning
Adapting a pre-trained AI model to perform better on a specific task by training it on additional specialised data.