Tokenizing is the operation of replacing one set of symbols with another, typically to make the resulting set of symbols smaller.

The term is most commonly used in computers, where a programming language source code, a set of symbols in an english-like format, is converted into another format that is much smaller. Most BASIC interpreters used this to save room, a command such as print would be replaced by a single number which uses much less room in memory. In fact most lossless compression systems use a form of tokenizing, although it's typically not referred to as such.