The tokenise module, generates tokens out of a stream of text and returns an iterator for each treated line.
>>> import tokenize
>>> file = open(‘primes.py’).next #note no ()…tokenize.generate_tokens needs a function name as argument…which it calls repeatedly till a StopIteration is received
# tokenze.generate_tokens(readline) is a generator that requires 1 argument, readline, which must be a callable object that provides the same interface as the readline() method of built in objects.
# Each call to the function must return 1 line from the input as a string.
>>> tokens = tokenize.generate_tokens(file)
# Generator produces 5-tuples with the following members
# 1. Token Type
# 2. Token String
# 3. Tuple (srow, scol) specifying the row and column where the token begins in the file
# 4. Tuple (erow, ecol) specifying the row and column where the tokens end in the file
# 5. The line on which the token was found
We get the tokens as :
1. different words
2. alpha-numeric characters separated by special characters are different tokens
3. Special characters are different tokens
4. Escape sequences are different tokens