Tokenization, Stemming and Lemmatization
What is Tokenization?
Tokenization is breaking down a big chunk of text into smaller chunks. Whether it is breaking the paragraph into sentences or sentences into words or words into characters. Tokenization can be very well done using NLTK library.