main

2020/05/08

text data handling in NLP

  1. basic feature extraction using text data
    • Number of words
    • Number of characters
    • Average word length
    • Number of stopwords
    • Number of special characters
    • Number of numerics
    • Number of uppercase words
  2. Basic Text Pre-processing of text data
    • Lower casing
    • Punctuation removal
    • Stopwords removal
    • Frequent words removal
    • Rare words removal
    • Spelling correction
    • Tokenization
    • Stemming
    • Lemmatization
  3. Advance Text Processing
    • N-grams
    • Term Frequency
    • Inverse Document Frequency
    • Term Frequency-Inverse Document Frequency (TF-IDF)
    • Bag of Words
    • Sentiment Analysis
    • Word Embedding

No comments:

Post a Comment

How to Supercharge Your Python Classes with Class Methods

  How to Supercharge Your Python Classes with Class Methods | by Siavash Yasini | May, 2024 | Towards Data Science As we just mentioned, a c...