Skip to content

NLP Roadmap

Natural Language Processing: Text tells the story (✧ᗜ✧)

TOC

  • Text Preprocessing
    • Normalization
    • Tokenization
    • Remove stop words
    • Stemming
    • Lemmatization
  • Parser
    • JSON parser
    • HTML parser
  • Text encoding
    • BOW
    • One hot encoding
    • TF-IDF
  • Text Classification
    • Intent detection
    • Named Entity Recognition
  • Text similarity
    • Embeddings: word2vec, TF-IDF
    • Matrices: cosinse similarity, Jaccard similarity, Euclidean Distance
    • Lexical similarity: for clustering and keyword matching
    • Semantic simialrity: for knowledge base, string and statical based
    • Algo: Global matrix factorization, Local context window
    • Model: BERT
  • Text Clustering
    • K means
  • Sentiment analysis
    • Fine-grained Sentiment Analysis (like 5 star rating - scale base)
    • Emotion detection (happy, sad, anger)
    • Aspect-based Sentiment Analysis (customer review on new product like new headphone design)
    • Intent Based Analysis (intent focused)
  • Languages
    • POS
    • Language detection
    • Machine translation
  • Spell correction
  • Pii
  • Conversational
    • Question answer
    • Text summarization
  • Evaluation matrices
    • F1 score
    • Perplexity
    • BERTScore
    • BLEU (Bilingual Evaluation Understudy)
    • ROUGE (Recall-Oriented Understudy for Gisting Evaluation)
    • METEOR (Metric for Evaluation of Translation with Explicit ORdering)
  • Transformer
    • Self Attention
    • Transformer Architecture