NLP Roadmap
Natural Language Processing: Text tells the story (✧ᗜ✧)
TOC
- Text Preprocessing
- Normalization
- Tokenization
- Remove stop words
- Stemming
- Lemmatization
- Parser
- JSON parser
- HTML parser
- Text encoding
- BOW
- One hot encoding
- TF-IDF
- Text Classification
- Intent detection
- Named Entity Recognition
- Text similarity
- Embeddings: word2vec, TF-IDF
- Matrices: cosinse similarity, Jaccard similarity, Euclidean Distance
- Lexical similarity: for clustering and keyword matching
- Semantic simialrity: for knowledge base, string and statical based
- Algo: Global matrix factorization, Local context window
- Model: BERT
- Text Clustering
- K means
- Sentiment analysis
- Fine-grained Sentiment Analysis (like 5 star rating - scale base)
- Emotion detection (happy, sad, anger)
- Aspect-based Sentiment Analysis (customer review on new product like new headphone design)
- Intent Based Analysis (intent focused)
- Languages
- POS
- Language detection
- Machine translation
- Spell correction
- Pii
- Conversational
- Question answer
- Text summarization
- Evaluation matrices
- F1 score
- Perplexity
- BERTScore
- BLEU (Bilingual Evaluation Understudy)
- ROUGE (Recall-Oriented Understudy for Gisting Evaluation)
- METEOR (Metric for Evaluation of Translation with Explicit ORdering)
- Transformer
- Self Attention
- Transformer Architecture