Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Categories

Computer Science

Machine Learning

Activation Functions in Neural Networks

Bias-Variance Tradeoff

Computer Vision Fundamental Concepts

Cross-validation Techniques

Data Preprocessing for Machine Learning

Deep Learning Fundamentals

Dimensionality Reduction Methods

Data Imbalance Handling

Ensemble Learning Techniques

Evaluation Metrics for Classification

Feature Engineering Techniques

Generative Models

Gradient Descent Variants

Hyperparameter Tuning

Machine Learning Best Practices

Loss Functions Explained

Machine Learning Security and Privacy

Machine Learning Pipelines

Machine Learning Frameworks

Model Explainability and Interpretability

Model Selection Criteria

Multiclass Classification Strategies

Natural Language Processing Terminology

Neural Network Architectures

Performance Optimization in Machine Learning

Time Series Analysis Techniques

Reinforcement Learning Concepts

Statistical Tests for Model Comparison

Natural Language Processing Terminology

Flashcards

0/20

Still learning

Tokenization

The process of splitting text into individual terms or phrases. It allows NLP systems to work with words and concepts rather than strings of characters.

POS Tagging

Short for Part-of-Speech Tagging, it involves assigning word types (like noun, verb, adjective) to each token. Useful for syntactic parsing and word sense disambiguation.

Named Entity Recognition (NER)

The process of identifying and classifying named entities in text into predefined categories such as the names of persons, organizations, locations, etc.

Lemmatization

The process of reducing words to their lemma or dictionary form. It is more sophisticated than stemming and uses lexical knowledge bases like WordNet.

Stemming

The process of reducing words to their base or root form. Generally simpler than lemmatization and often uses heuristic rules.

Stop Words

Commonly used words in a language (like 'the', 'is', 'in') that are often removed in NLP tasks to reduce noise and focus on meaningful words.

Corpus

A large and structured set of texts used in NLP. It serves as data for modeling language and extracting statistical information.

Word Embeddings

Numerical vector representations of words that capture their meanings, syntactic properties, and relation with other words.

Bag of Words (BoW)

A simple feature extraction technique that describes the occurrence of words within a document, disregarding grammar and word order.

TF-IDF

Short for Term Frequency-Inverse Document Frequency. A statistical measure used to evaluate the importance of a word to a document in a corpus.

n-gram

A contiguous sequence of n items (typically words or letters) from a given text or speech. Useful in text modeling and prediction.

Sequence to Sequence (Seq2Seq) Model

A type of model in NLP that transforms a given sequence of elements, such as words in one language, to another sequence, such as a translation in another language.

Attention Mechanism

A technique in sequence learning that allows the model to focus on different parts of the input while generating each word of the output sequence.

BERT

Bidirectional Encoder Representations from Transformers, a pre-training technique for NLP that considers both left and right context in all layers.

Transformer

An NLP model architecture that relies on self-attention mechanisms instead of sequence-aligned recurrent layers like RNNs for handling sequences of data.

Latent Semantic Analysis (LSA)

A technique for analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms.

Dependency Parsing

The process of analyzing the grammatical structure of a sentence by establishing relationships based on a head-dependent grammar.

Chunking

Also known as shallow parsing, it's the process of dividing a text into syntactically correlated parts of words, like noun or verb phrases.

Language Model

A statistical model that predicts the likelihood of a sequence of words. Used in various applications such as speech recognition, typing suggestions, and machine translation.

Co-reference Resolution

Determining if and how words or phrases (like pronouns) refer to the same entity within a text. It is crucial for understanding the context and meaning.

Know

Still learning

Click to flip

Know