Logo
Pattern

Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Natural Language Processing Terminology

20

Flashcards

0/20

Still learning
StarStarStarStar

Named Entity Recognition (NER)

StarStarStarStar

The process of identifying and classifying named entities in text into predefined categories such as the names of persons, organizations, locations, etc.

StarStarStarStar

Sequence to Sequence (Seq2Seq) Model

StarStarStarStar

A type of model in NLP that transforms a given sequence of elements, such as words in one language, to another sequence, such as a translation in another language.

StarStarStarStar

Transformer

StarStarStarStar

An NLP model architecture that relies on self-attention mechanisms instead of sequence-aligned recurrent layers like RNNs for handling sequences of data.

StarStarStarStar

Corpus

StarStarStarStar

A large and structured set of texts used in NLP. It serves as data for modeling language and extracting statistical information.

StarStarStarStar

Attention Mechanism

StarStarStarStar

A technique in sequence learning that allows the model to focus on different parts of the input while generating each word of the output sequence.

StarStarStarStar

Latent Semantic Analysis (LSA)

StarStarStarStar

A technique for analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms.

StarStarStarStar

POS Tagging

StarStarStarStar

Short for Part-of-Speech Tagging, it involves assigning word types (like noun, verb, adjective) to each token. Useful for syntactic parsing and word sense disambiguation.

StarStarStarStar

Word Embeddings

StarStarStarStar

Numerical vector representations of words that capture their meanings, syntactic properties, and relation with other words.

StarStarStarStar

Bag of Words (BoW)

StarStarStarStar

A simple feature extraction technique that describes the occurrence of words within a document, disregarding grammar and word order.

StarStarStarStar

BERT

StarStarStarStar

Bidirectional Encoder Representations from Transformers, a pre-training technique for NLP that considers both left and right context in all layers.

StarStarStarStar

Lemmatization

StarStarStarStar

The process of reducing words to their lemma or dictionary form. It is more sophisticated than stemming and uses lexical knowledge bases like WordNet.

StarStarStarStar

n-gram

StarStarStarStar

A contiguous sequence of n items (typically words or letters) from a given text or speech. Useful in text modeling and prediction.

StarStarStarStar

Language Model

StarStarStarStar

A statistical model that predicts the likelihood of a sequence of words. Used in various applications such as speech recognition, typing suggestions, and machine translation.

StarStarStarStar

Co-reference Resolution

StarStarStarStar

Determining if and how words or phrases (like pronouns) refer to the same entity within a text. It is crucial for understanding the context and meaning.

StarStarStarStar

Stop Words

StarStarStarStar

Commonly used words in a language (like 'the', 'is', 'in') that are often removed in NLP tasks to reduce noise and focus on meaningful words.

StarStarStarStar

Stemming

StarStarStarStar

The process of reducing words to their base or root form. Generally simpler than lemmatization and often uses heuristic rules.

StarStarStarStar

TF-IDF

StarStarStarStar

Short for Term Frequency-Inverse Document Frequency. A statistical measure used to evaluate the importance of a word to a document in a corpus.

StarStarStarStar

Dependency Parsing

StarStarStarStar

The process of analyzing the grammatical structure of a sentence by establishing relationships based on a head-dependent grammar.

StarStarStarStar

Chunking

StarStarStarStar

Also known as shallow parsing, it's the process of dividing a text into syntactically correlated parts of words, like noun or verb phrases.

StarStarStarStar

Tokenization

StarStarStarStar

The process of splitting text into individual terms or phrases. It allows NLP systems to work with words and concepts rather than strings of characters.

Know
0
Still learning
Click to flip
Know
0
Logo

© Hypatia.Tech. 2024 All rights reserved.