Explore tens of thousands of sets crafted by our community.
Word Embedding Techniques
10
Flashcards
0/10
GloVe
Global Vectors for Word Representation. Incorporates global statistics of the corpus by performing dimensionality reduction on the co-occurrence matrix of words to capture semantics.
BERT
Bidirectional Encoder Representations from Transformers. Generates deep bidirectional representations by jointly conditioning on both left and right context in all layers, primarily designed for fine-tuning on downstream tasks.
ELMo
Embeddings from Language Models. Utilizes a deep, bidirectional LSTM network to create contextualized word embeddings by taking into account the entire context in which a word appears.
Transformer
A model architecture that relies solely on attention mechanisms without recurrence, designed for parallelization and speed which has become the basis for many state-of-the-art NLP models.
TF-IDF
Term Frequency-Inverse Document Frequency. Weighs the frequency of a word in a document against its frequency across many documents to identify importance. Often used in document retrieval and indexing.
GPT
Generative Pretrained Transformer. Utilizes unsupervised learning for pretraining a transformer network on a large corpus, and is then fine-tuned for various tasks with supervised learning.
One-Hot Encoding
Represents words as sparse binary vectors where each word is represented by a 1 in its position and 0s elsewhere. Not used for capturing context or semantic information.
FastText
An extension of Word2Vec that treats each word as composed of character n-grams, which helps in capturing morphological information, especially for languages with rich word forms.
Word2Vec
A neural network based technique that learns distributed representations of words that capture semantic and syntactic patterns, through either the CBOW or Skip-gram models.
CoVe
Context Vectors. Uses a sequence-to-sequence model to provide contextually informed word representations by encoding the semantic meaning of words in context.
© Hypatia.Tech. 2024 All rights reserved.