Explore tens of thousands of sets crafted by our community.
Word Embedding Techniques
10
Flashcards
0/10
One-Hot Encoding
Represents words as sparse binary vectors where each word is represented by a 1 in its position and 0s elsewhere. Not used for capturing context or semantic information.
TF-IDF
Term Frequency-Inverse Document Frequency. Weighs the frequency of a word in a document against its frequency across many documents to identify importance. Often used in document retrieval and indexing.
Word2Vec
A neural network based technique that learns distributed representations of words that capture semantic and syntactic patterns, through either the CBOW or Skip-gram models.
GloVe
Global Vectors for Word Representation. Incorporates global statistics of the corpus by performing dimensionality reduction on the co-occurrence matrix of words to capture semantics.
FastText
An extension of Word2Vec that treats each word as composed of character n-grams, which helps in capturing morphological information, especially for languages with rich word forms.
ELMo
Embeddings from Language Models. Utilizes a deep, bidirectional LSTM network to create contextualized word embeddings by taking into account the entire context in which a word appears.
BERT
Bidirectional Encoder Representations from Transformers. Generates deep bidirectional representations by jointly conditioning on both left and right context in all layers, primarily designed for fine-tuning on downstream tasks.
Transformer
A model architecture that relies solely on attention mechanisms without recurrence, designed for parallelization and speed which has become the basis for many state-of-the-art NLP models.
GPT
Generative Pretrained Transformer. Utilizes unsupervised learning for pretraining a transformer network on a large corpus, and is then fine-tuned for various tasks with supervised learning.
CoVe
Context Vectors. Uses a sequence-to-sequence model to provide contextually informed word representations by encoding the semantic meaning of words in context.
© Hypatia.Tech. 2024 All rights reserved.