Explore tens of thousands of sets crafted by our community.
NLP Research Papers
15
Flashcards
0/15
Deep Contextualized Word Representations
The paper introduces ELMo, a deep contextualized word representation that models both complex characteristics of word use and word sense disambiguation, significantly improving the state of the art across a range of challenging NLP tasks.
Neural Machine Translation by Jointly Learning to Align and Translate
This pivotal paper presents a neural machine translation model which integrates alignment information, using a novel attention mechanism to improve translation quality.
Know What You Don’t Know: Unanswerable Questions for SQuAD
The paper augments the Stanford Question Answering Dataset (SQuAD) with unanswerable questions and proposes models that can abstain from answering when presented with a question that is not answerable from the given paragraph.
End-to-End Sequence Labeling via Bi-directional LSTM-CNNs-CRF
This paper improves sequence labeling tasks by presenting a hybrid neural network architecture that combines bi-directional LSTM, CNN, and CRF layers to effectively capture context and character-level information.
A Neural Probabilistic Language Model
This early work lays the groundwork for neural language modeling, presenting a framework for learning word representations and a statistical language model based on neural networks.
Improving Language Understanding by Generative Pre-Training
This paper by OpenAI introduces the GPT model, which is pretrained using a language modeling objective on a large corpus and fine-tuned on a task-specific dataset, significantly improving performance across a range of tasks.
Universal Language Model Fine-tuning for Text Classification
The paper presents ULMFiT, a transfer learning technique that applies unsupervised pretraining of a language model followed by fine-tuning on a target task, yielding strong performance on text classification benchmarks.
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
This paper presents the General Language Understanding Evaluation (GLUE) benchmark, a collection of diverse natural language understanding tasks with the goal of encouraging model generalization across multiple tasks.
A Structured Self-Attentive Sentence Embedding
This paper introduces a model that learns a self-attention mechanism for sentence embedding, allowing it to capture different aspects of the sentence and provide interpretable representations.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
This paper presents BERT, a method of pre-training language representations that obtain state-of-the-art results on a variety of language understanding tasks.
Distributed Representations of Words and Phrases and their Compositionality
This seminal work by Mikolov et al. introduces the Word2Vec algorithm which efficiently learns vector representations of words that capture syntactic and semantic word relationships.
Sequence to Sequence Learning with Neural Networks
This critical paper presents the sequence-to-sequence learning framework using RNNs, a breakthrough that enabled a variety of applications in machine translation and question answering.
Dynamic Memory Networks for Visual and Textual Question Answering
This paper proposes the Dynamic Memory Network, a model capable of answering questions based on visual and textual inputs by using episodic memory and an attention mechanism to focus on relevant information.
Attention Is All You Need
The paper introduces the Transformer model, which eschews recurrence and convolutions in favor of self-attention mechanisms, and achieves state-of-the-art results in machine translation.
GPT-3: Language Models are Few-Shot Learners
The paper introduces GPT-3, a language model with 175 billion parameters that demonstrates strong performance on many NLP tasks with little to no task-specific training data.
© Hypatia.Tech. 2024 All rights reserved.