Explore tens of thousands of sets crafted by our community.
Named Entity Recognition (NER) Primer
15
Flashcards
0/15
Entity Disambiguation
The task of resolving the ambiguity of entities in text, determining which 'entity' out of several possible candidates the term refers to, often using context or external databases.
BiLSTM (Bidirectional Long Short Term Memory)
A type of recurrent neural network (RNN) that can capture context from both past and future data points within a sequence, widely used in NER for its effectiveness in context-aware entity prediction.
NER in Multilingual Context
Refers to the ability of NER systems to operate on languages other than English, requiring models to handle various linguistic phenomena and script variances which increases the complexity of entity recognition.
Chunking
Also known as shallow parsing, it's a process of extracting phrases from unstructured text, which is then used as an input for NER to identify named entities.
Gazetteer
A list or a dictionary of named entities used as a reference to aid in entity recognition. It's a lookup table that can be used to improve NER by providing external knowledge.
False Positive
Occurs when an entity is incorrectly identified or classified by a NER system, indicating that the model recognized an entity that is not actually present in the text.
IOB Tagging Scheme
A common tagging format for marking up text chunks where 'I' stands for Inside, 'O' for Outside, and 'B' for Beginning of a named entity. The tags are used for representing entities in a tokenized text.
Named Entity Recognition (NER)
A subtask of information extraction that seeks to locate and classify named entities mentioned in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.
BERT (Bidirectional Encoder Representations from Transformers)
A transformer-based model designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers, which has shown significant improvements in NER tasks.
CRF (Conditional Random Fields)
A type of discriminative probabilistic model often used for structured prediction in NER, which takes into account the context within which entities appear to improve recognition accuracy.
Feature Engineering
The process of using domain knowledge to extract features from raw data that will be used to train machine learning models, especially crucial for NER to determine relevant attributes of named entities.
CoNLL
Short for 'Conference on Natural Language Learning', it's known for its NER challenges which have contributed to the progression of NER methods by providing a common framework for evaluation.
Transfer Learning
A machine learning method where a model developed for one task is reused as the starting point for a different but related task. In NER, it allows leveraging pre-trained models to improve entity recognition performance.
False Negative
Occurs when a NER system fails to identify or classify an actual entity in the text, which should have been recognized by the model.
Precision and Recall
Precision measures the correctness of the entities that the NER system has identified while Recall measures the system's ability to identify all relevant instances of entities in the text. Both metrics are used to evaluate NER performance.
© Hypatia.Tech. 2024 All rights reserved.