Explore tens of thousands of sets crafted by our community.
Big Data Concepts
20
Flashcards
0/20
Data Warehouse
A system used for reporting and data analysis, and is considered a core component of business intelligence. Example: Consolidated customer data for enterprise reporting.
Structured Data
Data that adheres to a pre-defined data model and is easy to analyze. Stored in a traditional database system. Example: SQL databases containing customer information.
ETL (Extract, Transform, Load)
A process that involves extracting data from outside sources, transforming it to fit operational needs, then loading it into the end target database or data warehouse. Example: Migrating CRM data to a data warehouse.
NoSQL Databases
A type of database designed for distributed data stores for large-scale data storage, and massive scalability. Example: Cassandra or MongoDB used for storing user profiles for millions of users.
Data Visualization
The graphical representation of information and data using visual elements like charts, graphs, and maps. Example: A dashboard showing sales data trends over time.
Big Data
Refers to extremely large datasets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions. Example: Social media platforms dealing with petabytes of user data.
Data Lake
A centralized repository that allows you to store all your structured and unstructured data at any scale. Example: Storing sensor data, social media data, images, and documents in raw form.
MapReduce
A programming model for processing large data sets with a distributed algorithm on a cluster. Example: Word count across millions of documents.
Cloud Computing
The delivery of computing services—including servers, storage, databases, networking, software, analytics, and intelligence—over the Internet ('the cloud') to offer faster innovation, flexible resources, and economies of scale. Example: Amazon Web Services (AWS).
Data Mining
The practice of examining large pre-existing databases in order to generate new information. Example: Discovering shopping patterns from retail sales data.
Semi-structured Data
A form of data that does not reside in a relational database but that does have some organizational properties that make it easier to analyze. Example: JSON, XML files.
Unstructured Data
Information that does not have a pre-defined data model or is not organized in a pre-defined manner. Example: Images, videos, and email content.
Stream Processing
The processing of data in real-time as it flows in. Example: Real-time fraud detection in credit card transactions.
Data Analytics
The science of analyzing raw data in order to make conclusions about that information. Example: Market trend analysis from consumer data.
Natural Language Processing (NLP)
A branch of artificial intelligence that helps computers understand, interpret, and manipulate human language. Example: Chatbots understanding and responding to user requests.
Machine Learning
A branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention. Example: Predictive models in algorithmic trading.
Internet of Things (IoT)
A network of physical objects ('things') that are embedded with sensors, software, and other technologies for the purpose of connecting and exchanging data with other devices and systems over the internet. Example: Smart home devices.
Data Governance
The management of the availability, usability, integrity, and security of data used in an organization. Example: Policies for user data compliance and privacy.
Hadoop
An open-source software framework used for distributed storage and processing of large sets of data using the MapReduce programming model. Example: Log processing for user behavior analysis.
Predictive Analytics
The use of data, statistical algorithms, and machine learning techniques to identify the likelihood of future outcomes based on historical data. Example: Credit scoring for loan approvals.
© Hypatia.Tech. 2024 All rights reserved.