Logo
Pattern

Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Big Data Concepts

20

Flashcards

0/20

Still learning
StarStarStarStar

Data Warehouse

StarStarStarStar

A system used for reporting and data analysis, and is considered a core component of business intelligence. Example: Consolidated customer data for enterprise reporting.

StarStarStarStar

Structured Data

StarStarStarStar

Data that adheres to a pre-defined data model and is easy to analyze. Stored in a traditional database system. Example: SQL databases containing customer information.

StarStarStarStar

ETL (Extract, Transform, Load)

StarStarStarStar

A process that involves extracting data from outside sources, transforming it to fit operational needs, then loading it into the end target database or data warehouse. Example: Migrating CRM data to a data warehouse.

StarStarStarStar

NoSQL Databases

StarStarStarStar

A type of database designed for distributed data stores for large-scale data storage, and massive scalability. Example: Cassandra or MongoDB used for storing user profiles for millions of users.

StarStarStarStar

Data Visualization

StarStarStarStar

The graphical representation of information and data using visual elements like charts, graphs, and maps. Example: A dashboard showing sales data trends over time.

StarStarStarStar

Big Data

StarStarStarStar

Refers to extremely large datasets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions. Example: Social media platforms dealing with petabytes of user data.

StarStarStarStar

Data Lake

StarStarStarStar

A centralized repository that allows you to store all your structured and unstructured data at any scale. Example: Storing sensor data, social media data, images, and documents in raw form.

StarStarStarStar

MapReduce

StarStarStarStar

A programming model for processing large data sets with a distributed algorithm on a cluster. Example: Word count across millions of documents.

StarStarStarStar

Cloud Computing

StarStarStarStar

The delivery of computing services—including servers, storage, databases, networking, software, analytics, and intelligence—over the Internet ('the cloud') to offer faster innovation, flexible resources, and economies of scale. Example: Amazon Web Services (AWS).

StarStarStarStar

Data Mining

StarStarStarStar

The practice of examining large pre-existing databases in order to generate new information. Example: Discovering shopping patterns from retail sales data.

StarStarStarStar

Semi-structured Data

StarStarStarStar

A form of data that does not reside in a relational database but that does have some organizational properties that make it easier to analyze. Example: JSON, XML files.

StarStarStarStar

Unstructured Data

StarStarStarStar

Information that does not have a pre-defined data model or is not organized in a pre-defined manner. Example: Images, videos, and email content.

StarStarStarStar

Stream Processing

StarStarStarStar

The processing of data in real-time as it flows in. Example: Real-time fraud detection in credit card transactions.

StarStarStarStar

Data Analytics

StarStarStarStar

The science of analyzing raw data in order to make conclusions about that information. Example: Market trend analysis from consumer data.

StarStarStarStar

Natural Language Processing (NLP)

StarStarStarStar

A branch of artificial intelligence that helps computers understand, interpret, and manipulate human language. Example: Chatbots understanding and responding to user requests.

StarStarStarStar

Machine Learning

StarStarStarStar

A branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention. Example: Predictive models in algorithmic trading.

StarStarStarStar

Internet of Things (IoT)

StarStarStarStar

A network of physical objects ('things') that are embedded with sensors, software, and other technologies for the purpose of connecting and exchanging data with other devices and systems over the internet. Example: Smart home devices.

StarStarStarStar

Data Governance

StarStarStarStar

The management of the availability, usability, integrity, and security of data used in an organization. Example: Policies for user data compliance and privacy.

StarStarStarStar

Hadoop

StarStarStarStar

An open-source software framework used for distributed storage and processing of large sets of data using the MapReduce programming model. Example: Log processing for user behavior analysis.

StarStarStarStar

Predictive Analytics

StarStarStarStar

The use of data, statistical algorithms, and machine learning techniques to identify the likelihood of future outcomes based on historical data. Example: Credit scoring for loan approvals.

Know
0
Still learning
Click to flip
Know
0
Logo

© Hypatia.Tech. 2024 All rights reserved.