Logo
Pattern

Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Common Machine Learning Datasets

15

Flashcards

0/15

Still learning
StarStarStarStar

MNIST

StarStarStarStar

A dataset of handwritten digits used for image processing and machine learning, ideal for training and testing models on tasks like image classification.

StarStarStarStar

CIFAR-10

StarStarStarStar

A dataset consisting of 60,000 32x32 color images in 10 different classes, used for computer vision tasks such as object recognition.

StarStarStarStar

ImageNet

StarStarStarStar

A large visual dataset designed for use in visual object recognition software research, with more than 14 million images and thousands of categories.

StarStarStarStar

IMDb

StarStarStarStar

A dataset containing 50,000 movie reviews for natural language processing or sentiment analysis, divided evenly into positive and negative reviews.

StarStarStarStar

UCI Machine Learning Repository

StarStarStarStar

A collection of databases, domain theories, and data generators widely used by the machine learning community for empirical analysis of machine learning algorithms.

StarStarStarStar

COCO (Common Objects in Context)

StarStarStarStar

A large-scale dataset for object detection, segmentation, and captioning, containing over 200,000 labeled images across 80 categories.

StarStarStarStar

LFW (Labeled Faces in the Wild)

StarStarStarStar

A database designed for studying the problem of unconstrained face recognition with more than 13,000 images of faces collected from the web.

StarStarStarStar

20 Newsgroups

StarStarStarStar

A collection of approximately 20,000 newsgroup documents, partitioned across 20 different newsgroups, suitable for text classification and clustering.

StarStarStarStar

Boston Housing

StarStarStarStar

A dataset containing information about different houses in Boston areas, used for regression analysis to predict housing prices.

StarStarStarStar

Stanford Dogs Dataset

StarStarStarStar

A dataset with over 20,000 images of 120 breeds of dogs from around the world, which is used for fine-grained image classification.

StarStarStarStar

LibriSpeech

StarStarStarStar

A dataset of 1,000 hours of English speech derived from audiobooks, allowing for training and evaluating speech recognition systems.

StarStarStarStar

Yelp Review Dataset

StarStarStarStar

A dataset consisting of user reviews for businesses across 11 metropolitan areas on Yelp, useful for sentiment analysis and recommendation systems.

StarStarStarStar

MS COCO

StarStarStarStar

A large-scale dataset for multiple computer vision tasks such as object detection, segmentation, and captioning. Similar to the COCO dataset but often updated with new data and annotations.

StarStarStarStar

Google Open Images

StarStarStarStar

A dataset with millions of annotated images for a broad range of categories, useful for machine learning models requiring large scale visual recognition.

StarStarStarStar

Sentiment140

StarStarStarStar

A dataset containing 160,000 tweets annotated with sentiments, created for the task of sentiment analysis in the context of social media.

Know
0
Still learning
Click to flip
Know
0
Logo

© Hypatia.Tech. 2024 All rights reserved.