To help you get started, we've compiled a variety of datasets and APIs from which to gain inspiration. Many of these datasets have already been cleaned and normalized, so they are ready to be explored using AI tools. The use of these datasets is often intended for research purposes only. If you want to use the data in your startup, be sure to read any associated license agreements to understand if there are commercial restrictions. Also note that you are not restricted to basing your idea on the data sets below. You may discover other open source data sets that inspire your creativity or you may bring your own proprietary data sets if you wish.
And if there’s a data set you think we should add to the list, please send it to us.
1. General Image Datasets: Large-scale datasets containing diverse images covering a wide range of categories and scenes. These datasets are often used for generic computer vision tasks like image classification and object detection. Examples include ImageNet, COCO, and Open Images.
2. Specific Object Datasets: Datasets focused on specific objects or subjects, such as faces (Labeled Faces in the Wild), birds (CUB-200-2011), or flowers (Oxford 102 Flower Dataset). These datasets are useful for fine-grained recognition tasks.
3. Scene Recognition Datasets: Datasets with images categorized based on different scene types, such as indoor scenes (MIT67) or urban environments (Cityscapes). These datasets are employed for scene recognition and understanding.
4. Medical Image Datasets: Datasets containing medical images, such as X-rays, MRI scans, or histopathology images, used for medical image analysis and diagnosis. Examples include DeepLesion and ChestX-ray8.
5. Handwritten Digits Datasets: Datasets with images of handwritten digits, like MNIST, used for digit recognition tasks and benchmarking algorithms.
6. Natural Language to Image Datasets: Pairings of images with corresponding natural language descriptions, useful for research in image captioning and multimodal learning. Examples include COCO Captions and Visual Genome.
7. Sketch Datasets: Datasets containing hand-drawn sketches, which are valuable for sketch-based recognition and image generation tasks. Examples include Sketchy Dataset and Quick, Draw!.
8. Domain Adaptation Datasets: Datasets designed for domain adaptation tasks, where images from different domains are provided to test algorithms' transferability. Examples include VisDA and Office-Home.
9. Social Media Image Datasets: Datasets sourced from social media platforms, capturing user-generated content, and covering a wide range of topics and events.
10. Geo-tagged Image Datasets: Image datasets linked with geographic location information, useful for research involving geospatial analysis and location-based image retrieval.
11. Video Frame Datasets: Extracted frames from videos, used for action recognition, video understanding, and video-to-image tasks.
12. Face Recognition Datasets: Datasets focused on face images, commonly used for face recognition and facial expression analysis. Examples include LFW and CelebA.
13. Remote Sensing and Satellite Image Datasets: Datasets containing aerial and satellite images, utilized for remote sensing, land cover classification, and environmental monitoring.
14. Art and Cultural Heritage Datasets: Image datasets featuring artwork, historical artifacts, or cultural heritage items, used for art analysis and preservation efforts.
15. Fashion Image Datasets: Datasets related to the fashion industry, containing images of clothing and fashion items for fashion analysis and recommendation systems.
1. ImageNet: A large-scale image database with millions of labeled images for computer vision research and object recognition tasks. - Website: http://www.image-net.org/
2. COCO (Common Objects in Context): A dataset containing images with complex everyday scenes, providing object detection and segmentation annotations. - Website: https://cocodataset.org/
3. Open Images: A dataset of millions of annotated images with labels for object detection, segmentation, and visual relationship recognition. - Website: https://storage.googleapis.com/openimages/web/index.html
4. Flickr Commons: A collection of images from Flickr with a "Creative Commons" license, allowing free use for non-commercial purposes. - Website: https://www.flickr.com/commons
5. Unsplash: A website offering high-quality, free-to-use images contributed by photographers under the Unsplash license. - Website: https://unsplash.com/
6. Pixabay: A platform providing free images, photos, and videos under a Creative Commons license. - Website: https://pixabay.com/
7. Pexels: Another platform with a collection of high-quality, free-to-use images and videos for personal and commercial use. - Website: https://www.pexels.com/
8. Kaggle Datasets: Kaggle hosts various image datasets contributed by the community, including those related to computer vision challenges. - Website: https://www.kaggle.com/datasets?fileType=jpg
9. VisualData: A collection of diverse image datasets for research and benchmarking computer vision algorithms. - Website: https://www.visualdata.io/
10. Stanford Dogs Dataset: A dataset containing images of various dog breeds for image classification tasks. - Website: http://vision.stanford.edu/aditya86/ImageNetDogs/
11. MNIST Handwritten Digits: A dataset of 28x28 grayscale images of handwritten digits (0-9) for image classification tasks. - Website: http://yann.lecun.com/exdb/mnist/
12. Labeled Faces in the Wild (LFW): A dataset of face images with labeled identities for face recognition research. - Website: http://vis-www.cs.umass.edu/lfw/
13. DeepFashion2: A large-scale fashion image dataset with clothing category labels and keypoint annotations. - Website: https://github.com/switchablenorms/DeepFashion2
14. Cityscapes: A dataset of urban street scenes with pixel-level annotations for semantic segmentation tasks. - Website: https://www.cityscapes-dataset.com/
15. CelebA: A dataset of celebrity face images with attribute labels for various facial features. - Website: http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
16. Caltech-256: A dataset with 256 object categories, containing images for object recognition tasks. - Website: http://www.vision.caltech.edu/Image_Datasets/Caltech256/
17. SUN Database: A dataset of scene images with a focus on scene recognition and understanding. - Website: https://groups.csail.mit.edu/vision/SUN/
18. CIFAR-10 and CIFAR-100: Datasets with small color images of various objects for image classification tasks. - Website: https://www.cs.toronto.edu/~kriz/cifar.html
19. Tiny ImageNet: A smaller version of ImageNet with 200 object classes, useful for smaller-scale experiments. - Website: https://tiny-imagenet.herokuapp.com/
20. ADE20K: A large-scale dataset for semantic segmentation, containing images with pixel-wise annotations. - Website: http://groups.csail.mit.edu/vision/datasets/ADE20K/
21. Aerial Maritime Drone Dataset: A dataset containing aerial images from maritime scenarios, suitable for object detection and tracking in maritime environments. - Website: https://www.aerial-maritime-drone.org/
22. NIST Face Recognition Vendor Test (FRVT) Databases: Datasets used for evaluating face recognition algorithms, including the Mugshot and Visa databases. - Website: https://www.nist.gov/itl/iad/image-group/frvt-databases-and-tools
23. Quick, Draw!: A dataset of over 50 million drawings across various categories, collected through the "Quick, Draw!" game. - Website: https://quickdraw.withgoogle.com/data
24. OpenEDS: An open-access database of scanning electron microscope images for material science and electron microscopy research. - Website: https://github.com/dsl-epfl/openeds
25. Flickr Faces-HQ Dataset (FFHQ): A high-quality dataset of human faces from Flickr, ideal for face-related research and analysis. - Website: https://github.com/NVlabs/ffhq-dataset
26. Geometric Shapes Dataset: A dataset containing images of geometric shapes, suitable for shape recognition and object detection. - Website: https://github.com/savarese/geometric-shapes
27. VisDA: The Visual Domain Adaptation Challenge datasets, featuring images from different visual domains for domain adaptation tasks. - Website: https://github.com/VisionLearningGroup/taskcv-2017-public
28. Image-to-Image Translation Datasets: A collection of paired images for image-to-image translation tasks, including Cityscapes, Facades, and Maps datasets. - Website: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
29. Visual Genome: A dataset with images and annotations, providing detailed visual relationships among objects for visual reasoning tasks. - Website: https://visualgenome.org/
30. Sketchy Dataset: A dataset of hand-drawn sketches across various object categories, useful for sketch recognition and analysis. - Website: http://sketchy.eye.gatech.edu/
31. Street View House Numbers (SVHN): A dataset of house numbers captured from Google Street View images for digit recognition tasks. - Website: http://ufldl.stanford.edu/housenumbers/
32. The FashionAI Global Challenge (Tianchi): A dataset with fashion images for tasks like fashion classification and attribute prediction. - Website: https://tianchi.aliyun.com/competition/entrance/231649/information
33. WildDash: A dataset of driving images with pixel-level annotations for semantic segmentation in urban driving scenarios. - Website: https://wilddash.cc/
34. Places2: A large-scale scene recognition dataset with images representing diverse indoor and outdoor scenes. - Website: http://places2.csail.mit.edu/
35. BDD100K: A diverse driving dataset with images and annotations for various computer vision tasks in autonomous driving. - Website: https://bdd-data.berkeley.edu/
36. ECCV 2020 Animal-AI: A dataset of images containing various animal species for animal recognition and classification tasks. - Website: https://www.kaggle.com/c/animalai-2020
37. CORe50: A dataset with images containing everyday objects, suitable for object recognition and manipulation tasks. - Website: https://vlomonaco.github.io/core50/
38. PlantCLEF: A dataset of plant images for plant species classification and identification. - Website: https://www.imageclef.org/PlantCLEF2021
39. PASCAL VOC: A popular dataset for object recognition and segmentation tasks, used in the PASCAL Visual Object Classes challenge. - Website: http://host.robots.ox.ac.uk/pascal/VOC/
40. Indoor Scene Recognition (MIT67): A dataset of indoor scene images for scene recognition and classification tasks. - Website: http://web.mit.edu/torralba/www/indoor.html
41. SketchyScene: A dataset with hand-drawn sketches of indoor scenes, useful for sketch-based scene recognition. - Website: https://sketchyscene.github.io/
42. Oxford 102 Flower Dataset: A dataset of flower images with 102 categories for fine-grained image classification. - Website: https://www.robots.ox.ac.uk/~vgg/data/flowers/102/
43. Visual Sentiment Ontology (VSO): A dataset with images annotated with visual sentiment concepts, allowing sentiment analysis tasks. - Website: http://sentiment.cs.cmu.edu/
44. The Street View Text (SVT) Dataset: A dataset with images of text in outdoor scenes for text detection and recognition tasks. - Website: http://tc11.cvc.uab.es/datasets/SVT_1
45. DeepLesion: A dataset of medical images with annotated lesions for lesion detection and classification. - Website: https://nihcc.app.box.com/v/DeepLesion
46. CUB-200-2011: A dataset of bird images with fine-grained categories for bird species recognition. - Website: http://www.vision.caltech.edu/visipedia/CUB-200-2011.html
47. Food-101: A dataset of food images with 101 categories for food recognition and classification. - Website: https://www.vision.ee.ethz.ch/datasets_extra/food-101/
48. Quickdraw-Dataset: A collection of over 100 million drawings across various categories from the "Quick, Draw!" game. - Website: https://github.com/googlecreativelab/quickdraw-dataset
49. Office-Home: A dataset containing images of objects in office and home environments for domain adaptation tasks. - Website: http://hemanthdv.org/OfficeHome-Dataset/
50. MIT Indoors: A dataset of indoor scenes for scene recognition and classification, focused on office and home environments. - Website: http://web.mit.edu/torralba/www/indoor.html
51. Chest X-Ray Images (Pneumonia): - Website: https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia
52. NIH Chest X-Rays - Website: https://www.kaggle.com/datasets/nih-chest-xrays/data
53. Epic Kitchens: The largest dataset in first-person (egocentric) vision; multi-faceted non-scripted recordings in native environments - i.e. the wearer’s homes, capturing all daily activities in the kitchen over multiple days. Annotations are collected using a novel ‘live’ audio commentary approach. - Website: https://epic-kitchens.github.io/2018
54. Cat Dataset: Over 9000 images of cats with annotated facial features. - Website: https://www.kaggle.com/datasets/crawford/cat-dataset
55. Food Images: - Website: https://www.kaggle.com/datasets/maddy1/food-train
56. Flower Image Dataset: This dataset contains 4242 labeled images of flowers. The images are divided into 5 classes: chamomile, tulip, rose, sunflower, dandelion. - Website: https://www.kaggle.com/datasets/alxmamaev/flowers-recognition
57. Dress Patterns: This dataset contains links to images of dresses, and the corresponding images are categorized into 17 pattern types. - Website: https://data.world/crowdflower/categorization-dress-patterns
58. Yahoo Image Datasets: - Website: https://webscope.sandbox.yahoo.com/catalog.php?datatype=i&guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ292ZXJ0aWNhbHdvcmtzaG9wLmNvbS8&guce_referrer_sig=AQAAAHl6AiQkRdyrTlmeziActzpp-EyEGBvZ0d8FWNm0Ngyl3gLaQrFNnafgSDuk2kPRxYquB2hPYhJ9nKUMNlLegsGrsaE6lbpv9Zpwzv9z7AmiKPySJzqhMZLOs339JcCWoW999H9-MpT3XzCaVy1cIeLo62X0e_gMUV_oOTN776Ll
59. State Farm Distracted Driver Detection: - Website: https://www.kaggle.com/competitions/state-farm-distracted-driver-detection/data
60. Yelp Restaurant Photo Classification: - Website: https://www.kaggle.com/c/yelp-restaurant-photo-classification/data
61. YouTube-8M Segments Dataset: YouTube-8M is a large-scale labeled video dataset that consists of millions of millions of YouTube video IDs and associated labels from a diverse vocabulary of 4700+ visual entities. - Website: https://research.google.com/youtube8m/
62. YouTube-BoundingBoxes Dataset: YouTube-BoundingBoxes is a large-scale data set of vidoe URLs with densely-sampled high-quality single-object bounding box annotations. The data set consists of approximately 380,000 15-20s video segments extracted from 240,000 different publicly visible YouTube videos, automatically selected to feature objects in natural settings without editing or post-processing, with a recording quality often akin to that of a hand-held cell phone camera. - Website: https://research.google.com/youtube-bb/
63. Google Atomic Visual Actions (AVA): New dataset that provides multiple action labels for eahc person in extended video sequences. - Website: https://ai.googleblog.com/2017/10/announcing-ava-finely-labeled-video.html
64. Google Open Image Dataset (v3): A dataset of ~9 million URLs to images that have been annotated with image-level labels and bounding boxes spanning thousands of classes. - Website: https://ai.googleblog.com/2016/09/introducing-open-images-dataset.html
65. CVonline: A collated list of image and video databases that people have found useful for computer vision research and algorithm evaluation. - Website: https://homepages.inf.ed.ac.uk/rbf/CVonline/Imagedbase.htm
1. Google Cloud Vision API: This API provides advanced image analysis capabilities, including object detection, face detection, text recognition, and image labeling. It can identify various objects, landmarks, and logos in images and extract text from images. - Website: https://cloud.google.com/vision
2. Microsoft Azure Computer Vision API: This API offers image analysis features like image recognition, face detection, OCR (Optical Character Recognition), and image thumbnail generation. It can also analyze images for adult content and generate image descriptions for accessibility. - Website: https://azure.microsoft.com/en-us/products/ai-services/ai-vision
3. IBM Watson Visual Recognition API: IBM's API allows image classification into custom categories, object detection, and face recognition. It's commonly used for tasks like brand detection, image moderation, and visual search. - Website: https://mediacenter.ibm.com/media/IBM+Watson+Visual+Recognition/0_jbsmp6lq
4. Clarifai API: Clarifai's API provides image and video recognition capabilities, including image tagging, object recognition, face detection, and NSFW (Not Safe For Work) content moderation. - Website: https://docs.clarifai.com/api-guide/api-overview/
5. Amazon Rekognition: Amazon's API offers various image analysis features, such as object and scene detection, facial analysis, celebrity recognition, and image moderation. - Website: https://docs.aws.amazon.com/rekognition/latest/APIReference/Welcome.html
6. Sighthound Cloud API: Sighthound's API provides object detection, facial recognition, and vehicle recognition capabilities, suitable for applications like security and surveillance. - Website: https://www.sighthound.com/products/cloud
7. Kairos API: The Kairos API specializes in facial recognition and emotion analysis, making it useful for applications like user verification and sentiment analysis. - Website: https://www.kairos.com/docs/api/
8. Imagga API: Imagga's API offers image tagging and categorization, content moderation, and color analysis, making it valuable for content organization and filtering. - Website: https://imagga.com/
9. Cloudmersive Image Recognition API: This API provides image recognition for a wide range of objects and scenes, offering features like labeling, OCR, face detection, and content moderation. - Website: https://cloudmersive.com/image-recognition-and-processing-api
10. Pictur.io API: Pictur.io's API focuses on image categorization and organization, allowing developers to build image search and organization applications. - Website: https://github.com/pictura-io/API-description
1. Food-101: A dataset containing 101 food categories with 101,000 images. Useful for food recognition and classification. - Website: https://www.vision.ee.ethz.ch/datasets_extra/food-101/
2. Sketchy Dataset: A dataset of hand-drawn sketches, beneficial for sketch-based image retrieval and analysis. - Website: https://sketchy.eye.gatech.edu/
3. FashionAI Global Challenge (Tianchi): A dataset with fashion images for tasks like fashion classification and attribute prediction. - Website: https://tianchi.aliyun.com/competition/entrance/231649/information
4. DeepFashion2: A large-scale fashion image dataset with clothing category labels and keypoint annotations. - Website: https://github.com/switchablenorms/DeepFashion2
5. PlantCLEF: A dataset of plant images for plant species classification and identification. - Website: https://www.imageclef.org/PlantCLEF2021
6. Animals with Attributes (AwA): A dataset of animal images with attributes, useful for attribute-based recognition. - Website: https://cvml.ist.ac.at/AwA2/
7. GeoPose3K: A dataset of images from diverse geographic locations with camera pose and GPS metadata. - Website: https://geoml.org/dataset_geopose3k.html
8. German Traffic Sign Recognition Benchmark (GTSRB): A dataset of German traffic signs for traffic sign recognition tasks. - Website: https://benchmark.ini.rub.de/gtsrb_dataset.html
9. Google Landmarks Dataset v2: A collection of images from various landmarks around the world, useful for landmark recognition. - Website: https://github.com/cvdfoundation/google-landmark
10. Traffic4cast: A dataset of traffic camera images for traffic flow prediction and analysis. - Website: https://www.iarai.ac.at/traffic4cast/
11. VISOR-Vehicle: A dataset of vehicle images from various viewpoints for vehicle recognition. - Website: https://github.com/hasanirtiza/VISOR-Vehicle
12. Painting-91: A dataset of paintings covering various art styles for art style classification tasks. - Website: http://www.visionlab.cs.huberlin.de/resources/
13. WiderPerson: A dataset with images containing persons from a wider range of viewpoints and contexts. - Website: http://www.cbsr.ia.ac.cn/users/scliao/projects/widerperson/
14. SYSU 3D Human-Object Interaction (SYSU HOI): A dataset with images of human-object interactions for action recognition. - Website: http://sysu-hcp.net/lip/
15. Places365-Standard: An extended version of Places2 with 365 scene categories for scene recognition. - Website: http://places2.csail.mit.edu/
16. PyImageSearch - Best Machine Learning Datasets: Various machine learning datasets. - Website: https://pyimagesearch.com/2023/07/31/best-machine-learning-datasets/
We are with our founders from day one, for the long run.