Public Datasets
Floydhub hosts the below popular public datasets. Using this avoids having to reupload big datasets onto our servers.
To use them, please pass it as --data
| Dataset Name | Floyd Data ID | Description |
|---|---|---|
| MS COCO 2014 Training Images | jq4ZXUCSVer4t65rWeyieG | Contains around 80k images |
| Imagenet VGG Very Deep 19 | jq4ZXUCSVer4t65rWeyieG | 19 weight layers pre-trained Convnet model |
| MNIST | Gbya2j64ApqjSHt3vDpdSh | Database for handwritten digits |
| CALTECH 101/256 | Z48LF4K75SeyGbLnfpXbCP | Pictures of objects belonging to 101/256 categories |
| Quora Duplicate Questions | XeyQLG4nb2psqRjmzCTsbN | Contains over 400K lines of potential question duplicate pairs |
| CIFAR 10/100 | diSgciLH4WA7HpcHNasP9j | Subset of 80 million tiny images dataset |
| Cats vs Dogs Redux: Kernels Edition | SyccinddLDdS7p3vzcwGQ2 | Dataset for Kaggle's famous Dogs vs Cats competition |
| CycleGAN | f9RVzpea4vb9uCLaDggUgX | Dataset for CycleGAN |
If you have requests or suggestions for any public datasets to add to our servers, let us know contact@floydhub.com
Help make this document better
This guide, as well as the rest of our docs, are open-source and available on GitHub. We welcome your contributions.
- Suggest an edit to this page (by clicking the edit icon at the top next to the title).
- Open an issue about this page to report a problem.