OCID – Object Clutter Indoor Dataset

Developing robot perception systems for handling objects in the real-world requires computer vision algorithms to be carefully scrutinized with respect to the expected operating domain. This demands large quantities of ground truth data to rigorously evaluate the performance of algorithms.

The Object Cluttered Indoor Dataset is an RGBD-dataset containing point-wise labeled point-clouds for each object. The data was captured using two ASUS-PRO Xtion cameras that are positioned at different heights. It captures diverse settings of objects, background, context, sensor to scene distance, viewpoint angle and lighting conditions. The main purpose of OCID is to allow systematic comparison of existing object segmentation methods in scenes with increasing amount of clutter. In addition OCID does also provide ground-truth data for other vision tasks like object-classification and recognition.

Fig.1. OCID  provides structured and pixel-wise annotated point-cloud data of objects in cluttered scenes ready to be used for robot vision tasks such as object-segmentation, classification and recognition.


OCID comprises 96 fully built up cluttered scenes. Each scene is a sequence of labeled pointclouds which are created by building a increasing cluttered scene incrementally and adding one object after the other. The first item in a sequence contains no objects, the second one object, up to the final count of added objects.

The dataset uses 89 different objects that are chosen representatives from the Autonomous Robot Indoor Dataset(ARID)[1] classes and YCB Object and Model Set (YCB)[2] dataset objects.

The ARID20 subset contains scenes including up to 20 objects from ARID. The ARID10 and YCB10 subsets include cluttered scenes with up to 10 objects from ARID and the YCB objects respectively. The scenes in each subset are composed of objects from only one set at a time to maintain separation between datasets. Scene variation includes different floor (plastic, wood, carpet) and table textures (wood, orange striped sheet, green patterned sheet). The complete set of data provides 2346 labeled point-clouds.

OCID subsets are structured so that specific real-world factors can be individually assessed.

• location: floor, table
• view: bottom, top
• scene: sequence-id
• free: clearly separated (objects 1-9 in corresponding sequence)
• touching: physically touching (objects 10-16 in corresponding sequence)
• stacked: on top of each other (objects 17-20 in corresponding sequence)

• location: floor, table
• view: bottom, top
• box: objects with sharp edges (e.g. cereal-boxes)
• curved: objects with smooth curved surfaces (e.g. ball)
• mixed: objects from both the box and curved
• fruits: fruit and vegetables
• non-fruits: mixed objects without fruits
• scene: sequence-id

• location: floor, table
• view: bottom, top
• box: objects with sharp edges (e.g. cereal-boxes)
• curved: objects with smooth curved surfaces (e.g. ball)
• mixed: objects from both the box and curved
• scene: sequence-id

Example of directory structure:
You can find all labeled pointclouds of the ARID20 dataset for the first sequence on a table recorded with the lower mounted camera in this directory:

In addition to labeled organized point-cloud files, corresponding depth, RGB and 2d-label-masks are available:

  • pcd: 640×480 organized XYZRGBL-pointcloud file with ground truth
  • rgb: 640×480 RGB png-image
  • depth: 640×480 16-bit png-image with depth in mm
  • label: 640×480 16-bit png-image with unique integer-label for each object at each pixel

Dataset creation using EasyLabel:

OCID was created using EasyLabel – a semi-automatic annotation tool for RGBD-data. EasyLabel processes recorded sequences of organized point-cloud files and exploits incrementally built up scenes, where in each take one additional object is placed. The recorded point-cloud data is then accumulated and the depth difference between two consecutive recordings are used to label new objects. The code is available here.

OCID data for instance recognition/classification

For ARID10 and ARID20 there is additional data available usable for object recognition and classification tasks. It contains semantically annotated RGB and depth image crops extracted from the OCID dataset.
The structure is as follows:
• type: depth, RGB
• class name: eg. banana, kleenex, …
• class instance: eg. banana_1, banana_2, kleenex_1, kleenex_2,…
The data is provided by Mohammad Reza Loghmani.


  • OCID dataset
    OCID dataset containing the subsets ARID20, ARID10, and YCB10. This includes RGB, depth, 2d-label masks and groundtruth annotated point-cloud data.
  • OCID semantic crops
    Cropped objects from the ARID20 and ARID10 subset of OCID dataset, containing RGB and depth data organized according to the instance and category of the object.

Research paper

If you found our dataset useful, please cite the following paper:


  author    = {Markus Suchi and
               Timothy Patten and
               David Fischinger and
               Markus Vincze},
  title     = {EasyLabel: {A} Semi-Automatic Pixel-wise Object Annotation Tool for
               Creating Robotic {RGB-D} Datasets},
  booktitle = {International Conference on Robotics and Automation, {ICRA} 2019,
               Montreal, QC, Canada, May 20-24, 2019},
  pages     = {6678--6684},
  year      = {2019},
  crossref  = {DBLP:conf/icra/2019},
  url       = {https://doi.org/10.1109/ICRA.2019.8793917},
  doi       = {10.1109/ICRA.2019.8793917},
  timestamp = {Tue, 13 Aug 2019 20:25:20 +0200},
  biburl    = {https://dblp.org/rec/bib/conf/icra/SuchiPFV19},
  bibsource = {dblp computer science bibliography, https://dblp.org}

  title     = {International Conference on Robotics and Automation, {ICRA} 2019,
               Montreal, QC, Canada, May 20-24, 2019},
  publisher = {{IEEE}},
  year      = {2019},
  url       = {http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8780387},
  isbn      = {978-1-5386-6027-0},
  timestamp = {Tue, 13 Aug 2019 20:23:21 +0200},
  biburl    = {https://dblp.org/rec/bib/conf/icra/2019},
  bibsource = {dblp computer science bibliography, https://dblp.org}

Contact & credits

For any questions or issues with the OCID-dataset, feel free to contact the author:

  • Markus Suchi – email: suchi@acin.tuwien.ac.at
  • Tim Patten – email: patten@acin.tuwien.ac.at

For specific questions about the OCID-semantic crops data please contact:

  • Mohammad Reza Loghmani – email: loghmani@acin.tuwien.ac.at


[1] Loghmani, Mohammad Reza et al. “Recognizing Objects in-the-Wild: Where do we Stand?” 2018 IEEE International Conference on Robotics and Automation (ICRA) (2018): 2170-2177.

[2] Berk Calli, Arjun Singh, James Bruce, Aaron Walsman, Kurt Konolige, Siddhartha Srinivasa, Pieter Abbeel, Aaron M Dollar, Yale-CMU-Berkeley dataset for robotic manipulation research, The International Journal of Robotics Research, vol. 36, Issue 3, pp. 261 – 268, April 2017.