Selective 3D Spatial Pyramid Matching Kernel for Object and Scene Categorization

3DSPMK approach

Project Description

In this project, we propose a novel approach to recognize object and scene categories in depth images. We introduce a Bag of Words (BoW) representation in 3D, the Selective 3D Spatial Pyramid Matching Kernel (3DSPMK). It starts quantizing 3D local descriptors, computed from point clouds, to build a vocabulary of 3D visual words. This codebook is used to build the 3DSPMK, which starts partitioning a working volume into fine sub-volumes, and computing a hierarchical weighted sum of histogram intersections of visual words at each level of the 3D pyramid structure. With the aim of increasing both the classification accuracy and the computational efficiency of the kernel, we propose two selective hierarchical volume decomposition strategies, based on representative and discriminative sub-volume selection processes, which drastically reduce the pyramid to consider. Results on different RGBD datasets show that our approaches obtain state-of-the-art results for both object recognition and scene categorization.



Errata CVPR paper - Please note that the results reported in our CVPR 2012 paper on the RGB-D database are not valid. We found some errors in the experimental setup. We are grateful to Liefeng Bo and Xiaofeng Ren for alerting us to these errors. Results obtained using the correct experimental setup have been included in the journal paper. Thus, please when citing the paper in reference to the RGB-D results, use the ones provided in the journal paper.

We have also generated a new version of both Table 1 and Figure 9 in the CVPR 2012 paper, but now with the correct results!. Here they are:

Table 1 Figure 9
Table 1 Figure 9



Any doubts, bugs? Please, email us: robertoj.lopez ...@...


When using our software, please acknowledge the effort that went into its development by referencing the papers:

  Title                    = {Recognizing in the Depth: Selective 3D Spatial Pyramid Matching Kernel for Object and Scene Categorization },
  Author                   = {Carolina Redondo-Cabrera and Roberto J. López-Sastre and Javier Acevedo-Rodríguez and Saturnino Maldonado-Bascón},
  Journal                  = {Image and Vision Computing },
  Year                     = {2014},
  Number                   = {32},
  Pages                    = {965-978},  
  author = {Redondo-Cabrera, C. and Lopez-Sastre, R.~J. and Acevedo-Rodriguez, J. and Maldonado-Bascon, S.},
  title = {{SURF}ing the Point Clouds: Selective {3D} Spatial Pyramids for Category-level Object Recognition},
  booktitle = {IEEE CVPR},
  year = {2012}  


This work was partially supported by projects CCG2013/EXP-047, IPT-2012-0808-370000, TIN2010-20845-C03-03, UAH2011/EXP-030 and IPT-2011-1366-390000.