S. Agarwal and D. Roth, Learning a Sparse Representation for Object Detection, European Conference on Computer Vision (ECCV), pp.113127-113158, 2002.
DOI : 10.1007/3-540-47979-1_8

P. Agrawal, R. Girshick, and J. Malik, Analyzing the Performance of Multilayer Neural Networks for Object Recognition, European Conference on Computer Vision (ECCV), pp.329344-60, 2014.
DOI : 10.1007/978-3-319-10584-0_22

R. Arandjelovi¢ and A. Zisserman, Smooth object retrieval using a bag of boundaries, International Conference on Computer Vision (ICCV), pp.37-93, 2011.

P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, Contour Detection and Hierarchical Image Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.5, p.898916, 1937.
DOI : 10.1109/TPAMI.2010.161

M. Aubry, D. Maturana, A. A. Efros, B. C. Russell, and J. Sivic, Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.72-79, 2014.
DOI : 10.1109/CVPR.2014.487
URL : https://hal.archives-ouvertes.fr/hal-01057240

M. Aubry and B. C. Russell, Understanding Deep Features with Computer-Generated Imagery, 2015 IEEE International Conference on Computer Vision (ICCV), pp.72-77, 2015.
DOI : 10.1109/ICCV.2015.329
URL : https://hal.archives-ouvertes.fr/hal-01240849

G. Baatz, O. Saurer, K. Köser, and M. Pollefeys, Large scale visual geolocalization of images in mountainous terrain, European Conference on Computer Vision (ECCV), p.71, 2012.

A. Babenko and V. Lempitsky, Aggregating local deep features for image retrieval

A. Babenko, A. Slesarev, A. Chigorin, and V. Lempitsky, Neural Codes for Image Retrieval, European Conference on Computer Vision (ECCV), pp.584599-56, 2014.
DOI : 10.1007/978-3-319-10590-1_38

S. Bell and K. Bala, Learning visual similarity for product design with convolutional neural networks, ACM Transactions on Graphics, vol.34, issue.4, pp.2015-72
DOI : 10.1145/1390156.1390303

Y. Bengio, Learning deep architectures for AI. Foundations and trends, Machine Learning, pp.1127-1151, 2009.

Y. Bengio, Deep learning of representations for unsupervised and transfer learning, JMLR Workshop on Unsupervised and Transfer Learning, p.72, 2012.

C. M. Bishop, Pattern recognition, p.19, 2006.

J. Canny, A computational approach to edge detection, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), issue.6, pp.679-698, 1986.

G. Chaurasia, S. Duchene, O. Sorkine-hornung, and G. Drettakis, Depth synthesis and local warps for plausible image-based navigation, ACM Transactions on Graphics, vol.32, issue.3
DOI : 10.1145/2487228.2487238
URL : https://hal.archives-ouvertes.fr/hal-00907793

T. Chen, M. Li, Y. Li, M. Lin, N. Wang et al., Mxnet: A exible and ecient machine learning library for heterogeneous distributed systems. arXiv preprint, p.43, 2015.

T. Chen, Z. Zhu, A. Shamir, S. Hu, and D. Cohen-or, 3-Sweep, ACM Transactions on Graphics, vol.32, issue.6, pp.32195-76, 2013.
DOI : 10.1145/2508363.2508378

S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran et al., cudnn: Ecient primitives for deep learning. arXiv preprint, p.43, 2014.

C. B. Choy, M. Stark, S. Corbett-davies, and S. Savarese, Enriching object detection with 2D-3D registration and continuous viewpoint estimation, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.72, 2015.
DOI : 10.1109/CVPR.2015.7298866

O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman, Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval, 2007 IEEE 11th International Conference on Computer Vision, p.71, 2007.
DOI : 10.1109/ICCV.2007.4408891

R. Collobert, K. Kavukcuoglu, and C. Farabet, Torch7: A Matlab-like environment for machine learning, BigLearn, NIPS Workshop, pp.43-44

N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.31-74, 2005.
DOI : 10.1109/CVPR.2005.177
URL : https://hal.archives-ouvertes.fr/inria-00548512

J. Deng, W. Dong, R. Socher, L. Li, K. Li et al., ImageNet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.248255-100, 2009.
DOI : 10.1109/CVPR.2009.5206848

P. Dollár, Z. Tu, and S. Belongie, Supervised Learning of Edges and Object Boundaries, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), p.37, 2006.
DOI : 10.1109/CVPR.2006.298

P. Dollár and C. L. Zitnick, Structured Forests for Fast Edge Detection, 2013 IEEE International Conference on Computer Vision, p.37, 2013.
DOI : 10.1109/ICCV.2013.231

A. Dosovitskiy and T. Brox, Inverting Visual Representations with Convolutional Networks, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2016-73
DOI : 10.1109/CVPR.2016.522

D. Eigen, C. Puhrsch, and R. Fergus, Depth map prediction from a single image using a multi-scale deep network, Neural Information Processing Systems (NIPS), pp.23662374-23662407, 2014.

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, The Pascal Visual Object Classes (VOC) Challenge, International Journal of Computer Vision, vol.73, issue.2, pp.303338-303368, 2010.
DOI : 10.1371/journal.pcbi.0040027

L. Fei-fei, R. Fergus, and P. Perona, Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories, Computer Vision and Image Understanding, vol.106, issue.1, pp.5970-6001, 2007.
DOI : 10.1016/j.cviu.2005.09.012

P. Felzenszwalb, R. Girshick, D. Mcallester, and D. Ramanan, Object Detection with Discriminatively Trained Part-Based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.9, pp.80-81, 2010.
DOI : 10.1109/TPAMI.2009.167

P. Felzenszwalb, D. Mcallester, and D. Ramanan, A discriminatively trained, multiscale, deformable part model, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.18-32, 2008.
DOI : 10.1109/CVPR.2008.4587597

R. Fergus, P. Perona, and A. Zisserman, Object class recognition by unsupervised scale-invariant learning, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., p.31, 2003.
DOI : 10.1109/CVPR.2003.1211479

S. Fidler, S. Dickinson, and R. Urtasun, 3D object detection and viewpoint estimation with a deformable 3D cuboid model, Neural Information Processing Systems (NIPS), p.71, 2012.

K. Fukushima, Neocognitron, Scholarpedia, vol.2, issue.1, pp.193202-193224, 1980.
DOI : 10.4249/scholarpedia.1717
URL : https://doi.org/10.4249/scholarpedia.1717

Y. Ganin and V. Lempitsky, Unsupervised domain adaptation by backpropagation, Proceedings of The 32nd International Conference on Machine Learning, pp.11801189-72, 2015.

S. Gidaris and N. Komodakis, Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model, 2015 IEEE International Conference on Computer Vision (ICCV), pp.11341142-55, 2015.
DOI : 10.1109/ICCV.2015.135

S. Gidaris and N. Komodakis, Attend rene repeat: Active box proposal generation via in-out localization, British Machine Vision Conference (BMVC), p.33, 2016.

R. Girshick, From Rigid Templates to Grammars: Object Detection with Structured Models, p.30, 2012.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.72-77, 2014.
DOI : 10.1109/CVPR.2014.81
URL : http://arxiv.org/pdf/1311.2524

R. Girshick, F. Iandola, T. Darrell, and J. Malik, Deformable part models are convolutional neural networks, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.437446-437479, 2015.
DOI : 10.1109/CVPR.2015.7298641
URL : http://arxiv.org/pdf/1409.5403

D. Glasner, M. Galun, S. Alpert, R. Basri, and G. Shakhnarovich, Viewpoint-aware object detection and pose estimation, International Conference on Computer Vision (ICCV), p.93, 2011.
DOI : 10.1109/iccv.2011.6126379
URL : http://www.ai.mit.edu/people/gregory/papers/iccv2011.pdf

X. Glorot, A. Bordes, and Y. Bengio, Deep sparse rectier neural networks, JMLR W&CP: Proceedings of the Fourteenth International Conference on Articial Intelligence and Statistics, p.26, 2011.

R. Guo and D. Hoiem, Beyond the Line of Sight: Labeling the Underlying Surfaces, European Conference on Computer Vision (ECCV), pp.761774-76, 2012.
DOI : 10.1007/978-3-642-33715-4_55

A. Gupta, A. A. Efros, and M. Hebert, Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics, European Conference on Computer Vision (ECCV), p.71, 2010.
DOI : 10.1007/978-3-642-15561-1_35
URL : http://www.cs.cmu.edu/%7Eabhinavg/blocksworld/blocksworld.pdf

S. Gupta, P. A. Arbeláez, R. B. Girshick, and J. Malik, Aligning 3D models to RGB-D images of cluttered scenes, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.72, 2015.
DOI : 10.1109/CVPR.2015.7299105

S. Gupta, J. Homan, and J. Malik, Cross Modal Distillation for Supervision Transfer, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.72, 2016.
DOI : 10.1109/CVPR.2016.309

S. Han, H. Mao, and W. J. Dally, Deep compression: Compressing deep neural network with pruning, trained quantization and human coding. CoRR, abs, p.43, 1510.

S. Han, J. Pool, J. Tran, and W. Dally, Learning both weights and connections for ecient neural network, Neural Information Processing Systems (NIPS), pp.11351143-11351186, 2015.

K. He, X. Zhang, S. Ren, and J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, European Conference on Computer Vision (ECCV), pp.346361-93, 2014.

K. He, X. Zhang, S. Ren, and J. Sun, Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2016-2044
DOI : 10.1109/CVPR.2016.90

M. Hejrati and D. Ramanan, Analyzing 3D objects in cluttered images, Neural Information Processing Systems (NIPS), pp.37-93, 2012.

G. Hinton, O. Vinyals, and J. Dean, Distilling the knowledge in a neural network, NIPS Deep Learning Workshop, p.72, 2014.

J. Homan, S. Guadarrama, E. Tzeng, R. Hu, J. Donahue et al., LSDA: Large scale detection through adaptation, Neural Information Processing Systems (NIPS), pp.72-117, 2014.

K. Hornik, Approximation capabilities of multilayer feedforward networks, Neural Networks, vol.4, issue.2, pp.251257-251281, 1991.
DOI : 10.1016/0893-6080(91)90009-T

Q. Huang, H. Wang, and V. Koltun, Single-view reconstruction via joint analysis of image and shape collections, Proceeding of SIGGRAPH), pp.2015-69
DOI : 10.1145/15922.15903

D. P. Huttenlocher and S. Ullman, Object recognition using alignment, International Conference on Computer Vision (ICCV), pp.36-93, 1987.

S. Ioe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, International Conference on Machine Learning (ICML), pp.448456-448484, 2015.

M. Irani and P. Anandan, Robust multi-sensor image alignment, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271), p.72, 1998.
DOI : 10.1109/ICCV.1998.710832

P. Isola, D. Zoran, D. Krishnan, and E. H. Adelson, Crisp Boundary Detection Using Pointwise Mutual Information, European Conference on Computer Vision (ECCV), p.37, 2014.
DOI : 10.1007/978-3-319-10578-9_52
URL : http://web.mit.edu/phillipi/www/publications/crisp_boundaries.pdf

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long et al., Cae: Convolutional architecture for fast feature embedding, Proceedings of the ACM International Conference on Multimedia, pp.675678-2014

K. Kawaguchi, Deep learning without poor local minima, Neural Information Processing Systems (NIPS), pp.2016-2040

N. Kholgade, T. Simon, A. Efros, and Y. Sheikh, 3D object manipulation in a single photograph using stock 3D models, ACM Transactions on Graphics, vol.33, issue.4, pp.127-69, 2014.
DOI : 10.1111/j.1467-9868.2005.00503.x

A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classication with deep convolutional neural networks In Advances in neural information processing systems, pp.10971105-70, 2012.

J. Laerty, A. Mccallum, and F. Pereira, Conditional random elds: Probabilistic models for segmenting and labeling sequence data, 18th 118 BIBLIOGRAPHY International Conference on Machine Learning (ICML) Une procedure d'apprentissage pour réseau a seuil asymetrique (a learning scheme for asymmetric threshold networks, p.282289

Y. Lecun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard et al., Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, vol.1, issue.4, pp.541551-70, 1989.
DOI : 10.1007/BF00133697

B. Leibe, A. Leonardis, and B. Schiele, An Implicit Shape Model for Combined Object Categorization and Segmentation, Workshop on statistical learning in computer vision, ECCV, p.31, 2004.
DOI : 10.1007/11957959_26

K. Lenc and A. Vedaldi, Understanding image representations by measuring their equivariance and equivalence, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2015-75
DOI : 10.1109/CVPR.2015.7298701

Y. Li, N. Snavely, D. Huttenlocher, and P. Fua, Worldwide pose estimation using 3D point clouds, European Conference on Computer Vision (ECCV), pp.2012-93
DOI : 10.1007/978-3-642-33718-5_2
URL : https://infoscience.epfl.ch/record/201014/files/global_pose.pdf

Y. Li, H. Su, C. R. Qi, N. Fish, D. Cohen-or et al., Joint embeddings of shapes and images via CNN image purication, ACM Transactions on Graphics, vol.69, pp.2015-72

J. J. Lim, H. Pirsiavash, and A. Torralba, Parsing IKEA Objects: Fine Pose Estimation, 2013 IEEE International Conference on Computer Vision, pp.71-79
DOI : 10.1109/ICCV.2013.372
URL : http://people.csail.mit.edu/lim/paper/lpt_iccv2013.pdf

T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona et al., Microsoft COCO: Common Objects in Context, European Conference on Computer Vision, pp.740755-740785, 2014.
DOI : 10.1007/978-3-319-10602-1_48

D. Lowe, The viewpoint consistency constraint, International Journal of Computer Vision, vol.171, issue.1, pp.5772-5808, 1987.
DOI : 10.1177/027836498400300301

D. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, vol.60, issue.2, pp.91110-91141, 2004.
DOI : 10.1023/B:VISI.0000029664.99615.94
URL : http://www.cs.ubc.ca/~lowe/papers/ijcv03.ps

T. Malisiewicz, A. Gupta, and A. A. Efros, Ensemble of exemplarsvms for object detection and beyond, International Conference on Computer Vision (ICCV), p.71, 2011.

F. Massa, M. Aubry, and R. Marlet, Convolutional neural networks for joint object detection and pose estimation: A comparative study. arXiv preprint, p.96, 2014.

F. Massa, R. Marlet, and M. Aubry, Crafting a multi-task CNN for viewpoint estimation, Procedings of the British Machine Vision Conference 2016, 2016.
DOI : 10.5244/C.30.91
URL : https://hal.archives-ouvertes.fr/hal-01743267

F. Massa, B. Russell, and M. Aubry, Deep Exemplar 2D-3D Detection by Adapting from Real to Rendered Views, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.648

J. L. Mundy, Object Recognition in the Geometric Era: A Retrospective, Toward Category-Level Object Recognition, pp.329-71, 2006.
DOI : 10.1007/11957959_1

R. Ortiz-cayon, A. Djelouah, F. Massa, M. Aubry, and G. Drettakis, Automatic 3D Car Model Alignment for Mixed Image-Based Rendering, 2016 Fourth International Conference on 3D Vision (3DV), p.68, 2010.
DOI : 10.1109/3DV.2016.37
URL : https://hal.archives-ouvertes.fr/hal-01368355

M. Osadchy, Y. Lecun, and M. L. Miller, Synergistic Face Detection and Pose Estimation with Energy-Based Models, The Journal of Machine Learning Research, vol.8, pp.11971215-93, 1992.
DOI : 10.1007/11957959_10

H. Penedones, R. Collobert, F. Fleuret, and D. Grangier, Improving object classication using pose information, pp.93-95, 1992.

X. Peng, K. Saenko, B. Sun, and K. Ali, Learning Deep Object Detectors from 3D Models, 2015 IEEE International Conference on Computer Vision (ICCV), pp.72-76, 2015.
DOI : 10.1109/ICCV.2015.151

B. Pepik, R. Benenson, T. Ritschel, and B. Schiele, What is holding back convnets for detection? In Pattern Recognition, p.517528

B. Pepik, M. Stark, P. Gehler, and B. Schiele, Teaching 3D geometry to deformable part models, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.71-93
DOI : 10.1109/CVPR.2012.6248075

P. O. Pinheiro, R. Collobert, and P. Dollar, Learning to segment object candidates, Neural Information Processing Systems (NIPS), p.100, 2015.

M. Rhu, N. Gimelshein, J. Clemons, A. Zulqar, and S. W. Keckler, Virtualizing deep neural networks for memory-ecient neural network design, p.43, 2016.

L. Roberts, Machine perception of 3-D solids, pp.71-91, 1965.

A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta et al., Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412, p.72, 2014.

F. Rosenblatt, The perceptron: A probabilistic model for information storage and organization in the brain., Psychological Review, vol.65, issue.6, pp.386-405, 1958.
DOI : 10.1037/h0042519

F. Rothganger, S. Lazebnik, C. Schmid, and J. Ponce, 3D object modeling and recognition using local ane-invariant image descriptors and multi-view spatial constraints, International Journal of Computer Vision (IJCV), vol.66, issue.3, pp.231259-71, 2006.
DOI : 10.1007/s11263-005-3674-1
URL : http://www.cs.cmu.edu/~drt/OSS/reading/match3D.pdf

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning internal representations by error propagation, pp.22-24, 1985.

. Fei-fei, Imagenet large scale visual recognition challenge, International Journal of Computer Vision (IJCV), vol.115, issue.41, pp.211252-211277, 2015.

H. Schneiderman and T. Kanade, A statistical method for 3D object detection applied to faces and cars, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662), pp.746751-746789, 2000.
DOI : 10.1109/CVPR.2000.855895

P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus et al., OverFeat: Integrated recognition, localization and detection using convolutional networks, p.34, 2013.

A. Sharif-razavian, H. Azizpour, J. Sullivan, and S. Carlsson, Cnn features o-the-shelf: an astounding baseline for recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp.806813-56, 2014.

M. Simonovsky and N. Komodakis, OnionNet: Sharing Features in Cascaded Deep Classifiers, Procedings of the British Machine Vision Conference 2016, p.35, 2016.
DOI : 10.5244/C.30.79

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition. CoRR, abs, pp.39-93, 1409.

S. Song and J. Xiao, Sliding Shapes for 3D Object Detection in Depth Images, ECCV, p.72, 2014.
DOI : 10.1007/978-3-319-10599-4_41

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: A simple way to prevent neural networks from overtting, The Journal of Machine Learning Research, vol.15, issue.1, pp.19291958-19291984, 2014.

H. Su, Q. Huang, N. Mitra, Y. Li, and L. Guibas, Estimating image depth using shape collections, ACM Transactions on Graphics, vol.33, issue.4, pp.2014-72
DOI : 10.1109/TPAMI.2013.87
URL : http://vecg.cs.ucl.ac.uk/Projects/SmartGeometry/image_shape_net/paper_docs/imageShapeNet_small_sigg14.pdf

H. Su, C. Qi, Y. Li, and L. Guibas, Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views, 2015 IEEE International Conference on Computer Vision (ICCV), pp.96-97, 1993.
DOI : 10.1109/ICCV.2015.308

H. Su, F. Wang, E. Yi, and L. J. Guibas, 3D-Assisted Feature Synthesis for Novel Views of an Object, 2015 IEEE International Conference on Computer Vision (ICCV), pp.26772685-70, 2015.
DOI : 10.1109/ICCV.2015.307

B. Sun and K. Saenko, From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains, Proceedings of the British Machine Vision Conference 2014, p.72, 2014.
DOI : 10.5244/C.28.82

A. Torralba and A. A. Efros, Unbiased look at dataset bias, CVPR 2011, p.49, 2011.
DOI : 10.1109/CVPR.2011.5995347
URL : http://people.csail.mit.edu/torralba/publications/datasets_cvpr11.pdf

S. Tulsiani and J. Malik, Viewpoints and keypoints, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.93-96
DOI : 10.1109/CVPR.2015.7298758

E. Tzeng, J. Homan, T. Darrell, and K. Saenko, Simultaneous Deep Transfer Across Domains and Tasks, 2015 IEEE International Conference on Computer Vision (ICCV), p.72, 2015.
DOI : 10.1109/ICCV.2015.463

J. Uijlings, K. Van-de-sande, T. Gevers, and A. Smeulders, Selective Search for Object Recognition, International Journal of Computer Vision, vol.57, issue.1, pp.33-77, 2013.
DOI : 10.1023/B:VISI.0000013087.49260.fb

R. Vaillant, C. Monrocq, and Y. Lecun, Original approach for the localisation of objects in images, IEE Proceedings - Vision, Image, and Signal Processing, vol.141, issue.4, pp.245250-245282, 1994.
DOI : 10.1049/ip-vis:19941301

K. E. Van-de-sande, J. R. Uijlings, T. Gevers, and A. W. Smeulders, Segmentation as selective search for object recognition, 2011 International Conference on Computer Vision, pp.1879-1886, 2011.
DOI : 10.1109/ICCV.2011.6126456

D. Vazquez, A. M. Lopez, J. Marin, D. Ponsa, and D. Geronimo, Virtual and Real World Adaptation for Pedestrian Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.36, issue.4, p.123
DOI : 10.1109/TPAMI.2013.163

P. Viola and M. Jones, Rapid object detection using a boosted cascade of simple features, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, pp.511-542, 2001.
DOI : 10.1109/CVPR.2001.990517

Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang et al., 3D ShapeNets: A deep representation for volumetric shape modeling, International Conference on Computer Vision and Pattern Recognition (CVPR), p.38, 2015.

Y. Xiang, W. Kim, W. Chen, J. Ji, C. Choy et al., ObjectNet3D: A Large Scale Database for 3D Object Recognition, European Conference on Computer Vision (ECCV), p.39, 2016.
DOI : 10.1007/978-3-319-10602-1_26

Y. Xiang, R. Mottaghi, and S. Savarese, Beyond PASCAL: A benchmark for 3D object detection in the wild, IEEE Winter Conference on Applications of Computer Vision, pp.76-91, 2014.
DOI : 10.1109/WACV.2014.6836101

Y. Xiang and S. Savarese, Estimating the aspect layout of object categories, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.2012-93
DOI : 10.1109/CVPR.2012.6248081

J. Xiao, B. Russell, and A. Torralba, Localizing 3D cuboids in singleview images, Neural Information Processing Systems (NIPS), p.71, 2012.

S. Xie and Z. Tu, Holistically-nested edge detection, International Conference on Computer Vision and Pattern Recognition (CVPR), pp.13951403-13951430, 2015.
DOI : 10.1007/s11263-017-1004-z
URL : http://arxiv.org/pdf/1504.06375

J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, How transferable are features in deep neural networks?, Neural Information Processing Systems (NIPS), pp.33203328-60, 2014.

F. Yu and V. Koltun, Multi-scale context aggregation by dilated convolutions, ICLR, p.33, 2016.

S. Zagoruyko, A. Lerer, T. Lin, P. O. Pinheiro, S. Gross et al., A MultiPath Network for Object Detection, Procedings of the British Machine Vision Conference 2016, p.35, 2016.
DOI : 10.5244/C.30.15

M. D. Zeiler and R. Fergus, Visualizing and Understanding Convolutional Networks, European Conference on Computer Vision (ECCV), pp.818833-818862, 2014.
DOI : 10.1007/978-3-319-10590-1_53
URL : http://cs.nyu.edu/%7Efergus/papers/zeilerECCV2014.pdf

M. Zia, M. Stark, B. Schiele, and K. Schindler, Detailed 3D Representations for Object Recognition and Modeling, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.11, pp.2013-93
DOI : 10.1109/TPAMI.2013.87