H. Alwassel, D. Mahajan, L. Torresani, B. Ghanem, T. et al., , 2019.

, Self-supervised learning by cross-modal audio-video clustering, 2019.

R. Arandjelovic and A. Zisserman, Look, listen and learn, 2017.

I. Armeni, S. Sax, A. R. Zamir, and S. Savarese, Joint 2d-3d-semantic data for indoor scene understanding, 2017.

V. Badrinarayanan, A. Kendall, and R. Cipolla, Segnet: A deep convolutional encoder-decoder architecture for image segmentation, 2017.

J. Behley, M. Garbade, A. Milioto, J. Quenzel, S. Behnke et al., SemanticKITTI: A dataset for semantic scene understanding of LiDAR sequences, ICCV, 2019.

M. Bojarski, D. Del-testa, D. Dworakowski, B. Firner, B. Flepp et al., End to end learning for self-driving cars, 2016.

H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong et al., nuScenes: A multimodal dataset for autonomous driving, 2019.

A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang et al., Shapenet: An information-rich 3d model repository, 2015.

C. Chen, A. Seff, A. Kornhauser, X. , and J. , Deepdriving: Learning affordance for direct perception in autonomous driving, 2015.

D. Chen, B. Zhou, V. Koltun, and P. Krähenbühl, Learning by cheating, 2019.

L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, Y. et al., Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, 2018.

L. Chen, G. Papandreou, F. Schroff, A. , and H. , Rethinking atrous convolution for semantic image segmentation, 2017.

X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, Multi-view 3d object detection network for autonomous driving, 2017.

Y. Chen, B. Yang, M. Liang, and R. Urtasun, Learning joint 2D-3D representations for depth completion, 2019.

X. Cheng, P. Wang, C. Guan, Y. , and R. , CSPN++: Learning context and resource aware convolutional spatial propagation networks for depth completion, 2019.

X. Cheng, P. Wang, Y. , and R. , Depth estimation via affinity learned with convolutional spatial propagation network, 2018.

H. Chiang, Y. Lin, Y. Liu, and W. H. Hsu, A unified point-based framework for 3D segmentation, vol.3, 2019.

C. Choy, J. Gwak, and S. Savarese, 4D spatio temporal convnet: Minkowski convolutional neural networks, 2019.

Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, net: learning dense volumetric segmentation from sparse annotation, MICCAI, 2016.

F. Codevilla, M. Miiller, A. López, V. Koltun, and A. Dosovitskiy, End-to-end driving via conditional imitation learning, 2018.

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler et al., The cityscapes dataset for semantic urban scene understanding, CVPR, 2016.

W. Dabney, G. Ostrovski, D. Silver, and R. Munos, Implicit quantile networks for distributional reinforcement learning, 2018.

A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser et al., ScanNet: Richly-annotated 3D reconstructions of indoor scenes, 2017.

A. Dai, C. Diller, and M. Nießner, Sg-nn: Sparse generative neural networks for self-supervised scene completion of rgb-d scans, 2019.

A. Dai and M. Nießner, 3dmv: Joint 3d-multi-view prediction for 3d semantic scene segmentation, 2018.

A. Dai, D. Ritchie, M. Bokeloh, S. Reed, J. Sturm et al., Scancomplete: Large-scale scene completion and semantic segmentation for 3d scans, CVPR, 2018.

J. Deng, W. Dong, R. Socher, L. Li, K. Li et al., Imagenet: A large-scale hierarchical image database, CVPR, 2009.

T. Devries and G. W. Taylor, Improved regularization of convolutional neural networks with cutout, 2017.

C. Doersch, A. Gupta, and A. A. Efros, Unsupervised visual representation learning by context prediction, 2015.

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, Carla: An open urban driving simulator, 2017.

Y. Duan, X. Chen, R. Houthooft, J. Schulman, A. et al., Benchmarking deep reinforcement learning for continuous control, 2016.

A. A. Efros and T. K. Leung, Texture synthesis by non-parametric sampling, 1999.

D. Eigen, C. Puhrsch, F. , and R. , Depth map prediction from a single image using a multi-scale deep network, NeurIPS, 2014.

A. Eldesokey, M. Felsberg, and F. S. Khan, Confidence propagation through CNNs for guided sparse depth regression, 2019.

Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle et al., Domain-adversarial training of neural networks, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01624607

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, Vision meets robotics: The kitti dataset, 2013.

J. Geyer, Y. Kassahun, M. Mahmudi, X. Ricou, R. Durgesh et al., A2D2: AEV autonomous driving dataset, 2019.

C. Godard, O. Mac-aodha, and G. J. Brostow, Unsupervised monocular depth estimation with left-right consistency, 2017.

C. Godard, O. Mac-aodha, M. Firman, and G. J. Brostow, Digging into self-supervised monocular depth estimation, 2019.

I. Goodfellow, J. Pouget-abadie, M. Mirza, B. Xu, D. Warde-farley et al., Generative adversarial nets, NIPS, 2014.

B. Graham, M. Engelcke, and L. Van-der-maaten, 3D semantic segmentation with submanifold sparse convolutional networks, 2018.

B. Graham and L. Van-der-maaten, , 2017.

F. Groh, P. Wieschollek, and H. P. Lensch, Flex-convolution (million-scale point-cloud learning beyond grid-worlds), 2018.

S. Gu, E. Holly, T. Lillicrap, and S. Levine, Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates, 2017.

S. Gu, T. Lillicrap, I. Sutskever, and S. Levine, Continuous deep Q-learning with model-based acceleration, 2016.

S. Gupta, R. Girshick, P. Arbeláez, M. , and J. , Learning rich features from rgb-d images for object detection and segmentation, 2014.

T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, Soft actorcritic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor, 2018.

T. Hackel, N. Savinov, L. Ladicky, J. D. Wegner, K. Schindler et al., SEMANTIC3D.NET: A new large-scale point cloud classification benchmark, 2017.

M. W. Hancock and B. Wright, A policy on geometric design of highways and streets. The American Association of State Highway and Transportation Officials, 2001.

C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture, 2016.

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, 2016.

P. Hermosilla, T. Ritschel, P. Vázquez, À. Vinacua, and T. Ropinski, Monte carlo convolution for learning on non-uniformly sampled point clouds, SIGGRAPH Asia, 2018.

M. Hessel, J. Modayil, H. Van-hasselt, T. Schaul, G. Ostrovski et al., Rainbow: Combining improvements in deep reinforcement learning, AAAI, 2018.

G. Hinton, O. Vinyals, and J. Dean, Distilling the knowledge in a neural network, NIPS Workshop, 2014.

J. Hoffman, E. Tzeng, T. Park, J. Zhu, P. Isola et al., CyCADA: Cycle-consistent adversarial domain adaptation, 2018.

J. Hoffman, D. Wang, F. Yu, D. , and T. , FCNs in the wild: Pixel-level adversarial and constraint-based adaptation, 2016.

J. Huang, H. Zhang, L. Yi, T. Funkhouser, M. Nießner et al., Texturenet: Consistent local parametrizations for learning from high-resolution signals on meshes, CVPR, 2019.

M. Jaritz, R. De-charette, M. Toromanoff, E. Perot, and F. Nashashibi, End-to-end race driving with deep reinforcement learning, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01848067

M. Jaritz, R. De-charette, E. Wirbel, X. Perrotton, and F. Nashashibi, Sparse and dense data with CNNs: depth completion and semantic segmentation, vol.3, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01858241

M. Jaritz, J. Gu, and H. Su, Multi-view PointNet for 3D scene understanding, ICCV Workshop, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02387461

M. Jaritz, T. Vu, R. De-charette, E. Wirbel, and P. Pérez, xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D semantic segmentation, vol.2020, 2020.
URL : https://hal.archives-ouvertes.fr/hal-02388974

L. Jiang, H. Zhao, S. Liu, X. Shen, C. Fu et al., Hierarchical point-edge interaction network for point cloud semantic segmentation, CVPR, 2019.

D. Kalashnikov, A. Irpan, P. Pastor, J. Ibarz, A. Herzog et al., , 2018.

, Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation, CoRL, 2018.

K. Kamnitsas, C. Ledig, V. F. Newcombe, J. P. Simpson, A. D. Kane et al., Efficient multi-scale 3d cnn with fully connected crf for accurate brain lesion segmentation, 2017.

M. Kempka, M. Wydmuch, G. Runc, J. Toczek, J. et al., Vizdoom: A doom-based ai research platform for visual reinforcement learning, 2016.

A. Kendall, V. Badrinarayanan, and R. Cipolla, Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding, 2017.

A. Kendall, J. Hawke, D. Janz, P. Mazur, D. Reda et al., Learning to drive in a day, ICRA, 2019.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, 2012.

J. Ku, A. Harakeh, and S. L. Waslander, In defense of classical image processing: Fast depth completion on the cpu, 2018.

Y. Kuznietsov, J. Stückler, and B. Leibe, Semi-supervised deep learning for monocular depth map prediction, 2017.

I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab, Deeper depth prediction with fully convolutional residual networks, vol.3, 2016.

L. Landrieu and M. Simonovsky, Large-scale point cloud semantic segmentation with superpoint graphs, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01801186

B. Lau, Using Keras and Deep Deterministic Policy Gradient to play TORCS, 2016.

D. Lee, Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks, ICML Workshop, 2013.

K. Lee, G. Ros, J. Li, and A. Gaidon, Spigan: Privileged adversarial learning from simulation, 2019.

S. Levine, C. Finn, T. Darrell, A. , and P. , End-to-end training of deep visuomotor policies, 2016.

S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, and D. Quillen, Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection, 2018.

Y. Li, R. Bu, M. Sun, W. Wu, X. Di et al., Pointcnn: Convolution on x-transformed points, NeurIPS, 2018.

Y. Li, L. Yuan, and N. Vasconcelos, Bidirectional learning for domain adaptation of semantic segmentation, 2019.

M. Liang, B. Yang, S. Wang, and R. Urtasun, Deep continuous fusion for multi-sensor 3D object detection, 2018.

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez et al., Continuous control with deep reinforcement learning, 2016.

M. Liu, T. Breuel, and J. Kautz, Unsupervised image-to-image translation networks, 2017.

J. Long, E. Shelhamer, D. , and T. , Fully convolutional networks for semantic segmentation, 2015.

M. Long, Y. Cao, J. Wang, J. , and M. , Learning transferable features with deep adaptation networks, 2015.

W. Luo, Y. Li, R. Urtasun, and R. Zemel, Understanding the effective receptive field in deep convolutional neural networks, 2016.

F. Ma, G. V. Cavalheiro, and S. Karaman, Self-supervised sparseto-dense: Self-supervised depth completion from lidar and monocular camera, ICRA, 2019.

F. Ma and S. Karaman, Sparse-to-dense: Depth prediction from sparse depth samples and a single image, 2018.

J. Mairal, G. Sapiro, and M. Elad, Learning multiscale sparse representations for image and video restoration, Multiscale Modeling & Simulation, 2008.

V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap et al., Asynchronous methods for deep reinforcement learning, ICML, 2016.

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou et al., Playing Atari with deep reinforcement learning, 2013.

M. Montemerlo, J. Becker, S. Bhat, H. Dahlkamp, D. Dolgov et al., Junior: The stanford entry in the urban challenge, Journal of Field Robotics, 2008.

P. Morerio, J. Cavazza, and V. Murino, Minimal-entropy correlation alignment for unsupervised deep domain adaptation, 2018.

Z. Murez, S. Kolouri, D. Kriegman, R. Ramamoorthi, K. et al., Image to image translation for domain adaptation, 2018.

E. Perot, M. Jaritz, M. Toromanoff, and R. De-charette, End-toend driving in a realistic racing game with deep reinforcement learning, CVPR Workshop, 2017.

F. Pizzati, R. De-charette, M. Zaccaria, and P. Cerri, Domain bridge for unpaired image-to-image translation and unsupervised domain adaptation, vol.2020, 2020.
URL : https://hal.archives-ouvertes.fr/hal-02436218

D. A. Pomerleau, W. Liu, C. Wu, H. Su, and L. J. Guibas, Frustum pointnets for 3d object detection from rgb-d data, NIPS 1989. Bibliography Qi, 1989.

C. R. Qi, H. Su, K. Mo, and L. J. Guibas, Pointnet: Deep learning on point sets for 3d classification and segmentation, 2017.

C. R. Qi, L. Yi, H. Su, and L. J. Guibas, Pointnet++: Deep hierarchical feature learning on point sets in a metric space, 2017.

J. Qiu, Z. Cui, Y. Zhang, X. Zhang, S. Liu et al., Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image, CVPR, 2019.

M. Ren, A. Pokrovsky, B. Yang, and R. Urtasun, Sbnet: Sparse blocks network for fast inference, 2018.

S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, 2015.

G. Riegler, A. O. Ulusoy, and A. Geiger, Octnet: Learning deep 3d representations at high resolutions, 2017.

O. Ronneberger, P. Fischer, and T. Brox, U-net: Convolutional networks for biomedical image segmentation, 2015.

G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes, CVPR, 2016.

J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, Trust region policy optimization, 2015.

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, 2017.

S. Shah, D. Dey, C. Lovett, and A. Kapoor, Airsim: High-fidelity visual and physical simulation for autonomous vehicles, Field and service robotics, 2018.

S. Shi, X. Wang, L. , and H. , Pointrcnn: 3d object proposal generation and detection from point cloud, 2019.

N. Silberman, D. Hoiem, P. Kohli, F. , and R. , Indoor segmentation and support inference from rgbd images, 2012.

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre et al., Mastering the game of go with deep neural networks and tree search, Nature, 2016.

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2015.

S. Song, F. Yu, A. Zeng, A. X. Chang, M. Savva et al., Semantic scene completion from a single depth image, 2017.

J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller, Striving for simplicity: The all convolutional net, 2015.

H. Su, V. Jampani, D. Sun, S. Maji, E. Kalogerakis et al., SPLATNet: Sparse lattice networks for point cloud processing, CVPR, 2018.

B. Sun and K. Saenko, Deep coral: Correlation alignment for deep domain adaptation, 2016.

Z. Sun, G. Bebis, and R. Miller, On-road vehicle detection: A review, 2006.

R. S. Sutton and A. G. Barto, Introduction to reinforcement learning, 1998.

J. Tang, F. Tian, W. Feng, J. Li, and P. Tan, Learning guided convolutional network for depth completion, 2019.

H. Thomas, C. R. Qi, J. Deschaud, B. Marcotegui, F. Goulette et al., KPConv: Flexible and deformable convolution for point clouds, ICCV, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02310026

J. Tian, W. Cheung, N. Glaser, Y. Liu, K. et al., Uno: Uncertainty-aware noisy-or multimodal fusion for unanticipated input degradation, vol.2020, 2020.

Y. Tian, D. Krishnan, and P. Isola, Contrastive multiview coding, 2019.

E. Todorov, T. Erez, and Y. Tassa, Mujoco: A physics engine for model-based control, IROS 2012, 2012.

M. Toromanoff, E. Wirbel, and F. Moutarde, End-to-end modelfree reinforcement learning for urban driving using implicit affordances, vol.2020, 2020.
URL : https://hal.archives-ouvertes.fr/hal-02513566

Y. Tsai, W. Hung, S. Schulter, K. Sohn, M. Yang et al., Learning to adapt structured output space for semantic segmentation, CVPR, 2018.

E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, D. et al., Deep domain confusion: Maximizing for domain invariance, 2014.

J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox et al., Sparsity invariant CNNs, vol.3, 2017.

B. Ummenhofer, H. Zhou, J. Uhrig, N. Mayer, E. Ilg et al., Demon: Depth and motion network for learning monocular stereo, 2017.

C. Urmson, J. Anhalt, D. Bagnell, C. Baker, R. Bittner et al., Autonomous driving in urban environments: Boss and the urban challenge, Journal of Field Robotics, 2008.

A. Valada, R. Mohan, and W. Burgard, Self-supervised model adaptation for multimodal semantic segmentation, 2019.

A. Valada, J. Vertens, A. Dhall, and W. Burgard, Adapnet: Adaptive semantic segmentation in adverse environmental conditions, 2017.

W. Van-gansbeke, D. Neven, B. De-brabandere, and L. Van-gool, Sparse and noisy lidar completion with rgb guidance and uncertainty, 2019.

N. Verma, E. Boyer, and J. Verbeek, Feastnet: Feature-steered graph convolutions for 3D shape analysis, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01540389

O. Vinyals, I. Babuschkin, J. Chung, M. Mathieu, M. Jaderberg et al., AlphaStar: Mastering the Real-Time Strategy Game Star-Craft II, 2019.

T. Vu, H. Jain, M. Bucher, M. Cord, and P. Pérez, Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation, CVPR, 2019.
URL : https://hal.archives-ouvertes.fr/hal-01942465

T. Vu, H. Jain, M. Bucher, M. Cord, and P. Pérez, DADA: Depth-aware domain adaptation in semantic segmentation, 2019.

S. Wang, S. Suo, W. Ma, A. Pokrovsky, and R. Urtasun, Deep parametric continuous convolutional neural networks, 2018.

Y. Wang, W. Chao, D. Garg, B. Hariharan, M. Campbell et al., Pseudo-lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous driving, CVPR, 2019.

Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein et al., Dynamic graph cnn for learning on point clouds, Transactions on Graphics, 2019.

C. J. Watkins and P. Dayan, Q-learning, Machine learning, 1992.

R. J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine learning, 1992.

B. Wu, A. Wan, X. Yue, and K. Keutzer, Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3D LiDAR point cloud, 2018.

B. Wu, X. Zhou, S. Zhao, X. Yue, and K. Keutzer, Squeezesegv2: Improved model structure and unsupervised domain adaptation for roadobject segmentation from a lidar point cloud, ICRA, 2019.

W. Wu, Z. Qi, and L. Fuxin, PointConv: Deep convolutional networks on 3D point clouds, 2019.

Z. Wu, X. Han, Y. Lin, M. Gokhan-uzunbas, T. Goldstein et al., DCAN: Dual channel-wise alignment networks for unsupervised scene adaptation, ECCV, 2018.

B. Wymann, E. Espié, C. Guionneau, C. Dimitrakakis, R. Coulom et al., Torcs, the open racing car simulator, 2000.

Y. Xu, T. Fan, M. Xu, L. Zeng, and Y. Qiao, Spidercnn: Deep learning on point sets with parameterized convolutional filters, 2018.

Y. Xu, X. Zhu, J. Shi, G. Zhang, H. Bao et al., Depth completion from sparse lidar data with depth-normal constraints, 2019.

Y. Yang, A. Wong, S. ;. Soatto, Y. Wang, W. Chao et al., Pseudo-LiDAR++: Accurate depth for 3D object detection in autonomous driving, CVPR 2019. Bibliography You, vol.2020, 2019.

J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu et al., Generative image inpainting with contextual attention, CVPR, 2018.

Y. Zhang, P. David, and B. Gong, Curriculum domain adaptation for semantic segmentation of urban scenes, 2017.

Y. Zhang and T. Funkhouser, Deep depth completion of a single rgb-d image, 2018.

Y. Zhang, T. Xiang, T. M. Hospedales, and H. Lu, Deep mutual learning, 2018.

H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, Pyramid scene parsing network, 2017.

T. Zhou, M. Brown, N. Snavely, and D. G. Lowe, Unsupervised learning of depth and ego-motion from video, 2017.

B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, Learning transferable architectures for scalable image recognition, 2018.

Y. Zou, Z. Yu, X. Liu, B. V. Kumar, W. et al., Confidence regularized self-training, 2019.

T. Dans-cette, nous abordons les défis de la rareté des annotations et la fusion de données hétérogènes tels que les nuages de points 3D et images 2D

, entraîné pour directement traduire l'entrée capteur (image caméra) en contrôles-commandes, ce qui rend cette approche indépendante des annotations dans le domaine visuel. Nous utilisons l'apprentissage par renforcement profond où l'algorithme apprend de la récompense, obtenue par interaction avec un simulateur réaliste. Nous proposons de nouvelles stratégies d'entraînement et fonctions de récompense pour une meilleure conduite et une convergence plus rapide. Cependant, le temps d'apprentissage reste élevé. C'est pourquoi nous nous concentrons sur la perception dans le reste de cette thèse pour étudier la

, Nous proposons une nouvelle architecture encodeur-décodeur qui fusionne les informations de l'image et la profondeur pour la tâche de complétion de carte de profondeur, améliorant ainsi la résolution du nuage de points projeté dans l'espace image. Deuxièmement, nous fusionnons directement dans l'espace 3D pour éviter la perte d'informations dû à la projection. Pour cela, nous calculons les caractéristiques d'image issues de plusieurs vues avec un CNN 2D, puis nous les projetons dans un nuage de points 3D global pour les fusionner avec l'information 3D. Par la suite, ce nuage de point enrichi sert d'entrée à un réseau "point-based, Nous proposons deux méthodes différentes pour la fusion 2D-3D. Premièrement, nous projetons des nuages de points LiDAR 3D dans l'espace image 2D, résultant en des cartes de profondeur éparses

, nous introduisons la nouvelle tâche d'adaptation de domaine non supervisée inter-modalités où on a accès à des données multi-capteurs dans une base de données source annotée et une base cible non annotée. Nous proposons une méthode d'apprentissage inter-modalités 2D-3D via une imitation mutuelle entre les réseaux d'images et de nuages de points pour résoudre l'écart de domaine source-cible. Nous montrons en outre que notre méthode