Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering, CVPR, 2018. ,
Real-Time Adaptation to Time-Varying Constraints for Reliable mHealth Video Communications, 2017. ,
Capturing Growth: Photo Apps and Open Graph, 2012. ,
Neural Machine Translation by Jointly Learning to Align and Translate, ICLR, 2015. ,
Modélisation de contextes pour l'annotation sémantique de vidéos, 2014. ,
Multimodal Machine Learning: A Survey and Taxonomy. Pattern Analysis and Machine Intelligence, 2017. ,
Don't Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference, ACL, 2019. ,
Understanding and Predicting Importance in Images, CVPR, 2012. ,
Latent Dirichlet Allocation Michael I, Jordan. J. Mach. Learn. Res, vol.3, 2003. ,
End to End Learning for Self-Driving Cars, 2016. ,
Web 1t 5-gram version 1. Linguistic Data Consortium, 2006. ,
A systematic study of the class imbalance problem in convolutional neural networks, Neural Networks, 2018. ,
HICO: A benchmark for recognizing human-object interactions in images, ICCV, 2015. ,
SMOTE: Synthetic Minority Over-sampling Techniqu, Journal of Artificial Intelligence Research, 2002. ,
Learning Efficient Object Detection Models with Knowledge Distillation, NIPS, 2017. ,
, Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. PAMI, 2018.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, EMNLP, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01433235
Three models for the description of language. IRE Transactions on Information Theory, 1956. ,
Efficient Video Generation on Complex Datasets, 2019. ,
Deep neural networks for youtube recommendations, RecSys 2016 -Proceedings of the 10th ACM Conference on Recommender Systems, 2016. ,
Fine-grained Categorization and Dataset Bootstrapping using Deep Metric Learning with Humans in the Loop, CVPR, 2016. ,
Detecting Visual Relationships with Deep Relational Networks, CVPR, 2017. ,
BoxSup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation, ICCV, 2015. ISBN 9781467383912 ,
Hedging Your Bets: Optimizing Accuracy-Specificity Trade-offs in Large Scale Visual Recognition, CVPR, 2012. ,
Large-Scale Object Classification Using Label Relation Graphs, European Conference on Computer Vision, 2014. ,
, ImageNet: A large-scale hierarchical image database, IEEE Conference on Computer Vision and Pattern Recognition, pp.2-9, 2009.
Ensemble Methods in Machine Learning, International workshop on multiple classifier systems, 2000. ,
Unsupervised Visual Representation Learning by Context Prediction, ICCV, 2015. ,
Discriminative Unsupervised Feature Learning with Convolutional Neural Networks, NIPS, 2014. ,
Brandon Tran, and Dimitris Tsipras. A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Discussion and Author Responses. Distill, vol.219 ,
The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, vol.88, pp.303-338, 2010. ,
Object Detection Meets Knowledge Graphs, IJCAI, pp.1661-1667, 2017. ,
WordNet: An Electronic Lexical Database, Bradford Books, vol.71, 1998. ,
Object Detection with Discriminatively Trained Part Based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010. ,
Self-Supervised Video Representation Learning With Odd-One-Out Networks, 2017. ,
, Annals of Eugenics, 1936.
DeViSE: A Deep Visual-Semantic Embedding Model, NIPS, 2013. ,
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding, Arxiv, 2016. ,
Object categorization using co-occurrence, location and appearance, CVPR, 2008. ,
iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection, BMVC, 2018. ,
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets, EMNLP-IJCNLP, 2019. ,
Laplacian pyramid reconstruction and refinement for semantic segmentation, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics ,
UNSUPERVISED REP-RESENTATION LEARNING BY PRE-DICTING IMAGE ROTATIONS, ICLR, 2018. ,
Adversarial Examples Are Not Bugs, They Are Features': Adversarial Example Researchers Need to Expand What is Meant by 'Robustness'. Distill, 2019. ,
Attentional Pooling for Action Recognition, NIPS, 2017. ,
Fast R-CNN, ICCV, 2015. ,
Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR, vol.11, 2014. ,
Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.580-587, 2014. ,
Understanding the difficulty of training deep feedforward neural networks, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010. ,
Neighbourhood Components Analysis, NIPS, 2004. ,
Self-supervised learning of visual features through embedding images into text topic spaces, CVPR, 2017. ,
Deep Learning, 2016. ,
We need to talk about standard splits, ACL, 2019. ,
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering, International Journal of Computer Vision, 2019. ,
Cross Modal Distillation for Supervision Transfer, CVPR, 2016. ,
ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning, IEEE International Joint Conference on Neural Networks, vol.9781424418213, 2008. ,
Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction, NIPS, 2018. ,
Distilling the Knowledge in a Neural Network, NIPS Deep Learning Workshop, 2014. ,
, Sepp Hochreiter and Jürgen Schmidhuber. full-text. Neural Computation, 1997.
Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network, CVPR, 2016. ,
Harnessing Deep Neural Networks with Logic Rules, ACL, 2016. ISBN 9781510827585 ,
Billion-scale similarity search with GPUs ,
Image Retrieval using Scene Graphs ,
Learning to Remember Rare Events, ICLR, 2017. ,
, The Kinetics Human Action Video Dataset
Stochastic Estimation of the Maximum of a Regression Function, The Annals of Mathematical Statistics, 1952. ,
Semi-supervised Classification with Graph Convolutioal Networks, ICLR, 2017. ,
, Multimodal Neural Language Models. Icml, pp.595-603, 2014.
Siamese Neural Networks for One-shot Image Recognition ,
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations, 2016. ,
ImageNet Classification with Deep Convolutional Neural Networks, Advances In Neural Information Processing Systems, pp.1-9, 2012. ,
HMDB: A Large Video Database for Human Motion Recognition, High Performance Computing in Science and Engineering, 2012. ,
IJCV submission in review The Open Images Dataset V4 Unified image classification, object detection, and visual relationship detection at scale, 2018. ,
One shot learning of simple visual concepts, {Proceedings of the 33rd Annual Conference of the Cognitive Science Society}, 2011. ,
Humanlevel concept learning through probabilistic program induction, Science, 2015. ,
Building Machines That Learn and Think Like People, Behavioral and Brain Sciences, 2017. ,
Learning to detect unseen object classes by between-class attribute transfer, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops, 2009. ,
Filling in a sparse training space for word sense identification, 1994. ,
ViP-CNN: Visual Phrase Guided Convolutional Neural Network, CVPR, 2017. ,
Learning from Noisy Labels with Distillation, 2017. ,
Visual Relationship Detection with Deep Structural Ranking ,
Visual Relationship Detection with Deep Structural Ranking, AAAI, 2018. ,
Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection, CVPR, 2017. ,
Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation, CVPR, 2016. ISBN 9781467388504 ,
Microsoft COCO: Common Objects in Context ,
Kaiming He, and Piotr Dollár. Focal Loss for Dense Object Detection, 2017. ,
Semantic image segmentation via deep parsing network, ICCV, 2015. ISBN 9781467383912 ,
Visual relationship detection with language priors, ECCV, 2016. ISBN 9783319464473 ,
Understanding Blind People's Experiences with Computer-Generated Captions of Social Media Images, 2017. ,
Some methods for classification and analysis of multivariate observations, Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1967. ,
, The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision. In ICLR, 2019.
Show and Tell More: Topic-Oriented Multi-Sentence Image Captioning, In IJCAI, 2018. ,
Deep Learning: A Critical Appraisal ,
The More You Know: Using Knowledge Graphs for Image Classification, CVPR, 2017. ,
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference, ACL, 2019. ,
Efficient Estimation of Word Representations in Vector Space, ICLR, 2013. ,
Shuffle and Learn: Unsupervised Learning using Temporal Order Verification, ECCV, 2016. ,
Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels, CVPR, 2016. ,
Moments in Time Dataset: one million videos for event understanding, 2019. ,
LEARNING A NATURAL LAN-GUAGE INTERFACE WITH NEURAL PROGRAMMER, 2017. ,
Pixels to Graphs by Associative Embedding, NIPS, 2017. ,
Associative Embedding: Endto-End Learning for Joint Detection and Grouping, NIPS, 2017. ,
Poincaré Embeddings for Learning Hierarchical Representations ,
Poincaré Embeddings for Learning Hierarchical Representations, 2017. ,
Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks, CVPR, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00911179
From Large Scale Image Categorization to Entry-Level Categories, ICCV, 2013. ,
Zero-Shot Learning with Semantic Output Codes, Advances in Neural Information Processing Systems, vol.22, 2009. ,
Context Encoders: Feature Learning by Inpainting, CVPR, 2016. ,
On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 1901. ,
Glove: Global Vectors for Word Representation, EMNLP, 2014. ,
Weakly-supervised learning of visual relations, ICCV, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01576035
Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Distillation, ICME, 2018. ,
Learning Prototypes for Visual Relationship Detection, CBMI, 2018. ,
On the Momentum Term in Gradient Descent Learning Algorithms ,
Data Distillation: Towards Omni-Supervised Learning, CVPR, 2018. ,
Learning semantic relationships for better action retrieval in images, CVPR, 2015. ,
, , 2018.
You Only Look Once: Unified, Real-Time Object Detection, CVPR, 2016. ,
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS, 2015. ,
Explaining the Predictions of Any Classifier, KDD, 2016. ISBN 9781450342322 ,
A Stochastic Approximation Method, Annals of Mathematical Statistics, 1951. ,
What helps where -and why? Semantic relatedness for knowledge transfer, CVPR, 2010. ISBN 9781424469840 ,
On Class Imbalance and Background Filtering in Visual Relationship Detection, 2019. ,
Modeling Relational Data with Graph Convolutional Networks, 2017. ,
FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR, 2015. ,
Learning a Distance Metric from Relative Comparisons, NIPS, 2003. ,
Training Regionbased Object Detectors with Online Hard Example Mining, CVPR, 2016. ,
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play, Science, 2018. ,
, FRACKING DEEP CONVOLUTIONAL IMAGE DESCRIPTORS. In ICLR, 2015.
Very Deep Convolutional Networks for Large-Scale Image Recognition, ICLR, 2015. ISBN 9781450341448 ,
Prototypical Networks for Few-shot Learning, NIPS, 2017. ,
Grounded Compositional Semantics for Finding and Describing Images with Sentences, Transactions of the Association for Computational Linguistics, 2014. ,
Improved Deep Metric Learning with Multi-class N-pair Loss Objective, NIPS, 2016. ,
Deep Metric Learning via Lifted Structured Feature Embedding, CVPR, 2016. ,
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild, 2012. ,
Representing General Relational Knowledge in ConceptNet 5, LREC, 2012. ,
Sequence-based prediction of protein protein interaction using a deep-learning algorithm, BMC Bioinformatics, 2017. ,
A multiple expert approach to the class imbalance problem using inverse random under sampling, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume, vol.5519, pp.82-91, 2009. ,
Selective Search for Object Recognition, International Journal of Computer Vision, 2013. ,
Visualizing Data using t-SNE, Journal of Machine Learning Research, vol.9, 2008. ,
Extracting implicit knowledge from text, 2010. ,
Attention Is All You Need, NIPS, 2017. ,
ORDER-EMBEDDINGS OF IMAGES AND LANGUAGE, ICLR, 2016. ,
Show and tell: A neural image caption generator, CVPR, 2015. ,
Matching Networks for One Shot Learning, NIPS, 2016. ,
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge, vol.99, pp.1-1, 2016. ,
Exploring Context and Visual Pattern of Relationship for Scene Graph Generation, CVPR, 2019. ,
Unsupervised Learning of Visual Representations using Videos, ICCV, 2015. ,
Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs, CVPR, 2018. ,
Dueling Network Architectures for, Deep Reinforcement Learning. arXiv, issue.9, pp.1-16, 2016. ,
Learning and Using the Arrow of Time, CVPR, 2018. ,
Distance Metric Learning for Large Margin Nearest Neighbor Classification, Journal of Machine Learning Research, 2009. ,
LinkNet: Relational Embedding for Scene Graph, NIPS, 2018. ,
Zero-Shot Learning -A Comprehensive Evaluation of the Good, the Bad and the Ugly, Bernt Schiele, and Zeynep Akata, 2018. ,
Distance Metric Learning, with Application to Clustering with Side-Information, NIPS, 2002. ,
Can humans fly? Action understanding with multiple classes of actors, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.7-12, 2015. ,
Scene Graph Generation by Iterative Message Passing, CVPR, 2017. ,
Attend and Tell: Neural Image Caption Generation with Visual Attention, ICML, 2015. ,
Deep Correlation for Matching Images and Text, CVPR, 2015. ,
Graph R-CNN for Scene Graph Generation, ECCV, 2018. ,
Human action recognition by learning bases of action attributes and parts, Proceedings of the IEEE International Conference on Computer Vision, pp.1331-1338, 2011. ,
A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning, CVPR, 2017. ,
Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition, ECCV, 2018. ,
Visual Relationship Detection With Internal and External Linguistic Knowledge Distillation, 2017. ,
Neural Motifs: Scene Graph Parsing with Global Context, CVPR, 2018. ,
Visual Translation Embedding Network for Visual Relation Detection, 2017. ,
, Relationship Proposal Networks. In CVPR, 2017.
Ahmed Elgammal, and Mohamed Elhoseiny. Large-Scale Visual Relationship Understanding, AAAI, 2019. ,
Graphical Contrastive Losses for Scene Graph Parsing, CVPR, 2019. ,
Colorful Image Colorization, ECCV, 2016. ,
Open Vocabulary Scene Parsing, ICCV, 2017. ,
Pyramid scene parsing network, CVPR, 2017. ISBN 9781538604571 ,
Visual relationship detection with object spatial distribution, ICME, 2017. ,
Reasoning about Object Affordances in a Knowledge Base Representation, ECCV, 2014. ,