.. De-modèle, 59 VI -Modélisation à partir de graphes cycliques, 65 VI.1 -Transformation de graphes quelconques en arborescences, p.68

V. Méthodologie-en, Q. , and Q. , 74 I -Construction des graph machines associées aux molécules, éventuel descripteur), vol.72, p.78

.. Ii-sélection-des-exemples-de-la-base-d-'apprentissage, 87 III.1 -Structure de la fonction de noeud 87 III.2 -Cas particulier : les graph machines pour la classification 89 III.3 -Sélection du modèle 89 CHAPITRE 4 -Exemples de modélisations de propriétés et d'activités moléculaires par les graph machines. 92 I -Prédiction de propriétés de molécules, 92 I.2 -Prédiction de descripteurs moléculaires, p.97

.. Ii-prédiction-d-'activités-moléculaires, 104 II.1 -Toxicité de molécules diverses sur un être vivant, le Pimephales promelas, 104 II.2 -Prédiction de l'activité agoniste de dérivés ecdystéroïdes, p.110

B. 1. Crum-brown, A. Et-frazer, and T. , On the connection between chemical constitution and physiological action, Transactions of the Royal Society of Edinburgh, vol.25, pp.1868-69

C. Hansch, A. Leo, and D. Et-hoekmann, Exploring QSAR : hydrophobic, electronic and steric constants, 1995.

H. Wiener, Structural Determination of Paraffin Boiling Points, Journal of the American Chemical Society, vol.69, issue.1, pp.17-20, 1947.
DOI : 10.1021/ja01193a005

M. Randi?, Characterization of molecular branching, Journal of the American Chemical Society, vol.97, issue.23, pp.6609-6614, 1975.
DOI : 10.1021/ja00856a001

L. B. Kier and L. H. Et-hall, Molecular connectivity in chemistry and drug research, 1976.

A. Balaban, Highly discriminating distance-based topological index, Chemical Physics Letters, vol.89, issue.5, pp.399-404, 1982.
DOI : 10.1016/0009-2614(82)80009-2

T. W. Heritage, EVA : A novel theoretical descriptor for QSAR studies. Perspectives in Drug Discovery and Design, pp.9-11, 1998.

J. H. Schuur, P. Selzer, and J. Et-gasteiger, The Coding of the Three-Dimensional Structure of Molecules by Molecular Transforms and Its Application to Structure-Spectra Correlations and Studies of Biological Activity, Journal of Chemical Information and Computer Sciences, vol.36, issue.2, pp.36-334, 1996.
DOI : 10.1021/ci950164c

I. Jolliffe, H. Martens, and T. Et-naes, Principal Component Analysis Multivariate calibration Estimation of principal components and related models by iterative least squares, Multivariate Analysis, Krishnaiaah, pp.391-420, 1966.

A. H. Höskuldson, PLS regression methods Ranking a random feature for variable and feature selection 14. Stoppiglia, H. Méthodes statistiques de sélection de modèles neuronaux ; applications financières et bancaires [thèse en ligne], Réseaux de neurones, méthodologie et applications. Paris : Eyrolles, 2ème édition McCulloch, W.S., et Pitts, W. A logical calculus of ideas immanent in nervous activity, pp.211-228, 1943.

S. Geman, E. Bienenstock, and R. Et-doursat, Neural Networks and the Bias/Variance Dilemma, Neural Computation, vol.36, issue.1, pp.1-58, 1992.
DOI : 10.1162/neco.1990.2.1.1

Y. Bengio and Y. Et-grandvalet, No unbiased estimator of the variance of K-fold crossvalidation, Journal of Machine Learning Research, issue.5, pp.1089-1105, 2003.

V. N. Vapnik, The nature of statistical learning theory, 1995.

G. Monari, Sélection de modèles non linéaires par leave-one-out : étude théorique et application des réseaux de neurones au procédé de soudage par points [thèse en ligne], Disponible sur, 1999.

G. Monari and G. Et-dreyfus, Local Overfitting Control via Leverages, Neural Computation, vol.36, issue.6, pp.1481-1506, 2002.
DOI : 10.1162/089976698300017610

URL : https://hal.archives-ouvertes.fr/hal-00922198

V. N. Vapnik, V. Vapnik, and A. Et-chervonenkis, Estimation of dependences based on empirical data On the uniform convergence of relative frequencies of events to their probabilities, Theory of Probability and its Applications, pp.16-264, 1971.

B. E. Boser, I. M. Guyon, and V. N. Et-vapnik, A training algorithm for optimal margin classifiers, Proceedings of the fifth annual workshop on Computational learning theory , COLT '92, pp.144-152, 1992.
DOI : 10.1145/130385.130401

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.103.1189

H. W. Kuhn and A. W. Et-tucker, Nonlinear programming, Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, pp.481-492, 1951.

O. Chapelle, Choosing multiple parameters for support vector machines, Machine Learning, pp.131-159, 2002.

R. Burbidge, Drug design by machine learning: support vector machines for pharmaceutical data analysis, Computers & Chemistry, vol.26, issue.1, pp.5-14, 2001.
DOI : 10.1016/S0097-8485(01)00094-8

A. Micheli, F. Portera, and A. Et-sperduti, QSAR/QSPR Studies by Kernel Machines, Recursive Neural Networks and Their Integration, 14th Italian Workshop on Neural Nets, WIRN, pp.308-315, 2003.
DOI : 10.1007/978-3-540-45216-4_35

E. Byvatov, Comparison of support vector machine and artificial neural network systems for drug/nondrug classification, Journal of Chemical Information and Computer Sciences, issue.6, pp.43-1882, 2003.

K. R. Muller, Classifying ???Drug-likeness' with Kernel-Based Learning Methods, Journal of Chemical Information and Modeling, vol.45, issue.2, pp.249-253, 2005.
DOI : 10.1021/ci049737o

X. J. Yao, Comparative Study of QSAR/QSPR Correlations Using Support Vector Machines, Radial Basis Function Neural Networks, and Multiple Linear Regression, Journal of Chemical Information and Computer Sciences, vol.44, issue.4, pp.1257-1266, 2004.
DOI : 10.1021/ci049965i

P. Lind and T. Et-maltseva, Support Vector Machines for the Estimation of Aqueous Solubility, Journal of Chemical Information and Computer Sciences, vol.43, issue.6, pp.43-1855, 2003.
DOI : 10.1021/ci034107s

T. Gärtner, A survey of kernels for structured data, ACM SIGKDD Explorations Newsletter, vol.5, issue.1, pp.49-58, 2003.
DOI : 10.1145/959242.959248

P. Mahé, Graph Kernels for Molecular Structure???Activity Relationship Analysis with Support Vector Machines, Journal of Chemical Information and Modeling, vol.45, issue.4, pp.939-951, 2005.
DOI : 10.1021/ci050039t

H. Kashima, K. Tsuda, and A. Et-inokuchi, Marginalized kernels between labeled graphs, Twentieth International Conference on Machine Learning, pp.321-328, 2003.

T. Jaakkola, M. Diekhans, and D. Et-haussler, A Discriminative Framework for Detecting Remote Protein Homologies, Journal of Computational Biology, vol.7, issue.1-2, pp.95-114, 2000.
DOI : 10.1089/10665270050081405

C. Leslie, E. Eskin, and W. W. Et-noble, THE SPECTRUM KERNEL: A STRING KERNEL FOR SVM PROTEIN CLASSIFICATION, Biocomputing 2002, pp.564-575, 2002.
DOI : 10.1142/9789812799623_0053

C. Ding and I. Et-dubchak, Multi-class protein fold recognition using support vector machines and neural networks, Bioinformatics, vol.17, issue.4, pp.349-358, 2001.
DOI : 10.1093/bioinformatics/17.4.349

J. Vert, H. Saigo, and T. Et-akutsu, Local alignment kernels for biological sequences, Kernel Methods in Computational Biology, pp.131-154, 2004.

D. K. Haussler, T. Kin, and K. Et-asai, Convolution kernels on discrete structures Rapport technique, UCSC- CRL-99-10 Marginalized kernels for biological sequences, Bioinformatics, pp.18-268, 1999.

P. Mahé, Extensions of marginalized graph kernels From Hopfield nets to recursive networks to graph machines : numerical machine learning for structured data, Twenty-first International Conference on Machine Learning, pp.298-334, 2004.

A. Leo, Calculation of hydrophobic constant (log P) from .pi. and f constants, Journal of Medicinal Chemistry, vol.18, issue.9, pp.865-910, 1975.
DOI : 10.1021/jm00243a001

J. W. Jalowka and T. Et-daubert, Group contribution method to predict critical temperature and pressure of hydrocarbons, Industrial & Engineering Chemistry Process Design and Development, vol.25, issue.1, pp.25-139, 1986.
DOI : 10.1021/i200032a021

T. E. Daubert and R. Et-bartakovits, Prediction of critical temperature and pressure of organic compounds by group contribution, Industrial & Engineering Chemistry Research, vol.28, issue.5, pp.28-638, 1989.
DOI : 10.1021/ie00089a023

R. D. Cramer, D. E. Patterson, J. D. Et-bunce, J. Sadowski, and J. Et-gasteiger, Comparative Molecular Field Analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins From atoms and bonds to three-dimensional atomic coordinates : automatic model builders Kohonen, T. Self-organization and associative memory, A general framework for adaptive processing of data structures. IEEE Transactions on Neural Neworks, pp.110-5959, 1984.

A. Goulon, Predicting activities without computing descriptors: graph machines for QSAR??, SAR and QSAR in Environmental Research, vol.21, issue.1-2, pp.141-153, 2007.
DOI : 10.1093/bioinformatics/17.1.107

A. Goulon, A. Duprat, and G. Et-dreyfus, Graph Machines and Their Applications to Computer-Aided Drug Design: A New Approach to Learning from Structured Data, Lecture Notes in Computer Science, pp.1-19, 2006.
DOI : 10.1007/11839132_1

C. Berge and M. Graphes, Adaptive graphical pattern recognition for the classification of company logos, Paris : Bordas, 3ème édition, pp.34-2049, 1983.

M. Gori, M. Maggini, and L. Et-sarti, A recursive neural network model for processing directed acyclic graphs with labeled edges, Proceedings of the International Joint Conference on Neural Networks, 2003., pp.1351-1355, 2003.
DOI : 10.1109/IJCNN.2003.1223892

T. Joachims, Text categorization with Support Vector Machines: Learning with many relevant features, ECML-98, 10th European Conference on Machine Learning, 1998.
DOI : 10.1007/BFb0026683

M. Collins and . Duffy, New ranking algorithms for parsing and tagging, Proceedings of the 40th Annual Meeting on Association for Computational Linguistics , ACL '02, pp.263-270, 2002.
DOI : 10.3115/1073083.1073128

S. Menchetti, Wide coverage natural language processing using kernel methods and neural networks for structured data, Pattern Recognition Letters, vol.26, issue.12, pp.26-1896, 2005.
DOI : 10.1016/j.patrec.2005.03.011

L. Denoyer and P. Et-gallinari, Bayesian network model for semi-structured document classification, Information Processing & Management, vol.40, issue.5, pp.807-827, 2004.
DOI : 10.1016/j.ipm.2004.04.009

URL : https://hal.archives-ouvertes.fr/hal-01172241

B. Piwowarski and P. Et-gallinari, A Machine Learning Model for Information Retrieval with Structured Documents, Machine Learning and Data Mining in Pattern Recognition, pp.425-438, 2003.
DOI : 10.1007/3-540-45065-3_37

J. Pollack, Recursive distributed representations, Artificial Intelligence, vol.46, issue.1-2, pp.77-106, 1990.
DOI : 10.1016/0004-3702(90)90005-K

A. Sperduti and R. Labeling, Labelling Recursive Auto-associative Memory, Connection Science, vol.1, issue.4, pp.429-459, 1994.
DOI : 10.1016/0004-3702(90)90003-I

C. Goller and A. Et-küchler, Learning task-dependent distributed structure representations by backpropagation through structure Hammer, B. On the approximation capability of recurrent neural networks Hammer, B. Recurrent networks for structured data -A unifying approach and its properties, IEEE International Conference on Neural Networks 67. Hammer, B. Learning with recurrent neural networks, in Lecture Notes in Control and Information Sciences, pp.347-352, 1996.

C. Jochum and J. Et-gasteiger, Canonical Numbering and Constitutional Symmetry, Journal of Chemical Information and Modeling, vol.17, issue.2, pp.113-117, 1977.
DOI : 10.1021/ci60010a014

D. Weininger, SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules, Journal of Chemical Information and Modeling, vol.28, issue.1, pp.31-36, 1988.
DOI : 10.1021/ci00057a005

D. Weininger, A. Weininger, and J. L. Et-weininger, SMILES. 2. Algorithm for generation of unique SMILES notation, Journal of Chemical Information and Modeling, vol.29, issue.2, pp.97-101, 1989.
DOI : 10.1021/ci00062a008

M. T. Cronin, Comparative assessment of methods to develop QSARs for the prediction of the toxicity of phenols to Tetrahymena pyriformis, Chemosphere, vol.49, issue.10, 2002.
DOI : 10.1016/S0045-6535(02)00508-8

A. Logp, W. M. Et-howard, and P. , Advanced Chemistry Development Atom/fragment contribution method for estimating octanol-water partition coefficients, Journal of Pharmaceutical Sciences, issue.10, pp.84-83, 1995.

S. Kowwin-software, C. Src-logkow-version, and A. J. Et-leo, Syracuse Research Corporation Substituent constants for correlation analysis in chemistry and biology Reliable assessment of logP of compounds of pharmaceutical relevance Simulating lipophilicity of organic molecules with a backpropagation neural network, SAR and QSAR in Environmental Research Devillers, J. Journal of Pharmaceutical Sciences, vol.166, issue.879, pp.371-380, 1979.

A. Breindl, B. Beck, and T. Et-clark, Prediction of the n-Octanol/Water Partition Coefficient, logP, Using a Combination of Semiempirical MO-Calculations and a Neural Network, Journal of Molecular Modeling, vol.3, issue.3, pp.142-155, 1997.
DOI : 10.1007/s008940050027

A. F. Duprat, T. Huynh, T. Et-dreyfus, and G. , Toward a Principled Methodology for Neural Network Design and Performance Evaluation in QSAR. Application to the Prediction of LogP, Journal of Chemical Information and Computer Sciences, vol.38, issue.4, pp.38-586, 1998.
DOI : 10.1021/ci980042v

N. Bodor, Z. Gabanyi, and C. Et-wong, A new method for the estimation of partition coefficient, Journal of the American Chemical Society, vol.111, issue.11, pp.3783-3786, 1989.
DOI : 10.1021/ja00193a003

S. Cabani, Group contributions to the thermodynamic properties of non-ionic organic solutes in dilute aqueous solution, Journal of Solution Chemistry, vol.7, issue.8, pp.10-563, 1981.
DOI : 10.1007/BF00646936

L. Bernazzani, Predicting Physical???Chemical Properties of Compounds from Molecular Structures by Recursive Neural Networks, Physical Properties Database [en ligne]. Syracuse Research Corporation (SRC), pp.2030-2042, 2006.
DOI : 10.1021/ci060104e

A. Disponible-sur-micheli, A novel approach to QSPR/QSAR based on neural networks for structures, Soft Computing Approaches in Chemistry, Cartwright, H. et Sztandera, pp.265-296, 2003.

C. Rücker, M. Meringer, A. M. Et-kerber, and P. C. Et-jurs, QSPR Using MOLGEN-QSPR:??? The Example of Haloalkane Boiling Points, Journal of Chemical Information and Computer Sciences, vol.44, issue.6, pp.2070-2076, 1995.
DOI : 10.1021/ci049802u

D. T. Stanton, Development of a Quantitative Structure???Property Relationship Model for Estimating Normal Boiling Points of Small Multifunctional Organic Molecules, Journal of Chemical Information and Computer Sciences, vol.40, issue.1, pp.81-90, 2000.
DOI : 10.1021/ci990311x

M. Shamsipur, Highly correlating distance/connectivity-based topological indices, Journal of Molecular Graphics and Modelling, vol.27, issue.4, pp.882-910, 2005.
DOI : 10.1016/j.jmgm.2008.09.005

X. Chen, Prediction of aqueous solubility of organic compounds using a quantitative structure???property relationship, Journal of Pharmaceutical Sciences, vol.91, issue.8, pp.91-1838, 2002.
DOI : 10.1002/jps.10178

P. D. Mosier, P. C. Et-jurs, and . Qsar, QSAR/QSPR Studies Using Probabilistic Neural Networks and Generalized Regression Neural Networks, Journal of Chemical Information and Computer Sciences, vol.42, issue.6, pp.42-1460, 2002.
DOI : 10.1021/ci020039i

D. Mackay, W. Y. Shiu, K. C. Et-ma, and A. R. Katritzky, Illustrated handbook of physical-chemical properties and environmental fate for organic chemicals QSPR studies on vapor pressure, aqueous solubility, and the prediction of water-air partition coefficients, Boca Raton Journal of Chemical Information and Computer Sciences, vol.14, pp.38-720, 1992.

H. E. Mcclelland and P. C. Et-jurs, Quantitative structure-property relationships for the prediction of vapor pressures of organic compounds from molecular structures, Journal of Chemical Information and Computer Sciences, pp.40-967, 2000.

C. K. Liang, D. A. Et-gallagher, and R. G. Clements, QSPR Prediction of Vapor Pressure from Solely Theoretically-Derived Descriptors, Journal of Chemical Information and Computer Sciences, vol.38, issue.2, pp.321-324, 1988.
DOI : 10.1021/ci970289c

M. Pintore, Predicting Toxicity against the fathead Minnow by Adaptive Fuzzy Partition, QSAR & Combinatorial Science, vol.40, issue.2, pp.210-219, 2003.
DOI : 10.1002/qsar.200390014

K. L. Kaiser and S. P. Et-niculescu, Using probabilistic neural networks to model the toxicity of chemicals to the fathead minnow (Pimephales promelas): A study based on 865 compounds, Chemosphere, vol.38, issue.14, pp.38-3231, 1999.
DOI : 10.1016/S0045-6535(99)00553-6

C. L. Russom, Predicting modes of toxic action from chemical structure : acute toxicity in the fathead minnow (Pimephales promelas), Environmental Toxicology and Chemistry, issue.5, pp.16-948, 1997.

L. Dinan, R. E. Hormann, and T. Et-fujimoto, An extensive ecdysteroid CoMFA, Journal of Computer-Aided Molecular Design, vol.13, issue.2, pp.185-207, 1999.
DOI : 10.1023/A:1008052320014

M. Ravi, 4D-QSAR Analysis of a Set of Ecdysteroids and a Comparison to CoMFA Modeling, Journal of Chemical Information and Computer Sciences, vol.41, issue.6, pp.41-1587, 2001.
DOI : 10.1021/ci010076u

R. E. Hormann, L. Dinan, and P. Et-whiting, Superimposition evaluation of ecdysteroid agonist chemotypes through multidimensional QSAR, Journal of Computer-Aided Molecular Design, pp.17-135, 2003.

S. J. Swamidass, Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity, Bioinformatics, vol.21, issue.Suppl 1, pp.359-368, 2005.
DOI : 10.1093/bioinformatics/bti1055

K. C. Nicolaou, Molecular Design and Chemical Synthesis of a Highly Potent Epothilone, ChemMedChem, vol.82, issue.1, pp.41-44, 2006.
DOI : 10.1002/cmdc.200500056

. Ainsi, pour la molécule de 3-méthylbut-2-én-1-ol, les classes d'équivalence sont les suivantes

. Atomes-de-la-classe, aucun des critères ne permet de les départager, les numéros sont attribués au hasard. On obtient donc la numérotation suivante : N (b) = 1 ; N (a) = 2 ; N (f ) = 3, N (e) =, vol.3, issue.4 5

A. Goulon, A. Duprat, and G. Et-dreyfus, From Hopfield nets to recursive networks to graph machines: numerical machine learning for structured data, Annexe Theoretical Computer Science, vol.2, issue.3442-3, pp.298-334, 2005.

A. Goulon, A. Duprat, and G. Et-dreyfus, Learning numbers from graphs, International Symposium on Applied Stochastic Models and Data Analysis (ASMDA), 2005.

A. Goulon, A. Duprat, and G. Et-dreyfus, Graph Machines and Their Applications to Computer-Aided Drug Design: A New Approach to Learning from Structured Data, Lecture Notes in Computer Science, pp.1-19, 2006.
DOI : 10.1007/11839132_1

A. Goulon, T. Picot, A. Duprat, and G. Et-dreyfus, Predicting activities without computing descriptors: graph machines for QSAR??, SAR and QSAR in Environmental Research, vol.21, issue.1-2, pp.141-153, 2007.
DOI : 10.1093/bioinformatics/17.1.107