Neural Networks

Abstract

Artificial neural networks are a form of machine learning, inspired by biological neural networks, which can be trained by examples to recognise patterns, classify data and make predictions. They have been used in a wide variety of problems in the fields of life sciences, such as predicting protein secondary structures, classifying protein families and identifying single nucleotide polymorphism. Problems in life sciences in general and in bioinformatics in particular present challenges to the design and training of neural networks and to the interpretation of networks outputs as well, including complex structured data and limited amount of training examples. In addressing these issues and limitations, significant progress has been made, including ways to design the neural networks to incorporate prior knowledge, to make automatic formulation of network architecture for optimal performance and to construct hybrid systems to leverage advantages of various learning methods.

Key Concepts:

  • Artificial neural networks are a means of mapping input data to output data in such a manner as to perform complex functions such as pattern recognition, classification, prediction and knowledge extraction, that are increasingly useful in bioinformatics applications.

  • There are numerous kinds of neural networks, classified by their architecture (number of layers, number of processing units, interconnections, etc.) and how they are trained to perform.

  • Neural networks have been successfully applied to numerous areas of genomic and proteomic research; the number of such applications has grown explosively in the past few years.

  • One of the most difficult challenges in constructing and training neural networks is often in determining the content and features of data to be used for training and ultimately using the neural networks, that is feature extraction, preprocessing.

  • Hybrid systems, which combine neural networks with other artificial intelligence and statistical techniques, are a promising area of research.

Keywords: artificial intelligence; classification; encoding; sequence analysis; gene recognition

Figure 1.

A simple, multilayer neural network. The units marked f perform weighted sums and threshold functions. Input units simply pass the input data to the hidden layer. Desired output results (target values) are supplied during training for comparison with the actual network output. Weights, partially shown as W11, W12, etc., are adjusted to minimise the error function. Once trained, a network no longer requires the target values and may be used to process new information.

Figure 2.

Schema of application of neural networks to genome informatics: important features are extracted from DNA or protein sequences, encoded into a vector of real numbers, and then applied to a neural network for training or prediction. Output results may also be manipulated and finally results for given applications evaluated or utilised.

close

References

Abraham A (2004) Meta learning evolutionary artificial neural networks. Neurocomputing 56: 1–38.

Albrecht S, Busch J, Kloppenburg M, Metze F and Tavan P (2000) Generalized radial basis function networks for classification and novelty detection: self‐organization of optimal Bayesian decision. Neural Networks: The Official Journal of the International Neural Network Society 13(10): 1075–1093.

Augusteijn MF and Shaw KA (2000) Radical pruning: a method to construct skeleton radial basis function networks. International Journal of Neural Systems 10(2): 143–154.

Azuaje F (2001) A computational neural approach to support the discovery of gene function and classes of cancer. IEEE Transactions on Bio‐medical Engineering 48(3): 332–339.

Baldi P, Brunak S, Chauvin Y, Andersen CA and Nielsen H (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5): 412–424.

Baldi P, Brunak S, Frasconi P, Soda G and Pollastri G (1999) Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 15(11): 937–946.

Capriotti E, Fariselli P and Casadio R (2004) A neural‐network‐based method for predicting protein stability changes upon single point mutations. Bioinformatics 20: i63–i68.

Craven MW and Shavlik JW (1997) Understanding time series networks: a case study in rule extraction. International Journal of Neural Systems 8(4): 373–384.

Cristianini N and Shawe‐Taylor J (2000) An introduction to support vector machines. Cambridge, UK: Cambridge University Press.

Elizondo DA and Gongora MA (2005) Current trends on knowledge extraction and neural networks. Lecture Notes in Computer Science 3697: 485–490.

Emanuelsson O, Nielsen H, Brunak S and von Heijne G (2000) Predicting subcellular localization of proteins based on their N‐terminal amino acid sequence. Journal of Molecular Biology 300(4): 1005–1016.

Franscoli P (2005) Neural networks and kernel machines for vector and structured data. In: Helma C (ed.) Predictive Toxicology, pp. 255–308. Boca Raton, FL: Taylor & Francis Group, LLC.

Hinton GE and Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313: 504–507.

Krogh A and Riis SK (1999) Hidden neural networks. Neural Computation 11(2): 541–563.

Lampinen J and Vehtari A (2001) Bayesian approach for neural networks – review and case studies. Neural Networks: The Official Journal of the International Neural Network Society 14(3): 257–274.

Lara J, Wohlhueter RM, Dimitrova Z and Khudyakov YE (2008) Artificial neural network for prediction of antigenic activity for a major conformational epitope in the hepatitis C virus NS3 protein. Bioinformatics 24: 1858–1864.

Lin K, Simossis VA, Taylor WR and Heringa J (2005) A simple and fast secondary structure prediction method using hidden neural networks. Bioinformatics 21: 152–159.

Liu L, Ho YK and Yau S (2007) Prediction of primate splice site using inhomogeneous Markov chain and neural network. DNA and Cell Biology 26(7): 477–483.

Mann S, Li J and Chen YPP (2007) A pHMM‐ANN based discriminative approach to promoter identification in prokaryote genomic contexts. Nucleic Acids Research 35(2): e12.

Motsinger AA, Lee SL, Mellick G and Ritchie MD (2006) GPNN: Power studies and applications of a neural network method for detecting gene‐gene interactions in studies of human disease. BMC Bioinformatics 7: 39.

Pasquier C and Hamodrakas SJ (1999) An hierarchical artificial neural network system for the classification of transmembrane proteins. Protein Engineering 12(8): 631–634.

Petersen TN, Lundegaard C, Nielsen M et al. (2000) Prediction of protein secondary structure at 80% accuracy. Proteins 41(1): 17–20.

Qian N and Sejnowski TJ (1988) Predicting the secondary structure of globular proteins using neural network models. Journal of Molecular Biology 202(4): 865–884.

Ritchie M, Motsinger A, Bush W, Coffey C and Moore J (2007) Genetic programming neural networks: a powerful bioinformatics tool for human genetics. Applied Soft Computing 7(1): 471–479.

Ritchie MD, White BC, Parker JS, Hahn LW and Moore JH (2003) Optimization of neural network architecture using genetic programming improves detection and modeling of gene‐gene interactions in studies of human diseases. BMC Bioinformatics 4: 28.

Schneider G (2000) Neural networks are useful tools for drug design. Neural Networks: The Official Journal of the International Neural Network Society 13(1): 15–16.

Schneider G, Schrodl W, Wallukat G et al. (1998) Peptide design by artificial neural networks and computer‐based evolutionary search. Proceedings of the National Academy of Sciences of the USA 95(21): 12179–12184.

Stahl M, Taroni C and Schneider G (2000) Mapping of protein surface cavities and prediction of enzyme class by a self‐organizing neural network. Protein Engineering 13(2): 83–88.

Stormo GD, Schneider TD, Gold L and Ehrenfeucht A (1982) Use of the perceptron algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Research 10: 2997–3011.

Uberbacher EC, Xu Y and Mural RJ (1996) Discovering and understanding genes in human DNA sequence using GRAIL. Methods in Enzymology 266: 259–281.

Unneberg P, Strömberg M and Sterky F (2005) SNP discovery using advanced algorithms and neural networks. Bioinformatics 21: 2528–2530.

Vohradsky J (2001) Neural network model of gene expression. FASEB Journal: Official Publication of the Federation of American Societies for Experimental Biology 15(3): 846–854.

Wu CH (1996) Gene classification artificial neural system. Methods in Enzymology 266: 71–88.

Wu CH, Whitson G, McLarty J, Ermongkonchai A and Chang TC (1992) Protein classification artificial neural system. Protein Science: A Publication of the Protein Society 1(5): 667–677.

Wu CH, Zhao S, Chen HL, Lo CJ and McLarty J (1996) Motif identification neural design for rapid and sensitive protein family search. Computer Applications in the Biosciences 12: 109–118.

Yang ZR (2005) Prediction of caspase cleavage sites using Bayesian bio‐basis function neural networks. Bioinformatics 21: 1831–1837.

Yang ZR and Thomson R (2005) Bio‐basis function neural network for prediction of protease cleavage sites in proteins. IEEE Transactions on Neural Networks 16: 263–274.

Further Reading

Wu CH and McLarty JW (2000) Neural Networks and Genome Informatics. Included in series: Methods in Computational Biology and Biochemistry. Volume 1, Series Editor Konopka AK. Amsterdam: Elsevier Science.

Contact Editor close
Submit a note to the editor about this article by filling in the form below.

* Required Field

How to Cite close
Wu, Cathy H, McLarty, Jerry W, and Liao, Li(Sep 2010) Neural Networks. In: eLS. John Wiley & Sons Ltd, Chichester. http://www.els.net [doi: 10.1002/9780470015902.a0005268.pub2]