||Author(s): S. Diplaris, G. Tsoumakas, P. Mitkas, I. Vlahavas.
Title: “Protein Classification with Multiple Algorithms”.
Click here to download the PDF (Acrobat Reader) file (10 pages).
10th Panhellenic Conference on Informatics (PCI 2005), P. Bozanis and E.N. Houstis (Eds.), Springer-Verlag, LNCS 3746, pp. 448-456, Volos, Greece, 11-13 November, 2005.
Abstract: Nowadays, the number of protein sequences being stored in central
protein databases from labs all over the world is constantly increasing. From
these proteins only a fraction has been experimentally analyzed in order to detect
their structure and hence their function in the corresponding organism. The
reason is that experimental determination of structure is labor-intensive and
quite time-consuming. Therefore there is the need for automated tools that can
classify new proteins to structural families. This paper presents a comparative
evaluation of several algorithms that learn such classification models from data
concerning patterns of proteins with known structure. In addition, several approaches
that combine multiple learning algorithms to increase the accuracy of
predictions are evaluated. The results of the experiments provide insights that
can help biologists and computer scientists design high-performance protein
classification systems of high quality