Dynamic protein classification: Adaptive models based on incremental learning strategies

DSpace/Manakin Repository

Show simple item record

dc.contributor.author Mohamed, Shakir
dc.date.accessioned 2008-03-18T09:27:41Z
dc.date.available 2008-03-18T09:27:41Z
dc.date.issued 2008-03-18T09:27:41Z
dc.identifier.uri http://hdl.handle.net/10539/4678
dc.description.abstract Abstract One of the major problems in computational biology is the inability of existing classification models to incorporate expanding and new domain knowledge. This problem of static classification models is addressed in this thesis by the introduction of incremental learning for problems in bioinformatics. The tools which have been developed are applied to the problem of classifying proteins into a number of primary and putative families. The importance of this type of classification is of particular relevance due to its role in drug discovery programs and the benefit it lends to this process in terms of cost and time saving. As a secondary problem, multi–class classification is also addressed. The standard approach to protein family classification is based on the creation of committees of binary classifiers. This one-vs-all approach is not ideal, and the classification systems presented here consists of classifiers that are able to do all-vs-all classification. Two incremental learning techniques are presented. The first is a novel algorithm based on the fuzzy ARTMAP classifier and an evolutionary strategy. The second technique applies the incremental learning algorithm Learn++. The two systems are tested using three datasets: data from the Structural Classification of Proteins (SCOP) database, G-Protein Coupled Receptors (GPCR) database and Enzymes from the Protein Data Bank. The results show that both techniques are comparable with each other, giving classification abilities which are comparable to that of the single batch trained classifiers, with the added ability of incremental learning. Both the techniques are shown to be useful to the problem of protein family classification, but these techniques are applicable to problems outside this area, with applications in proteomics including the predictions of functions, secondary and tertiary structures, and applications in genomics such as promoter and splice site predictions and classification of gene microarrays. en
dc.format.extent 13384281 bytes
dc.format.mimetype application/pdf
dc.language.iso en en
dc.subject bioinformatics en
dc.subject protein classification en
dc.subject neural networks en
dc.subject fuzzy ARTMAP en
dc.subject incremental learning en
dc.title Dynamic protein classification: Adaptive models based on incremental learning strategies en
dc.type Thesis en


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search WIReDSpace


Browse

My Account

Statistics