SCOP Family Fingerprints: An Information Theoretic Approach to Structural Classification of Protein Domains

Abstract

Protein domain classification is a useful instrument to deduce functional properties of proteins. Several databases have been introduced that collect domains having a known structure, and SCOP is probably the most used one. It classifies domains in a four level hierarchy and it groups sequences according to both structural similarity and phylogenetic relation. Many automatic tools to classify domains according to available databases have been proposed so far. In this paper we introduce the notion of "fingerprint" as an easy and readable digest of the similarities between a sequence and an entire set of sequences, and this concept offers us a rationale for building an automatic SCOP classifier which assigns a query sequence to the most likely family. Fingerprint-based analysis has been implemented in a software tool and we report some experimental validations for it.

Publication
PROCEEDINGS IEEE INTERNATIONAL CONFERENCE OF BIOINFORMATICS AND BIOMEDICINE (WORKSHOPS)

Related