A New Similarity Measure among Protein Sequences

Kuen-Pin Wu; Hsin-Nan Lin; Ting-Yi Sung; Wen-Lian Hsu

doi:10.1109/CSB.2003.1227335

Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003

A New Similarity Measure among Protein Sequences

Year: 2003, Pages: 347

DOI Bookmark: 10.1109/CSB.2003.1227335

Authors

Kuen-Pin Wu, Academia Sinica
Hsin-Nan Lin, Academia Sinica
Ting-Yi Sung, Academia Sinica
Wen-Lian Hsu, Academia Sinica

Abstract

Protein sequence analysis is an important tool to decode the logic of life. One of the most important similarity measures in this area is the edit distance between amino acids of two sequences. We believe this criterion should be reconsidered because protein features are probably associated more with small peptide fragments than with individual amino acids. In this paper, we design small patterns that are associated with highly conversed regions among a set of protein sequences. These patterns are used analogous to the index terms in information retrieval. Therefore, we do not consider gaps within patterns. This new similarity measure has been applied to phylogenetic tree construction, protein clustering and protein secondary structure prediction and has produced promising results.

Like what you’re reading?

Already a member?

Get this article FREE with a new membership!

A Geometric Representation of Protein Sequences
2007 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2007)
Pattern recognition in the prediction of protein structural class
1993 The Twenty-sixth Hawaii International Conference on System Sciences
Hubs and Non-hubs in Protein-Protein Interaction Networks: A Biophysical Interpretation
2012 23rd International Workshop on Database and Expert Systems Applications (DEXA)
A Scatterplot-Based Visual Analytics Tool for Protein Pocket Properties
2013 International Conference on Cyberworlds (CW)
The Similarity Comparison of G-Protein Coupled Receptor Based on Structural Matrix Algorithm
2010 International Conference on Computational and Information Sciences
Graphical Representation and Similarity Analysis of Protein Sequences Based on Fractal Interpolation
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Discovery of Spatially Cohesive Itemsets in Three-Dimensional Protein Structures
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Identifying Dominant Amino Acid Pairs of Known Protein-Protein Interactions via K-Means Clustering
2017 IEEE 19th International Conference on High Performance Computing and Communications; IEEE 15th International Conference on Smart City; IEEE 3rd International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
Deep Robust Framework for Protein Function Prediction Using Variable-Length Protein Sequences
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Clustering for Protein Representation Learning
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

A New Similarity Measure among Protein Sequences

Authors

Abstract

Related Articles