Protein function Motif extraction based on single function category sequence alignment in yeast

Document Type : Original Article

Author

College of Engineering, Modern University for Technology and Information, Cairo, Egypt.

Abstract

Protein function prediction is one of the most vital problems in the field of proteomics since it leads to determining cell functions and identifying the diseases and their effect. Since proteome is divided into clusters, each cluster (group of proteins) should have common characteristics. One of these characteristics is to have the same functions. In this study we try to extract motifs for each sub-function category of yeast proteins. The technique is based on applying multiple sequence alignment (MSA) to all yeast protein function categories. The protein sequences are collected from different data sources as DIP, PIR, and SWISS PROT and CLC program is used to apply the sequence alignment. The technique is applied to proteins having single function only that because multi function proteins can be affected by the functions correlation. Threshold is determined for every protein function category to indicate the most common frequent amino acids to be a feature for this category. After implementing the algorithm, sequence is verified with some proteins have the correct functions and the gained results are good. The technique is considered as verification method for protein function prediction. And reference database table is constructed based on the extracted motifs.

Keywords