|Research Area:||Protein Structure||Year:||2009|
|Type of Publication:||In Proceedings||Keywords:||remote homologs, sequence search, protein homology|
|Book title:||The 2009 International Conference on Bioinformatics and Computational Biology (BIOCOMP 09)|
|Journal's acceptance rate (%):||27%|
Las Vegas, NV
Current biological sequence comparison tools frequently fail to recognize matches between homologs when sequence similarity is below the twilight zone of less than 25% sequence identity. By combining sequence properties and position specific scoring matrices, improved accuracy in remote homology detection is realized. This paper extends the work of Propsearch, a sequence-property-based approach to sequence searching, by incorporating a population adaptive genetic algorithm that makes use of position specific scoring matrices in feature calculation. Optimized feature weights are obtained by training a genetic algorithm and used to find homologs to a query sequence. Databases with less than 10%, 20%, and 30% sequence similarity are used to test the remote homology detector. Comparisons are made between the optimized remote homology detector and other sequence similarity programs in both accuracy and time complexity. Future considerations for position specific scoring matrices based on the original genetic algorithm are also proposed.