-
Stretch Profile: A pruning technique to accelerate DNA sequence search
- Back
Document Title
Stretch Profile: A pruning technique to accelerate DNA sequence search
Author
Khitmoh N., Smanchat S., Tongsima S.
Name from Authors Collection
Affiliations
Faculty of Information Technology and Digital Innovation, King Mongkut ’s University of Technology North Bangkok, Bangkok, Thailand; National Biobank of Thailand National Center for Genetic Engineering and Biotechnology National Science and Technology Development AgencyPathum Thani, Thailand
Type
Article
Source Title
Informatics in Medicine Unlocked
ISSN
23529148
Year
2020
Volume
19
Open Access
Gold
Publisher
Elsevier Ltd
DOI
10.1016/j.imu.2020.100323
Format
Abstract
DNA sequence similarity search has been used by scientists to facilitate biological research. Over the years, more sequences are added to databases, making them constantly larger. Existing sequence search techniques usually focus on the improvement of sequence search algorithms to make the search faster, ignoring the possibility of reducing unrelated sequences from the search. This paper presents a pruning technique to accelerate DNA sequence search based on a novel Stretch Profile created from stretches of consecutive base characters: A-Stretch, C-Stretch, G-Stretch, and T-Stretch. The Stretch Profile is pre-generated for each sequence in a sequence database. During the search, the Stretch Profile of the query sequence is generated for comparison. The sequences in the database whose profiles do not match the Stretch Profile of the query sequence are excluded from the search, resulting in the reduction of search space, and consequently, search time. For evaluation, we compare sequence retrievals from the Greengenes database and processing time when using only BLAST and when using the proposed pruning technique with BLAST. The results show that the proposed pruning technique can reduce the search time by 30.43% up to 63.74% depending on the length of input query, while maintaining a sensitivity of 1.00 when compared to the result of the original BLAST search. © 2020 The Authors
Keyword
Pruning | Sequence profiling | Sequence retrieval | Sequence search
Industrial Classification
Knowledge Taxonomy Level 1
Knowledge Taxonomy Level 2
Knowledge Taxonomy Level 3
License
cc BY-NC-ND
Rights
Author
Publication Source
Scopus
Note
Full text