Two Efficient Techniques to Find Approximate Overlaps between Sequences
المؤلف
المصدر
العدد
المجلد 2017، العدد 2017 (31 ديسمبر/كانون الأول 2017)، ص ص. 1-8، 8ص.
الناشر
Hindawi Publishing Corporation
تاريخ النشر
2017-02-15
دولة النشر
مصر
عدد الصفحات
8
التخصصات الرئيسية
الملخص EN
The next-generation sequencing (NGS) technology outputs a huge number of sequences (reads) that require further processing.
After applying prefiltering techniques in order to eliminate redundancy and to correct erroneous reads, an overlap-based assembler typically finds the longest exact suffix-prefix match between each ordered pair of the input reads.
However, another trend has been evolving for the purpose of solving an approximate version of the overlap problem.
The main benefit of this direction is the ability to skip time-consuming error-detecting techniques which are applied in the prefiltering stage.
In this work, we present and compare two techniques to solve the approximate overlap problem.
The first adapts a compact prefix tree to efficiently solve the approximate all-pairs suffix-prefix problem, while the other utilizes a well-known principle, namely, the pigeonhole principle, to identify a potential overlap match in order to ultimately solve the same problem.
Our results show that our solution using the pigeonhole principle has better space and time consumption over an FM-based solution, while our solution based on prefix tree has the best space consumption between all three solutions.
The number of mismatches (hamming distance) is used to define the approximate matching between strings in our work.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Haj Rachid, Maan. 2017. Two Efficient Techniques to Find Approximate Overlaps between Sequences. BioMed Research International،Vol. 2017, no. 2017, pp.1-8.
https://search.emarefa.net/detail/BIM-1135093
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Haj Rachid, Maan. Two Efficient Techniques to Find Approximate Overlaps between Sequences. BioMed Research International No. 2017 (2017), pp.1-8.
https://search.emarefa.net/detail/BIM-1135093
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Haj Rachid, Maan. Two Efficient Techniques to Find Approximate Overlaps between Sequences. BioMed Research International. 2017. Vol. 2017, no. 2017, pp.1-8.
https://search.emarefa.net/detail/BIM-1135093
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references
رقم السجل
BIM-1135093
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر