Algorithms on Strings, Trees, and Sequences [Gusfield 1997-05-28].pdf

(4725 KB) Pobierz
Algorithms
on
Strings, Trees,
and
Sequences
COMPUTER SCIENCE AND COMPUTATIONAL
BIOLOGY
Dan
Gusfield
University of
California,
Davis
C
AMBRIDGE
UNIVERSITY
PRESS
Contents
Preface
xiii
I
Exact String Matching: The Fundamental String Problem
1
Exact Matching: Fundamental Preprocessing and First Algorithms
1.1
1.2
1.3
1.4
1.5
1.6
The naive method
The preprocessing approach
Fundamental preprocessing of the pattern
Fundamental preprocessing in linear time
The simplest linear-time exact matching algorithm
Exercises
2
Exact Matching: Classical Comparison-Based Methods
2.1
2.2
2.3
2.4
2.5
Introduction
The Boyer-Moore Algorithm
The Knuth-Morris-Pratt algorithm
Real- time string matching
Exercises
3
Exact Matching: A Deeper Look at Classical Methods
3.1
3.2
3.3
3.4
3.5
3.6
3.7
A Boyer-Moore variant with a "simple" linear time bound
Cole's linear worst-case bound for Boyer-Moore
The original preprocessing for Knuth-Momis-Pratt
Exact matching with
a
set of patterns
Three applications of exact set matching
Regular expression pattern matching
Exercises
4
Seminumerical String Matching
4.1
4.2
4.3
4.4
4.5
Arithmetic versus comparison-based methods
The
Shift-And
method
The match-count problem and Fast Fourier Transform
Karp-Rabin fingerprint methods for exact match
Exercises
CONTENTS
8.10
8.11
For the purists:
how
to avoid bit-level operations
Exercises
9
More Applications of Suffix Trees
Longest common extension: a bridge to inexact matching
Finding all maximal palindromes in linear time
Exact matching with wild cards
The k-mismatch problem
Approximate palindromes and repeats
Faster methods for tandem repeats
A linear-time solution to the multiple common substring-problem
Exercises
III
Inexact Matching, Sequence Alignment, Dynamic Programming
10 The Importance of (Sub)sequence Comparison in Molecular Biology
11
Core String Edits, Alignments, and Dynamic Programming
Introduction
The edit distance between two strings
Dynamic programming calculation of edit distance
Edit graphs
Weighted edit distance
String similarity
Local alignment: finding substrings of high similarity
Gaps
Exercises
12
Refining Core String Edits and Alignments
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
Computing alignments in only linear space
Faster algorithms when the number of differences are bounded
Exclusion methods: fast expected running time
Yet more suffix trees and more hybrid dynamic programming
A faster
(combinatorial)
algorithm for longest common subsequence
Convex gap weights
The Four-Russians speedup
Exercises
13
Extending the Core Problems
13.1
13.2
13.3
13.4
Parametric sequence alignment
Computing suboptimal alignments
Chaining diverse local alignments
Exercises
14
Multiple String Comparison
-
The Holy Grail
14.1
14.2
14.3
Why multiple string comparison?
Three "big-picture" biological uses for
multiple
string comparison
Family and superfamily representation
Zgłoś jeśli naruszono regulamin