Abstract
Nonlinear similarity functions are often better than linear functions at distinguishing interesting subalignments from those due to chance. Nonlinear similarity functions useful for comparing biological sequences are developed. Several new algorithms are presented for finding locally optimal subalignments of two sequences. Unlike previous algorithms, they may use any reasonable similarity function as a selection criterion. Among these algorithms are VV-1, which finds all and only the locally optimal subalignments of two sequences, and CC-1, which finds all and only the weakly locally optimal subalignments of two sequences. The VV-1 algorithm is slow and interesting only for theoretical reasons. In contrast, the CC-1 algorithm has average time complexityO(MN) when used to find only very good subalignments.
Similar content being viewed by others
Literature
Altschul, S. F. and B. W. Erickson. 1986a. “A Nonlinear Measure of Subalignment Similarity and its Significance Levels.”Bull. math. Biol. 48, 617–632.
— 1986b. “Optimal Sequence Alignment Using Affine Gap Costs.”Bull. math. Biol. 48, 603–616.
Erickson, B. W. and P. H. Sellers, 1983. “Recognition of Patterns in Genetic Sequences.” InTime Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, D. Sankoff and J. B. Kruskal (Eds), pp. 55–91. Reading, MA: Addison-Wesley.
—, L. T. May and P. B. Sehgal. 1984. “Internal Duplication in Human Alpha-1 and Beta-Interferon.”Proc. natn. Acad. Sci. U.S.A. 81, 7171–7175.
Fitch, W. M. and T. F. Smith. 1983. “Optimal Sequence Alignments.”Proc. natn. Acad. Sci. U.S.A. 80, 1382–1386.
Goad, W. B. and M. I. Kanehisa. 1982. “Pattern Recognition in Nucleic Acid Sequences. I. A General Method for Finding Local Homologies and Symmetries.”Nucl. Acids Res 10, 247–263.
Gotoh, O. 1982. “An Improved Algorithm for Matching Biological Sequences.”J. molec. Biol. 162, 705–708.
Needleman, S. B. and C. D. Wunsch. 1970. “A General Method Applicable to the Search for Similarities in the Amino Acid Sequences of Two Proteins.”J. molec. Biol. 48, 443–453.
Reich, J. G., H. Drabsch and A. Daumler. 1984. “On the Statistical Assessment of Similarities in DNA Sequences.”Nucl. Acids Res. 12, 5529–5543.
Schwartz, R. M. and M. O. Dayhoff. 1978 “Matrices for Detecting Distant Relationships.” InAtlas of Protein Sequence and Structure, Vol. 5, Suppl. 3, M. O. Dayhoff (Ed.), pp. 345–358. Washington, DC: National Biomedical Research Foundation.
Sellers, P. H. 1974. “On the Theory and Computation of Evolutionary Distances.”SIAM J. appl. Math. 26, 787–793.
—. 1984. “Pattern Recognition in Genetic Sequences by Mismatch Density.”Bull. math. Biol. 46, 501–514.
Sellers, P. H. 1986. Lecture presented at the 827th Meeting of the American Mathematical Society, Baltimore, May 1986.
Smith, T. F. and M. S. Waterman. 1981. “Comparison of Biosequences.”Adv. Appl. Math. 2, 482–489.
Taylor, P. 1984. “A Fast Homology Program for Aligning Biological Sequences.”Nucl. Acids Res.12, 447–455.
Waterman, M. S. 1984a. “General Methods of Sequence Comparison.”Bull. math. Biol. 46, 473–500.
— 1984b. “Efficient Sequence Alignment Algorithms.”J. theor. Biol. 108, 333–337.
—, T. F. Smith and W. A. Beyer. 1976. “Some Biological Sequence Metrics.”Adv. Math. 20, 367–387.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Altschul, S.F., Erickson, B.W. Locally optimal subalignments using nonlinear similarity functions. Bltn Mathcal Biology 48, 633–660 (1986). https://doi.org/10.1007/BF02462328
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02462328