GLORIA — GEOMAR Library Ocean Research Information Access

Hits per page

hits 1 - 5 | 5 hits

Sorting

Electronic Resource

Extending the trend vector: The trend matrix and sample-based partial least squares (1994)

Sheridan, Robert P. ; Nachbar, Robert B. ; Bush, Bruce L.

Springer

Journal of computer aided molecular design 8 (1994), S. 323-340

add to mindlist on the mindlist

Details

ISSN: 1573-4951

Keywords: Atom pairs ; PLS ; SAMPLS ; Topological descriptors ; QSAR

Source: Springer Online Journal Archives 1860-2000

Topics: Chemistry and Pharmacology

Notes: Summary Trend vector analysis [Carhart, R.E. et al., J. Chem. Inf. Comput. Sci., 25 (1985) 64], in combination with topological descriptors such as atom pairs, has proved useful in drug discovery for ranking large collections of chemical compounds in order of predicted biological activity. The compounds with the highest predicted activities, upon being tested, often show a several-fold increase in the fraction of active compounds relative to a randomly selected set. A trend vector is simply the one-dimensional array of correlations between the biological activity of interest and a set of properties or ‘descriptors’ of compounds in a training set. This paper examines two methods for generalizing the trend vector to improve the predicted rank order. The trend matrix method finds the correlations between the residuals and the simultaneous occurrence of descriptors, which are stored in a two-dimensional analog of the trend vector. The SAMPLS method derives a linear model by partial least squares (PLS), using the ‘sample-based’ formulation of PLS [Bush, B.L. and Nachbar, R.B., J. Comput.-Aided Mol. Design, 7 (1993) 587] for efficiency in treating the large number of descriptors. PLS accumulates a predictive model as a sum of linear components. Expressed as a vector of prediction coefficients on properties, the first PLS component is proportional to the trend vector. Subsequent components adjust the model toward full least squares. For both methods the residuals decrease, while the risk of overfitting the training set increases. We therefore also describe statistical checks to prevent overfitting. These methods are applied to two data sets, a small homologous series of disubstituted piperidines, tested on the dopamine receptor, and a large set of diverse chemical structures, some of which are active at the muscarinic receptor. Each data set is split into a training set and a test set, and the activities in the test set are predicted from a fit on the training set. Both the trend matrix and the SAMPLS approach improve the predictions over the simple trend vector. The SAMPLS approach is superior to the trend matrix in that it requires much less storage and CPU time. It also provides a useful set of axes for visualizing properties of the compounds. We describe a randomization method to determine the optimum number of PLS components that is very much faster for large training sets than leave-one-out cross-validation.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00126749

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

Electronic Resource

A systematic search for protein signature sequences (1992)

Sheridan, Robert P. ; Venkataraghavan, R.

New York, NY : Wiley-Blackwell

Proteins: Structure, Function, and Genetics 14 (1992), S. 16-28

add to mindlist on the mindlist

Details

ISSN: 0887-3585

Keywords: BLAST3 ; protein sequence ; sequence motif ; protein sequence database ; Chemistry ; Biochemistry and Biotechnology

Source: Wiley InterScience Backfile Collection 1832-2000

Topics: Medicine

Notes: Signature sequences are contiguous patterns of amino acis 10-50 resiues long that are associated with a particular structure or function in proteins. These may be of three types (by our nomenclature): superfamily signatures, remnant homologies, and motifs. We have performed a systematic search through a database of protein sequences to automatically and preferentially find remnant homologies and motifs. This was accomplished in three steps: 1We generated a nonredundant sequence database.2We used BLAST3 (Altschul and Lipman, Proc. Natl. Acad. Sci. U.S.A. 87:5509--5513, 1990) to generate local pairwise and triplet sequence alignments for every protein in the database vs. every other.3We selected “interesting” alignments and grouped them into clusters. We find that most of the clusters contain segments from proteins which share a common structure or function. Many of them correspond to signatures previously noted in the literature. We discuss three previously recognized motifs in detail (FAD/NAD-binding, ATP/GTP-binding, and cytochrome b5-like domains) to demonstrate how the alignments generated by our procedure are consistent with previous work and make structural and functional sense. We also discuss two signatures (for N-acetyltransferases and glycerol-phosphate binding) which to our knowledge have not been previously recognized. © 1992 Wiley-Liss, Inc.

Additional Material: 6 Ill.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1002/prot.340140105

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

Electronic Resource

Amino acid composition and hydrophobicity patterns of protein domains correlate with their structures (1985)

Sheridan, Robert P. ; Dixon, J. Scott ; Venkataraghavan, R. ; [et al.]

New York : Wiley-Blackwell

Biopolymers 24 (1985), S. 1995-2023

add to mindlist on the mindlist

Details

ISSN: 0006-3525

Keywords: Chemistry ; Polymer and Materials Science

Source: Wiley InterScience Backfile Collection 1832-2000

Topics: Chemistry and Pharmacology

Notes: We examine the correlation between the sequence and tertiary structure for 212 domains from globular proteins and polypeptides. The sequence of each domain is described as a set of 25 features: the mole percent of 20 amino acids, the number of residues in the domain, and the abundance of four simple patterns in the hydrophobicity profile of the sequence. Each domain, then, is described as a location in 25-dimensional sequence-feature space. We use pattern-recognition methods to find the two axes through the 25-dimensional sequence-feature space that best discriminate, respectively, predominantly α-helix domains from predominantly β-strand domains (the “secondary structure vector,” SV) and parallel α/β domains from other domains (the “parallel vector,” PV). When we divide the domains into two categories based on whether the cysteine content is above (CYS-RICH) or below (NORMAL) 4.5%, we find the secondary structure vector for the subset of CYS-RICH domains points in a significantly different direction than the equivalent vector for the NORMAL domains. Thus, CYS-RICH and NORMAL, domains are best treated separately. The secondary structure vector and the parallel vector for NORMAL domains describes statistically meaningful information, but the secondary structure vector for CYS-RICH domains may not be as reliable. We show how the secondary structure content of a NORMAL domain can be predicted by projecting the domain in the feature space onto the secondary structure vector. We subdivide the domains into five structural classes based on whether there is a parallel or mixed β-sheet in the domain and whether there are more helix or strand residues: NORMAL ALPHA, NORMAL BETA, NORMAL PARALLEL, CYS-RICH ALPHA, and CYS-RICH BETA. When we project the NORMAL domains onto the plane containing the origin of the feature space and SV and PV, we see that ALPHA, BETA, and PARALLEL, domains cluster in the plane, with the BETA cluster partially overlapping the PARALLEL cluster. The separations between the clusters are such that, by looking at the location of any given NORMAL domain in the plane, we can correctly predict its structural class with 83% accuracy. CYS-RICH ALPHA and BETA domains cluster when projected onto the CYS-RICH SV vector, and the classes can be preducted with 83% accuracy, but this accuracy for CYS-RICH domains may not be statistically meaningful.

Additional Material: 2 Ill.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1002/bip.360241011

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

Electronic Resource

The effect of deuterium substitution on hydrogen bonds in redox proteins (1984)

Sheridan, Robert P. ; Knight, Eugene T. ; Allen, Leland C.

New York : Wiley-Blackwell

Biopolymers 23 (1984), S. 195-200

add to mindlist on the mindlist

Details

ISSN: 0006-3525

Keywords: Chemistry ; Polymer and Materials Science

Source: Wiley InterScience Backfile Collection 1832-2000

Topics: Chemistry and Pharmacology

Additional Material: 2 Ill.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1002/bip.360230203

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

Electronic Resource

Hydrogen-bond cooperativity in protein secondary structure (1979)

Sheridan, Robert P. ; Lee, Richard H. ; Peters, Nancy ; [et al.]

New York : Wiley-Blackwell

Biopolymers 18 (1979), S. 2451-2458

add to mindlist on the mindlist

Details

ISSN: 0006-3525

Keywords: Chemistry ; Polymer and Materials Science

Source: Wiley InterScience Backfile Collection 1832-2000

Topics: Chemistry and Pharmacology

Notes: Hydrogen bonding in the α-helix and β-sheet has been studied by ab initio molecular orbital calculations carried out on complexes of formamide. Hydrogen-bond geometries were taken from x-ray crystallography of polypeptides. Positive cooperativity is found in all cases. The limiting value for infinite chains is obtained by use of a double-reciprocal plot and indicates an increase in the effective bond strength of 25% over that of a single isolated bond. Parallel calculations based on a classical electrostatic model yield qualitatively similar trends but underestimate the cooperativity by half. Charge redistribution accompanying cooperativity is characterized by a new type of charge-density difference plot, the cooperativity map. The magnitude and distance over which cooperativity acts suggest several significant biological consequences. Thus the average of α-helices and the number of β-sheet strands found in protein may be influenced by cooperativity. Cooperativity in the interpeptide hydrogen bond may also be partly responsible for the rapid formation of secondary structure in renaturing proteins and help stabilize secondary structure relative to the random-coil conformation.

Additional Material: 3 Ill.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1002/bip.1979.360181006

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

hits 1 - 5 | 5 hits