In:
Cancer Research, American Association for Cancer Research (AACR), Vol. 71, No. 8_Supplement ( 2011-04-15), p. 2742-2742
Abstract:
Several platforms for genome sequencing and methods for exon enrichment are now available and continue to evolve quickly. In this study, we compared the performance of whole exome sequencing using two instruments, Illumina's GAIIx and HiSeq 2000, comparing single versus duplicate runs. Eight runs of eight samples each were performed with genomic DNA derived from whole blood and sample libraries prepared with the Agilent SureSelect capture array. One run was conducted in duplicate on the GAIIx, and, for some analyses, the combined data were used. We used TREAT (Targeted RE-sequencing and Annotation Tool), developed in-house for data analysis, including sequence alignment (MAQ, BWA), local re-alignment (GATK), variant calling (utilizing MAQ), annotation (SIFT and Seattle Seq), and visualization. Overall, duplicate runs on the GAIIx platform (compared to a single run) increased the number of total reads by approximately two-fold: from ∼66 M to ∼130 M. For both single and duplicate runs, ∼90% of reads mapped to the reference sequence, and ∼55% of reads mapped on-target. Of importance, the percent coverage of the target region increased substantially with duplicate runs; ∼72% of the target region was covered at 40-fold or more when run in duplicate compared to only ∼52% when run once. At 30x, 20x, and 10x, there was a 1.3- (62% v. 78%), 1.2- (72% v. 84%), and 1.1-fold (84% v. 91%) increase respectively, in the percent coverage of the target region for samples run twice. A similar number of filtered, on-target SNPs per sample (∼24 K) was found for both single and duplicate run analyses. However, a 1.3-fold increase in the number of on-target indels was seen in the duplicate run (∼1,200) compared to the single run (∼900). When the combined data for samples run twice on a GAIIx were compared to the samples run using a HiSeq, the total number of reads were similar (∼130 M), although several of the samples had substantially more reads on the HiSeq platform. Samples run on a HiSeq had an increase in the percent coverage of the target region at 40x (72% v. 81%), 30x (78% v. 86%), 20x (84% v. 91%) and 10x (91% v. 95%) compared to the two-run approach on a GAIIx. A similar number of filtered on-target SNPs (∼24 K/sample) and indels (∼1200) was found with the HiSeq compared to two runs on a GAIIx. In summary, our results demonstrate increases in the total number of reads and in overall coverage with the HiSeq 2000 instrument. The total number of SNPs and indel that mapped on target for the version 1 Agilent SureSelect capture array is ∼25 K and 1,200 per sample, respectively. As capture is of variable efficiency for individual runs, higher average coverage of the target region is necessary for sufficient coverage of poorly captured regions. Additional analyses comparing Agilent SureSelect versions 1 and 2 arrays are underway. Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 102nd Annual Meeting of the American Association for Cancer Research; 2011 Apr 2-6; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2 011;71(8 Suppl):Abstract nr 2742. doi:10.1158/1538-7445.AM2011-2742
Type of Medium:
Online Resource
ISSN:
0008-5472
,
1538-7445
DOI:
10.1158/1538-7445.AM2011-2742
Language:
English
Publisher:
American Association for Cancer Research (AACR)
Publication Date:
2011
detail.hit.zdb_id:
2036785-5
detail.hit.zdb_id:
1432-1
detail.hit.zdb_id:
410466-3
Permalink