Noninvasive Cancer Classification Using Diverse Genomic Features in Circulating Tumor DNA

Circulating tumor DNA (ctDNA) has the potential to revolutionize cancer care, through detection of somatic lesions over time, as relevant for therapy selection, response monitoring, and early detection. Our group previously described cancer personalized profiling with deep sequencing (CAPP-Seq) [2], a ctDNA detection method targeting recurrent point mutations and structural variations within a given tumor. However, two unmet challenges for ctDNA are its utility for noninvasive histology classification and for copy number variation (CNV) at low ctDNA levels. Here, we tackle both challenges as related problems. We describe a simple CNV detection algorithm for ctDNA. To correct for systematic/biological noise, we map depth data into corresponding z-statistics in healthy subjects. Then, we estimate parameters of a multivariate Gaussian governing z-statistics of the contributing regions. Finally, we employ a polished signal to assign a score to each gene. CNVs are then called based on predetermined performance metrics. We benchmarked performance via synthetic and empirical CAPP-Seq data, achieving 95% sensitivity/specificity for ctDNA levels 3-5% in actionable CNVs involving ERBB2, MET, and EGFR. We also successfully detected CNVs for ctDNA level as low as 3% in patients with non-small cell lung cancer or diffuse large B-cell lymphoma (DLBCL). We separately describe a Bayesian tumor histology classifier using prior probabilities from existing knowledge regarding CNVs, single nucleotide variants, and gene fusions in public tumor sequencing data. Prior probabilities were constructed within an optimization framework using maximum entropy. When applied to noninvasive classification of DLBCL subtypes (i.e. germinal center B-cell like (GCB) and activated B-cell like (ABC) [1]) using pretreatment plasma, this method showed ~ 80% concordance with routine clinical classification (Hans Algorithm). We conclude that tumor subtype classification and CNV detection with ctDNA is feasible and robust using CAPP-Seq.