br To systematically investigate the causal relation ship
To systematically investigate the causal relation-ship between mLOY and lung cancer, we conducted a standard mendelian randomization study by using the novel mLOY-associated variants reported by Wright et al.5 as instrumental variables. We first evaluated the association between genetically predicted mLOY and lung cancer risk in 3797 males (1711 case patients and 2086 controls) from our previous Nanjing Medical University lung cancer GWAS data10 and then directly investigated the association of individual copy numbers of chromosome Y with lung cancer risk in 2255 males (1445 case patients and 810 controls). Kaplan-Meier and Cox regression analyses were
performed to evaluate the effect of mLOY on lung cancer prognosis in 309 patients with lung cancer treated without surgery.
Materials and Methods
We conducted a mendelian randomization study to evaluate the association between mLOY and lung cancer risk. The study 1018675-35-8 was derived from our previ-ous lung cancer GWAS.10 This study obtained informed consent from all of the included subjects and was approved by the ethics and human subject committee of Nanjing Medical University. We included data on 3797 males (1711 case patients and 2086 controls) to examine the association between genetically predicted mLOY and lung cancer risk; the demographics of this population are described in Supplementary Table 1. First, we estimated the median of the log R ratio (LRR) of probes in the Y chromosome (mLRR-Y) in 2255 males (1445 case patients and 810 controls) included in sam-ples from the aforementioned 3797 males with available raw array data and validated the association between mLOY-related single-nucleotide polymorphism (SNPs) and mLRR-Y in 810 control subjects. We further studied the association between mLRR-Y and lung cancer risk directly in the 2255 males (Fig. 1). Then, we derived a weighted genetic risk score (wGRS) to predict mLOY and evaluate its association with lung cancer risk in 3797 males. Survival analysis was conducted in 309 non-operatively managed male patients included in the aforementioned 1711 case patients with available follow-up and clinical information (see Supplementary Table 1). All case patients were males who were histo-pathologically or cytologically confirmed to have lung cancer by at least two local pathologists. Cancer-free control subjects were randomly selected from those receiving routine physical examinations in local hospitals or those participating in a community-based screening program for noninfectious diseases in Jiangsu Province. The details have been described in our previous work.10
Weighted genetic risk score (wGRS) of
NJMU male control population NJMU male case-control population
14 genetic variants associated with mLOY
NJMU male case-control population
Figure 1. Workflow of the mendelian randomization design. Abbreviations: mLOY, mosaic loss of chromosome Y; NJMU, Nanjing Medical University.
Quality Control and Imputation
All subjects included in this study were genotyped with Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix, Santa Clara, CA), and we obtained genotype data by either direct genotyping or imputation. To obtain high-quality genotype data, we performed a standard quality control procedure with PLINK software (version 1.07, http://zzz.bwh.harvard.edu/plink/) to exclude unqualified SNPs and samples. Unqualified SNPs were those that did not map to autosomal chromosomes with a call rate lower than 95% and minor allele fre-quency (MAF) less than 0.05 and deviated from Hardy-Weinberg equilibrium in all samples (p < 1 10–5). The unqualified samples (1) had a call rate lower than 95%, (2) had sex discordance, (3) were duplicates or probable relatives, (4) had an extreme heterozygosity rate, or (5) were outliers according to principal component (PC) analysis. The imputation was per-formed with Shapeit software (version 2, http://www. shapeit.fr/, phasing step) and IMPUTE2 software (http://mathgen.stats.ox.ac.uk/impute/impute_v2.html, Imputation step), with the 1000 Genomes Project (phase
III integrated variant set release, across 2504 samples [http://www.internationalgenome.org/category/phase-3/, hg19]) as the reference. A total of 3797 males (1711 case patients and 2086 controls) were used for further analysis. Poorly imputed SNPs defined by an information measure (Is) less than 0.80 with IMPUTE2 were excluded from the analysis.
Selection of mLOY-Related SNPs
Estimating Mosaic Y Chromosome Loss
We estimated the level of mLOY according to the LRR of probes in the male-specific region of chromo-some Y, which is located in the 56-Mb region between pseudoautosomal regions 1 and 2 (PAR1 and PAR2, Y