TABLE OF CONTENTS Page Acknowledgements... VI Declaration... VII List of abbreviations... VIII 1. INTRODUCTION... 1 2. BASIC PRINCIPLES... 3 2.1. Hereditary Colorectal Cancer... 3 2.1.1. Differential diagnosis... 4 2.1.2. Adenoma-carcinoma sequence... 5 2.1.3. Knudson two-hit hypothesis... 6 2.2. Genetics of adenomatous polyposis syndromes... 7 2.2.1. Familial adenomatous polyposis (FAP)... 7 2.2.2. MUTYH-associated polyposis (MAP)... 9 2.2.3. Mutation negative adenomatous polyposis... 10 2.3. Human genome variation... 11 2.3.1. Single nucleotide polymorphisms (SNPs)... 12 2.3.2. Tandem repeats... 12 2.3.3. Structural variations... 12 2.4. Identification of causative genes in human disease... 18 2.4.1. Homozygosity mapping... 19 2.4.2. Loss of heterozygosity (LOH) analysis... 20 2.4.3. Linkage analysis... 21 2.4.4. Genome-wide association study (GWAS)... 22 2.4.5. Copy number variation (CNV) analysis... 23 2.4.6. High-throughput sequencing... 26 2.5. Validation of functionally relevant candidate genes... 27 2.5.1. Recurrent findings... 27 2.5.2. Segregation analysis... 27 2.5.3. Gene expression in relevant target tissues... 28 2.5.4. Candidate gene approach... 28 2.5.5. Pathway enrichment analysis/network analysis... 28 2.6. Scope of the thesis... 30 3. MATERIALS AND METHODS... 31 I
3.1. Databases... 31 3.2. Devices... 32 3.3. Software... 33 3.4. Commercial reagents... 34 3.5. Study samples... 35 3.5.1. Initial patient cohort... 35 3.5.2. NGS validation cohort... 36 3.5.3. Heinz Nixdorf RECALL (HNR) study controls... 36 3.5.4. GWAS replication study... 37 3.6. DNA and RNA preparations... 37 3.6.1. DNA extraction using desalting method... 37 3.6.2. Formalin fixed paraffin embedded (FFPE) tissue DNA isolation... 38 3.6.3. RNA extraction using the PAX gene kit... 38 3.6.4. Determination of concentration and quality... 39 3.6.5. First-strand cdna synthesis... 39 3.7. Polymerase Chain Reaction (PCR)... 40 3.7.1. Basic principle... 40 3.7.2. Primer design... 40 3.7.3. PCR reaction components... 41 3.7.4. Cycling step... 42 3.7.5. Agarose gel electrophoresis... 42 3.7.6. PCR product purification... 42 3.8. Sanger sequencing... 43 3.8.1. Basic principle... 43 3.8.2. Reaction components... 44 3.8.3. Cycling step... 44 3.8.4. Cycle sequencing product cleaning... 44 3.8.5. Capillary electrophoresis... 45 3.9. APC transcript analysis... 45 3.9.1. Primer design... 45 3.9.2. cdna analysis... 46 3.9.3. Sanger sequencing... 46 3.9.4. Data analysis... 46 3.9.5. Genomic DNA analysis... 47 3.9.6. Haplotype analysis... 47 3.9.7. In-silico analysis... 47 II
3.10. Genome-wide SNP array hybridization... 47 3.10.1. Genotyping based on BeadArray Technology (Illumina )... 47 3.10.2. Protocol... 48 3.10.3. Bead decoding... 49 3.10.4. Quality control of raw data... 49 3.11. Identification of putative CNVs... 50 3.11.1. Final reports... 50 3.11.2. CNV calling... 50 3.12. CNV analysis... 52 3.12.1. Known candidate gene survey... 52 3.12.2. Filtering CNVs... 52 3.13. CNV validation... 57 3.13.1. Quantitative PCR (qpcr)... 57 3.13.2. CNV validation by qpcr using SYBR Green I... 57 3.13.3. Data analysis... 59 3.13.4. Copy number calculation (2 ΔΔCT method)... 59 3.14. Co-segregation analysis... 60 3.15. Gene expression analysis... 60 3.15.1. Gene expression in human colon cdna... 60 3.15.2. PCR and agarose gel electrophoresis... 61 3.16. Network analysis... 62 3.17. Candidate gene prioritization... 62 3.17.1. Frequency of finding... 62 3.17.2. Segregation analysis... 62 3.17.3. Data mining... 63 3.18. TaqMan gene expression analysis... 63 3.18.1. Basic principle... 63 3.18.2. Relative quantitative PCR (RT-PCR)... 64 3.18.3. Reaction components... 64 3.18.4. Cycling step... 65 3.18.5. Data analysis... 65 3.19. Targeted next generation sequencing... 66 3.19.1. Basic principle... 66 3.19.2. Library preparation, target enrichment, and sequencing... 67 3.19.3. Alignment, genotype calling, and variant annotation... 67 3.19.4. Data analysis and filter... 67 III
3.19.5. Validation of results... 68 3.20. Genotyping based on MassExtend Reaction (Sequenom )... 68 3.20.1. Basic principle... 68 3.20.2. Selection of the genotyped SNPs... 68 3.20.3. DNA preparation... 70 3.20.4. Assay and primer design... 70 3.20.5. PCR step... 70 3.20.6. Digestion step... 71 3.20.7. Extension primer adjustment... 71 3.20.8. Extension step... 71 3.20.9. Clean up reaction... 72 3.20.10. Dispensing DNA on a chip... 73 3.20.11. Mass spectrometry... 73 3.20.12. Data analysis... 73 3.21. TaqMan SNP genotyping/allelic discrimination... 73 3.21.1. Basic principle... 73 3.21.2. DNA preparation... 74 3.21.3. Primer and probe design... 74 3.21.4. PCR step... 74 3.21.5. Allelic discrimination data analysis... 75 4. RESULTS... 76 4.1. Transcript analysis of the APC gene... 76 4.1.1. Agarose gel electrophoresis... 76 4.1.2. Sanger sequencing of aberrant transcripts and genomic DNAs... 77 4.1.3. In-silico analysis... 80 4.1.4. Haplotype analysis... 80 4.2. CNV analysis... 82 4.2.1. Quality control of SNP array hybridization... 82 4.2.2. CNV calling... 82 4.2.3. Survey of CNV in known candidate genes... 84 4.2.4. CNV filtering... 86 4.2.5. CNV validation by qpcr using SYBR Green I... 88 4.2.6. Co-segregation analysis... 88 4.3. Candidate gene prioritization... 91 4.3.1. Genes covered by the validated CNVs... 91 4.3.2. Gene expression in human colon cdna... 92 IV
4.3.3. Network and pathway analysis... 93 4.3.4. Data mining... 100 4.4. TaqMan gene expression analysis of CTNNB1 and MUTYH... 104 4.5. Resequencing candidate genes... 105 4.5.1. Sanger sequencing of LZTFL1... 105 4.5.2. High throughput resequencing of remaining candidate genes... 106 4.6. Screening for somatic point mutation... 114 4.7. Replication of the GWAS performed in adenomatous polyposis... 115 5. DISCUSSION... 116 5.1. Transcript analysis of the APC gene... 116 5.2. Novel causative gene identification... 117 5.2.1. CNV analysis... 117 5.2.2. Candidate gene prioritization... 122 5.2.3. Validation of the clinical relevance of the candidate genes... 127 5.3. Limitations of the study... 130 6. SUMMARY... 131 7. OUTLOOK/PERSPECTIVE... 133 8. REFERENCES... 135 List of publications... 152 Appendices... 153 V