Project

General

Profile

CDCB SNP Array Validation Process

  1. Submit a completed SNP array validation form
    • All requested information MUST be provided.
       
  2. Provide the SNP Array Full Name following this format:
    [Technology (e.g. Illumina or Affymetrix)] - [Company/Collaborator name] - [SNP array name] - [description / version # (if applicable)]
     
  3. Payment corresponding to CDCB SNP array validation fees.
     
  4. Submit SNP array data considering the following requirements
     
    1. SNP Content:
      1. The chip must include all 195 SNPs used for parentage verification and discovery (See ICAR documentation).
      2. For data to qualify for ICAR parentage certification, at least 95% of the ICAR 195 SNPs must have genotype calls (minimum 185/195 non-missing).
      3. The chip must contain at least 3480 of the 3552 "fast discovery" SNPs in the attached list, including all first 96 labeled as critical.
      4. The chip must contain at least 350 of the 550 SNPs used for Quick Discovery Service (QDisc).
      5. The chip must include at least 10 Y SNPs for gender verification (examples provided in the attached file).
         
    2. Genotype Concordance (to assess genotype call accuracy, including both male and female samples)
      Provide one of the following:
      • At least 51 genotypes from the new chip for animals already genotyped with a Chips Used in CDCB Evaluation that includes most of the SNPs on the new array; or
      • Genotypes from at least 51 animals for which at least one parent has been genotyped with a Chips Used in CDCB Evaluation with substantial SNP overlap.
         
    3. SNP Coordinates
    4. Manifest File
      • The SNP manifest file must include flanking sequences for all SNPs to support verification and alignment of SNP positions.
      • Preference is for at least 50 bp of sequence upstream and downstream of the SNP
         
    • The above data is to be submitted to CDCB in 3 files
       
      1. Description of the SNP (SNP manifest/map file):
        Name,Index,Chr,Position,FlankingSequence
        SNP_Name_1,1,19,123456789,GCAGTGGCACCTGCTCCCTTCTTCCTAGGTGCGCTTCTGTACGCTTACTA[A/T]ATCTCGGCTACATCGGCTACAATTGCGTGTTATGCTCGAGGCTTACACCT
        SNP_Name_2,2,30,11223344,CGAGTGGAAATTGCTCACTTATGGCTAGGTGAGATTCTCTAGCCTTAGTA[C/G]CGCCTGGCTAGACTGCATAACCGGTGCGTGTTACGGTCCATTCATAGACA
        SNP_Name_3,3,5,98765432,CTTGAGCATGTCGCGAACCTCAGGCAATGTGTGACTCTTTAGTCTGTGTA[C/A]AATCTTACTAGAGGGCATAGTCGATGCGAGTCACTGTACATGCAGAGATT
        
        • The above columns are REQUIRED, with the preferred order as shown
           
      2. Genotypes (Final Report file):
        [Header]
        Version    1.1.1
        Processing Date 14-Mar-2023 03:44:44 PM
        Content Test
        Num SNPs        4
        Total SNPs      4
        Num Samples     3
        Total Samples   3
        [Data]
                AB01234567        AB01234568        AB01234569
        SNP_Name_1        BB        AA        AB
        SNP_Name_2        AB        AB        AB
        SNP_Name_3        AA        AB        AA
        SNP_Name_4        BB        BB        BB
        
        • Matrix format is preferred (rows are SNP, columns are samples)
          • Standard format (all data in rows) can be submitted if matrix format is not available
        • Sample_ID in the genotype file should match the Sample_ID in the sample sheet file
        • SNP names have a maximum of 44 characters allowed
           
      3. Information about the submitted samples (Sample sheet file):
        [Header],,,,,,,,,,,,,,,,,,,,,,,
        Investigator Name,"Doe, John",,,,,,,,,,,,,,,,,,,,,,
        Project Name,2023031411_ABC_AB,,,,,,,,,,,,,,,,,,,,,,
        Experiment Name,,,,,,,,,,,,,,,,,,,,,,,
        Date,44916,,,,,,,,,,,,,,,,,,,,,,
        [Manifests],,,,,,,,,,,,,,,,,,,,,,,
        A,A_Dairy_Chip,,,,,,,,,,,,,,,,,,,,,,
        [Data],,,,,,,,,,,,,,,,,,,,,,,
        Sample_ID,Sample_Plate,Sample_Name,Project,AMP_Plate,Sample_Well,SentrixBarcode_A,SentrixPosition_A,Scanner,Date_Scan,Replicate,Parent1,Parent2,Gender,Sample Type
        AB01234567,123456,HOUSA000000000001,Project1,222222,E1,200000000000,R01C01,,,,,,,Tissue
        AB01234568,123456,HOUSA000000000002,Project1,222222,B2,200000000001,R01C02,,,,,,,Tissue
        AB01234569,123456,HOUSA000000000003,Project1,222222,C3,200000000002,R01C03,,,,,,,Tissue
        
        • At minimum, the Sample sheet file must contain columns for Sample_ID, Sample_Name, Barcode, and Position

For additional details on the file formats, please see:
https://redmine.uscdcb.com/projects/cdcb-customer-service/wiki/CDCB_Accepted_genotype_file_formats

For lists of required SNPs, expand "Files" section below
 

Redmine Appliance - Powered by TurnKey Linux