Project

General

Profile

Genomic evaluations format (CSV)

Introduction

Effective March 2019, all genomic evaluations both public and private accross CDCB systems will officially be distributed with this new pipe-delimited format.
The "old" CSV and XML formats will be discontinued in September 2019. Between March and August 2019 both formats will be released by CDCB.

In short:
  • Genomic evaluations will be distributed in two files named "infoANIM" and "infoEVAL"
    • "infoEVAL" contains all the trait genomic evaluations (e.g. HO_young_Pub_infoEVAL_1903.csv)
    • "infoANIM" contains information related to the animal that is not a trait genomic evaluation. (e.g. HO_young_Pub_infoANIM_1903.csv)
  • Character encoding of these files is UTF-8.

Public files content information

While private files contain all male and female animals each nominator is entitled to receive, public files include subsets of male animals with specific characteristics.

  • Weekly evaluations
    • [BB]_young_Pub_[YYYYMMDD].zip (where [BB]=Breed code and [YYYYMMDD] is date format): Male non-reference (for yield) animals considered “publishable”. Excludes embryos.
  • Monthly evaluations
    • [BB]_New_young_Pub_[YYMM].zip (where [BB]=Breed code and [YYMM] is date format): Male non-reference (for yield) animals receiving evaluations for the first time, that are considered “publishable”. New animals are identified by comparing the previous run, by breed of evaluation. Excludes embryos.
  • Triannual evaluations
    • IMPORTANT: During triannual evaluations the files are released compressed, encrypted and password-protected. The password to open the files is supplied at 7 a.m. on release day, in the file ftp://ftp.uscdcb.com/pub/password
    • [BB]_young_Pub_[YYMM].zip (where [BB]=Breed code and [YYMM] is date format): Male non-reference (for yield) animals considered “publishable”. Excludes embryos.
    • [BB]_New_young_Pub_[YYMM].zip (where [BB]=Breed code and [YYMM] is date format): Male non-reference (for yield) animals receiving evaluations for the first time, that are considered “publishable”. New animals are identified by comparing the previous run, by breed of evaluation. Excludes embryos.
    • [BB]_all_evaluated_[YYMM].zip (where [BB]=Breed code and [YYMM] is date format): Male reference (for yield) animals considered “publishable”.

InfoEVAL files

Examples

  • Example filename (BB=Breed code, XXX is stud number or file type, [YYYYMMDD/YYMM] is date format (YYYYMMDD = weekly, YYMM = monthly)
    • BB_studXXX_infoEVAL_[YYYYMMDD/YYMM].csv [contains only evaluation data, pipe delimited, multiple rows per animal]
    • BB_young_Pub_infoEVAL_[YYYYMMDD/YYMM].csv [contains only evaluation data, pipe delimited, multiple rows per animal]
  • Example file format (NOTE: Extra spaces provided for clarity. The file provided does not contain any blank spaces.
    #TYPE:EVAL|RUN:monthly|DATE:1812|ANIM:408|FIELDS:45
    ID17             |TRAIT |GPTA |GREL|GSONS|DGV |PA
    HOUSA000000000001|NM    |-161 |64  |-146 |-166|-599
    HOUSA000000000001|MILK  |-358 |71  |-327 |-341|-574
    HOUSA000000000001|FAT   |-17  |69  |-17  |-18 |-30
    HOUSA000000000001|PRO   |-3   |73  |-4   |-5  |-14
    HOUSA000000000001|FATPCT|-0.02|71  |-0.02|    |-0.04
    [...]
    

Format description

  • Separator is pipe "|"
  • All animals have the same # of rows and columns.
  • Missing values are empty fields (no blanks)
  • ROW #1 is the “validation row”, containing information about the file: type (EVAL), run (monhtly/weekly),date (YYMM or YYYYMMDD), anim (# of animals in file), fields (# of rows per animal)
    Example:
    #TYPE:EVAL|RUN:monthly|DATE:1812|ANIM:408|FIELDS:45
    • The above example shows the first line of a typical infoEVAL file.
      • TYPE is EVAL (meaning this is an infoEVAL file),
      • RUN is [monthly/weekly] (meaning this is a monthly or weekly evaluation file)
      • DATE is the evaluation date (format YYMM for monthly evaluations, YYYYMMDD for weekly evaluations)
      • ANIM is the number of animals in the file (for validation purposes)
      • FIELDS is the number of fields (rows) per animal.
  • ROW #2 is the header:
    Field name Description Reference
    ID17 ID of the animal in 17 bytes 4 119
    TRAIT Trait name list (see reference for
    full names and formats)
    350
    GPTA Genomic PTA (previously “GenPTA”) 351
    GREL Genomic Reliability (previously “GenRel”) 350
    GSONS Genomic PTA for sons (previously “GenSons”) 350
    DGV Direct Genomic Value 350
    PA Parent Average 350 352
  • ROW #3 onwards is the animal specific information.

InfoANIM files

Examples

  • Example filename (BB=Breed code, XXX is stud number or file type, [YYYYMMDD/YYMM] is date format (YYYYMMDD = weekly, YYMM = monthly)
    • BB_studXXX_infoANIM_[YYYYMMDD/YYMM].csv [contains only evaluation data, pipe delimited, multiple rows per animal]
    • BB_young_Pub_infoANIM_[YYYYMMDD/YYMM].csv [contains only evaluation data, pipe delimited, multiple rows per animal]
  • Example file format (NOTE: Extra spaces provided for clarity. The file provided does not contain any blank spaces)
    #TYPE:ANIM|RUN:monthly|DATE:1812|ANIM:408|FIELDS:49
    ID17|INFORMATION|VALUE
    HOUSA000000000001|EVAL_BREED|HO
    HOUSA000000000001|SEX|F
    HOUSA000000000001|SIRE17|HOUSA000000000002
    HOUSA000000000001|DAM17|HOUSA000000000003
    [...]
    

Format description

  • Separator is pipe "|"
  • All animals have the same # of rows and columns.
  • Missing values are empty fields (no blanks)
  • ROW #1 is the “validation row”, containing information about the file: type (ANIM), run (monhtly/weekly),date (YYMM or YYYYMMDD), anim (# of animals in file), fields (# of rows per animal)
    Example:
    #TYPE:ANIM|RUN:monthly|DATE:1812|ANIM:408|FIELDS:49
    • The above example shows the first line of a typical infoANIM file.
      • TYPE is ANIM (meaning this is an infoANIM file),
      • RUN is [monthly/weekly] (meaning this is a monthly or weekly evaluation file)
      • DATE is the evaluation date (format YYMM for monthly evaluations, YYYYMMDD for weekly evaluations)
      • ANIM is the number of animals in the file (for validation purposes)
      • FIELDS is the number of fields (rows) per animal.
  • ROW #2 is the header:
    Field name Description Reference
    ID17 ID of the animal in 17 bytes 4 119
    INFORMATION Information provided in the field 353
    VALUE Value related to the information provided 353
  • ROW #3 onwards is the animal specific information.

Redmine Appliance - Powered by TurnKey Linux