Genomic evaluations format (CSV)¶
- Table of contents
- Genomic evaluations format (CSV)
Introduction¶
Effective March 2019, all genomic evaluations both public and private accross CDCB systems will officially be distributed with this new pipe-delimited format.
The "old" CSV and XML formats will be discontinued in September 2019. Between March and August 2019 both formats will be released by CDCB.
- Genomic evaluations will be distributed in two files named "infoANIM" and "infoEVAL"
- "infoEVAL" contains all the trait genomic evaluations (e.g. HO_young_Pub_infoEVAL_1903.csv)
- "infoANIM" contains information related to the animal that is not a trait genomic evaluation. (e.g. HO_young_Pub_infoANIM_1903.csv)
- Character encoding of these files is UTF-8.
Public files content information¶
While private files contain all male and female animals each nominator is entitled to receive, public files include subsets of male animals with specific characteristics.
- Weekly evaluations
- [BB]_young_Pub_[YYYYMMDD].zip (where [BB]=Breed code and [YYYYMMDD] is date format): Male non-reference (for yield) animals considered “publishable”. Excludes embryos.
- Monthly evaluations
- [BB]_New_young_Pub_[YYMM].zip (where [BB]=Breed code and [YYMM] is date format): Male non-reference (for yield) animals receiving evaluations for the first time, that are considered “publishable”. New animals are identified by comparing the previous run, by breed of evaluation. Excludes embryos.
- Triannual evaluations
- IMPORTANT: During triannual evaluations the files are released compressed, encrypted and password-protected. The password to open the files is supplied at 7 a.m. on release day, in the file ftp://ftp.uscdcb.com/pub/password
- [BB]_young_Pub_[YYMM].zip (where [BB]=Breed code and [YYMM] is date format): Male non-reference (for yield) animals considered “publishable”. Excludes embryos.
- [BB]_New_young_Pub_[YYMM].zip (where [BB]=Breed code and [YYMM] is date format): Male non-reference (for yield) animals receiving evaluations for the first time, that are considered “publishable”. New animals are identified by comparing the previous run, by breed of evaluation. Excludes embryos.
- [BB]_all_evaluated_[YYMM].zip (where [BB]=Breed code and [YYMM] is date format): Male reference (for yield) animals considered “publishable”.
InfoEVAL files¶
Examples¶
- Example filename (BB=Breed code, XXX is stud number or file type, [YYYYMMDD/YYMM] is date format (YYYYMMDD = weekly, YYMM = monthly)
- BB_studXXX_infoEVAL_[YYYYMMDD/YYMM].csv [contains only evaluation data, pipe delimited, multiple rows per animal]
- BB_young_Pub_infoEVAL_[YYYYMMDD/YYMM].csv [contains only evaluation data, pipe delimited, multiple rows per animal]
- Example file format (NOTE: Extra spaces provided for clarity. The file provided does not contain any blank spaces.
#TYPE:EVAL|RUN:monthly|DATE:1812|ANIM:408|FIELDS:45 ID17 |TRAIT |GPTA |GREL|GSONS|DGV |PA HOUSA000000000001|NM |-161 |64 |-146 |-166|-599 HOUSA000000000001|MILK |-358 |71 |-327 |-341|-574 HOUSA000000000001|FAT |-17 |69 |-17 |-18 |-30 HOUSA000000000001|PRO |-3 |73 |-4 |-5 |-14 HOUSA000000000001|FATPCT|-0.02|71 |-0.02| |-0.04 [...]
Format description¶
- Separator is pipe "|"
- All animals have the same # of rows and columns.
- Missing values are empty fields (no blanks)
- ROW #1 is the “validation row”, containing information about the file: type (EVAL), run (monhtly/weekly),date (YYMM or YYYYMMDD), anim (# of animals in file), fields (# of rows per animal)
Example:#TYPE:EVAL|RUN:monthly|DATE:1812|ANIM:408|FIELDS:45
- The above example shows the first line of a typical infoEVAL file.
- TYPE is EVAL (meaning this is an infoEVAL file),
- RUN is [monthly/weekly] (meaning this is a monthly or weekly evaluation file)
- DATE is the evaluation date (format YYMM for monthly evaluations, YYYYMMDD for weekly evaluations)
- ANIM is the number of animals in the file (for validation purposes)
- FIELDS is the number of fields (rows) per animal.
- The above example shows the first line of a typical infoEVAL file.
- ROW #2 is the header:
Field name Description Reference ID17 ID of the animal in 17 bytes 4 119 TRAIT Trait name list (see reference for
full names and formats)350 GPTA Genomic PTA (previously “GenPTA”) 351 GREL Genomic Reliability (previously “GenRel”) 350 GSONS Genomic PTA for sons (previously “GenSons”) 350 DGV Direct Genomic Value 350 PA Parent Average 350 352
- ROW #3 onwards is the animal specific information.
InfoANIM files¶
Examples¶
- Example filename (BB=Breed code, XXX is stud number or file type, [YYYYMMDD/YYMM] is date format (YYYYMMDD = weekly, YYMM = monthly)
- BB_studXXX_infoANIM_[YYYYMMDD/YYMM].csv [contains only evaluation data, pipe delimited, multiple rows per animal]
- BB_young_Pub_infoANIM_[YYYYMMDD/YYMM].csv [contains only evaluation data, pipe delimited, multiple rows per animal]
- Example file format (NOTE: Extra spaces provided for clarity. The file provided does not contain any blank spaces)
#TYPE:ANIM|RUN:monthly|DATE:1812|ANIM:408|FIELDS:49 ID17|INFORMATION|VALUE HOUSA000000000001|EVAL_BREED|HO HOUSA000000000001|SEX|F HOUSA000000000001|SIRE17|HOUSA000000000002 HOUSA000000000001|DAM17|HOUSA000000000003 [...]
Format description¶
- Separator is pipe "|"
- All animals have the same # of rows and columns.
- Missing values are empty fields (no blanks)
- ROW #1 is the “validation row”, containing information about the file: type (ANIM), run (monhtly/weekly),date (YYMM or YYYYMMDD), anim (# of animals in file), fields (# of rows per animal)
Example:#TYPE:ANIM|RUN:monthly|DATE:1812|ANIM:408|FIELDS:49
- The above example shows the first line of a typical infoANIM file.
- TYPE is ANIM (meaning this is an infoANIM file),
- RUN is [monthly/weekly] (meaning this is a monthly or weekly evaluation file)
- DATE is the evaluation date (format YYMM for monthly evaluations, YYYYMMDD for weekly evaluations)
- ANIM is the number of animals in the file (for validation purposes)
- FIELDS is the number of fields (rows) per animal.
- The above example shows the first line of a typical infoANIM file.
- ROW #2 is the header:
Field name Description Reference ID17 ID of the animal in 17 bytes 4 119 INFORMATION Information provided in the field 353 VALUE Value related to the information provided 353
- ROW #3 onwards is the animal specific information.