Functional Gene Discovery Platform for Sorghum

Tremendous gaps remain in our understanding of the valuable traits contained in sorghum genetic resources. Advances in genomics, targeted mutagenesis, reverse genetics and whole-genome DNA sequencing can enable efficient gene discovery and germplasm mining for crop improvement. With the support of the Bill & Melinda Gates Foundation, we are developing genetic and genomic resources that can be used to leverage the phenotypic variation in sorghum. By developing tools in the genome-sequenced variety BTx623 and elite germplasm adapted for Africa, this project accelerates the ability of sorghum researchers to translate knowledge into practical applications in sorghum improvement.

This site enables you to search for Sorghum lines containing variations, both natural and ems induced, in coding sequences.

To use this site: 1) Enter search criteria on the 'Search' tab and press the 'Get Variants' button. 2) View list of variants in the 'Table' tab. The default value for the number of records to return is 500 records at a time but you can set this to a lower value if you have a slow internet connection. If there are more records meeting your criteria than the number you have selected to have returned, then double arrow buttons (<< and >>) will appear at that bottom corners of the 'Table' page allowing you to page through sets of records. 3) For each record, the Transcipt name is a live link that will take you to a Phytozome page with information about the transcipt. 4) Press the red button at the right of a row on the 'Table' tab to view additional details of a given variant on the 'Details' tab.

B&MGF This project was made possible, in part, by the support of the Bill & Melinda Gates Foundation.

Variation Source (ems or nat):

Sample_ID:

Transcript:

Chromosome (as Chr01, Chr02, etc.):

Minimum Positon on Chromosome:

Maximum Positon on Chromosome:

Sorghum Annotation (will match partial words and phrases):

Maize Annotation (will match partial words and phrases):

Arabidopsis Annotation (will match partial words and phrases):

Number of records to fetch:

If you are on a slow internet connection, you may want to change from the default value of 500 to a smaller value, e.g. 50 or 100.


row_id Variation Source Sample Chrom Posn Transcript Sorghum Annotation View Details
{{row.row_id}} {{row.Variation_Source}} {{row.Sample_ID}} {{row.Chrom}} {{row.Posn | number}} {{row.Gene}} {{row.Sorghum_Annotation}}
Loading ...

Sample_ID:

Variation_Source:

row_id:

GRIN_ID:

Sequencing_ID:

{{selectedRow.Sample_ID}}
{{selectedRow.Variation_Source}}
{{selectedRow.row_id}}
{{selectedRow.GRIN_ID}}
{{selectedRow.Sequencing_ID}}

Transcipt:

Chromosome:

Position:

Phytozome_PACID:

Genotype_Info:

{{selectedRow.Gene}}
{{selectedRow.Chrom}}
{{selectedRow.Posn}}
{{selectedRow.Phytozome_PACID}}
{{selectedRow.Genotype_Information}}

Reference:

Allele:

Codon_Change:

Amino_Acid_Change:

Tag:

{{selectedRow.Ref}}
{{selectedRow.Allele}}
{{selectedRow.Codon_Change}}
{{selectedRow.Amino_Acid_Change}}
{{selectedRow.Tag}}

Impact:

Impact Type:

Quality:

{{selectedRow.Impact}}
{{selectedRow.Impact_Type}}
{{selectedRow.Qual}}

Sorghum Annotation:

{{selectedRow.Sorghum_Annotation}}

Maize Annotation:

Maize BLAST E Value

{{selectedRow.Maize_Annotation}}
{{selectedRow.Maize_BLAST_E_value}}

Arabidopsis Annotation:

Arabidpsis BLAST E Value

{{selectedRow.Arabidopsis_Annotation}}
{{selectedRow.Arabidopsis_BLAST_E_value}}

Germplasm

  • ems mutants: Seed of most of the ems mutant lines is available through the U.S. National Germplasm System. Lines which are available are identified with a Plant Introduction identifier in the 'GRIN_ID' field of the 'Details' tab. The Plant Introduction identifier (e.g. 'PI 123456') can be used as a search criterion for ordering seed from the Germplasm Resource Information Network (GRIN) web portal: GRIN-global search. For a complete list of the available ems mutants, use the 'Advanced Search Criteria' feature of the portal, select 'Sorghum EMS Mutants (Purdue)' as the accession group name and 'Equal To' as the criterion.
  • Natural (nat) variants:
    1. Sorghum lines selected for whole genome sequencing including diverse varieties from Africa, Striga resistant lines from West Africa, and elite sorghum parent lines.

      Entry

      Pedigree

      Comments

      1
      Tx430
      Elite yellow seed male
      2
      Tx2752
      Elite red seed female
      3
      Tx631
      Elite food-grade female
      4
      TxARG1
      Elite food-grade female
      5
      Tx436
      Elite food-grade male
      6
      B N223
      Elite food-grade female – Niger
      7
      Kuyuma
      Food-grade – Zambia
      8
      Sepon82
      Food-grade – Niger
      9
      SK 5912
      Food-type – Nigeria
      10
      Ajabsido
      Drought tolerance – Sudan
      11
      CE-151-262-A1
      Food-grade – Senegal
      12
      CSM-63
      Guinea – Mali
      13
      Mota Maradi
      Preflowering drought tolerance – Niger
      14
      Koro Kollo
      Preflowering drought tolerance – Sudan
      15
      Feterita Gishesh
      Preflowering drought tolerance – Sudan
      16
      Segeolane
      Preflowering drought tolerance – Botswana
      17
      SC35
      Postflowering drought tolerance – East Africa
      18
      PI609567
      Erect-head dhurra – N. Mali
      19
      MR732
      Elite food-grade male – Niger
      20
      Wassa
      Food-grade guinea – Mali
      21
      Seguetana
      Food-grade guinea – Mali
      22
      El Mota
      Preflowering drought tolerance – Niger
      23
      Honey Drip
      Sweet
      24
      Theis
      Sweet
      25
      SC599
      Postflowering drought tolerance – Converted Rio
      26
      Framida
      Striga resistant – Burkina faso
      27
      ICSV1049
      Striga resistant – Burkina faso
      28
      Sariaso 14
      Striga resistant – LGS – Burkina Faso
      29
      Grinkan
      Food Grade Guinea – Mali
      30
      Mace Da Kunya
      Late-maturing dune sorghum – Niger
    2. Variants in the following lines were identified in sequence data provided in conjunction with this publication: Mace, E. S. et al. Whole-genome sequencing reveals untapped genetic potential in Africa’s indigenous cereal crop sorghum. Nat. Commun. 4:2320 doi: 10.1038/ncomms3320 (2013):

      • Ai4, B35, B923296, B963676, Cherekit-IBC-E-460, Early_Hegari, Greenleaf, IBC-E-38432, ICSV745, IS3614-2, IS8525, IS9710, Karper_669, Kilo_IBC-E-382, KS115, LR9198, M35-1, Macia, Malisor_84-7, PI525695, PI563516, PI585749, PI586430, QL12, R931945-2-2, Rio, RTx7000, Sb_drummondii-PI330272, Sb_PI226096, Sb_verticilliflorum-AusTRCF_317961, Sb_verticilliflorum-PI300119, SC103-14E, SC108C, SC170-6-8, SC23, SC237-14E, SC326-6, SC35C, SC56-14E, SC62C, S_propinquum-369-1, S_propinquum-369-2, Yik_solate-IBC-E-339, Zengada-IBC-E-308

EMS Mutagenized Population

We have developed an ethyl methanesulfonate (EMS) mutagenized population of 12,000 families in the BTx623 generation. We are using this population for fundamental genetic research and sorghum improvement in two ways:

  1. forward genetic screens for new mutant phenotypes
  2. reverse genetic screens to identify useful phenotypes among mutants affected at candidate genes

Preliminary genetic analyses of the population demonstrated a very high mutation rate. Biochemical screens identified five dhurrin metabolism mutants. Whole genome sequence analyses of one of these mutants revealed approximately 7,000 G to A or C to T Single Nucleotide Polymorphisms (SNPs). Phenotypic characterization of 4800 M4 families from the population identified 24 bmr mutants. Taken together, these results demonstrate our population is highly suitable for new gene and allele discovery.

Genome sequences were generated for 600 EMS lines. Bioinformatic processing was used to identify all mutations that impact predicted protein sequences. The database available at this website can be used to identify all detected mutations in EMS lines that have specific phenotypes or to identify EMS lines that contain mutations in specified genes of interest.

SNPs Detection and Annotation Pipeline

A total of 29,972,216,082 and 3,408,790,376 NGS paired-end reads were generated using Illumina HiSeq 2500 technology for the 586 EMS-mutagenized and 30 natural variation lines in sorghum, respectively. For the 44 sorghum natural variation lines1 a total of 7,915,312,158 post-filtered NGS paired-end reads (generated using Illumina HiSeq 2000 technology) were analyzed. The sequenced reads were mapped to the sorghum reference genome2 assembly (version 2.1) using BWA short reads aligner (version 0.6.2) package3. The SAMtools software (version 0.1.18) package4 was used for the in silico SNPs detection in all the projects, with the exception of the 44 natural variation lines, where we used version 0.1.19. In the initial variant calling, only DNA bases with basecalling qualities of at least 20 in reads mapping to a candidate genomic position were considered. From the results, variant calls with read coverage depths greater than 100 or less than two were excluded. Next, variant calls lacking evidence of mapped NGS reads derived from both strands of the genome at the corresponding genomic position were also removed. Finally, only homozygous SNPs with SNP quality of at least 20 were retained. For the EMS mutagenesis project, the final SNPs were further filtered to remove (shared) SNPs detected in multiple individuals at the same genomic position. The impact of each SNP on gene function was predicted using the snpEff software (version 3.4) suite5. All sorghum genes with predicted high or medium SNP impact were further annotated using the available sorghum gene descriptions. Additionally, Arabidopsis thaliana (TAIR10) and maize B73 (version 5b.60) homologs of the sorghum genes were detected using the BLAST software (version 2.2.30+) package6.

Citations:

  1. Mace, E. S. et al. Whole-genome sequencing reveals untapped genetic potential in Africa’s indigenous cereal crop sorghum. Nat. Commun. 4:2320 doi: 10.1038/ncomms3320 (2013)
  2. Paterson AH, Bowers JE, Bruggmann R, et al. The Sorghum bicolor genome and the diversification of grasses. Nature. 2009;457(7229):551–6. doi:10.1038/nature07723.
  3. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. doi:10.1093/bioinformatics/btp324.
  4. Li H, Handsaker B, Wysoker A, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. doi:10.1093/bioinformatics/btp352.
  5. Cingolani P, Platts A, Wang LL, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6(2):80–92. doi:10.4161/fly.19695.
  6. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10. doi:10.1016/S0022-2836(05)80360-2.

Contact Information

Principle Investigators

Project team members

  • Dr. Charles Addo Quaye caddoqua at purdue dot edu
  • Moriah M Massafaro mmassafa at purdue dot edu
  • Molly M McKneight mmckneig at purdue dot edu

Web site questions, issues, and comments

  • Jan Erik Backlund jbacklun at purdue dot edu

Purdue University, 610 Purdue Mall, West Lafayette, IN 47907, (765) 494-4600

2014 Purdue University | An equal access/equal opportunity university | Copyright Complaints | Maintained by Office of XYZ

If you have trouble accessing this page because of a disability, please contact Office of XYZ at XYZ@purdue.edu.