Representative Genomes
- HRGMv2_Rep_Genome/
: Genome sequences (FASTA format) for 4,824 HRGMv2 representative genomes (one per species)
- HRGMv2_Rep_Genome.tar.gz
: Compressed archive of the HRGMv2_Rep_Genome/ folder
Pangenomes
- HRGMv2_Pangenomes/
: Pangenome data for all 4,824 species, structured as follows:
For Multi-genome species (2,639 species): Species with more than one non-redundant genome. Each folder contains the full output of Panaroo v1.3.0.
(Refer to the Panaroo GitHub for detailed file descriptions.)
- combined_DNA_CDS.fasta.gz
- combined_protein_CDS.fasta.gz
- combined_protein_cdhit_out.txt
- combined_protein_cdhit_out.txt.clstr
- final_graph.gml
- gene_data.csv.gz
- gene_presence_absence.csv
- gene_presence_absence.Rtab
- gene_presence_absence_roary.csv
- pan_genome_reference.fa – nucleotide sequences
- pan_genome_reference.faa – amino acid sequences
- pre_filt_graph.gml
- struct_presence_absence.Rtab
- summary_statistics.txt
- emapper_results/ – eggNOG-mapper results for pan_genome_reference.fa
- rgi_results/ - RGI results for pan_genome_reference.fa
For Single-genome species (2,185 species): Species with only one non-redundant genome. Pangenomes were generated directly from the representative genome, including:
- pan_genome_reference.fa – nucleotide sequences
- pan_genome_reference.faa – amino acid sequences
- emapper_results/ – eggNOG-mapper results for pan_genome_reference.fa
- rgi_results/ – RGI results results for pan_genome_reference.fa
- For bulk download:
1. HRGMv2_Pangenomes.tar.gz – Archive of the entire HRGMv2_Pangenomes/ folder
2. Pangenome_download_link_info.tsv – Table with full download URLs for each species