1. Descrition: KIJ_CD-HIT-100_Proteins and UHGP-100_unique proteins are merged, and identical proteins are de-replicated
2. Number of proteins: 107 million
3. Protein fasta file: KIJ-UHGP_unique_Proteins.faa
4. Cluster info file: KIJ-UHGP_unique_Proteins.cluster_info.tsv
>format: 1st column - representative
2nd column - member proteins (separated by ';')
>Representative protein is the longest sequence of the cluster.
Present directory - data/protein_catalog/3.KIJ-UHGP_unique_Proteins
Name | Last modified | Size | |
---|---|---|---|
Parent Directory | - | - | |
KIJ-UHGP_unique_Proteins.cluster_info.tsv.gz | 2020-06-17 11:34:58 | 3 GB | |
KIJ-UHGP_unique_Proteins.faa.gz | 2020-06-17 11:35:50 | 23 GB | |
readme.txt | 2020-11-04 20:02:40 | 445 B |