The database is created from flat file versions of IEDB, McPAS-TCR, and VdjDB slim in that order and generates an RDA file of all T-cells and B-cells recorded to show antigenic specificity in these three databases. This is a script for internal use and will be run once every six months to update the databases.
Format
antigen_db
A tibble with 320,786 rows and 16 columns:
- tra_cdr3_aa
T-cell receptor alpha chain amino acid sequence
- gene
Antigen gene/protein name
- epitope
Antigen epitope sequence/target
- pathology
Pathology associated with epitope
- antigen
Name of antigen
- tra_v_call
T-cell receptor alpha chain V gene
- tra_j_call
T-cell receptor alpha chain J gene
- mhc_allele
MHC allele associated with the epitope
- reference
Reference for known antigenic specificity
- score
Confidence of the antigenic specificity
- cell_type
Immune cell type
- source
Database name
- trb_cdr3_aa
T-cell receptor beta chain amino acid sequence
- trb_v_call
T-cell receptor beta chain V gene
- trb_j_call
T-cell receptor beta chain J gene
- Species
Species
Source
The flat files for this function can be downloaded at the following links:
IEDB: https://www.iedb.org/downloader.php?file_name=doc/receptor_full_v3.zip
McPAS-TCR: http://friedmanlab.weizmann.ac.il/McPAS-TCR/
VdjDB: https://github.com/antigenomics/vdjdb-db/releases Note: Here the vdjdb.slim.txt is used as the input Note: The order of the input matters, please provide the flat file paths in this order IEDB, McPAS-TCR, VdjDB.