ProteInfer
Predicting the functional properties of protein sequences using deep neural networks.
Important
This file type requires the parsomics-plugin-proteinfer
plugin
File naming
The file names must adhere to one of the following patterns:
<MAG-name>_ProteInfer_out.tsv
,<MAG-name>.tsv
,
File format
The file must include a header (i.e. it should include column names at the top). It must have the following columns:
Column name | Column obligatoriness | Data type | Data nullability |
---|---|---|---|
sequence_name | Mandatory | String | Not nullable |
predicted_label | Mandatory | String | Nullable |
confidence | Mandatory | String | Nullable |
description | Optional | String | Nullable |
Mapping to database
ProteinAnnotationFile
Original data | ProteinAnnotationFile field |
---|---|
ProteInfer TSV file path | path |
ProteinAnnotationEntry
Original data | ProteinAnnotationEntry field |
---|---|
sequence_name | protein_key 1 |
predicted_label | accession and annotation_type 2 |
confidence | score |
description | description |
Footnotes
-
The protein name in the ProteInfer TSV file name is used to query the primary key of the corresponding protein in the database ↩
-
The
predicted_label
column in the ProteInfer TSV files is formatted like so:<Annotation-type>:<Accession>
. One such example would bePfam:CL0023
, in which case this plugin would setannotation_type
to "PFAM" andaccession
to "CL0023". ↩