Using FASTA Files in ProSightPC and ProSightPD

High throughput searching is only possible if software can expect input files that are formatted in a predictable pattern. As the leader in High Throughput Top Down searching, Proteinaceous has developed its software to expect files that match the UniProt format.

Proteinaceous generally recommends the use of UniProt text files or XML files because they contain known PTM information. Using this known PTM information is what allows Proteinaceous software to perform the most sophisticated searching currently available.

Even though they do not contain nearly as much information, some users prefer to use fasta files. This is normally because they are easy to create and modify. Unfortunately, if a newly created or modified fasta file does not match the UniProt format it might not be as useable with Proteinaceous software.

The most common problem occurs when the description line does not follow the UniProt format. Fortunately, there is simple fix – reformat the description line to more closely match the UniProt format. The UniProt format includes >sp or >tr and two pipes(|). If the user changes the description line to include >sp or >tr, a unique accession for every entry and two pipes, the fasta file should work with both ProSightPC and ProSightPD.

UniProt entry:
>sp|P02144|MYG_HUMAN Myoglobin OS=Homo sapiens GN=MB PE=1 SV=2
MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE
DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH
PGDFGADAQGAMNKALELFRKDMASNYKELGFQG

Example of non-conforming entry:

>MYG_HUMAN
MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE
DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH
PGDFGADAQGAMNKALELFRKDMASNYKELGFQG

Example of fixed description line:
>sp|P02144|MYG_HUMAN
MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE
DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH
PGDFGADAQGAMNKALELFRKDMASNYKELGFQG

Fixing fasta files can be challenging. If this solution does not solve the issue for you or if you are experiencing other issues please contact the development team.

Please follow and like us:
RSS
Follow by Email
Facebook
Google+
http://proteinaceous.net/using-fasta-files-in-prosightpc-and-prosightpd/
Twitter