Frequently Asked Questions about the TargetIdentifier Server
Who are we?
The TargetIdentifier Server was implemented by Dr. Min (see Min et al. 2005). This site is supported by Youngstown State University.
Our motivation
Generating expressed sequence tags (ESTs) remains a primary method for gene discovery in most organisms. Identifying full-length cDNA clones with a specific function for further downstream characterization in the laboratory raises a critical demand for automating this process. Our server is aimed at automating identification of full-length EST cDNAs.
How does it work?
TargetIdentifier uses the frames predicted in the pre-run BLASTX. The BLASTX in the 'NCBI-blastall' package parameters are used: "-v 1 -b 1 -e 1e-5" (Note: we used version 2.2.19 - earlier or later versions may not work properly). The results to identify full-length EST derived seqeunces according to the definitions described below. It integrates (i) the prediction whether an EST or cDNA sequence is full-length or not (i.e. containing a putative translation initiation codon or not) and (ii) prediction of an open reading frame (ORF) completely sequenced or not, as well as (iii) functional annotation based on BLASTX outputs.
Definition of each category of prediction
Input
Output
TargetIdentifier: The default format is in MS Excel (.xls) format. Each field (column) is tab-delimited. The fields are: (i) the name/function of the subject protein in the highest score pair (HSP) in BLASTX, (ii) query identifier, (iii) E-value, (iv) a prediction of whether the query is full-length, short full-length, possible full-length, ambiguous or partial, (v) start codon position or 'NO' if no start codon is predicted, (vi) the sequence status of the query regarding whether or not the ORF has been completely sequenced, and (vii) BLASTX output for the HSP that includes the subject definition line, length, score, E-value, identities, positives, and reading frame. A sample of TargetIdentifier output generated using our Aspergillus niger data.
Annotator: a program implemented using the algorithm of the TargetIdentifier. The output is slightly different. The default format is in MS Excel (.xls) format. Each field (column) is tab-delimited. The fields are: (i) query identifier, (ii) provisional function with or without a qualifier (Table 1), (iii) a prediction of whether the query sequence is full-length, short full-length, possible full-length, ambiguous or partial, (iv) start codon position or 'NO' if no start codon is predicted, and (v) the sequence status of the query regarding whether or not the ORF has been completely sequenced. A sample of Annotator output generated using our Aspergillus niger data.
E-value | Qualifier |
---|---|
E <= 1e-100 | No qualifier |
E <= 1e-50 | Homologous to |
E <= 1e-30 | Highly similar to |
E <= 1e-10 | Similar to |
E <= 1e-5 | Weakly similar to |
E <= 0.1 | Very weakly similar to |
E > 0.1 | Extremely weakly similar to |
Accuracy evaluation
The accuracy of the algorithm was evaluated using the human UniGene dataset and our in-house generated A. niger EST sequences. The overall accuracy was > 90%. See details.
Security of user submitted data
The data submitted to our server will be automatically deleted after they are processed. We do not keep data submitted by a user.
How to obtain user's results
The results can be downloaded from the server web site. The output file(s) will be kept for 2 days only after data generation, then will be deleted. The results are saved in Microsoft Excell (.xls) format and can be changed to "text" format as each field (column) is tab-delimited.
How to cite us
Min, X.J., Butler, G, Storms, R. and Tsang, A. TargetIdentifier: a web server for identifying full-length cDNAs from EST sequences. Nucleic Acids Res., 2005, Vol. 33, Web Server Issue W669-W672. Our server URL (http://bioinformatics.ysu.edu/tools/TargetIdentifier.html) can also be used as your reference.
Standalone TargetIdentifier Software availability
The standalone version of the TargetIdentifier software is available free for academic use only. It is written in Perl - easy to run in any OS. Please download at following site for downloading.
Comments and suggestions
Please contact Dr. Min at the YSU Bioinformatics Lab.
Back to the TargetIdentifier Server | Top of Page | Back to Index Page |