E-mail question/comments: deblasio@cs.arizona.edu
About
The accuracy of a protein multiple sequence alignment is measured with respect to a known reference alignment. Without such a reference alignment we are left to estimate the accuracy of a computed alignment. A good accuracy estimator has broad utility, from building a meta-aligner that selects the best output of a collection of aligners (see Ensemble Alignment), to boosting the accuracy of a single aligner by choosing the best values for alignment parameters (see Parameter Advising), or finding misaligned regions in a given alignment and correcting them (see Adaptive Local Realignment). To estimate accuracy of a computed multiple sequence alignment we have developed Facet (short for feature-based accuracy estimator) which computes a single estimate of accuracy as a linear combination of efficiently-computable feature functions (see Accuracy Estimator).
On this website we provide our accuracy estimator for use by the public free for non-comercial use. In addition, the benchmarks and advisor sets used in various publications are available for download. We encourage those interested to read the related publication on the pages listed above. If you would like more information or are having problems installing and running Facet let us know my sending an email to deblasio at cs dot arizona dot edu.
Publications
Please cite:
For accuracy estimation and oracle advisor sets:
Accuracy Estimation and Parameter Advising for Protein Multiple Sequence Alignment
John Kececioglu and Dan DeBlasio
Journal of Computational Biology 20:(4), 259-279, 2013.
doi:10.1089/cmb.2013.0007 (pdf)
For greedy advisor sets:
Learning Parameter-Advising Sets for Multiple Sequence Alignment
Dan DeBlasio and John Kececioglu (a)
IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2015
doi:10.1109/TCBB.2015.2430323 (pdf)
For ensemble alignment:
Ensemble Multiple Sequence Alignment via Advising
Dan DeBlasio and John Kececioglu (b)
In Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (BCB '15).
ACM, New York, NY, USA, pp452-461.
doi:10.1145/2808719.2808766 (pdf) (talk)
Previous Publications
Estimating the Accuracy of Multiple Alignments and its Use in Parameter Advising
Dan DeBlasio, Travis J. Wheeler, John Kececioglu
Proceedings of the 16th
Conference on Research in Computational Molecular Biology (RECOMB),
Springer-Verlag Lecture Notes in Bioinformatics 7262, 45-59, 2012.(pdf) (talk)
For set finding:
Learning Parameter Sets for Alignment Advising
Dan DeBlasio and John Kececioglu
In Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (BCB '14).
ACM, New York, NY, USA, pp230-239.
doi:10.1145/2649387.2649448 (pdf) (talk)
Download
The Facet distribution includes the accuracy estimator (written in Java as well as a driver script, a wrapper for PSIPRED secondary structure predictor and scripts for using Facet for aligner advising.
FACET v1.4 (tgz) (6 Aug 2015)
Previous Versions
The development version can be found on GitHub (http://git.io/Facet)
The development version of Opal which includes adaptive local realignment can be found on GitHub (http://git.io/Opal)
Note this application requires a working copy of PSIPRED as well as BLAST. PSIPRED v3.2 can be downloaded here: (link).
Acknowledgements
Research supported by the NSF IGERT Grant in Comparative Genomics DGE-0654435 and NSF GrantIIS-1217886.