Comparing the success of different prediction software in sequence analysis: a review.

Vladimir Bajic*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

37 Scopus citations

Abstract

The abundance of computer software for different types of prediction in DNA and protein sequence analyses raises the problem of adequate ranking of prediction program quality. A single measure of success of predictor software, which adequately ranks the predictors, does not exist. A typical example of such an incomplete measure is the so-called correlation coefficient. This paper provides an overview and short analysis of several different measures of prediction quality. Frequently, some of these measures give results contradictory to each other even when they relate to the same prediction scores.This may lead to confusion. In order to overcome some of the problems, a few new measures are proposed including some variants of a 'generalised distance from the ideal predictor score'; these are based on topological properties, rather than on statistics. In order to provide a sort of a balanced ranking, the averaged score measure (ASM) is introduced.The ASM provides a possibility for the selection of the predictor that probably has the best overall performance.The method presented in the paper applies to the ranking problem of any prediction software whose results can be properly represented in a true positive-false positive framework, thus providing a natural set-up for linear biological sequence analysis.

Original languageEnglish (US)
Pages (from-to)214-228
Number of pages15
JournalBriefings in bioinformatics
Volume1
Issue number3
DOIs
StatePublished - Jan 1 2000

ASJC Scopus subject areas

  • Information Systems
  • Molecular Biology

Fingerprint Dive into the research topics of 'Comparing the success of different prediction software in sequence analysis: a review.'. Together they form a unique fingerprint.

Cite this