This is the command compstruct that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator
compstruct - calculate accuracy of RNA secondary structure predictions
compstruct [options] trusted_file test_file
compstruct evaluates the accuracy of RNA secondary structure predictions, at the on a per-
base-pair basis. The trusted_file contains one or more sequences with trusted (known) RNA
secondary structure annotation. The test_file contains the same sequences, in the same
order, with predicted RNA secondary structure annotation. compstruct reads the structures
and compares them, and calculates both the sensitivity (the number of true base pairs that
are correctly predicted) and the specificity (positive predictive value, the number of
predicted base pairs that are true). Results are reported for each individual sequence,
and in summary for all sequences together.
Both files must contain secondary structure annotation in WUSS notation. Only SELEX and
Stockholm formats support structure markup at present.
The default definition of a correctly predicted base pair is that a true pair (i,j) must
exactly match a predicted pair (i,j).
Mathews, Zuker, Turner and colleagues (see: Mathews et al., JMB 288:911-940, 1999) use a
more relaxed definition. Mathews defines "correct" as follows: a true pair (i,j) is
correctly predicted if any of the following pairs are predicted: (i,j), (i+1,j), (i-1,j),
(i,j+1), or (i,j-1). This rule allows for "slipped helices" off by one base. The -m
option activates this rule for both sensitivity and for specificity. For specificity, the
rule is reversed: predicted pair (i,j) is considered to be true if the true structure
contains one of the five pairs (i,j), (i+1,j), (i-1,j), (i,j+1), or (i,j-1).
-h Print brief help; includes version number and summary of all options, including
-m Use the Mathews relaxed accuracy rule (see above), instead of requiring exact
prediction of base pairs.
-p Count pseudoknotted base pairs towards the accuracy, in either trusted or predicted
structures. By default, pseudoknots are ignored.
Normally, only the trusted_file would have pseudoknot annotation, since most RNA
secondary structure prediction programs do not predict pseudoknots. Using the -p
option allows you to penalize the prediction program for not predicting known
pseudoknots. In a case where both the trusted_file and the test_file have
pseudoknot annotation, the -p option lets you count pseudoknots in evaluating the
prediction accuracy. Beware, however, the case where you use a pseudoknot-capable
prediction program to generate the test_file, but the trusted_file does not have
pseudoknot annotation; in this case, -p will penalize any predicted pseudoknots
when it calculates specificity, even if they're right, because they don't appear in
the trusted annotation; this is probably not what you'd want to do.
Specify that the two sequence files are in format <s>. In this case, both files
must be in the same format. The default is to autodetect the file formats, in which
case they could be different (one SELEX, one Stockholm).
Don't print any verbose header information.
Use compstruct online using onworks.net services