Step 17: "DIS" output validation
It is easy to sort A_vs_BC_substitutions.good file, for example, by third
column using UNIX sort command:
$ sort -n +2 -r A_vs_BC_substitutions.good > A_vs_BC_substitutions.good.sorted
where -n stands for numeric option, +2 for third column,
and -r for reverse order (descending).
contains data for more than 800 SNP candidates where polymorphism occurs two times or
more at the same place of the consensus (high priority). About 2000 candidates are of lowest
priority. By analyzing all output files it is easy to find that our method revealed
more than 1000 SNP/INDEL candidates of high priority (high level of confidence) on the example
It is possible to view and validate assembly and polymorphic sites using
The viewer is written in Python and should work on any computer platform with Python
interpreter. Input files for ContigViewer are "Alignment Files" generated by
You can check several examples on the "step 14" web page.
Note, that first two contigs with numbers 2491 and 2942 belong to "high priority" group.
Third contig 415, it seems, assembled with paralogs. Last contig with number 9 belongs
to low priority group.