giira - Gene Identification Incorporating RNA-Seq data and Ambiguous reads


giira -iG genomeFile.fasta -iR rnaFile.fastq -libPath


GIIRA (Gene Identification Incorporating RNA-Seq data and Ambiguous reads) is a method to
identify potential gene regions in a genome based on a RNA-Seq mapping and incorporating
ambiguously mapped reads.


-h : help text and exit

-iG [pathToGenomes] : specify path to directory with genome files in fasta format

-iR [pathToRna] : specify path to directory with rna read files in fastq format

-scripts [absolutePath] : specify the absolute path to the directory containing the
required helper scripts, DEFAULT: directory of GIIRA.jar

-out [pathToResults] : specify the directory that shall contain the results files

-outName [outputName] : specify desired name for output files, DEFAULT: genes

-haveSam [samfileName]: if a sam file already exists, provide the name, else a mapping is
performed. NOTE: the sam file has to be sorted according to read names!

-nT [numberThreads] : specify the maximal number of threads that are allowed to be used,

-mT [tophat/bwa/bwasw] : specify desired tool for the read mapping, DEFAULT: tophat

-mem [int] : specify the amount of memory that cplex is allowed to use

-maxReportedHits [int] : if using BWA as mapping tool, specify the maximal number of
reported hits, DEFAULT: 2

-prokaryote : if specified, genome is treated as prokaryotic, no spliced reads are
accepted, and structural genes are resolved. DEFAULT: n

-minCov [double] : specify the minimum required coverage of the gene candidate extraction,
DEFAULT: -1 (is estimated from mapping)

-maxCov [double] : optional maximal coverage threshold, can also be estimated from mapping

-endCov [double] : if the coverage falls below this value, the currently open candidate
gene is closed. This value can be estimated from the minimum coverage (-1);

-dispCov [0/1] : estimate (1) the coverage histogram for the read mapping, DEFAULT: 0

-interval [int] : specify the minimal size of an interval between near candidate genes, if
"-1" it equals the read length. DEFAULT: -1

-splLim [double] : specify the minimal coverage that is required to accept a splice site,
if (-1) the threshold is equal to minCov, DEFAULT: -1

-rL [int] : specify read length, otherwise this information is extracted from SAM file

-samForSequential [pathToSamFile] : if it is desired to analyse chromosomes in a
sequential manner, provide a chromosome sorted sam file in addition to the one
sorted by read names, DEFAULT: noSequential

-noAmbiOpti : if specified, ambiguous hits are not included in the analysis

-settingMapper [(list of parameters)] : A comma-separated list of the desired parameters
for TopHat or BWA. Please provide

for each parameter a pair of indicator and value, separated by an equality sign.
Note that parameters intended for the 3 different parts (indexing, aln, sam) of BWA
have to be separated by a lowercase bar

Example: -settingMapper [-a=is_-t=5,-N_-n=5]

