The Configuration file

Some global parameters of Eoulsan such as the path of the temporary directory can be set in a configuration file. There are two ways to use a configuration file:

  • Create a ~/.eoulsan file.
  • use the -conf <file> option of Eoulsan command line. The ~/.eoulsan file will not be read.

A setting can be also set using the -s option of Eoulsan. See the command line section for more information.

The next table summaries the available parameters of the configuration file:

Parameter Type Default value Description
main.tmp.dir string /tmp usually Path to the temporary directory
main.executables.tmp.dir string The value of the main.tmp.dir parameter Path to the temporary directory for binaries extracted from Eoulsan Jar
main.debug boolean false Enable debugging information
main.printstacktrace boolean false Enable print stack trace when error occurs
main.ui.name string basic Define the user interface to use. There is currently 3 available UI: "basic" (the default UI), "no" (that do nothing) and the experimental "lanterna"
main.local.threads integer 0 Number of threads to use in local mode
main.output.tree.type string step Define the organization of the output files. If value is "flat" all the output files will be in the execution directory, and if value is "step" all the output files of a step will be gethered in a dedicated directory
main.format.path string Not set Define the paths of the formats. Multiple paths can be separated using a space character
main.galaxy.tool.path string Not set Define the paths of the galaxy tools files. Multiple paths can be separated using a space character
main.default.fastq.format string fastq-sanger The default fastq format: fastq-sanger, fastq-solexa, fastq-illumina or fastq-illumina-1.5
main.design.obfuscate boolean true Obfuscate design file when upload to AWS
main.design.remove.replicate.info boolean true Remove replicate information in design when upload to AWS
main.old.result.format boolean false Save step result file in the Eoulsan version 1 format
main.rserve.enable boolean false Enable Rserve server for R computation
main.rserve.servername string Not set Name of the Rserve server
main.rserve.keep.files boolean false Keep files on Rserve server
main.save.r.scripts boolean false Save or not r scripts
main.genome.storage.path string Not set Path to the genomes repository
main.gff.storage.path string Not set Path to the GFF annotations repository
main.gtf.storage.path string Not set Path to the GTF annotations repository
main.additional.annotation.storage.path string Not set Path to the additional annotations repository
main.genome.mapper.index.storage.path string Not set Path to the genome indexes repository (cannot be an URL)
main.genome.desc.storage.path string Not set Path to the genome descriptions repository (cannot be an URL)
main.additional.annotation.hypertext.links.path string Not set Path to the additional annotation hypertext links info file (cannot be an URL)
main.docker.uri string Not set The Docker server URI. Usually the value is unix:///var/run/docker.sock
main.docker.mount.nfs.roots boolean false If this option is enabled, when mounted data in a Docker container are stored in a NFS volume, mount the root the of NFS volume instead of data path. This option avoid some right issues with NFS root squash
main.mail.send.result.mail boolean false Enable send mail to user at the end of analysis
main.mail.send.result.mail.to string Not set Mail address where send result message
main.mail.smtp.host string Not set SMTP server to use to send mails. See the SMTP section for more information
main.hadoop.log.level string INFO Hadoop Log4J log level
zookeeper.connect.string string Not set ZooKeeper connect String. If not set, the server used will be the same as the job tracker node and the port will the default port set by the zookeeper.default.port
zookeeper.default.port integer 2181 ZooKeeper Default port
zookeeper.session.timeout integer 10000 ZooKeeper session timeout
aws.access.key string Not set AWS access key, a 20-character alphanumeric string
aws.secret.key string Not set AWS secret key, a 40-character string
aws.ec2.key.name string Not set EC2 key pair key name that allow SSH connection to the remote cluster.
aws.mapreduce.hadoop.version string 1.0.3 Hadoop version to use with AWS MapReduce
aws.mapreduce.instances.number string Not set Number of instances in the cluster
aws.mapreduce.instances.type string m1.xlarge Instance type
aws.mapreduce.endpoint string Europe (see below for the real value) AWS endpoint to use (European as default).
aws.mapreduce.log.path string Not set Log path. If not set, no log will be generated
aws.mapreduce.task.tracker.mapper.max.tasks integer 0 The number of maximal task mapper that can be created on a task tracker (0 for no limit)
aws.mapreduce.enable.debugging boolean false If set to true, AWS Elastic MapReduce debug mode will be enable. (Require AWS SimpleDB)
aws.mapreduce.wait.job boolean false Wait the end of the job on AWS MapReduce
main.cluster.scheduler.name string Not set The name of the cluster scheduler to use
main.cluster.default.required.memory integer Not set The default amount of memory in MB required to launch a step on the cluster
htcondor.concurrency.limits value Not set HTCondor concurrency limits values to use if user wants to limit the number of simultaneous running jobs (e.g. eoulsan:2500)

This values are overridden by the values of the global section of the XML workflow file. It is useful to use a configuration file to set AWS keys to remove this entries from the workflow file that is sent on the network when using Amazon Web Services. Developers can also use additional parameter that are not in the previous table.

The following table contains the endpoints for the several AWS regions where EMR is available:

Region Endpoint
US-East (Northern Virginia) us-east-1.elasticmapreduce.amazonaws.com
US West (Oregon) elasticmapreduce.us-west-2.amazonaws.com
US-West (Northern California) us-west-1.elasticmapreduce.amazonaws.com
EU (Ireland) eu-west-1.elasticmapreduce.amazonaws.com
Asia Pacific (Singapore) elasticmapreduce.ap-southeast-1.amazonaws.com
Asia Pacific (Sydney) elasticmapreduce.ap-southeast-2.amazonaws.com
Asia Pacific (Tokyo) elasticmapreduce.ap-northeast-1.amazonaws.com
South America (Sao Paulo) elasticmapreduce.sa-east-1.amazonaws.com

The lastest list of available EMR endpoints can be found here.

SMTP configuration (Mail service)

Eoulsan use the javamail library to send messages to inform user of the end of the analysis. To do this Eoulsan need an SMTP server. If your SMTP server can be used without authentication and with an unencrypted connection on the default port, you just had to set the main.mail.smtp.host parameter in your Eoulsan configuration. Otherwise, you need to add the correct javamail SMTP properties with a "main." prefix to your Eoulsan configuration (e.g. javamail mail.smtp.port property becomes main.mail.smtp.port in Eoulsan configuration).

Eoulsan configuration file sample

# This is an example of configuration file for Eoulsan.
# You need to use the -conf parameter or rename this file to
# $HOME/.eoulsan to enable it.

# Temporary directory.
# By default Eoulsan use the temporary directory for your platform.
main.tmp.dir=/tmp

# Debug mode.
# By default in Eoulsan the debug mode is disable.
main.debug=false