How to run a design with Teolenn

Requirements

To run Teolenn you need:

  • A computer that can execute Teolenn (see installation requirements for more informations).
  • Download Teolenn and install it.
  • The sequence of the genome you want use for the design.
    • This genome must be in one file using the Fasta format.
    • The header of each chromosome or scaffold must be simple (short as possible, without punctuation marks or other symbols).
      >chromosome_1
      TATATAAAAACCTTTACTACTTTTACTATTATTATTACCTTATTATATAGTTATAATTAACTTCCTTTTA
      GCACTACTATTAATAAATAATAAATATAATATACTACTAATTACTATAAATAAATTTAGTAAAAAGGTAA
      TTCTAAAACTAGTTAAAAAAACTAATATAGCCTTAAAAATAGCTAATAAGCTAGTAGCAAGACTTTTAAA
      ...
  • The sequence of the masked genome if you want use the complexity measurement.
    • This genome must be in one file using the Fasta format.
    • The header of each chromosome or scaffold must be simple (short as possible, without punctuation marks or other symbols).
    • The name of the header of each chromosome must be the same for genome file and genome masked file.

Non standard Installation

Soap and genometools can be installed manualy. For more information about non standard installation see the installing section.

The design file

The design file is the core element of a design using Teolenn. It contains the filters to apply on generated sequences, the measurements to compute, the filters on measurements and the selection parameters. The design file is an XML file (see Wikipedia article for more information about XML). This file allow Teolenn to be a very flexible design tool. As the design file is the central part of Teolenn you must read carefully the design file section of the documentation.

You can also use the Trichoderma reesei design file as a model for your design. Sequence and sequence masked of Trichoderma reesei are available on the website of the JGI. Don't forget in this design that the parameters are specific to Trichoderma reesei and need to be adapted for your design.

The Teolenn process

There are 4 steps in the Teolenn process to select probes:

  1. Generate all oligonucleotides sequences for the genome and the genome masked (create one file per chromosome, the extension of this files are ".oligo" and ".masked" files). Filter oligonucleotides sequences (one file per chromosome, the extension of this files are ".oligo.filtered" and ".masked.filtered" files)
  2. Compute measurements (create the oligo.mes file)
  3. Filter measurements (create the filtered.mes and the filtered.stat file)
  4. Compute the selection of the oligonucleotides (create the select.mes file). In this final step:
    1. For each measurement and oligonucleotide, Teolenn compute a score using the measurement value and sometimes one or more statistical parameters (defined in the parameters values of the measurements in the select section of the design file). A score is a float value between 0 and 1 whereas a measurement value can have any type and value (negative, string...).
    2. Apply a weight on each measurement score to get a global score for an oligonucleotide.
    3. Choose the oligonucleotide with the best score in each selection window.

As there is the need of statistical data for the last step of the process, Teolenn must be launched two times. One to get the statistical data and one to get the selected oligonucleotides. You can skip the 4 firsts steps in the second run using the skip attribute in the design file (see the design file section of the documentation for more information).