Learn Motifs from Unaligned Sequences

This tool allows you to find motifs that are discriminatively enriched in a positive sequences set relative to a negative sequences set. The main novelty is that it can learn a "Feature Motif Model" (FMM) representation of the motif, capturing dependencies between different positions through di-nucleotide features (such as: "G at position 3 and T at position 9"). Mono-nucleotide features are also in use, thus the FMM formalism contains the PSSM one. FMMs are represented by a clear and intuitive logo, easily pointing out important di-nucleotide features. The height of a feature in the logo is linear to its expected occurrence. For a comprehensive description of the FMM and of this tool (the FMM Motif Finder) see Sharon & Lubliner et al., A Feature-Based Approach to Modeling Protein-DNA Interactions, PLoS Comput Biol, Aug. 2008. Note that standard PSSM models can be learned by the tool, without using the FMM formalism.

Online usage of this tool is limited to data size of up to 500 kb. To run on larger sets you may download our standalone executable and run it on your machine. The standalone version allows the customization of various run parameters.

The software uses weblogo to generate the PSSM logos, under this license.

