Tag: Software

Sequence Assembly Software

– April 21, 2010

Hi there! Today, I would like to give some information and links about the tools used generally for sequence assembly, soon I am going to post a new article about the current state-of-the-art in visualization focusing on NGS.

AMOS is collection of tools and class interfaces for the assembly of DNA sequencing reads.

Bowtie is an ultrafast, memory-efficient short read aligner.

Cap3 is a DNA sequence assembly program.

Consed is a tool for viewing, editing, sequence assemblies created with phrap.

EagleView is an information-rich genome assembler viewer with data integration capability.

Gap5 is a general purpose multiple alignment program for DNA or proteins.

GigaBayes is a short-read SNP and short-INDEL discovery program.

Maq is a software that builds mapping assemblies from short reads generated by the next-generation sequencing machines.

Maqview is graphical read alignment viewer.

MOSAIK is a suite comprising of three modular programs: MosaikBuild, MosaikAligner, and MosaikAssembler.

Phrap is a program for assembling shotgun DNA sequence data.

Phred software reads DNA sequencing trace files, calls bases, and assigns a quality value to each called base.

PolyPhred is a program that compares fluorescence-based sequences across traces obtained from different individuals to identify heterozygous sites for single nucleotide substitutions.

VAAL is a polymorphism discovery algorithm for short reads.

Velvet is a sequence assembler for very short reads.

Mining Tools

– March 25, 2010

Here you will find some widely used and free (or shareware) data mining tools. Help yourself!

ADaM, Algorithm Development and Mining version 4.0 toolkit

AlphaMiner, open source data mining platform that offers various data mining model building and data cleansing functionality.

Databionic ESOM Tools, a suite of programs for clustering, visualization, and classification with Emergent Self-Organizing Maps (ESOM).

Gnome Data Mining Tools, including apriori, decision trees, and Bayes classifiers.

IBM Intelligent Miner. University scholars can now receive free copies of DB2 UDB and Intelligent Miner for educational or research purposes.

KEEL, includes knowledge extraction algorithms, preprocessing techniques, evolutionary rule learning, genetic fuzzy systems, and more.

KNIME, extensible open source data mining platform implementing the data pipelining paradigm (based on eclipse).

Machine Learning in Java (MLJ), an open-source suite of Java tools for research in machine learning.

MiningMart, a graphical tool for data preprocessing and mining on relational databases; supports development, documentation, re-use and exchange of complete KDD processes. Free for non-commercial purposes.

MLC++, a machine learning library in C++.

Kansas State U.port of MLC++: Binary (tar.gz), and Linux source

Orange, C++ components for data mining,includes preprocessing, modeling and data exploration techniques.

RapidMiner, a leading open-source system for knowledge discovery and data mining.

Rattle, a data mining suite based on open source statistical language R, includes graphics, clustering, modeling, and more.

StarProbe, Web-based multi-user server available for academic institutions.

TANAGRA, offers a GUI interface and methods for data access, statistics, feature selection, classification, clustering, visualization, association and more.

Weka, collection of machine learning algorithms for solving real-world data mining problems. It is written in Java and runs on almost any platform.

Enjoy!