Recently, I was introduced to an amazing pipeline-writing framework – nextflow. It has the following features:
-
it abstracts a pipeline, which can be written in any language and run on many computing platforms such as Linux Slurm, PBS, AWS, etc.
-
it is composed of many processes, the executing order of which is determined by the dependencies of the input and output channels of each process.
-
the nextflow language is extension of the Groovy programming language, which is a programming language for Java virtual machine. The language syntax is most similar to Python.
-
it allows parallel processing and caching (thus resuming from middle processes).
The key component here is connecting different processes using channels, so that the input and output of a process can be well coordinated between processes.
This seems a promising pipeline-writing program, but it needs some time to learn the programming language.
Last modified on 2019-09-19