# The FlowProject¶

This chapter describes how to setup a complete workflow via the implementation of a FlowProject.

## Setup and Interface¶

To implement a more automated workflow, we can subclass a FlowProject:

# project.py
from flow import FlowProject

class Project(FlowProject):
pass

if __name__ == '__main__':
Project().main()

Tip

You can generate boiler-plate templates like the one above with the $flow init function. There are multiple different templates available via the -t/--template option. Executing this script on the command line will give us access to this project’s specific command line interface: ~/my_project$ python project.py
usage: project.py [-h] [-d] {status,next,run,script,submit,exec} ...

Note

You can have multiple implementations of FlowProject that all operate on the same signac project! This may be useful, for example, if you want to implement two very distinct workflows that operate on the same data space. Simply put those in different modules, e.g., project_a.py and project_b.py.

## Defining a workflow¶

We will reproduce the simple workflow introduced in the previous section by first copying both the greeted() condition function and the hello() operation function into the project.py module. We then use the operation() and the post() decorator functions to specify that the hello() operation function is part of our workflow and that it should only be executed if the greeted() condition is not met.

# project.py
from flow import FlowProject

class Project(FlowProject):
pass

def greeted(job)
return job.isfile('hello.txt')

@Project.operation
@Project.post(greeted)
def hello(job):
with job:
with open('hello.txt', 'w') as file:
file.write('world!\n')

if __name__ == '__main__':
Project().main()

We can define both pre and post conditions, which allow us to define arbitrary workflows as an acyclic graph. A operation is only executed if all pre-conditions are met, and at at least one post-condition is not met.

Tip

Cheap conditions should be placed before expensive conditions as they are evaluated lazily! That means for example, that given two pre-conditions, the following order of definition would be preferable:

@Project.operation
@Project.pre(cheap_condition)
@Project.pre(expensive_condition)
def hello(job):
pass

The same holds for post-conditions.

We can then execute this workflow with:

## Generating Execution Scripts¶

Instead of executing operations directly we can also create a script for execution. If we have any pending operations, a script might look like this:

~/my_project \$ python project.py script

set -e
set -u

# Operation 'hello' for job '14fb5d016557165019abaac200785048':
# Operation 'hello' for job '2af7905ebe91ada597a8d4bb91a1c0fc':
# Operation 'hello' for job '42b7b4f2921788ea14dac5566e6f06d0':
# Operation 'hello' for job '9bfd29df07674bc4aa960cf661b5acd2':
# Operation 'hello' for job '9f8a8e5ba8c70c774d410a9107e2a32b':

These scripts can be used for the execution of operations directly, or they could be submitted to a cluster environment for remote execution. For more information about how to submit operations for execution to a cluster environment, see the Cluster Submission chapter.

This script is generated from a default jinja2 template, which is shipped with the package. We can extend this default template or write our own to cutomize the script generation process.

Here is an example for such a template, that would essentially generate the same output:

cd {{ project.config.project_dir }}

{% for operation in operations %}
operation.cmd
{% endfor %}

Note

Unlike the default template, this exemplary template would not allow for parallel execution.

Checkout the next section for a guide on how to submit operations to a cluster environment.