Manage Environments

The signac-flow package uses environment profiles to adjust the submission process to local environments. That is because different environments provide different resources and options for the submission of operations to those resources. Although the basic options will always be the same, there might be some subtle differences depending on where you want to submit your operations.

Tip

If you are running on a high-performance super computer, add the following line to your project.py module to import packaged profiles: import flow.environments Please see Supported Environments for more information.

How to Use Environments

Environments are defined by subclassing from the ComputeEnvironment class. The ComputeEnvironment class is a meta-class that ensures that all subclasses are automatically globally registered when they are defined. This enables us to use environments simply by defining them or importing them from a different module. The flow.get_environment() function will go through all defined ComputeEnvironment classes and return the one where the is_present() class method returns True.

Packaged Environments

The package comes with a few default environments which are always available and designed for specific schedulers. That includes the DefaultTorqueEnvironment and the DefaultSlurmEnvironment. This means that if you are within an environment with a torque or slurm scheduler you should be immediately able to submit to the cluster.

In addition, signac-flow comes with some environments tailored to specific compute clusters that are defined in the flow.environments module. These environments are not automatically available. Instead, you need to explictly import the flow.environments module.

For a full list of all packaged environments, please see Supported Environments.

Defining New Environments

In order to implement a new environment, create a new class that inherits from flow.ComputeEnvironment. You will need to define a detection algorithm for your environment, by default we use a regular expression that matches the return value of socket.getfqdn().

Those are the steps usually required to define a new environment:

  1. Subclass from flow.ComputeEnvironment.
  2. Determine a regular expression that would match the output of socket.getfqdn().
  3. Create a template and specify the template name as template class variable.

This is an example for a typical environment class definition:

class MyUniversityCluster(flow.DefaultTorqueEnvironment):

    hostname_pattern = r'.*\.mycluster\.university\.edu$'  # Matches names like login.mycluster.university.edu
    template = 'mycluster.myuniversity.sh'

Then, add the mycluster.myuniversity.sh template script to the templates/ directory within your project root directory.

Important

The new environment will be automatically registered and used as long as it is either defined within the same module as your FlowProject class or its module is imported into the same module.

As an example on how to write a submission script template, this would be a viable template to define the header for a SLURM scheduler:

{% extends "base_script.sh" %}
{% block header %}
#!/bin/bash
#SBATCH --job-name="{{ id }}"
#SBATCH --partition={{ partition }}
#SBATCH -t {{ walltime|format_timedelta }}
{% block tasks %}
#SBATCH --ntasks={{ np_global }}
{% endblock %}
{% endblock %}

All templates, which are shipped with the package, are within the flow/templates/ directory within the package source code.

Contributing Environments to the Package

Users are highly encouraged to contribute environment profiles that they developed for their local environments. In order to contribute an environment, either simply email them to the package maintainers (see the README for contact information) or create a pull request.