Changelog

The signac-flow package follows semantic versioning. The numbers in brackets denote the related GitHub issue and/or pull request.

Version 0.15

[0.15.0] – 2021-xx-xx

Added

  • Add support for aggregation (operations acting on multiple jobs) via flow.aggregator (#464, #516, #542).
  • Add official support for Andes cluster (#500).
  • Decorator for setting directives while registering operation function FlowProject.operation.with_directives (#309, #502).
  • Add new flow command flow template create for automatic creation of custom templates (#520, #534).

Changed

  • Jinja templates are indented for easier reading (#461, #495).
  • flow.directives is deprecated in favor of flow.FlowProject.operation.with_directives (#309, #502).
  • All environments require a scheduler in order to submit, even in pretend mode (#533).
  • Submitting in pretend mode will show additional scheduler command information (#533).

Fixed

  • Errors raised during submission were not being shown to users (#517, #518).
  • Fixed dependency flag for SLURM submissions (#530).

Version 0.14

[0.14.0] – 2021-04-27

Added

  • Documentation for all directives (#480).
  • Defined validators for the fork directive (#480).
  • Submission summary now appears in FlowProject status output, showing the number of queued, running, unknown statuses. (#472, #488).
  • Status overview now shows the number of jobs with incomplete operations and totals for the label overviews (#481, #501).

Changed

  • Renamed TorqueEnvironment and related classes to PBSEnvironment (#388, #491).
  • LSF and SLURM schedulers will appear to be present if the respective commands bjobs -V or sbatch --version exit with a non-zero error code (#498).
  • Only known JobStatus values will be written to the project document, to save space and writing time (#503).

Fixed

  • Strictly enforce that operation functions cannot be used as condition functions (and vice-versa) and prevent the registration of two operations with the same name (#496).
  • Changed default value of status_parallelization to none, to avoid bugs in user code caused by thread parallelism and overhead in process parallelism (#486).
  • Memory directives are converted to an integer number of gigabytes or megabytes in submission scripts (#482, #484).
  • Fixed behavior of --only-incomplete-operations (#481, #501).

Removed

  • Removed FlowProject.add_operation (#479, #487).
  • Removed deprecated --walltime argument (#478).
  • Removed deprecated flow.run interface (#478).
  • Removed deprecated FlowProject.export_job_statuses (#478).
  • Removed deprecated script feature, use submit --pretend instead (#478).
  • Removed deprecated CPUEnvironment, GPUEnvironment classes (#478).

Version 0.13

[0.13.0] – 2021-03-16

Added

  • Add official support for Bridges-2 cluster (#441).
  • Add support for memory requests via directives (#258, #466).
  • Add support for walltime requests via directives, deprecated --walltime argument to submit (#240, #476).

Fixed

  • Support for multi-line @flow.cmd operations (#451, #453).
  • FlowProject status shows labels and correct number of jobs for projects with zero operations (#454, #460).

Removed

  • Removed public API of deprecated class JobOperation (#445).
  • Removed public API of deprecated methods eligible and complete of BaseFlowOperation and FlowGroup (#445).
  • Removed configuration option use_buffered_mode (#445).
  • Removed public API of script, next_operations and submit_operations of FlowProject (#445).
  • Removed support for decommissioned Bridges cluster (#441).
  • Removed support for memory command line argument in submit (#466).

Version 0.12

[0.12.0] – 2021-01-30

Added

  • Code is formatted with black and isort pre-commit hooks (#365).
  • Add official support for Python version 3.9 (#365).
  • Documentation has been added for all public classes and methods (#387, #389).
  • Added internal support for aggregates of jobs (#334, #348, #351, #364, #383, #390, #415, #422, #430).
  • Added code coverage to continuous integration (#405).

Changed

  • Command line interface always uses --job-id instead of --jobid (#363, #386).
  • CPUEnvironment and GPUEnvironment classes are deprecated (#381).
  • Docstrings are now written in numpydoc style (#392).
  • Default environment for the University of Minnesota Mangi cluster changed from Torque to SLURM (#393).
  • Run commands are evaluted lazily (#70, #396).
  • Deprecated method export_job_statuses (#402).
  • Improved internal caching of scheduler status (#410).
  • Refactored status fetching code (#368, #417).
  • Optimization: Directives are no longer deep-copied (#420, #421).
  • The use_buffered_mode config option is deprecated. Buffering is always internally enabled (#425).
  • Evaluate directives when called instead of when defined (#398, #402).
  • Various internal refactorings and optimizations (#371, #373, #374, #375, #376, #377, #378, #379, #380, #400, #410, #416, #423, #426).
  • Scheduler is now an abstract base class (#426).
  • flow.scheduling.fakescheduler has been renamed to flow.scheduling.fake_scheduler (#426).
  • Arguments to submit have been changed for all scheduler classes (#426).
  • Python 3.6 is only tested with oldest dependencies (#436).
  • Drop support for tqdm versions older than 4.48.1 (#436, #440).
  • Drop support for Jinja2 versions older than 2.10.0 (#436).
  • Deprecated operations.py method flow.run (#221, #463, #467).

Fixed

  • Ensure that directives are always evaluated before running or submitting (#408, #409).
  • Cache the fully qualified domain name during environment detection to fix a performance issue on macOS (#339, #394).
  • Ensure that next CLI command displays eligible jobs for the exact operation name provided (#443).
  • Display warning when a non-existing operation/group is passed to run, submit, or next commands (#291, #442).

Removed

  • Removed the deprecated method flow.util.misc.write_human_readable_statepoints (#397).
  • Removed the deprecated argument --no-parallelize (#424).
  • Removed the deprecated env argument from submission methods (#424).
  • flow.render_status.Renderer class has been removed. FlowProject.print_status no longer returns the renderer (#426).
  • Removed deprecated status.py module (#426).
  • Removed the --test argument from FlowProject.submit (#439).

Version 0.11

[0.11.0] – 2020-10-09

Added

  • Added classes _Directives and _Directive that serve as a smart mapping for directives specified by the environment or user (#265, #283).
  • Added support for pre-commit hooks (#333).
  • Add environment profile for University of Minnesota, Minnesota Supercomputing Institute, Mangi supercomputer (#353).

Changed

  • Make FlowCondition class private (#307, #315).
  • Deprecate JobOperation class, make SubmissionJobOperation a private class and deprecate the following methods of FlowProject: script, run_operations, submit_operations, next_operations. (#313)
  • Deprecate the following methods: FlowGroup.eligible, FlowGroup.complete, BaseFlowOperation.eligible, BaseFlowOperation.complete (#337).

Fixed

  • Serial execution on Summit correctly counts total node requirements (#342).
  • Fixed performance regression in job submission in large workspaces (#354).

Removed

  • Drop support for Python 3.5 (#305). The signac project will follow the NEP 29 deprecation policy going forward.
  • Remove the deprecated methods always, make_bundles, and JobOperation.get_id (#312).

Version 0.10

[0.10.1] – 2020-08-20

Fixed

  • Fix issue with the submission of bundled operations on cluster environments that do not allow slashes (‘/’) in cluster scheduler job names (#343).

[0.10.0] – 2020-06-27

Added

  • Add FlowGroup (one or more operations can be grouped within an execution environment) (#114).
  • Add official support for University of Michigan Great Lakes cluster (#185).
  • Add official support for Bridges AI cluster (#222).
  • Add IgnoreConditions option for submit(), run() and script() (#38, #209).
  • Add pytest support for testing framework (#227, #232).
  • Add markdown and html format support for print_status() (#113, #163).
  • Add memory flag option for default Slurm scheduler (#256).
  • Add optional environment variable to specify submission script separator (#262).
  • Add status_parallelization configuration to specify the parallelization used for fetching status (#264, #271).

Changed

  • Raises ValueError when an operation function is passed to FlowProject.pre() and FlowProject.post(), or a non-operation function passed to FlowProject.pre.after() (#248, #249).
  • The option to provide the env argument to submit and submit_operations has been deprecated (#245).
  • The command line option --cmd for script has been deprecated and will trigger a DeprecationWarning upon use until removed (#243, #218).
  • Raises ValueError when --job-name is passed by the user because that interferes with status checking (#164, #241).
  • Submitting with --memory no longer assumes a unit of gigabytes on Bridges and Comet clusters (#257).
  • Buffering is enabled by default, improving the performance of status checks (#273).
  • Deprecate the use of no_parallelize argument while printing status (#264, #271).
  • Submission via the command-line interface now calls the FlowProject.submit function instead of bypassing it for FlowProject.submit_operations (#238, #286).
  • Updated Great Lakes GPU request syntax (#299).

Fixed

  • Ensure that label names are used when displaying status (#263).
  • Fix node counting for large resource sets on Summit (#294).

Removed

  • Removed ENVIRONMENT global variable in the flow.environment module (#245).
  • Removed vendored tqdm module and replaced it with a requirement (#247).

Version 0.9

[0.9.0] – 2020-01-09

Added

  • Add official support for Python version 3.8 (#190, #210).
  • Add descriptive error message when tag is not set and cannot be autogenerated for conditions (#195).
  • Add “fork” directive to enforce the execution of operations within a subprocess (#159).
  • Operation graph detection based on function comparison (#178).
  • Exceptions raised during operations always show tracebacks of user code (#169, #171).

Changed

  • Raise a warning when a condition’s tag is not set and raise an error if this occurs during graph detection (#195).
  • Raise errors if a forked process or @cmd operation returns a non-zero exit code. (#170, #172).

Removed

  • Drop support for Python version 2.7 (#157, #158, #201).
  • The “always” condition has been deprecated and will trigger a DeprecationWarning upon use until removed (#179).
  • Removed deprecated UnknownEnvironment in favor of StandardEnvironment (#204).
  • Removed support for decommissioned INCITE Titan and Eos computers (#204).
  • Removed support for the legacy Python-based submission script generation (#200).
  • Removed legacy compatibility layers for Python 2, signac < 1.0, and soft dependencies (#205).
  • Removed deprecated support for implied operation names with the run command (#205).

Version 0.8

[0.8.0] – 2019-09-01

Added

  • Add feature for integrated profiling of status updates (status --profile) to aid with the optimization of a FlowProject implementation (#107, #110).
  • The status view is generated with Jinja2 templates and thus more easily customizable (#67, #111).
  • Automatically show an overview of the number of eligible jobs for each operation in status view (#134).
  • Allow the provision of multiple operation-functions to the pre.after and *.copy_from conditions (#120).
  • Add option to specify the operation execution order (#121).
  • Add a testing module to easily initialize a test project (#130).
  • Enable option to always show the full traceback with show_traceback = on within the [flow] section of the signac configuration (#61, #144).
  • Add full launcher support for job submission on XSEDE Stampede2 for large parallel single processor jobs (#85, #91).

Fixed

  • Both the nranks and omp_num_threads directives properly support callables (#118).
  • Show submission error messages in combination with a TORQUE scheduler (#103, #104).
  • Fix issue that caused the “Fetching operation status” progress bar to be inaccurate (#108).
  • Fix erroneous line in the torque submission template (#126).
  • Ensure default parameter range detection in status printing succeeds for nested state points (#154).
  • Fix issue with the resource set calculation on INCITE Summit (#101).

Changed

  • Packaged environments are now available by default. Set import_packaged_environments = off within the [flow] section of the signac configuration to revert to previous behavior.

  • The following methods of the FlowProject class have been deprecated and will trigger a DeprecationWarning upon use until their removal:

    • classify (use labels() instead)
    • next_operation (use next_operations() instead)
    • export_job_stati (replaced by export_job_statuses)
    • eligible_for_submission (removed without replacement)
    • update_aliases (removed without replacement)
  • The support for Python version 2.7 is deprecated.

Removed

  • The support for Python version 3.4 has been dropped.
  • Support for signac version 0.9 has been dropped.

Version 0.7

[0.7.1] – 2019-03-25

Added

  • Add function to automatically print all varying state point parameters in the detailed status view triggered by providing option -p/–parameters without arguments (#19, #87).
  • Add clear environment notification when submitting job scripts (#43, #88).

Fixed

  • Fix issue where the scheduler status of job-operations would not be properly updated for ineligible operations (#96).

Fixed (compute environments)

  • Fix issue with the TORQUE scheduler that occured when there was no job scheduled at all on the system (for any user) (#92, #93).

Changed

  • The performance of status updates has been significantly improved (up to a factor of 1,000 for large data spaces) by applying a more efficient caching strategy (#94).
  • Add clear environment notification when submitting job scripts.

[0.7.0] – 2019-03-14

Added

  • Add legend explaining the scheduler-related symbols to the detailed status view (#68).
  • Allow the specification of the number of tasks per resource set and additional jsrun arguments for Summit scripts.

Fixed (general)

  • Fixes issue where callable cmd-directives were not evaluated (#47).
  • Fixes issue where the source file of wrapped functions was not determined correctly (#55).
  • Fix a Python 2.7 incompatibility and another unrelated issue with the TORQUE scheduler driver (#54, #81).
  • Fixes issue where providing the wrong argument type to Project.submit() would go undetected and lead to unexpected behavior (#58).
  • Fixes issue where using the buffered mode would lead to confusing error messages when condition-functions would raise an AttributeError exception.
  • Fixes issue with erroneous unused-directive-keys-warning.

Fixed (compute environments)

  • Fixes issues with the Summit environment resource set calculation for parallel operations under specific conditions (#63).
  • Fix the node size specified in the template for the ORNL Eos system (#77).
  • Fixes issue with a missing --gres directive when using the GPU-shared partition on the XSEDE Bridges system (#59).
  • Fixed University of Michigan Flux hostname pattern to ignore the Flux Hadoop cluster (#82).
  • Remove the Ascent environment (host decommissioned).

Note: The official support for Python 3.4 will be dropped beginning with version 0.8.0.

Version 0.6

Major changes

  1. The generation of execution and submission scripts is now based on the jinja2 templating system.
  2. The new decorator API for the definition of a FlowProject class largely reduces the amount of boiler plate code needed to implement FlowProjects. It also removes the necessity to have at least two modules, e.g., one project.py and one operations.py module.
  3. Serial execution is now the default for all project sub commands, that includes run, script, and submit. Parallel execution must be explicitly enabled with the --parallel option.
  4. The run command executes all eligible operations, that means you don’t have to run the command multiple times to “cycle” through all pending operations. Accidental infinite loops are automatically avoided.
  5. Execution scripts generated with the script option are always bundled. The previous behavior, where the script command would print multiple scripts to screen unless the --bundle option was provided did not make much sense.

See the full changelog below for detailed information on all changes.

How to migrate existing projects

If your project runs with flow 0.5 without any DeprecationWarnings (that means no messages when running Python with the -W flag), then you don’t have to do anything. Version 0.6 is mostly backwards compatible to 0.5, with the execption of custom script templating.

Since 0.6 uses jinja2 to generate execution and submission scripts, the previous method of generating custom scripts by overloading the FlowProject.write_script*() methods is now deprecated. That means that if you overloaded any of these functions, the new templating system is disabled, and flow will fallback to the old templating system and you won’t be able to use jinja2 templates.

If you decide to migrate to the new API, those are the steps you most likely have to take:

  1. Replace all write_script*() methods and replace them with a custom template script, e.g., templates/script.sh within your project root directory.
  2. Optionally, use the new decorator API instead of FlowProject.add_operation to add operations to your FlowProject.
  3. Optionally, use the new decorator API to define label functions for your FlowProject.
  4. Execute your project with the Python -w option to make DeprecationWarnings visible and address all issues.

We recommend to go through the tutorial on signac-docs to learn how to best take advantage of flow 0.6.

[0.6.4] – 2018-12-28

  • Add the @with_job decorator that allows the definition of operations to take place within the job context. Works with @cmd.
  • Add the not_ condition prefix to negate condition functions.
  • Add the false condition prefix as analogue to the true condition prefix.
  • Add support for the Summit supercomputer (U.S. DOE, Oak Ridge National Laboratory) and Ascent testing cluster.
  • Add support for the IBM LSF scheduler.
  • Add warning about explicitly set, but unused directives during submission.
  • Add official support for Python version 3.7.
  • Fix issue where the status sub-command ignored the –show-traceback option.
  • Fix SLURM scheduler driver to show full error message in case that submission with squeue failed.
  • Better specification of (optional) dependencies in setup.py and requirements.txt.
  • Overall revision of all cluster submission templates; improved structure and abstraction of logic.
  • The serialization of operations was improved to optimize execution speed for local runs.
  • The evaluation of preconditions and postconditions was optimized for optimally lazy evaluation: cheaper conditions should be placed above more expensive conditions for maximal performance.
  • When gathering operations, signac will automatically use the buffered mode when config value ‘flow.use_buffered_mode’ is set to True (requires signac >= 0.9.3).
  • Improved documentation for developers and contributors.

[0.6.3] – 2018-08-22

  • Fix issue related to dynamic data spaces, that means data spaces where jobs are either added, removed, or changed during the execution of the workflow. Specifically, flow will now execute operations also for jobs that were added during execution.
  • Fix issue where command line options would be ignored when provided before the sub-command.
  • Fix issue where the table symbols in the –stack –pretty view were swapped.

[0.6.2] – 2018-08-11

  • Increase performance of condition evaluation (switch from eager to lazy evaluation). Speeds up detailed status update and run/script/submit sub commands.
  • Fix issue with the detailed status update failing on older Python versions (#29).
  • Fix issue with the XSEDE Bridges template in combination with GPU operations.

[0.6.1] – 2018-07-01

  • Add the -v/--verbosity and --show-traceback option to the project interface, which allows for more fine-grained control over the message verbosity. The --debug option is now equivalent to -vv --show-traceback.
  • The message verbosity of the project class was overall reduced.
  • Global options including (--debug and --verbose) can be used at any place within the project command and must no longer be placed before the sub command, e.g., the following commands are equivalent: $ python project.py run --debug and $ python project.py --debug run.
  • Implement the -p/--parallel option for the project run command.
  • Use cloudpickle when encountering pickling issues during parallel execution (when installed).
  • Implement the status --ignore-errors option.
  • Handle changes to the project data space during running execution, e.g., removed jobs.
  • Print the operation.cmd attribute to screen in run --pretend mode, not repr(operation).
  • Show progressbar while gathering pending operations.

[0.6.0] – 2018-05-24

Major updates and new features

  • Use jinja2 as templating engine for the generation of execution and submission scripts.
  • Add decorator API for the definition of FlowProject operations and label functions.
  • Revise the status view to render on a per job-operation basis, not on a per job basis.
  • The <project> run function executes all pending operations, not just the next pending ones.
  • The <project> script function no longer supports explicit bundling, all operations are bundled by default.
  • The default execution mode for script and submission script bundling is serial, not parallel.
  • Add the operations directive parameter, which provides a more generalized interface to specify required resources or other execution related metadata.
  • Add support for XSEDE Stampede2.
  • Add simple-scheduler, for local testing of scheduled workflows.
  • Allow the override of the detected environment with the SIGNAC_FLOW_ENVIRONMENT variable.
  • The $ flow init commad initializes the signac project if not project is found.

API changes (breaking)

  • The FlowProject.run() method arguments were changed [1]; the old API is better supported by the new FlowProject.run_operations() function.
  • The FlowProject.submit() and .submit_operations() method arguments were changed [1].
  • The JobOperation constructor arguments were changed; the old API ist still supported.

API changes (non-breaking)

  • Unify the job and operation selection API for the run/script/submit commands.
  • Add FlowProject.operation() decorator function.
  • Add FlowProject.label() docorator function.
  • The FlowProject.write_human_readable_statepoints() method is deprecated.
  • All FlowProject methods relating to the old templating system are deprecated, that includes all write_script*() methods.
  • Add flow.cmd decorator function.
  • Add flow.directives decorator function.

[1] A reasonable attempt to support legacy API use is made, but may fail under some circumstances.

Version 0.5

[0.5.6] – 2018-02-22

  • Fix issue, where operations with spaces in their name would not be accepted by the SLURM scheduler.
  • Add environment profile for XSEDE Bridges.
  • Update the environment profile for XSEDE comet to use the shared queue by default and provide options to specify the memory allocation.
  • Improve performance of the project update status function.

[0.5.5] – 2017-10-05

  • Fix issue with the SLURM scheduler, where the queue status could not be parsed.

[0.5.4] – 2017-08-01

  • Fix issue with <project> run, where operation commands consist of multiple shell commands.
  • Fix issue where the <project> status output showed negative values for the number of lines omitted (issue #12).
  • Raise error when trying to provide a timeout for <project> run in serial execution in combination with Python 2.7; this mode of execution is not supported for Python versions 2.7.x.
  • Enforce that the <project> status –overview-max-lines (-m) argument is positive.

[0.5.3] – 2017-07-18

  • Fix issue where the return value of FlowProject.next_operation() is ignored in combination with the <project> submit / run / script interface.

[0.5.2] – 2017-07-12

  • Fix bug in detailed status output in combination with unhashable types.
  • Do not fork when executing only a single operation with flow.run().
  • Run all next operations for each job with flow.run() instead of only one of the next operations.
  • Gather all next operations when submitting, instead of only one of the nex operations for each job.

[0.5.1] – 2017-06-08

  • Exclude private functions, that means functions with a name that start with an underscore, from the operations listing when using flow.run().
  • Forward all extra submit arguments into the write_script() methods.
  • Fix an issue with $flow init/flow.init() in combination with Python 2.7.

[0.5.0] – 2017-05-24

Major updates and new features

  • The documentation has been completely revised; most components are now covered by a reference documentation; the reference documentation serves also as basic tutorial.
  • The signac-flow package now officially supports Python version 2.7 in addition to versions 3.4+; the support for version 3.3 has been dropped.
  • Add comand line interface for instances of FlowProject, to be accessed via the FlowProject.main() function. This makes it easier to interact with specific workflow implementations on the command line, for example to view the project’s status, execute operations or submit them to a scheduler.
  • The $ flow init command initializes a new very lean workflow module that replaces the need to use project templates. Setting up a workflow with signac-flow is now much easier; template projects are no longer needed. The $ flow init command can be invoked with an optional -t/–template argument to initialize project modules with example code.
  • Add the flow.run() function to turn python modules that implement functions to be used as data space operations into executables. Executing the flow.run() function opens a command line interface that can be used to execute operations defined within the same module directly from the command line in serial and parallel.
  • The definition of operations on the project level is now possible via the FlowProject.operations dictionary; operations can either be added directly or via the FlowProject.add_operation() function.
  • Environment with torque or slurm scheduler are now immediately supported via a default environment profile.
  • The submission process is generally streamlined and it is easier to forward arguments to the underlying scheduler; this is is supposed to enable the user to directly submit scripts and operations without the need to setup a custom environment profile.
  • Some environment profiles for large cluster environments are bundled with the package; it is no longer needed to install external packages to be able to use profiles on some HPC resources.

API changes (breaking)

  • The use of JobScript.write_cmd() with an np argument is pending deprecation, the adjustment of commands to the local environment is moved to an earlier stage (for instance, during project instance construction).
  • The official project template is still functional via a legacy API layer, however it is recommended that users update projects to use with this version; the update process is described in the README document.
  • Most of the environment specific command line arguments are now directly provided by the environment profile via profile specific add_parser_args() functions; that means that existing environment might be require some tweaking to work with this version.

Version 0.4

[0.4.2] – 2017-02-28

  • Fix issue in the submit legacy mode, the write_header() method was previously ignored.

[0.4.1] – 2017-02-24

  • Fix ppn issue when submitting in legacy mode.
  • Enable optional parallelization during submit.

[0.4.0] – 2017-02-23

Major revision to the job-operation submission function.

  • The write_user() function has been replaced by submit_user() with slightly adjusted API.
  • The header and environment module have been merged into a single environment module.
  • All submit logic has been removed from the scheduler drivers.
  • Any submit logic implemented as part of the environment module has been reduced to the bare minimum.
  • The submission flow has been refactored to be based on JobOperations.
  • An attempt is made to detect the use of the deprecated API which will trigger the use of a legacy code path and the emission of warnings.
  • Improved testing capabilities for unknown environments.
  • The determination of present environments is deterministic and based on reversed definition order.
  • Add the label decorators, allowing a more concise definition of methods which are to be used for classification.
  • Add the FlowGraph, which allows the user to implement workflows in form of a graph.
  • Implement unit tests for all core functionalities.