API Reference

This is the API for the signac (core) application.

The Project

Attributes

The JobsCursor class

Attributes

The Job class

Attributes

The JSONDict

This class implements the interface for the job’s statepoint and document attributes, but can also be used on its own.

The H5Store

This class implements the interface to the job’s data attribute, but can also be used on its own.

The H5StoreManager

This class implements the interface to the job’s stores attribute, but can also be used on its own.

Top-level functions

Submodules

signac.sync module

Synchronization of jobs and projects.

Jobs may be synchronized by copying all data from the source job to the destination job. This means all files are copied and the documents are synchronized. Conflicts, that means both jobs contain conflicting data, may be resolved with a user defined strategy.

The synchronization of projects is in essence the synchronization of all jobs which are in the destination project with the ones in the source project and the sync synchronization of the project document. If a specific job does not exist yet at the destination it is simply cloned, otherwise it is synchronized.

A sync strategy is a function (or functor) that takes the source job, the destination job, and the name of the file generating the conflict as arguments and returns the decision whether to overwrite the file as Boolean. There are some default strategies defined within this module as part of the FileSync class. These are the default strategies:

  1. always – Always overwrite on conflict.

  2. never – Never overwrite on conflict.

  3. update – Overwrite when the modification time of the source file is newer.

  4. Ask – Ask the user interactively about each conflicting filename.

For example, to synchronize two projects resolving conflicts by modification time, use:

dest_project.sync(source_project, strategy=sync.FileSync.update)

Unlike files, which are always either overwritten as a whole or not, documents can be synchronized more fine-grained with a sync function. Such a function (or functor) takes the source and the destination document as arguments and performs the synchronization. The user is encouraged to implement their own sync functions, but there are a few default functions implemented as part of the DocSync class:

  1. NO_SYNC – Do not perform any synchronization.

  2. COPY – Apply the same strategy used to resolve file conflicts.

  3. update – Equivalent to dst.update(src).

  4. ByKey – Synchronize the source document key by key, more information below.

This is how we could synchronize two jobs, where the documents are synchronized with a simple update function:

dst_job.sync(src_job, doc_sync=sync.DocSync.update)

The DocSync.ByKey functor attempts to synchronize the destination document with the source document without overwriting any data. That means this function behaves similar to update() for a non-intersecting set of keys, but in addition will preserve nested mappings without overwriting values. In addition, any key conflict, that means keys that are present in both documents, but have differing data, will lead to the raise of a DocumentSyncConflict exception. The user may expclitly decide to overwrite certain keys by providing a “key-strategy”, which is a function that takes the conflicting key as argument, and returns the decision whether to overwrite that specific key as Boolean. For example, to sync two jobs, where conflicting keys should only be overwritten if they contain the term ‘foo’, we could execute:

dst_job.sync(src_job, doc_sync=sync.DocSync.ByKey(lambda key: 'foo' in key))

This means that all documents are synchronized ‘key-by-key’ and only conflicting keys that contain the word “foo” will be overwritten, any other conflicts would lead to the raise of a DocumentSyncConflict exception. A key-strategy may also be a regular expression, so the synchronization above could also be achieved with:

dst_job.sync(src_job, doc_sync=sync.DocSync.ByKey('foo'))
class signac.sync.DocSync

Bases: object

Collection of document synchronization functions.

class ByKey(key_strategy=None)

Bases: object

Synchronize documents key by key.

COPY = 'copy'

Copy (and potentially overwrite) documents like any other file.

NO_SYNC = False

Do not synchronize documents.

static update(src, dst)

Perform a simple update.

class signac.sync.FileSync

Bases: object

Collection of file synchronization strategies.

class Ask

Bases: object

Resolve sync conflicts by asking whether a file should be overwritten interactively.

static always(src, dst, fn)

Resolve sync conflicts by always overwriting.

classmethod keys()

Return keys.

static never(src, dst, fn)

Resolve sync conflicts by never overwriting.

static update(src, dst, fn)

Resolve sync conflicts based on newest modified timestamp.

signac.sync.sync_jobs(src, dst, strategy=None, exclude=None, doc_sync=None, recursive=False, follow_symlinks=True, preserve_permissions=False, preserve_times=False, preserve_owner=False, preserve_group=False, deep=False, dry_run=False)

Synchronize the dst job with the src job.

By default, this method will synchronize all files and document data of dst job with the src job until a synchronization conflict occurs. There are two different kinds of synchronization conflicts:

  1. The two jobs have files with the same name, but different content.

  2. The two jobs have documents that share keys, but those keys are mapped to different values.

A file conflict can be resolved by providing a ‘FileSync’ strategy or by excluding files from the synchronization. An unresolvable conflict is indicated with the raise of a FileSyncConflict exception.

A document synchronization conflict can be resolved by providing a doc_sync function that takes the source and the destination document as first and second argument.

Parameters
  • src (Job) – The src job, data will be copied from this job’s workspace.

  • dst (Job) – The dst job, data will be copied to this job’s workspace.

  • strategy (callable, optional) – A synchronization strategy for file conflicts. The strategy should be a callable with signature strategy(src, dst, filepath) where src and dst are the source and destination instances of Project and filepath is the filepath relative to the project path. If no strategy is provided, a errors.SyncConflict exception will be raised upon conflict. (Default value = None)

  • exclude (str, optional) – A filename exclusion pattern. All files matching this pattern will be excluded from the synchronization process. (Default value = None)

  • doc_sync (attribute or callable from DocSync, optional) – A synchronization strategy for document keys. The default is to use a safe key-by-key strategy that will not overwrite any values on conflict, but instead raises a DocumentSyncConflict exception.

  • recursive (bool, optional) – Recursively synchronize sub-directories encountered within the job workspace directories. (Default value = False)

  • follow_symlinks (bool, optional) – Follow and copy the target of symbolic links. (Default value = True)

  • preserve_permissions (bool, optional) – Preserve file permissions (Default value = False)

  • preserve_times (bool, optional) – Preserve file modification times (Default value = False)

  • preserve_owner (bool, optional) – Preserve file owner (Default value = False)

  • preserve_group (bool, optional) – Preserve file group ownership (Default value = False)

  • dry_run (bool, optional) – If True, do not actually perform any synchronization operations. (Default value = False)

  • deep (bool, optional) – (Default value = False)

signac.sync.sync_projects(source, destination, strategy=None, exclude=None, doc_sync=None, selection=None, check_schema=True, recursive=False, follow_symlinks=True, preserve_permissions=False, preserve_times=False, preserve_owner=False, preserve_group=False, deep=False, dry_run=False, parallel=False, collect_stats=False)

Synchronize the destination project with the source project.

Try to clone all jobs from the source to the destination. If the destination job already exist, try to synchronize the job using the optionally specified strategy.

Parameters
  • source (class:~.Project) – The project presenting the source for synchronization.

  • destination (class:~.Project) – The project that is modified for synchronization.

  • strategy (callable, optional) – A synchronization strategy for file conflicts. The strategy should be a callable with signature strategy(src, dst, filepath) where src and dst are the source and destination instances of Project and filepath is the filepath relative to the project path. If no strategy is provided, a errors.SyncConflict exception will be raised upon conflict. (Default value = None)

  • exclude (str, optional) – A filename exclusion pattern. All files matching this pattern will be excluded from the synchronization process. (Default value = None)

  • doc_sync (attribute or callable from DocSync) – A synchronization strategy for document keys. The default is to use a safe key-by-key strategy that will not overwrite any values on conflict, but instead raises a DocumentSyncConflict exception.

  • selection (sequence of Job or job ids (str), optional) – Only synchronize the given selection of jobs. (Default value = None)

  • check_schema (bool, optional) – If True, only synchronize if this and the other project have a matching state point schema. See also: detect_schema(). (Default value = True)

  • recursive (bool, optional) – Recursively synchronize sub-directories encountered within the job workspace directories. (Default value = False)

  • follow_symlinks (bool, optional) – Follow and copy the target of symbolic links. (Default value = True)

  • preserve_permissions (bool, optional) – Preserve file permissions (Default value = False)

  • preserve_times (bool, optional) – Preserve file modification times (Default value = False)

  • preserve_owner (bool, optional) – Preserve file owner (Default value = False)

  • preserve_group (bool, optional) – Preserve file group ownership (Default value = False)

  • dry_run (bool, optional) – If True, do not actually perform the synchronization operation, just log what would happen theoretically. Useful to test synchronization strategies without the risk of data loss. (Default value = False)

  • deep (bool, optional) – (Default value = False)

  • parallel (bool, optional) – (Default value = False)

  • collect_stats (bool, optional) – (Default value = False)

Returns

Returns stats if collect_stats is True, else None.

Return type

NoneType or FileTransferStats

Raises
  • DocumentSyncConflict – If there are conflicting keys within the project or job documents that cannot be resolved with the given strategy or if there is no strategy provided.

  • FileSyncConflict – If there are differing files that cannot be resolved with the given strategy or if no strategy is provided.

  • SchemaSyncConflict – In case that the check_schema argument is True and the detected state point schema of this and the other project differ.

signac.warnings module

signac.errors module

Errors raised by signac.

exception signac.errors.ConfigError

Bases: Error, RuntimeError

Error with parsing or reading a configuration file.

exception signac.errors.DestinationExistsError(destination)

Bases: Error, RuntimeError

The destination for a move or copy operation already exists.

Parameters

destination (str) – The destination causing the error.

exception signac.errors.DocumentSyncConflict(keys)

Bases: SyncConflict

Raised when a synchronization operation fails due to a document conflict.

keys

The keys that caused the conflict.

exception signac.errors.Error

Bases: Exception

Base class used for signac Errors.

exception signac.errors.FileSyncConflict(filename)

Bases: SyncConflict

Raised when a synchronization operation fails due to a file conflict.

filename

The filename of the file that caused the conflict.

exception signac.errors.H5StoreAlreadyOpenError

Bases: Error, OSError

Indicates that the underlying HDF5 file is already open.

exception signac.errors.H5StoreClosedError

Bases: Error, RuntimeError

Raised when trying to access a closed HDF5 file.

exception signac.errors.IncompatibleSchemaVersion

Bases: Error

The project’s schema version is incompatible with this version of signac.

exception signac.errors.InvalidKeyError

Bases: ValueError

Raised when a user uses a non-conforming key.

exception signac.errors.JobsCorruptedError(job_ids)

Bases: Error, RuntimeError

The state point file of one or more jobs cannot be opened or is corrupted.

Parameters

job_ids – The job id(s) of the corrupted job(s).

exception signac.errors.KeyTypeError

Bases: TypeError

Raised when a user uses a key of invalid type.

exception signac.errors.SchemaSyncConflict(schema_src, schema_dst)

Bases: SyncConflict

Raised when a synchronization operation fails due to schema differences.

exception signac.errors.StatepointParsingError

Bases: Error, RuntimeError

Indicates an error that occurred while trying to identify a state point.

exception signac.errors.SyncConflict

Bases: Error, RuntimeError

Raised when a synchronization operation fails.

exception signac.errors.WorkspaceError(error)

Bases: Error, OSError

Raised when there is an issue creating or accessing the workspace.

Parameters

error – The underlying error causing this issue.