Changelog

The signac package follows semantic versioning.

Version 0.9

Highlights

  • Adds persistent state point index caching, which speeds up all functions that require indexing, for example the $ signac find command.
  • Adds the $ signac sync tool for synchronization of multiple signac projects.
  • Adds the $ signac schema function for the automatic detection of the implicit schema of a signac project.
  • Adds the $near operator to match numbers with up to a specific precision.
  • Adds functions for the import and export of data spaces.
  • Add functions for the management of data on the project level, as opposed to the job level.

[0.9.5] – 2019-01-31

Fixed

  • Ensure that the next() function can be called for a JobsIterator, e.g., project.find().
  • Pickling issue that occurs when a _SyncedDict (job.statepoint, job.document, etc.) contains a list.
  • Issue with the readline module that would cause signac shell to fail on Windows operating systems.

[0.9.4] – 2018-10-24

Added

  • Adds the $ signac import command and the Project.import_from() method for the import of data spaces into a project workspace, such as a directory, a tarball, or a zip-file.
  • Adds the $ signac export command and the Project.export_to() method for the export of project workspaces to an external location, such as a directory, a tarball, or a zip-file.
  • Adds functionality for the rapid initialization of temporary projects with the signac.TemporaryProject context manager.
  • Adds the signac.Project.temporary_project() context manager which creates a temporary project within the root project workspace.
  • Add signac to the default namespace when invoking signac shell.
  • Add option to specify a custom view path for the signac view/ Project.create_linked_view() function.
  • Iterables of documents used to construct a Collection no longer require an _id field.

Changed

  • The default path for linked views has been adjusted to match the one used for data exports.

Fixed

  • Fix issue where differently typed integer values stored within a Collection under the same key would not be indexed correctly. This issue affected the correct function of the $type operator for aforementioned cases and would lead to incorrect types in the Project schema detection algorithm for integer values.
  • Fix issue where jobs that are migrated (state point change), but are not initialized, were not properly updated.
  • Fix issue where changes to lists as part of synchronized dictionary, for example a state point or document would not be saved.
  • Fix non-deterministic issue occuring on network file systems when trying to open jobs where the user has no write access to the job workspace directory.

[0.9.3] – 2018-06-14

Added

  • Add $near operator to express queries for numerical values that match up to a certain precision.
  • Add the $ signac shell sub command to directly launch a Python interpreter within a project directory.

Fixed

  • Fix issue where a job instance would not be properly updated after more than one state point reset.

[0.9.2] – 2017-12-18

Added

  • Add provisional feature (persistent state point caching); calling the Project.update_cache() method will generate and store a persistent state point cache in the project root directory, which will increase the speed of many project iteration, search, and selection operations.
  • Add Project.check() method which checks for workspace corruption, but does not make any attempt to repair it.
  • The Project.repair() method will attempt to repair jobs, that have been corrupted by manually renaming the job’s workspace directory.

Changed

  • Enable the write_concern flag for the job.document.
  • Allow to omit the specification of an authentication mechanism in the MongoDB host configuration.

Fixed

  • Fix critical issue in the JSONDict implementation that would previously overwrite the underlying file when attempting to store values that are not JSON serializable.
  • Fix issue where the Project.export() function would ignore the update argument when the index to export to would be a MongoDB collection.

[0.9.1] – 2017-11-07

Fixed

  • Fix critical issue in the SyncedAttrDict implementation that would previously overwrite the underlying file if the first operation was a __setitem__() operation.

[0.9.0] – 2017-10-28

Added

  • Introduction of $ signac sync, Project.sync(), and Job.sync() for the simplified and fine-grained synchronization of multiple project data spaces.
  • Introduction of $ signac schema and Project.detect_schema() for the automatic detection of the implicit and semi-structured state point schema of a project data space.
  • Simplified aggregation of jobs over projects and Project.find_jobs() results with the Project.groupby() function.
  • Support for project-centralized data with the Project.document attribute and the Project.fn() method for the wrapping of filenames within the project root directory.
  • Added the Job.clear() and the Job.reset() methods to clear or reset a job’s workspace data.

Changed

  • Both Job.statepoint and Job.document now use the same underlying data structure and provide the exact same API (copy with () and access of keys as attributes).
  • The Collection class uses an internal counter instead of UUIDs for the generation of primary keys (resulting in improved performance).
  • Major performance improvements (faster Collection, improved caching)
  • Overhaul of the reference documentation.

Version 0.8

Highlights

  • Adds boolean and arithmetic operators to search queries.
  • Major revision of the indexing system.
  • Adds $ signac document command line function.
  • Add the signac.Collection class for the management of persistent document collections.

[0.8.7] – 2017-10-05

Fixed

  • Fix an issue where the creation of linked views was non-deterministic in some cases.
  • Fix an issue where the creation of linked views would fail when the project contains job with state points that have lists as values.

[0.8.6] – 2017-08-25

Fixed

  • Fix Collection append truncation issue (see issue #66).

[0.8.5] – 2017-06-07

Changed

  • The signac ids in the signac find –show view are no longer enclosed by quotation marks.

Fixed

  • Fix compatibility issue that broke the signac find –view and all –pretty commands on Python 2.7.
  • Fix issue where view directories would be incomplete in combination with heterogeneous state point schemas.

[0.8.4] – 2017-05-19

Added

  • All search queries on project and collection objects support various operators including: $and, $or, $gt, $gte, $lt, $lte, $eq, $ne, $exists, $regex, $where, $in, $nin, and $type.
  • The $ signac find command supports a simple filter syntax, where key value pairs can be provided as individual arguments.
  • The $ signac find command is extended by a –show option, to display the state point and the document contents directly. The contents are truncated to an adjustable depth to reduce output noise.
  • The $ signac view command has an additional filter option to select a sub data space directly without needing to pipe job ids.
  • The new $ signac document command can be used to display a job’s document directly.

Changed

  • Minor performance improvements.

[0.8.3] – 2017-05-10

Changed

  • Raise ExportError when updating with an empty index.

Fixed

  • Fix command line logic issue with $signac config host.
  • Fix bug, where Collection.replace_one() would ignore the upsert argument under specific conditions.

[0.8.2] – 2017-04-19

Fixed

  • Fixes a TypeError which occurred under specific conditions when calling Collection.find() with nested filter arguments.

[0.8.1] – 2017-04-17

Fixed

  • Fixes wide-spread typo (indeces -> indexes).

[0.8.0] – 2017-04-16

Overall major simplification of the generation of indexes and the management and searching of index collections without external database.

Added

  • Introduction of the Collection class for the management of document collections, such as indexes in memory and on disk.
  • Generation of file indexes directly via the signac.index_files() function.
  • Generation of master indexes directly via the signac.index() function and the $ signac index command.
  • The API of signac_access.py files has been simplified, including the possibility to use a blank file for a minimal configuration.
  • Use the $ signac project --access command to create a minimal access module in addition to Project.create_access_module().
  • The update of existing index collections has been simplified by using the export() function with the update=True argument, which means that stale documents (the associated file or state point no longer exists) are automatically identified and removed.
  • Added the Job.ws attribute, as short-cut for Job.workspace().
  • The Job.sp interface has a get() function which can be used to specify a default value in case that the requested key is not part of the state point.

Changed (breaking API)

  • The $ signac index command generates a master index instead of a project index. To generate a project index from the command line use $ signac project --index instead.
  • The SignacProjectCrawler class expects the project’s root directory as first argument, not the workspace directory.
  • The get_crawlers() function defined within a signac_access.py access module is expected to yield crawler instances directly, not a mapping of crawler ids and instances.
  • The simplification of the signac_access.py module API is reflected in a reduction of arguments to the Project.create_access_module() method.

Changed (non-breaking)

  • The RegexFileCrawler, SignacProjectCrawler and MasterCrawler classes were moved into the root namespace.
  • If a MasterCrawler object is instantiated with the raise_on_error argument set to True, any errors encountered during crawling are raised instead of ignored and skipped; this simplifies the debugging of erroneous access modules.
  • Improved error message for invalid configuration files.
  • Better error messages for invalid $ signac find queries.
  • Check a host configuration on the command line via $ signac host --test.
  • A MongoDB database host configuration defaults to none when no authentication method is explicitly specified.
  • Using the --debug option in combination with $ signac index will show the traceback of errors encountered during indexing instead of ignoring them.
  • Instances of Job are hashable, making it possible to use them as dict keys for instance.
  • The representation of Job instances via repr() can actually serves as copy constructor command.
  • The project interface implementation performs all non-trivial search operations on an internally management index collection, which improves performance and simplifies the code base.

Deprecated

  • The DocumentSearchEngine class has been deprecated, its functionality is now provided by the Collection class.

Fixed

  • An issue related to exporting documents to MongoDB collections via pymongo in combination with Python 2.7 has been fixed.

Version 0.7

Highlights

  • Add support for Python 3.6, PyPy and PyPy3.
  • Make any instance of Project behave like an iterable (for job in project).
  • Introduction of the Job.sp attribute to access state point variables.
  • Revision of the linked view function, which now allows the update of previous views.
  • Support for searching by job document keys on the command line.
  • Add functions for moving and cloning jobs.
  • Add functions for changing a job’s state point.
  • Enable opening of jobs by abbreviated id.

[0.7.1] – 2017-01-09

Added

  • When the python-rapidjson package is installed, it will be used for JSON encoding/decoding (experimental).

Changed

  • All job move-related methods raise DestinationExistsError in case of destination conflicts.
  • Optimized $ signac find command.

Fixed

  • Fixed bug in $ signac statepoint.
  • Suppress ‘broken pipe error’ message when using $ signac find for example in combination with $ head.

[0.7.0] – 2017-01-04

Added

  • Add support for Python version 3.6.
  • Add support for PyPy and PyPy3.
  • Simplified iteration over project data spaces.
  • An existing linked view can be updated by executing the view command again.
  • Add attribute interface for the access and modification of job state points: Job.sp.
  • Add function for moving and copying of jobs between projects.
  • All project related iterators support the len-operator.
  • Enable iteration over all jobs with: for job in project:.
  • Make len(project) an alias for project.num_jobs().
  • Add in-operator to determine whether a job is initialized within a project.
  • Add Job.sp attribute to access and modify a job’s state point.
  • The Project.open_job() method accepts abbreviated job ids.
  • Add Project.min_len_unique_id() method to determine the minimum length of job ids to be unique within the project’s data space.
  • Add Job.move() method to move jobs between projects.
  • Add Project.clone() method to copy jobs between projects.
  • Add $ signac move and $ signac clone command line functions.
  • Add Job.reset_statepoint() method to reset a job’s state point.
  • Add Job.update_statepoint() method to update a job’s state point.
  • Add a Job.FN_DOCUMENT constant which defines the default filename of the job document file
  • The $ signac find command accepts a -d/--doc-filter option to filter by job document contents.
  • Add the Project.create_linked_view() method as replacement for the previously deprecated Project.create_view() method.

Changed

  • Linked views use relative paths.
  • The Guide documentation chapter has been renamed to Reference and generally overhauled.
  • The Quick Reference documentation chapter has been extended.

Fixed

  • Fix error when using an instance of Job after calling Job.remove().
  • A project created in one the standard config directories (such as the home directory) does not take prevalence over project configurations in or above the current working directory.

Removed

  • The signac-gui component has been removed.
  • The Project.create_linked_view() force argument is removed.
  • The Project.find_variable_parameters() method has been removed

Version 0.6

Highlights

  • General revision of the indexing and export system.
  • General consolidation including the removal of the conversion framework.

[0.6.2] – 2017-12-15

Added

  • Add instructions on how to acknowledge signac in publications to documentation.
  • Add cite module for the auto-generation of formatted references and BibTeX entries.

Removed

  • Remove SSL authentication support.

[0.6.1] – 2017-11-26

Changed

  • The Project.create_view() method triggers a DeprecationWarning instead of a PendingDeprecationWarning.
  • The Project.find_variable_parameters() method triggers a DeprecationWarning instead of a PendingDeprecationWarning.

Fixed

  • Make package more robust against PySide import errors.
  • Fix Project.__repr__ method.
  • Fix critical bug in fs.GridFS class, which rendered it unusuable.
  • Fix issue in indexing.fetch() previously resulting in local paths being ignored.
  • Fix error signac.__all__ namespace directive.

[0.6.0] – 2016-11-18

Added

  • Add the export_to_mirror() function for mirroring files.
  • Introduction of the signac.fs namespace to simplify the configuration of mirror filesystems.
  • Add errors module to root namespace. Many exceptions raised inherit from the base exception types defined within that module, making it easier to catch signac related errors.
  • Add the export_one() function for the export of a single index document; simplifies the implementation of custom export functions.
  • Opening an instance of Job with the open_job() method multiple times and entering a job context recursively is now well-defined behavior: Opening a job now adds the current working directory onto a stack, closing it switches into the directory on top of the stack.
  • The return type of Project.open_job() can be configured to make it easier to specialize projects with custom job types.

Changed

  • The MasterCrawler logic has been simplified; their primary function is the compilation of index documents from slave crawlers, all export logic, including data mirroring is now provided by the signac.export() function.
  • Each index document is now uniquely coupled with only one file or data object, which is why signac.fetch() replaces signac.fetch_one() and the latter one has been deprecated and is currently an alias of the former one.
  • The signac.fetch() function always returns a file-like object, regardless of format definition.
  • The format argument in the crawler define() function is now optional and has now very well defined behavior for str types. It is encouraged to define a format with a str constant rather than a file-like object type.
  • The TextFile file-like object class definition in the formats module has been replaced with a constant of type str.
  • The signac.export() function automatically delegates to specialized implementations such as export_pymongo() and is more robust against errors, such as broken connections.
  • The export_pymongo() function makes multiple automatic restart attempts when encountering errors.
  • Documentation: The tutorial is now based on signac-examples jupyter notebooks.
  • The contrib.crawler module has been renamed to contrib.indexing to better reflect the semantic context.
  • The signac.export() function now implements the logic for data linking and mirroring.
  • Provide default argument for ‘–indent’ option for $ signac statepoint command.
  • Log, but do not reraise exceptions during MasterCrawler execution, making the compilation of master indexes more robust against errors.
  • The object representation of Job and Project instances is simplified.
  • The warning verbosity has been reduced when importing modules with optional dependencies.

Removed

  • All modules related to the stale conversion framework feature have been removed resulting in a removal of the optional networkx dependency.
  • Multiple modules related to the conversion framework feature have been removed, including: contrib.formats_network, contrib.conversion, and contrib.adapters.

Fixed

  • Opening instances of Job with the Job.open() method multiple times, equivalently entering the job context recursively, does not cause an error anymore, but instead the behavior is well-defined.

Version 0.5

[0.5.0] – 2016-08-31

Added

  • New function: signac.init_project() simplifies project initialization within Python
  • Added optional root argument to signac.get_project() to simplify getting a project handle outside of the current working directory
  • Added optional argument to signac.get_project(), to allow fetching of projects outside of the current working directory.
  • Added two class factory methods to Project: get_project() and init_project().

Changed

  • The performance of project indexing and crawling has been improved.

Version 0.4

[0.4.0] – 2016-08-05

Added

  • The performance of find operations can be greatly improved by using pre-generated job indexes.
  • New top-level commands: $ signac find, $ signac index, $ signac statepoint, and $ signac view.
  • New method: Project.create_linked_view()
  • New method: Project.build_job_statepoint_index()
  • New method: Project.build_job_search_index()
  • The Project.find_jobs() method allows to filter by job document.

Changed

  • The SignacProjectCrawler indexes all jobs, not only those with non-empty job documents.
  • The signac.fetch_one() function returns None if no associated object can be fetched.
  • The tutorial is restructured into multiple parts.
  • Instructions for installation are separated from the guide.

Removed

  • Remove previously deprecated crawl keyword argument in index export functions.
  • Remove previously deprecated function common.config.write_config().

Version 0.3

[0.3.0] – 2016-06-23

Added

  • Add contributing agreement and guidelines.

Changed

  • Change license from MIT to BSD 3-clause license.

Version 0.2

[0.2.9] – 2016-06-06

Added

  • Addition of the signac config command line API.
  • Password updates are encrypted with bcrypt when passlib is installed.
  • The user is prompted to enter missing credentials (username/password) in case that they are not stored in the configuration.
  • The $ signac confg tool provides the --update-pw argument, which allows users to update their own password.
  • Added MIT license, in addition, all source code files contain a short licensing header.

Changed

  • Improved documentation on how to configure signac.
  • The OSI classifiers are updated, including an upgrade of the development status to ‘4 - beta’.

Fixed

  • Nested job state points can no longer get corrupted. This bug occurred when trying to operate on nested state point mappings.

Deprecated

  • Deprecated pymongo versions 2.x are no longer supported.

[0.2.8] – 2016-04-18

Added

  • Project is now in the root namespace.
  • Add index() method to Project.
  • Add create_access_module() method to Project.
  • Add find_variable_parameters() method to Project.
  • Add fn() method to Job, which prepends the job’s workspace path to a filename.
  • The documentation contains a comprehensive tutorial

Changed

  • The crawl() function yields only the index documents and not a tuple of (_id, doc).
  • export() and export_pymongo() expect the index documents as first argument, not a crawler instance. The old API is still supported, but will trigger a DeprecationWarning.

[0.2.7] – 2016-02-29

Added

  • Add job.isfile() method

Changed

  • Optimize project.find_statepoints() and project.repair() functions.

[0.2.6] – 2016-02-20

Added

  • Add job.reset_statepoint() and job.update_statepoint()
  • Add job.remove() function

Changed

  • Sanitize filter argument in all project.find_*() methods.
  • The job.statepoint() function accurately represents saved statepoints, e.g. tuples are represented as lists, as there is no difference between tuples and lists in JSON.
  • signac-gui does not block on database operations.
  • signac-gui allows reload of databases and collections of connected hosts.

Fixed

  • RegexFileCrawler define() class function only acts upon the actual specialization and not globally on all RegexFileCrawler classes.
  • signac-gui does not crash when replica sets are configured.

[0.2.5] – 2016-02-10

Added

  • Added signac.get_project(), signac.get_database(), signac.fetch() and signac.fetch_one() to top-level namespace.
  • Added basic shell commands, see $ signac --help.
  • Allow opening of jobs by id: project.open_job(id='abc123...').
  • Mirror data while crawling.
  • Use extra sources for fetch() and fetch_one().
  • Add file system handler: LocalFS, handler for local file system.
  • Add file system handler: GridFS, handler for MongoDB GridFS file system.
  • Crawler tags, to control which crawlers are used for a specific index.
  • Allow explicit job workspace creation with job.init().
  • Forwarding of pymongo host configuration via signac configuration.

Changed

  • Major reorganization of the documentation, split into: Overview, Guide, Quick Reference and API.
  • Documentation: Add notes for system administrators about advanced indexing.
  • Warn about outdated pymongo versions.
  • Set zip_safe flag to true in setup.py.
  • Remove dependency on six module, by adding it to the common subpackage.

Deprecated

Fixed

  • Fixed hard import of pymongo bug (issue #24).
  • Crawler issues with malformed documents.

[0.2.4] – 2016-01-11

Added

  • Implement Project.repair() function for projects with corrupted workspaces.
  • Allow environment variables in workspace path definition.
  • Check and fix config permission errors.

Changed

  • Increase robustness of job manifest file creation.

Fixed

  • Fix project crawler deep directory issue (hotfix).

[0.2.3] – 2015-12-09

Fixed

  • Fix a few bugs related to project views.

[0.2.2] – 2015-11-30

Fixed

  • Fix SignacProjectCrawler super() bug.

[0.2.1] – 2015-11-29

Added

  • Add support for Python 2.7
  • Add signac-gui (early alpha)
  • Allow specification of relative and default workspace paths
  • Add the ability to create project views
  • Add Project.find_*() functions to search the workspace
  • Add function to write and read state point hash tables

[0.2.0] – 2015-11-05

  • Major consolidation of the package.
  • Remove all hard dependencies, but six.