Methods¶
activegit defines a single class with methods and properties that wrap git features, such as tags and push/pull. Wrapping these features allows them to be cast to an active learning context.
-
class
activegit.
ActiveGit
(repopath, bare=False, shared='group')¶ Uses a git repo to keep track of active learning data and classifier.
The standard set of files is: ‘training.pkl’, ‘testing.pkl’, and ‘classifier.pkl’. First two each contain a dictionary with features as keys and target labels (e.g., 0/1) as values. The third file contains the classifier (e.g., from sklearn).
Tags are central to tracking classifier and data. A new repo starts with empty files and a tag “initial”. Branch ‘master’ keeps latest and branch ‘working’ is used for active session. After committing a new version, the working is merged to master, deleted, and a new working branch checked out.
Setting bare=True creates a bare git repo that can be shared (cloned) by a group locally or via git daemon sharing.
-
classifier
¶ Returns classifier from classifier.pkl
-
commit_version
(version, msg=None)¶ Add tag, commit, and push changes
-
initializerepo
()¶ Fill empty directory with products and make first commit
-
isvalid
¶ Checks whether contents of repo are consistent with standard set.
-
set_version
(version, force=True)¶ Sets the version name for the current state of repo
-
show_version_info
(version)¶ Summarizes info of a particular version (a la “git show version”)
-
testing_data
¶ Returns data dictionary from testing.pkl
-
training_data
¶ Returns data dictionary from training.pkl
-
update
()¶ Pull latest versions/tags, if linked to a remote (e.g., github).
-
version
¶ Current version checked out.
-
versions
¶ Sorted list of versions committed thus far.
-
write_classifier
(clf)¶ Writes classifier object to pickle file
-
write_testing_data
(features, targets)¶ Writes data dictionary to filename
-
write_training_data
(features, targets)¶ Writes data dictionary to filename
-