subject¶
Classes for managing data and protocol access and storage.
Currently named subject, but will likely be refactored to include other data models should the need arise.
Classes:
|
Class for managing one subject’s data and protocol. |
-
class
Subject
(name: Optional[str] = None, dir: Optional[str] = None, file: Optional[str] = None, new: bool = False, biography: Optional[dict] = None)[source]¶ Bases:
object
Class for managing one subject’s data and protocol.
Creates a
tables
hdf5 file in prefs.get(‘DATADIR’) with the general structure:/ root |--- current (tables.filenode) storing the current task as serialized JSON |--- data (group) | |--- task_name (group) | |--- S##_step_name | | |--- trial_data | | |--- continuous_data | |--- ... |--- history (group) | |--- hashes - history of git commit hashes | |--- history - history of changes: protocols assigned, params changed, etc. | |--- weights - history of pre and post-task weights | |--- past_protocols (group) - stash past protocol params on reassign | |--- date_protocol_name - tables.filenode of a previous protocol's params. | |--- ... |--- info - group with biographical information as attributes
- Variables
lock (
threading.Lock
) – manages access to the hdf5 filename (str) – Subject ID
file (str) – Path to hdf5 file - usually {prefs.get(‘DATADIR’)}/{self.name}.h5
current (dict) – current task parameters. loaded from the ‘current’
filenode
of the h5 filestep (int) – current step
protocol_name (str) – name of currently assigned protocol
current_trial (int) – number of current trial
running (bool) – Flag that signals whether the subject is currently running a task or not.
data_queue (
queue.Queue
) – Queue to dump data while running taskthread (
threading.Thread
) – thread used to keep file open while running taskdid_graduate (
threading.Event
) – Event used to signal if the subject has graduated the current stepSTRUCTURE (list) –
list of tuples with order:
full path, eg. ‘/history/weights’
relative path, eg. ‘/history’
name, eg. ‘weights’
type, eg.
Subject.Weight_Table
or ‘group’
locations (node) –
tables.IsDescriptor
for tables.
- Parameters
name (str) – subject ID
dir (str) – path where the .h5 file is located, if None, prefs.get(‘DATADIR’) is used
file (str) – load a subject from a filename. if None, ignored.
new (bool) – if True, a new file is made (a new file is made if one does not exist anyway)
biography (dict) – If making a new subject file, a dictionary with biographical data can be passed
Methods:
open_hdf
([mode])Opens the hdf5 file.
close_hdf
(h5f)Flushes & closes the open hdf file.
new_subject_file
(biography)Create a new subject file and make the general filestructure.
Ensure that our h5f has the appropriate baseline structure as defined in self.STRUCTURE
update_biography
(params)Change or make a new biographical attribute, stored as attributes of the info group.
update_history
(type, name, value[, step])Update the history table when changes are made to the subject’s protocol.
assign_protocol
(protocol[, step_n])Assign a protocol to the subject.
Flushes the ‘current’ attribute in the subject object to the current filenode in the .h5
Save the current protocol in the history group and delete the node
Prepares the Subject object to receive data while running the task.
data_thread
(queue)Thread that keeps hdf file open and receives data while task is running.
save_data
(data)Alternate and equivalent method of putting data in the queue as Subject.data_queue.put(data)
stop_run
()puts ‘END’ in the data_queue, which causes
data_thread()
to end.to_csv
(path[, task, step])Export trial data to .csv
get_trial_data
([step, what])Get trial data from the current task.
apply_along
([along, step])get_step_history
([use_history])Gets a dataframe of step numbers, timestamps, and step names as a coarse view of training status.
get_timestamp
([simple])Makes a timestamp.
get_weight
([which, include_baseline])Gets start and stop weights.
set_weight
(date, col_name, new_value)Updates an existing weight in the weight table.
update_weights
([start, stop])Store either a starting or stopping mass.
graduate
()Increase the current step by one, unless it is the last step.
Classes:
Class to describe parameter and protocol change history
Class to describe table for weight history
Class to describe table for hash history
-
open_hdf
(mode='r+')[source]¶ Opens the hdf5 file.
This should be called at the start of every method that access the h5 file and
close_hdf()
should be called at the end. Otherwise the file will close and we risk file corruption.See the pytables docs here and here
- Parameters
mode (str) – a file access mode, can be:
‘r’: Read-only - no data can be modified.
‘w’: Write - a new file is created (an existing file with the same name would be deleted).
‘a’ Append - an existing file is opened for reading and writing, and if the file does not exist it is created.
‘r+’ (default) - Similar to ‘a’, but file must already exist.
- Returns
Opened hdf file.
- Return type
tables.File
-
close_hdf
(h5f)[source]¶ Flushes & closes the open hdf file. Must be called whenever
open_hdf()
is used.- Parameters
h5f (
tables.File
) – the hdf file opened byopen_hdf()
-
new_subject_file
(biography)[source]¶ Create a new subject file and make the general filestructure.
If a file already exists, open it in append mode, otherwise create it.
- Parameters
biography (dict) – Biographical details like DOB, mass, etc. Typically created by
Biography_Tab
.
-
ensure_structure
()[source]¶ Ensure that our h5f has the appropriate baseline structure as defined in self.STRUCTURE
Checks that all groups and tables are made, makes them if not
-
update_biography
(params)[source]¶ Change or make a new biographical attribute, stored as attributes of the info group.
- Parameters
params (dict) – biographical attributes to be updated.
-
update_history
(type, name, value, step=None)[source]¶ Update the history table when changes are made to the subject’s protocol.
The current protocol is flushed to the past_protocols group and an updated filenode is created.
Note
This only updates the history table, and does not make the changes itself.
- Parameters
type (str) – What type of change is being made? Can be one of
‘param’ - a parameter of one task stage
‘step’ - the step of the current protocol
‘protocol’ - the whole protocol is being updated.
name (str) – the name of either the parameter being changed or the new protocol
value (str) – the value that the parameter or step is being changed to, or the protocol dictionary flattened to a string.
step (int) – When type is ‘param’, changes the parameter at a particular step, otherwise the current step is used.
-
assign_protocol
(protocol, step_n=0)[source]¶ Assign a protocol to the subject.
If the subject has a currently assigned task, stashes it with
stash_current()
Creates groups and tables according to the data descriptions in the task class being assigned. eg. as described in
Task.TrialData
.Updates the history table.
- Parameters
protocol (str) – the protocol to be assigned. Can be one of
the name of the protocol (its filename minus .json) if it is in prefs.get(‘PROTOCOLDIR’)
filename of the protocol (its filename with .json) if it is in the prefs.get(‘PROTOCOLDIR’)
the full path and filename of the protocol.
step_n (int) – Which step is being assigned?
-
flush_current
()[source]¶ Flushes the ‘current’ attribute in the subject object to the current filenode in the .h5
Used to make sure the stored .json representation of the current task stays up to date with the params set in the subject object
-
stash_current
()[source]¶ Save the current protocol in the history group and delete the node
Typically this is called when assigning a new protocol.
Stored as the date that it was changed followed by its name if it has one
-
prepare_run
()[source]¶ Prepares the Subject object to receive data while running the task.
Gets information about current task, trial number, spawns
Graduation
object, spawnsdata_queue
and callsdata_thread()
.- Returns
- the parameters for the current step, with subject id, step number,
current trial, and session number included.
- Return type
Dict
-
data_thread
(queue)[source]¶ Thread that keeps hdf file open and receives data while task is running.
receives data through
queue
as dictionaries. Data can be partial-trial data (eg. each phase of a trial) as long as the task returns a dict with ‘TRIAL_END’ as a key at the end of each trial.each dict given to the queue should have the trial_num, and this method can properly store data without passing TRIAL_END if so. I recommend being explicit, however.
Checks graduation state at the end of each trial.
- Parameters
queue (
queue.Queue
) – passed byprepare_run()
and used by other objects to pass data to be stored.
-
save_data
(data)[source]¶ Alternate and equivalent method of putting data in the queue as Subject.data_queue.put(data)
- Parameters
data (dict) – trial data. each should have a ‘trial_num’, and a dictionary with key ‘TRIAL_END’ should be passed at the end of each trial.
-
stop_run
()[source]¶ puts ‘END’ in the data_queue, which causes
data_thread()
to end.
-
to_csv
(path, task='current', step='all')[source]¶ Export trial data to .csv
- Parameters
path (str) – output path of .csv
task (str, int) – not implemented, but in the future pull data from ‘current’ or other named task
step (str, int, list, tuple) – Step to select, see
Subject.get_trial_data()
-
get_trial_data
(step: Union[int, list, str] = - 1, what: str = 'data')[source]¶ Get trial data from the current task.
- Parameters
step (int, list, ‘all’) – Step that should be returned, can be one of
-1: most recent step
int: a single step
list of two integers eg. [0, 5], an inclusive range of steps.
string: the name of a step (excluding S##_)
‘all’: all steps.
what (str) – What should be returned?
‘data’ : Dataframe of requested steps’ trial data
‘variables’: dict of variables without loading data into memory
- Returns
DataFrame of requested steps’ trial data.
- Return type
-
get_step_history
(use_history=True)[source]¶ Gets a dataframe of step numbers, timestamps, and step names as a coarse view of training status.
- Parameters
use_history (bool) – whether to use the history table or to reconstruct steps and dates from the trial table itself. compatibility fix for old versions that didn’t stash step changes when the whole protocol was updated.
- Returns
-
get_timestamp
(simple=False)[source]¶ Makes a timestamp.
- Parameters
simple (bool) –
- if True:
returns as format ‘%y%m%d-%H%M%S’, eg ‘190201-170811’
- if False:
returns in isoformat, eg. ‘2019-02-01T17:08:02.058808’
- Returns
basestring
-
get_weight
(which='last', include_baseline=False)[source]¶ Gets start and stop weights.
Todo
add ability to get weights by session number, dates, and ranges.
- Parameters
which (str) – if ‘last’, gets most recent weights. Otherwise returns all weights.
include_baseline (bool) – if True, includes baseline and minimum mass.
- Returns
dict
-
set_weight
(date, col_name, new_value)[source]¶ Updates an existing weight in the weight table.
Todo
Yes, i know this is bad. Merge with update_weights
- Parameters
date (str) – date in the ‘simple’ format, %y%m%d-%H%M%S
col_name (‘start’, ‘stop’) – are we updating a pre-task or post-task weight?
new_value (float) – New mass.
-
update_weights
(start=None, stop=None)[source]¶ Store either a starting or stopping mass.
start and stop can be passed simultaneously, start can be given in one call and stop in a later call, but stop should not be given before start.
- Parameters
start (float) – Mass before running task in grams
stop (float) – Mass after running task in grams.
-
class
History_Table
¶ Bases:
tables.description.IsDescription
Class to describe parameter and protocol change history
- Variables
Attributes:
-
columns
= { 'name': StringCol(itemsize=256, shape=(), dflt=b'', pos=None), 'time': StringCol(itemsize=256, shape=(), dflt=b'', pos=None), 'type': StringCol(itemsize=256, shape=(), dflt=b'', pos=None), 'value': StringCol(itemsize=4028, shape=(), dflt=b'', pos=None)}¶
-
class
Weight_Table
¶ Bases:
tables.description.IsDescription
Class to describe table for weight history
- Variables
Attributes:
-
columns
= { 'date': StringCol(itemsize=256, shape=(), dflt=b'', pos=None), 'session': Int32Col(shape=(), dflt=0, pos=None), 'start': Float32Col(shape=(), dflt=0.0, pos=None), 'stop': Float32Col(shape=(), dflt=0.0, pos=None)}¶