uval.utils package
Submodules
uval.utils.hdf5_format module
This module provides functions to read and write uval HDF5 files. The supported HDF5 fields defined listed as yaml:
File name suggestion (let’s keep tools functional even if names not complying) - NAME.det.h5 (does not contain volume_data, groundtruth) - NAME.gt.h5 (does not contain detections, volume_data) - NAME.volcache.h5 (only volume_meta including volume projection cache) - NAME.voldata.h5 (only volume_data)
# X, Y and Z axis # Z-axis is belt direction (dir. of motion) # Y-axis is vertical pointing up # X-axis is point left when looking in belt motion direction
# Character set always UTF-8
- file_meta: # Always required
host_name: “H5T_STRING” # Host name of computer that generated the h5 file e.g. philscomputer user_name: “H5T_STRING” # User name of user that generated the h5 file dt_generated: “H5T_STRING” # ISO 8601 time and date of file creation (with timezone!)
- volume_meta: # Always available, not optional! (also for dets and gt)
id: “H5T_STRING” # e.g. BAGGAGE_20181122_081331_126018 file_md5: “H5T_STRING” # The checksum of the original ct volume file e.g. 686593fa1f05f610066129b72c62bfdd full_shape: INT (3) # Shape of full volume is_cropped: INT # if 1, the data only contains the voxels within the roi roi_start: INT (3) roi_shape: INT (3) # Matches size of data if “is_cropped” is True cache: # Optional
projection_x: UINT8 RGB IMAGE # Colored Matlum if possible, otherwise grayscale (R=G=B) projection_y: UINT8 RGB IMAGE projection_z: UINT8 RGB IMAGE
volume_data: “H5T_STD_U16LE” # Optional (to save space, not contained in .dets.h5 and .gt.h5)
- detections: # list int-indexed as strings for each member (e.g. “0”, “1”, ..)
class_name: “H5T_STRING” roi_start: INT (3) roi_shape: INT (3) # Same as size of mask if mask is available mask: “H5T_STD_U8LE” # Optional (e.g. only bounding boxes) score: FLOAT cache: # Optional
density: FLOAT mass: FLOAT num_voxels: INT projection_x: UINT16 # Taking 3D mask with 1s and 0s, adding up along x axis (only y and z axis remain) projection_y: UINT16 # Taking 3D mask with 1s and 0s, adding up along y axis (only x and z axis remain) projection_z: UINT16 # Taking 3D mask with 1s and 0s, adding up along z axis (only x and y axis remain)
- groundtruth: # dict indexed by label id for each member
class_name: “H5T_STRING” target_id: “H5T_STRING” # Formerly known as threat id roi_start: INT (3) roi_shape: INT (3) # Same as size of mask if mask is available mask: “H5T_STD_U8LE” # Optional (e.g. only bounding boxes) cache: # Optional
projection_x: # As for detections projection_y: projection_z:
# 2D MASK projection to 1D # X —-> # 0 0 0 0 0 0 0 ^ # 0 1 0 1 0 0 0 | # 0 1 1 1 1 0 0 | # 0 0 1 1 0 0 0 Y # 0 0 0 1 0 0 0
# 0 2 2 4 1 0 0 Projection (adding up) # 0 1 1 1 1 0 0 Binary mask
# Proj along Y
- class uval.utils.hdf5_format.ArrayReqs(shape=None, dtype=None)[source]
Bases:
objectUsed to represent requirements on an np.ndarray. Very similar to a type hint. So that ArrayReqs(shape=(3,3,-1)) corresponds to a type hint like np.ndarray[shape=(3,3,-1)] where -1 indicates any size along that dimension. You can also use ArrayReqs(shape=3) to indicate the number of dimensions should be 3.
- class uval.utils.hdf5_format.FieldRequired(value)[source]
Bases:
enum.EnumAn enumeration.
- Optional = 2
- Required = 1
- uval.utils.hdf5_format.check_dataset_type(dataset: object, type_descriptor) bool[source]
Check for a single dataset (a python value or H5Dataset) if it matches the type requirement.
- uval.utils.hdf5_format.check_dictgroup_fields(dictgroup: Dict[str, dict], requirements: Union[dict, object], base_name: str = '') None[source]
Checks a list-like dict of instances against the requirements. Almost like check_listgroup_fields but the elements are indexed by keys. The requirements apply to each item of the dict, not to the dict as a whole.
- uval.utils.hdf5_format.check_fields(to_check: Union[dict, h5py._hl.group.Group], requirements: Union[Dict[Any, Any], Any], base_name: str = '')[source]
Checks a dict (group) or value (dataset) against requirements. Fields may be required or optional. Additional unknown fields will also result in a failed check.
The requirements dict has to be defined as follows: Every entry in the dict maps from the field name to a 2-tuple (data_type, is_required), where data_type can be a python type like str, int, np.ndarray or ArrayWithShape to specify the data shape. The element is_required is an enum (see FieldRequired) which specifies if this field is optional.
The requirements can be nested to represent groups. For this purpose, use a requirements dictionary as the data_type of one of the fields.
- Parameters
to_check – Nested dictionary to check against requirements
requirements – As explained above
base_name – String describing the location of to_check within the file (for better error messages only)
- uval.utils.hdf5_format.check_listgroup_fields(listgroup: List[dict], requirements: Union[dict, Any], base_name: str = '') None[source]
Checks a list of instances against the requirements. The requirements apply to each item of the list, not to the list as a whole. Inside the HDF5 file, lists will be represented as groups containing multiple groups that have integers as names (but stored as string, because names have to be strings). In the native python dict format, we will use simple lists.
- uval.utils.hdf5_format.check_volume_meta_fields(volume_meta: dict, volume: Optional[numpy.ndarray] = None) None[source]
- uval.utils.hdf5_format.h5_check_dictgroup_fields(dictgroup: h5py._hl.group.Group, requirements: Union[dict, object], base_name: str = '') None[source]
Checks an H5Group of similar sub-items against the requirements. Almost like check_listgroup_fields but the elements are indexed by keys. The requirements apply to each item of the group, not to the group as a whole.
- uval.utils.hdf5_format.h5_check_listgroup_fields(listgroup: h5py._hl.group.Group, requirements: Union[dict, object], base_name: str = '') None[source]
Checks an H5Group that makes a list of instances against the requirements. The requirements apply to each item of the list, not to the list as a whole. Inside the HDF5 file, lists will be represented as groups containing multiple groups that have integers as names (but stored as string, because names have to be strings).
uval.utils.hdf5_io module
This module provides functions to read and write uval HDF5 files. For a format specification, have a look at hdf5_format.py
- class uval.utils.hdf5_io.UvalHdfFileInput(filepath: str)[source]
Bases:
uval.utils.hdf5_io.UvalHdfFileA class to manage, read from a single uval-specific HDF5 file. The file on disk will not be held open by default. Every operation will open and close the file. If a bunch of operations is executed in a row and the file shall be held open, use a with-context on the UvalHdfFile object.
- detections(include_masks=False, include_caches=False)[source]
A list of detected areas in the ct image Please refer to UVal hdf5 format.
- Parameters
include_masks – To include 3d masks while reading or not
include_caches – To include cached data while reading ot not
- Returns
None
- ground_truth(include_masks=False, include_caches=False)[source]
A list of 3d groundtruth data belonging to a ct image Please refer to UVal hdf5 format.
- Parameters
include_masks – To include 3d masks while reading or not
include_caches – To include cached data while reading ot not
- Returns
None
- read_all_fields() dict[source]
Reads all the existing fields in h5 file
- Returns
All the groups in a dictionary
- class uval.utils.hdf5_io.UvalHdfFileOutput(filepath: str, copy_from_input: Optional[uval.utils.hdf5_io.UvalHdfFileInput] = None)[source]
Bases:
uval.utils.hdf5_io.UvalHdfFileA class to manage, write to a single uval-specific HDF5 file. The file on disk will not be held open by default. Every operation will open and close the file. If a bunch of operations is executed in a row and the file shall be held open, use a with-context on the UvalHdfFile object.
- property detections
- property file_meta
- property groundtruth
- read_all_from(input_file: uval.utils.hdf5_io.UvalHdfFileInput) None[source]
Reading all the meta data included in another HDF5 file :param input_file: HDF file to read from
- Returns
None
- property volume
- property volume_meta
uval.utils.hdf5_verification module
This module provides functions to verify uval HDF5 files. It reports any problems that a set of existing HDF5 may have. Any missing required field in HDF5 file will be reported as a problem
- uval.utils.hdf5_verification.verify_hdf5_files(folder_path: str, recursive: bool = False, file_filter: str = '*.h5', print_problems: bool = True) Dict[str, List[str]][source]
Give a folder, finds and verifies all HDF5 files inside. This can be recursive or filtered if desired.
Returns a dictionary with all the problems found for each file. :param folder_path: The path to folder containing HDF5 files :param recursive: To parse the folder recursively or not :param file_filter: The wildcard to include HDF5 files by name :param print_problems: To print the detected problems in standard output or not
- Returns
A dictionary containing all the detected problems including the file name and problem description
uval.utils.hdf5_virtual module
- class uval.utils.hdf5_virtual.Hdf5Virtual[source]
Bases:
objectThis module provides the class Hdf5Virtual which represents a reference to an Hdf5 file that actually resides on disk somewhere. It keeps meta data and other low-storage information in memory. When trying to access larger parts of the file, like volume, masks or projections, it accesses the actual file.
uval.utils.label_naming module
We defined a ‘short’ form of label names. That means, if we have a bag with volume id BAGGAGE_20171205_081937_012345 and a label called BAGGAGE_20171205_081937_012345_label_2, we can write the label as ‘%_label_2’.
To escape a ‘%’ as actual character, we use ‘%%’ in the short form.
This module provides functions to convert between the two forms.
- uval.utils.label_naming.label_long_to_short(volume_id: str, long_label_id: str) str[source]
This function will shorten the full label_id if possible. For a definition of the short and long form, see this module’s docstring.
- Parameters
volume_id – The ID which represents a image file containing the volume
long_label_id – Full id of label with no ‘%’ inside
- Returns
Shortened label_id, in which the volume_id is replaced by ‘%’
- uval.utils.label_naming.label_short_to_long(volume_id: str, short_label_id: str) str[source]
This function will restore the full label_id if it is a short form. For a definition of the short and long form, see this module’s docstring.
- Parameters
volume_id – The ID which represents a image file containing the volume
short_label_id – A shortened version of label id containing ‘%’ character
- Returns
Extended label_id, in which ‘%’ is replaced by the volume_id
uval.utils.log module
This utils module currently only provides logging functionality. Once this grows to much, we will need to split it.
uval.utils.yaml_io module
This module provides simple functions to read and write YAML files. The data structure for pure YAML IO operations would be the dictionary
- uval.utils.yaml_io.load_yaml_data(yaml_file_path: str) Dict[source]
loads data from a YAML file into a directory structure and returns dict type If the file does not exist or not read properly, returns None.
- Parameters
yaml_file_path – The path to the YAML file
- Returns
Data read from YAML file in dictionary format
- uval.utils.yaml_io.store_yaml_data(data_dict: dict, yaml_file_path: str) bool[source]
Stores a dictionary type data into YAML file and returns True If the folder is not writable of there’s a problem with the writing, it returns False
- Parameters
data_dict –
yaml_file_path –
- Returns
Returns true if YAML stored successfully, otherwise returns false