uval.utils package

Submodules

uval.utils.hdf5_format module

This module provides functions to read and write uval HDF5 files. The supported HDF5 fields defined listed as yaml:

File name suggestion (let’s keep tools functional even if names not complying) - NAME.det.h5 (does not contain volume_data, groundtruth) - NAME.gt.h5 (does not contain detections, volume_data) - NAME.volcache.h5 (only volume_meta including volume projection cache) - NAME.voldata.h5 (only volume_data)

# X, Y and Z axis # Z-axis is belt direction (dir. of motion) # Y-axis is vertical pointing up # X-axis is point left when looking in belt motion direction

# Character set always UTF-8

file_meta: # Always required

host_name: “H5T_STRING” # Host name of computer that generated the h5 file e.g. philscomputer user_name: “H5T_STRING” # User name of user that generated the h5 file dt_generated: “H5T_STRING” # ISO 8601 time and date of file creation (with timezone!)

volume_meta: # Always available, not optional! (also for dets and gt)

id: “H5T_STRING” # e.g. BAGGAGE_20181122_081331_126018 file_md5: “H5T_STRING” # The checksum of the original ct volume file e.g. 686593fa1f05f610066129b72c62bfdd full_shape: INT (3) # Shape of full volume is_cropped: INT # if 1, the data only contains the voxels within the roi roi_start: INT (3) roi_shape: INT (3) # Matches size of data if “is_cropped” is True cache: # Optional

projection_x: UINT8 RGB IMAGE # Colored Matlum if possible, otherwise grayscale (R=G=B) projection_y: UINT8 RGB IMAGE projection_z: UINT8 RGB IMAGE

volume_data: “H5T_STD_U16LE” # Optional (to save space, not contained in .dets.h5 and .gt.h5)

detections: # list int-indexed as strings for each member (e.g. “0”, “1”, ..)

class_name: “H5T_STRING” roi_start: INT (3) roi_shape: INT (3) # Same as size of mask if mask is available mask: “H5T_STD_U8LE” # Optional (e.g. only bounding boxes) score: FLOAT cache: # Optional

density: FLOAT mass: FLOAT num_voxels: INT projection_x: UINT16 # Taking 3D mask with 1s and 0s, adding up along x axis (only y and z axis remain) projection_y: UINT16 # Taking 3D mask with 1s and 0s, adding up along y axis (only x and z axis remain) projection_z: UINT16 # Taking 3D mask with 1s and 0s, adding up along z axis (only x and y axis remain)

groundtruth: # dict indexed by label id for each member

class_name: “H5T_STRING” target_id: “H5T_STRING” # Formerly known as threat id roi_start: INT (3) roi_shape: INT (3) # Same as size of mask if mask is available mask: “H5T_STD_U8LE” # Optional (e.g. only bounding boxes) cache: # Optional

projection_x: # As for detections projection_y: projection_z:

# 2D MASK projection to 1D # X —-> # 0 0 0 0 0 0 0 ^ # 0 1 0 1 0 0 0 | # 0 1 1 1 1 0 0 | # 0 0 1 1 0 0 0 Y # 0 0 0 1 0 0 0

# 0 2 2 4 1 0 0 Projection (adding up) # 0 1 1 1 1 0 0 Binary mask

# Proj along Y

class uval.utils.hdf5_format.ArrayReqs(shape=None, dtype=None)[source]

Bases: object

Used to represent requirements on an np.ndarray. Very similar to a type hint. So that ArrayReqs(shape=(3,3,-1)) corresponds to a type hint like np.ndarray[shape=(3,3,-1)] where -1 indicates any size along that dimension. You can also use ArrayReqs(shape=3) to indicate the number of dimensions should be 3.

check(array: Union[numpy.ndarray, h5py._hl.dataset.Dataset]) bool[source]
class uval.utils.hdf5_format.FieldRequired(value)[source]

Bases: enum.Enum

An enumeration.

Optional = 2
Required = 1
uval.utils.hdf5_format.check_dataset_type(dataset: object, type_descriptor) bool[source]

Check for a single dataset (a python value or H5Dataset) if it matches the type requirement.

uval.utils.hdf5_format.check_detection_fields(detections: list) None[source]
uval.utils.hdf5_format.check_dictgroup_fields(dictgroup: Dict[str, dict], requirements: Union[dict, object], base_name: str = '') None[source]

Checks a list-like dict of instances against the requirements. Almost like check_listgroup_fields but the elements are indexed by keys. The requirements apply to each item of the dict, not to the dict as a whole.

uval.utils.hdf5_format.check_fields(to_check: Union[dict, h5py._hl.group.Group], requirements: Union[Dict[Any, Any], Any], base_name: str = '')[source]

Checks a dict (group) or value (dataset) against requirements. Fields may be required or optional. Additional unknown fields will also result in a failed check.

The requirements dict has to be defined as follows: Every entry in the dict maps from the field name to a 2-tuple (data_type, is_required), where data_type can be a python type like str, int, np.ndarray or ArrayWithShape to specify the data shape. The element is_required is an enum (see FieldRequired) which specifies if this field is optional.

The requirements can be nested to represent groups. For this purpose, use a requirements dictionary as the data_type of one of the fields.

Parameters
  • to_check – Nested dictionary to check against requirements

  • requirements – As explained above

  • base_name – String describing the location of to_check within the file (for better error messages only)

uval.utils.hdf5_format.check_file_meta_fields(file_meta: dict) None[source]
uval.utils.hdf5_format.check_groundtruth_fields(groundtruth: dict) None[source]
uval.utils.hdf5_format.check_listgroup_fields(listgroup: List[dict], requirements: Union[dict, Any], base_name: str = '') None[source]

Checks a list of instances against the requirements. The requirements apply to each item of the list, not to the list as a whole. Inside the HDF5 file, lists will be represented as groups containing multiple groups that have integers as names (but stored as string, because names have to be strings). In the native python dict format, we will use simple lists.

uval.utils.hdf5_format.check_volume_meta_fields(volume_meta: dict, volume: Optional[numpy.ndarray] = None) None[source]
uval.utils.hdf5_format.h5_check_detection_fields(h5: h5py._hl.files.File) None[source]
uval.utils.hdf5_format.h5_check_dictgroup_fields(dictgroup: h5py._hl.group.Group, requirements: Union[dict, object], base_name: str = '') None[source]

Checks an H5Group of similar sub-items against the requirements. Almost like check_listgroup_fields but the elements are indexed by keys. The requirements apply to each item of the group, not to the group as a whole.

uval.utils.hdf5_format.h5_check_file_meta_fields(h5: h5py._hl.files.File) None[source]
uval.utils.hdf5_format.h5_check_groundtruth_fields(h5: h5py._hl.files.File) None[source]
uval.utils.hdf5_format.h5_check_listgroup_fields(listgroup: h5py._hl.group.Group, requirements: Union[dict, object], base_name: str = '') None[source]

Checks an H5Group that makes a list of instances against the requirements. The requirements apply to each item of the list, not to the list as a whole. Inside the HDF5 file, lists will be represented as groups containing multiple groups that have integers as names (but stored as string, because names have to be strings).

uval.utils.hdf5_format.h5_check_volcache(h5: h5py._hl.files.File) None[source]

Checks the volume cache, which is usually optional. Here we require it

uval.utils.hdf5_format.h5_check_volume_meta_fields(h5: h5py._hl.files.File) None[source]

uval.utils.hdf5_io module

This module provides functions to read and write uval HDF5 files. For a format specification, have a look at hdf5_format.py

class uval.utils.hdf5_io.UvalHdfFile(filepath: str, mode: str = 'r')[source]

Bases: abc.ABC

class uval.utils.hdf5_io.UvalHdfFileInput(filepath: str)[source]

Bases: uval.utils.hdf5_io.UvalHdfFile

A class to manage, read from a single uval-specific HDF5 file. The file on disk will not be held open by default. Every operation will open and close the file. If a bunch of operations is executed in a row and the file shall be held open, use a with-context on the UvalHdfFile object.

detections(include_masks=False, include_caches=False)[source]

A list of detected areas in the ct image Please refer to UVal hdf5 format.

Parameters
  • include_masks – To include 3d masks while reading or not

  • include_caches – To include cached data while reading ot not

Returns

None

file_meta()[source]
ground_truth(include_masks=False, include_caches=False)[source]

A list of 3d groundtruth data belonging to a ct image Please refer to UVal hdf5 format.

Parameters
  • include_masks – To include 3d masks while reading or not

  • include_caches – To include cached data while reading ot not

Returns

None

is_closed()[source]

Checks if the file is closed or not initialized

read_all_fields() dict[source]

Reads all the existing fields in h5 file

Returns

All the groups in a dictionary

volume()[source]

The ct 3d volume stored in hdf5 file Please refer to UVal hdf5 format.

Returns

None

volume_meta(include_caches=False)[source]

The metadata information regarding the ct 3d image. Please refer to UVal hdf5 format.

Parameters

include_caches – Whether to include the cached data or not

Returns

None

class uval.utils.hdf5_io.UvalHdfFileOutput(filepath: str, copy_from_input: Optional[uval.utils.hdf5_io.UvalHdfFileInput] = None)[source]

Bases: uval.utils.hdf5_io.UvalHdfFile

A class to manage, write to a single uval-specific HDF5 file. The file on disk will not be held open by default. Every operation will open and close the file. If a bunch of operations is executed in a row and the file shall be held open, use a with-context on the UvalHdfFile object.

property detections
property file_meta
property groundtruth
is_closed()[source]

Checks if the file is closed or not initialized

read_all_from(input_file: uval.utils.hdf5_io.UvalHdfFileInput) None[source]

Reading all the meta data included in another HDF5 file :param input_file: HDF file to read from

Returns

None

property volume
property volume_meta
write()[source]

Explicitly call this method to write the file. However, changes will automatically be written when the file is closed.

Returns

None

uval.utils.hdf5_verification module

This module provides functions to verify uval HDF5 files. It reports any problems that a set of existing HDF5 may have. Any missing required field in HDF5 file will be reported as a problem

uval.utils.hdf5_verification.verify_hdf5_files(folder_path: str, recursive: bool = False, file_filter: str = '*.h5', print_problems: bool = True) Dict[str, List[str]][source]

Give a folder, finds and verifies all HDF5 files inside. This can be recursive or filtered if desired.

Returns a dictionary with all the problems found for each file. :param folder_path: The path to folder containing HDF5 files :param recursive: To parse the folder recursively or not :param file_filter: The wildcard to include HDF5 files by name :param print_problems: To print the detected problems in standard output or not

Returns

A dictionary containing all the detected problems including the file name and problem description

uval.utils.hdf5_verification.verify_single_hdf5_file(file_path: str) list[source]

Checks a single hdf5 file and returns a list of error descriptions.

Parameters

file_path – The HDF5 file path to be verified

Returns

A list of problems detected in HDF5 file

uval.utils.hdf5_virtual module

class uval.utils.hdf5_virtual.Hdf5Virtual[source]

Bases: object

This module provides the class Hdf5Virtual which represents a reference to an Hdf5 file that actually resides on disk somewhere. It keeps meta data and other low-storage information in memory. When trying to access larger parts of the file, like volume, masks or projections, it accesses the actual file.

uval.utils.label_naming module

We defined a ‘short’ form of label names. That means, if we have a bag with volume id BAGGAGE_20171205_081937_012345 and a label called BAGGAGE_20171205_081937_012345_label_2, we can write the label as ‘%_label_2’.

To escape a ‘%’ as actual character, we use ‘%%’ in the short form.

This module provides functions to convert between the two forms.

uval.utils.label_naming.label_long_to_short(volume_id: str, long_label_id: str) str[source]

This function will shorten the full label_id if possible. For a definition of the short and long form, see this module’s docstring.

Parameters
  • volume_id – The ID which represents a image file containing the volume

  • long_label_id – Full id of label with no ‘%’ inside

Returns

Shortened label_id, in which the volume_id is replaced by ‘%’

uval.utils.label_naming.label_short_to_long(volume_id: str, short_label_id: str) str[source]

This function will restore the full label_id if it is a short form. For a definition of the short and long form, see this module’s docstring.

Parameters
  • volume_id – The ID which represents a image file containing the volume

  • short_label_id – A shortened version of label id containing ‘%’ character

Returns

Extended label_id, in which ‘%’ is replaced by the volume_id

uval.utils.log module

This utils module currently only provides logging functionality. Once this grows to much, we will need to split it.

class uval.utils.log.RootLogger(logging_level=10)[source]

Bases: object

All the child logger messages will propagate through this root logger

set_up_handlers()[source]

Sets up Stream Handlers

uval.utils.yaml_io module

This module provides simple functions to read and write YAML files. The data structure for pure YAML IO operations would be the dictionary

uval.utils.yaml_io.load_yaml_data(yaml_file_path: str) Dict[source]

loads data from a YAML file into a directory structure and returns dict type If the file does not exist or not read properly, returns None.

Parameters

yaml_file_path – The path to the YAML file

Returns

Data read from YAML file in dictionary format

uval.utils.yaml_io.store_yaml_data(data_dict: dict, yaml_file_path: str) bool[source]

Stores a dictionary type data into YAML file and returns True If the folder is not writable of there’s a problem with the writing, it returns False

Parameters
  • data_dict

  • yaml_file_path

Returns

Returns true if YAML stored successfully, otherwise returns false

Module contents