mendevi.database.meta.merge_extractors¶
- mendevi.database.meta.merge_extractors(labels: set[str], alias: dict[str, str] | None = None, select: str | None = None, return_callable: bool = False) tuple[set[str]][source]¶
Return the source code of the function that extracts all variables.
Parameters¶
- labelsset[str]
The returned variable names. These are the keys to the output dictionary.
- aliasdict[str, str], optional
By default, the label extraction method is defined by the function
get_extractor(). This list of aliases allows any unknown key to define a customised access method.- selectstr, optional
A Python Boolean expression that raises a RejectError exception if it evaluates to False.
- return_callableboolean, default=False
By default, returns the source code of the function. If this option is set to True, an executable function is returned.
Returns¶
- lbls_atomset[str]
The name of the primary value to be extracted for the SQL query.
- funclist[str] or callable
The function that consumes a line from the SQL query, and returns the dictionary of extracted values.
Examples¶
>>> from mendevi.database.meta import merge_extractors >>> print("\n".join(merge_extractors({"rate", "enc_scenario"}, select="'yeti' in hostname")[1])) def line_extractor(raw: dict[str]) -> dict[str]: """Get the labels: enc_scenario, rate, reject.""" hostname = extract.extract_hostname(raw) reject = not ('yeti' in hostname) if reject: raise RejectError("this line must be filtered") encode_cmd = extract.extract_encode_cmd(raw) size = extract.extract_video_size(raw) video_name = extract.extract_video_name(raw) video_duration = extract.extract_video_duration(raw) enc_scenario = f"cmd: {encode_cmd}, video_name: {video_name}, hostname: {hostname}" rate = None if size is None or video_duration is None else 8.0 * size / video_duration return { 'enc_scenario': enc_scenario, 'rate': rate, 'reject': reject, }