Skip to content

taps.filter

Filter

Bases: Protocol

Filter protocol.

A Filter is a callable object, e.g., a function, used by the Engine that takes an object as input and returns a boolean indicating if the object should be transformed by Engine's data Transformer.

FilterConfig

Bases: BaseModel, ABC

Abstract Filter plugin configuration.

get_filter() abstractmethod

get_filter() -> Filter

Create a filter from the configuration.

Source code in taps/filter/_protocol.py
@abc.abstractmethod
def get_filter(self) -> Filter:
    """Create a filter from the configuration."""
    ...

AllFilter

Filter that lets all objects pass through.

from taps.filter import AllFilter

filter_ = AllFilter()
assert filter_('value')  # always true

AllFilterConfig

Bases: FilterConfig

AllFilter plugin configuration.

get_filter()

get_filter() -> Filter

Create a filter from the configuration.

Source code in taps/filter/_simple.py
def get_filter(self) -> Filter:
    """Create a filter from the configuration."""
    return AllFilter()

NullFilter

Filter that lets no objects pass through.

from taps.filter import NullFilter

filter_ = NullFilter()
assert not filter_('value')  # always false

NullFilterConfig

Bases: FilterConfig

NullFilter plugin configuration.

get_filter()

get_filter() -> Filter

Create a filter from the configuration.

Source code in taps/filter/_simple.py
def get_filter(self) -> Filter:
    """Create a filter from the configuration."""
    return NullFilter()

ObjectSizeFilter

ObjectSizeFilter(
    *, min_bytes: int = 0, max_bytes: float = math.inf
)

Object size filter.

Checks if the size of an object (computed using sys.getsizeof()) is greater than a minimum size and less than a maximum size.

Warning

sys.getsizeof() does not count the size of objects referred to by the main object.

Example
from taps.filter import ObjectSizeFilter

filter_ = ObjectSizeFilter(min_bytes=100)
assert not filter_('small')
assert filter_('large' * 100)

Parameters:

  • min_bytes (int, default: 0 ) –

    Minimum size threshold (inclusive) to pass through the filter.

  • max_bytes (float, default: inf ) –

    Maximum size threshold (inclusive) to pass through the filter.

Source code in taps/filter/_object.py
def __init__(
    self,
    *,
    min_bytes: int = 0,
    max_bytes: float = math.inf,
) -> None:
    self.min_bytes = min_bytes
    self.max_bytes = max_bytes

ObjectSizeFilterConfig

Bases: FilterConfig

ObjectSizeFilter plugin configuration.

get_filter()

get_filter() -> Filter

Create a filter from the configuration.

Source code in taps/filter/_object.py
def get_filter(self) -> Filter:
    """Create a filter from the configuration."""
    return ObjectSizeFilter(
        min_bytes=self.min_size,
        max_bytes=self.max_size,
    )

ObjectTypeFilter

ObjectTypeFilter(
    *types: type, patterns: Sequence[str] | None = None
)

Object type filter.

Checks if an object is of a certain type using isinstance() or by pattern matching against the name of the type.

Example
from taps.filter import ObjectTypeFilter

filter_ = ObjectTypeFilter(int, str)
assert filter_(42)
assert filter_('value')
assert not filter_(3.14)

Parameters:

  • types (type, default: () ) –

    Types to check.

  • patterns (Sequence[str] | None, default: None ) –

    Regex compatible patterns to compare against the name of the object's type.

Source code in taps/filter/_object.py
def __init__(
    self,
    *types: type,
    patterns: Sequence[str] | None = None,
) -> None:
    self.types = types
    self.patterns = tuple(patterns if patterns is not None else [])

ObjectTypeFilterConfig

Bases: FilterConfig

ObjectTypeFilter plugin configuration.

get_filter()

get_filter() -> Filter

Create a filter from the configuration.

Source code in taps/filter/_object.py
def get_filter(self) -> Filter:
    """Create a filter from the configuration."""
    return ObjectTypeFilter(patterns=self.patterns)

PickleSizeFilter

PickleSizeFilter(
    *, min_bytes: int = 0, max_bytes: float = math.inf
)

Object size filter.

Checks if the size of an object (computed using size of the pickled object) is greater than a minimum size and less than a maximum size.

Warning

Pickling large objects can take significant time, so this filter type is only recommended when the data transformation cost (e.g., communication or storage) is significantly greater than serialization of the objects.

Example
from taps.filter import PickleSizeFilter

filter_ = PickleSizeFilter(min_bytes=100)
assert not filter_('small')
assert filter_('large' * 100)

Parameters:

  • min_bytes (int, default: 0 ) –

    Minimum size threshold (inclusive) to pass through the filter.

  • max_bytes (float, default: inf ) –

    Maximum size threshold (inclusive) to pass through the filter.

Source code in taps/filter/_object.py
def __init__(
    self,
    *,
    min_bytes: int = 0,
    max_bytes: float = math.inf,
) -> None:
    self.min_bytes = min_bytes
    self.max_bytes = max_bytes

PickleSizeFilterConfig

Bases: FilterConfig

PickleSizeFilter plugin configuration.

get_filter()

get_filter() -> Filter

Create a filter from the configuration.

Source code in taps/filter/_object.py
def get_filter(self) -> Filter:
    """Create a filter from the configuration."""
    return PickleSizeFilter(
        min_bytes=self.min_size,
        max_bytes=self.max_size,
    )