query - Data Querying

The query module provides SQL-like interfaces for filtering and selecting records from data collections. This is Layer 2 of the architecture.

Key Concepts

  • Query: A filter specification with parameters

  • Param: A single filter condition (Equals, OneOf, etc.)

  • Collect: Fields to return from matching records

Classes

Query Interface

This module provides SQL-like query interfaces for filtering and selecting records from data collections.

Classes

Query

Base query class with parameter filtering.

Param

Base parameter class for query conditions.

Equals

Parameter for equality comparisons.

OneOf

Parameter for “in list” comparisons.

Example

>>> from pyswark.query.interface import Query, Equals, OneOf
>>>
>>> query = Query(
...     params=[
...         ('status', Equals('active')),
...         ('category', OneOf(['A', 'B']))
...     ],
...     collect=['name', 'value']
... )
class pyswark.query.interface.Equals

Bases: Param

Parameter for equality comparison (value == other).

Example

>>> Equals('active')  # matches records where field == 'active'
inputs: bool | str | int | float
model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class pyswark.query.interface.OneOf

Bases: Param

Parameter for “in list” comparison (value in [values]).

Example

>>> OneOf(['A', 'B', 'C'])  # matches records where field in ['A', 'B', 'C']
inputs: list
model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class pyswark.query.interface.Param

Bases: BaseModel

Base class for query parameters.

Parameters:

inputs (Any) – The parameter value(s) for comparison.

inputs: Any
model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class pyswark.query.interface.Query

Bases: BaseModel

SQL-like query for filtering records.

Parameters:
  • params (list or dict) – Query parameters as (field_name, Param) pairs.

  • collect (str or list, optional) – Field names to return in results.

Example

>>> query = Query(
...     params={'status': Equals('active')},
...     collect=['name', 'email']
... )
>>> results = query.runAll(records)
all(records: list[dict], params)

records must meet all param criteria

any(records: list[dict], params)

records must meet any param criteria

collect: str | list[str] | tuple[str]
classmethod collectResults(records, indices, collect)
classmethod extractRecords(records: list[Any])

optional extraction step, say to convert to a list of dicts

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

params: dict[str, Union[pyswark.query.interface.Param, dict]] | list[tuple[str, Union[pyswark.query.interface.Param, dict]]]
runAll(records: list[Any])
runAny(records: list[Any])

Usage Examples

Basic Query

from pyswark.query.interface import Query, Equals, OneOf

# Create a query with parameters
query = Query(
    params=[
        ('status', Equals('active')),
        ('category', OneOf(['A', 'B', 'C']))
    ],
    collect=['name', 'value']
)

# Run against records
results = query.runAll(records)

Query with Dict Parameters

from pyswark.query.interface import Query, Equals

# Parameters can also be specified as a dict
query = Query(
    params={
        'status': Equals('active'),
        'type': Equals('premium')
    }
)

results = query.runAll(records)

Query Methods

# runAll - records must match ALL parameters
results = query.runAll(records)

# runAny - records must match ANY parameter
results = query.runAny(records)

Custom Parameters

You can create custom parameter types by extending Param:

from pyswark.query.interface import Param

class GreaterThan(Param):
    \"\"\"value > threshold\"\"\"
    inputs: float

    def __call__(self, value, records=None):
        return value > self.inputs

# Use in query
query = Query(params=[
    ('price', GreaterThan(100.0))
])

Serialization

Queries are fully serializable:

from pyswark.query.interface import Query, Equals
from pyswark.lib.pydantic import ser_des

query = Query(params={'status': Equals('active')})

# Serialize
json_str = ser_des.toJson(query)

# Deserialize
restored = ser_des.fromJson(json_str)