[SQL] Use a single WINDOW operator for temporal filters instead of two when possible#5551
[SQL] Use a single WINDOW operator for temporal filters instead of two when possible#5551mihaibudiu merged 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves how temporal filters involving NOW() are compiled by merging compatible filter predicates so they can be implemented with a single WINDOW operator when possible, which can reduce memory usage. It also refactors temporal-filter analysis logic into reusable top-level classes and adds tests to validate the new behavior.
Changes:
- Extracts temporal filter decomposition and representation (
BooleanExpression,TemporalFilter,TemporalFilterList, etc.) fromRewriteNowinto dedicated top-level classes. - Adds a
MergeFilterspass to recombine consecutiveFILTERoperators with compatible temporal predicates into a single filter, before rewriting to window operators. - Updates and adds tests (including a regression test ensuring only one window is generated for certain
NOW()-based predicates) and introduces a small simplification inSimplifyforANDofwrap_boolexpressions, with an added type-consistency assertion.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
TemporalFilterTests.java |
Adjusts tests to reference the new top-level BooleanExpression interface used for temporal filter decomposition. |
StreamingTests.java |
Updates expected counts in streaming tests to reflect that some scenarios now compile to a single window instead of multiple windows. |
Regression2Tests.java |
Adds a regression test verifying that two compatible temporal predicates are implemented using a single DBSPWindowOperator. |
WindowBounds.java |
Introduces a top-level record to encapsulate computed window bounds and generate window closures. |
WindowBound.java |
Introduces a top-level record representing a single window bound and logic to combine bounds. |
TemporalFilterList.java |
New top-level record representing a compatible list of TemporalFilter instances and logic to derive WindowBounds. |
TemporalFilter.java |
New top-level record modeling a temporal comparison suitable for implementation as a temporal filter. |
RewriteNow.java |
Refactors to use the new temporal helper classes and centralizes temporal filter decomposition via FindComparisons. |
NonTemporalFilter.java |
New top-level record representing non-temporal boolean expressions within filter decompositions. |
NoNow.java |
New top-level record for boolean expressions that do not involve NOW(). |
MergeFilters.java |
New visitor that merges consecutive FILTER operators with compatible temporal predicates into a single filter. |
ImplementNow.java |
Wires the new MergeFilters pass into the temporal rewrite pipeline after breaking filters. |
FindComparisons.java |
New helper class that decomposes filter closures into temporal and non-temporal boolean components. |
BooleanExpression.java |
Defines a shared interface for boolean expression fragments (temporal and non-temporal) used in temporal analysis. |
Simplify.java |
Adds a simplification for AND of wrap_bool expressions and enforces type consistency between original and simplified expressions. |
| * | ||
| * @param parameter Parameter of the original filter comparison function | ||
| * @param noNow left expression, which does not involve now. | ||
| * @param withNow right expression, which must ba a monotone expression of now(), |
There was a problem hiding this comment.
The Javadoc for withNow has a typo: "must ba a monotone expression" should be "must be a monotone expression".
| * @param withNow right expression, which must ba a monotone expression of now(), | |
| * @param withNow right expression, which must be a monotone expression of now(), |
|
|
||
| import org.dbsp.util.ICastable; | ||
|
|
||
| /** Boolean expression that appear as conjuncts in a filter expression */ |
There was a problem hiding this comment.
The Javadoc has a small grammatical issue; "Boolean expression that appear" should be "Boolean expressions that appear".
| /** Boolean expression that appear as conjuncts in a filter expression */ | |
| /** Boolean expressions that appear as conjuncts in a filter expression */ |
| import java.util.Objects; | ||
|
|
||
| /** | ||
| * A List of NowComparison objects that are "compatible" with each other. |
There was a problem hiding this comment.
The Javadoc still refers to "NowComparison objects", but the implementation uses TemporalFilter instances; updating the comment to reference TemporalFilter (or the current abstraction) would make the documentation accurate and reduce confusion.
| * A List of NowComparison objects that are "compatible" with each other. | |
| * A list of {@link TemporalFilter} instances that are "compatible" with each other. |
…o when possible Signed-off-by: Mihai Budiu <mbudiu@feldera.com>
be61ba0 to
9e4bc20
Compare
|
|
||
| /** Visitor which finds consecutive filters that contain NOW expressions that can be implemented in a single WINDOW | ||
| * operator and merges them into a single filter operation. */ | ||
| public class MergeFilters extends CircuitCloneVisitor { |
There was a problem hiding this comment.
why is this specific to NOW(). Shouldn't any filter of the form expr1 <= ts <= expr2 benefit?
There was a problem hiding this comment.
yes, expr1 and expr2 must involve NOW to use a window; otherwise they are just regular filters.
There's a general form recognized by the FindFilters visitor.
In the past we used to implement filters such as
WHERE x < NOW() - expr1 AND x > NOW() - expr2using a single DBSP WINDOW primitive with bounds for left and right. Then we added support for disjunctive queries, using OR, and the algorithm became much simpler, decomposing predicates using NOW into complex networks of filters and sums using de Morgan's rules. In the process we started compiling such filters into a pair of windows, one with a left bound and one with a right bound.This PR adds a little step which will merge back two filters into one if their predicates are "compatible" in this way. This is not necessarily optimal, since it may not recognize more complex patterns where the predicate terms can be reordered, but it regained the ability to compile some filters into a single WINDOW, which can be potentially much more effective in reducing memory consumption.
The PR looks big because I moved (using automatic refactoring tools, none of that AI crap) many inner classes to the toplevel to enable both analyses to reuse them. The only new stuff is really in
MergeFilters.