# Hour Categorization Algorithm ## Overview The hour categorization system determines how many of an employee's worked hours in a day are assigned to a **category** (e.g. "Extra Hours", "Night Hours"). Each category defines one or more **rules**, and each rule contains one or more **conditions** combined with AND. The algorithm operates on **time intervals** (Luxon `Interval`), which allows it to precisely represent which portions of the worked day qualify. > See also: `GET /time-tracking/categorized-hours/explain?userId=&dateString=` runs this same algorithm on demand and returns the executed rule ASTs (`RUN → CATEGORY → RULE → condition chain`), including non-matching categories. Built directly by `business/utils/ruleAstBuilder.ts` + `ruleAstExecutor.ts` (no event-stream reconstruction); entry point at `TimeTrackingHourCategorizationService.explainCategorization`. --- ## Model structure ``` Category ├── Rule 1 (conditions combined with AND) │ ├── Condition A │ └── Condition B ├── Rule 2 (conditions combined with AND) │ └── Condition C └── ... ``` - **Category**: groups rules. Rules are combined with **union** (OR). - **Rule**: groups conditions. Conditions are combined with **AND** (intersection + thresholds). - **Condition**: individual criterion (time range, day of week, hour threshold, etc.). --- ## Condition types Conditions are classified into two types: ### Filter conditions (spatial) Produce **intervals** that represent "where" the hours count. | Field | ValueType | Example | Result | |-------|-----------|---------|--------| | `TIME_RANGE` | `TIME_RANGE` | `10:00-15:00` | Intersection of worked hours with the range | | `WORKED_HOURS` | `DAY_LIST` | `MONDAY` | All worked intervals if the day is Monday, empty otherwise | | `WORKED_HOURS` | `DAY_TYPE` | `WORKDAY` | All worked intervals if it's a workday, empty otherwise | ### Threshold conditions (scalar) Produce a **value in minutes** that trims the resulting intervals. | Field | Operator | ValueType | Example | Effect | |-------|----------|-----------|---------|--------| | `WORKED_HOURS` | `GREATER_THAN` | `NUMBER_OF_HOURS` | `> 3h` | Removes the first 3h (only the excess counts) | | `WORKED_HOURS` | `LESS_THAN` | `NUMBER_OF_HOURS` | `< 5h` | Keeps at most the first 5h | | `WORKED_HOURS` | `GREATER_THAN` | `SCHEDULED_HOURS` | `> scheduled` | Removes scheduled hours from the start | | `WORKED_HOURS` | `LESS_THAN` | `SCHEDULED_HOURS` | `< scheduled` | Keeps at most the scheduled hours | --- ## Per-rule algorithm: literal-order chain (`buildRuleAst` + `executeRuleAst`) A rule's conditions are evaluated **in the literal order of its `conditions` array**. `buildRuleAst(rule)` turns them into a linear chain (one node per condition, in order); `executeRuleAst` seeds the chain with the day's worked intervals and applies each condition as a transform to the running result. **There is no fixed FILTER→GT→LT pipeline and no aggregation across conditions** — the configured order is the order that runs (the point of SQEG-2727: e.g. `> scheduled` placed *before* a `TIME_RANGE` removes scheduled hours first, then keeps the windowed part). A canonical `[filter, GREATER_THAN, LESS_THAN]` rule computes exactly as the old fixed pipeline did. The three transform types, applied wherever each condition sits in the array: ### Filter conditions — intersect / gate All filter conditions produce intervals that are **intersected** with each other (logical AND). If there are multiple filters, the result is the portion of time that satisfies **all of them** simultaneously. ``` Worked: |████████████████████████████████████████| 08:00 18:00 Filter A ·········|██████████████████|··········· (TIME_RANGE) 10:00 15:00 Filter B |████████████████████████████████████████| (DAY=MONDAY) 08:00 18:00 (full day if Monday) Intersection ·········|██████████████████|··········· (A ∩ B) 10:00 15:00 ``` If there are no filters but thresholds exist, the full worked intervals are used as the starting point. ### GREATER_THAN threshold — trim (`trimFromStart`) Removes that many minutes from the **start** of the running intervals — only the excess beyond the threshold counts. Each `GREATER_THAN` condition trims **independently, in place** (in its array position). ``` Filtered: ·········|██████████████████|··········· 10:00 15:00 (5h) GREATER_THAN 3h → remove 3h from start: ·········|░░░░░░░░░░░░|█████|··········· 10:00 13:00 15:00 ^^^^^^^^^^^^ 3h removed Result: ······················|█████|··········· 13:00 15:00 (2h) ``` ⚠️ Multiple `GREATER_THAN` conditions compose **additively** (they trim in sequence), NOT as `max`: - `> 3h` then `> 5h` → 3h + 5h = **8h removed** (not 5h). A rule meant as "overtime beyond the greater of two thresholds" must not be written as two GTs — remediation of existing such rules is tracked in SQEG-2800. ### LESS_THAN threshold — cap (`keepFromStart`) Keeps only that many minutes from the **start** of the running intervals — "at most N hours". ``` Previous result: ·|████████████████|· 09:00 15:00 (6h) LESS_THAN 4h → keep first 4h: ·|████████████|░░░░|· 09:00 13:00 15:00 ^^^^ discarded Result: ·|████████████|····· 09:00 13:00 (4h) ``` Multiple `LESS_THAN` conditions compose to the **minimum** naturally — `keep(5h)` then `keep(3h)` = keep 3h (most restrictive) — which matches the AND meaning, so no special handling is needed. ### Diagram — illustrative (one of each, canonical order) > Conditions actually run in **literal array order** with no aggregation (see above). The boxes below show the common canonical single-threshold case; with multiple same-type thresholds the GREATER_THAN step is **additive**, not `max`. ``` ┌─────────────────────────────────────────────────────┐ │ Day's worked intervals │ │ [08:00-12:00, 13:00-18:00] │ └───────────────────────┬─────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────┐ │ STEP 1: Filter intersection │ │ │ │ TIME_RANGE, DAY_LIST, DAY_TYPE │ │ Multiple filters → intersection (AND) │ │ │ │ If no filters → use full worked intervals │ └───────────────────────┬─────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────┐ │ STEP 2: trimFromStart( max(GREATER_THAN) ) │ │ │ │ Removes the first N hours from the start │ │ Only the excess beyond the threshold remains │ │ │ │ If no GREATER_THAN → no change │ └───────────────────────┬─────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────┐ │ STEP 3: keepFromStart( min(LESS_THAN) ) │ │ │ │ Keeps only the first N hours │ │ Discards the rest │ │ │ │ If no LESS_THAN → no change │ └───────────────────────┬─────────────────────────────┘ │ ▼ Rule's intervals ``` --- ## Rule combination: union Rules within a category are combined with **union** (interval merge). This means that if two rules produce overlapping intervals, the hours are not double-counted. ``` calculateCategoryHours(category): allIntervals = [] for each rule in category.rules: allIntervals.push(...executeRuleAst(buildRuleAst(rule)).intervals) return totalDurationInHours( Interval.merge(allIntervals) ) ``` ### Rule union diagram ``` Worked: |████████████████████████████████████████| 08:00 18:00 Rule 1 ·········|██████████████████|··········· (TIME_RANGE 10:00 15:00 → 5h 10:00-15:00) Rule 2 |████████████████████████████████████████| (DAY=MONDAY) 08:00 18:00 → 10h Union |████████████████████████████████████████| (merge) 08:00 18:00 → 10h (not 15h) ``` The union subsumes overlapping intervals. The result is 10h (the wider range), not the arithmetic sum 5h + 10h = 15h. --- ## Examples ### Example 1: Simple filter — TIME_RANGE without thresholds **Setup:** - Rule: `TIME_RANGE = 10:00-15:00` - Work: 08:00-18:00 (10h) **Calculation:** ``` Worked: |████████████████████████████████████████| 08:00 18:00 TIME_RANGE: ·········|██████████████████|··········· 10:00 15:00 Intersection:·········|██████████████████|··········· 10:00 15:00 ``` - Step 1: intersect([08:00-18:00], [10:00-15:00]) = [10:00-15:00] - Step 2: no GREATER_THAN → no change - Step 3: no LESS_THAN → no change - **Result: 5h** --- ### Example 2: Filter + DAY_LIST (AND) **Setup:** - Rule: `TIME_RANGE = 10:00-15:00 AND DAY_LIST = MONDAY` - Work: 08:00-18:00 (10h) on Monday **Calculation:** ``` Worked: |████████████████████████████████████████| 08:00 18:00 TIME_RANGE: ·········|██████████████████|··········· 10:00 15:00 DAY=MONDAY: |████████████████████████████████████████| 08:00 18:00 Intersection: ·········|██████████████████|··········· 10:00 15:00 ``` - Step 1: intersect([10:00-15:00], [08:00-18:00]) = [10:00-15:00] - **Result: 5h** If the day were Tuesday, DAY_LIST = MONDAY returns empty → **0h**. --- ### Example 3: Threshold only — GREATER_THAN without filters **Setup:** - Rule: `WORKED_HOURS > 2h` - Work: 08:00-13:00 (5h) **Calculation:** ``` Worked: |████████████████████|···· 08:00 13:00 (5h) No filters → starts from full worked intervals. GREATER_THAN 2h → trimFromStart(2h): |░░░░░░░░|██████████████|···· 08:00 10:00 13:00 ^^^^^^^^ 2h removed Result: ··········|██████████████|···· 10:00 13:00 (3h) ``` - Step 1: no filters → [08:00-13:00] - Step 2: trim 2h → [10:00-13:00] - **Result: 3h** --- ### Example 4: Threshold only — LESS_THAN without filters **Setup:** - Rule: `WORKED_HOURS < 3h` - Work: 08:00-13:00 (5h) **Calculation:** ``` Worked: |████████████████████|···· 08:00 13:00 (5h) LESS_THAN 3h → keepFromStart(3h): |████████████|░░░░░░░|···· 08:00 11:00 13:00 ^^^^^^ discarded Result: |████████████|············· 08:00 11:00 (3h) ``` - Step 1: no filters → [08:00-13:00] - Step 3: keep 3h → [08:00-11:00] - **Result: 3h** --- ### Example 5: Filter + LESS_THAN (AND) **Setup:** - Rule: `TIME_RANGE = 10:00-15:00 AND WORKED_HOURS < 3h` - Work: 08:00-18:00 (10h) **Calculation:** ``` Worked: |████████████████████████████████████████| 08:00 18:00 Step 1 — Filter TIME_RANGE: ·········|██████████████████|··········· 10:00 15:00 (5h) Step 3 — LESS_THAN 3h → keepFromStart(3h): ·········|████████████|░░░░░|··········· 10:00 13:00 15:00 ^^^^ discarded Result: ·········|████████████|················· 10:00 13:00 (3h) ``` - **Result: 3h** (of the 5h in range, keeps the first 3h) --- ### Example 6: Filter + GREATER_THAN — full overlap **Setup:** - Rule: `TIME_RANGE = 10:00-15:00 AND WORKED_HOURS > 3h` - Work: 08:00-18:00 (10h) **Calculation:** ``` Worked: |████████████████████████████████████████| 08:00 18:00 Step 1 — Filter TIME_RANGE: ·········|██████████████████|··········· 10:00 15:00 (5h) Step 2 — GREATER_THAN 3h → trimFromStart(3h): ·········|░░░░░░░░░░░░|█████|··········· 10:00 13:00 15:00 ^^^^^^^^^^^^ 3h removed Result: ······················|█████|··········· 13:00 15:00 (2h) ``` - **Result: 2h** (of the 5h in range, the first 3h are "normal"; only the 2h excess is categorized) --- ### Example 7: Filter + GREATER_THAN — partial overlap **Setup:** - Rule: `TIME_RANGE = 10:00-15:00 AND WORKED_HOURS > 3h` - Work: 14:00-20:00 (6h) **Calculation:** ``` Worked: ·····|████████████████████| 14:00 20:00 (6h) Step 1 — Filter TIME_RANGE: ·········|██████████████████|··········· 10:00 15:00 Intersection: ·····|████████|············ 14:00 15:00 (1h) Step 2 — GREATER_THAN 3h → trimFromStart(3h): ····|░|···················· 14:00 15:00 ^ 1h < 3h → entirely removed Result: empty → 0h ``` - **Result: 0h** (only 1h falls within the range, which does not exceed the 3h threshold within that range) --- ### Example 8: Union of two rules (one subsumes the other) **Setup:** - Rule 1: `TIME_RANGE = 10:00-15:00` - Rule 2: `DAY_LIST = MONDAY` - Work: 08:00-18:00 (10h) on Monday **Calculation:** ``` Rule 1: ·········|██████████████████|··········· 10:00 15:00 → [10:00-15:00] Rule 2: |████████████████████████████████████████| 08:00 18:00 → [08:00-18:00] Union: |████████████████████████████████████████| (merge) 08:00 18:00 → 10h ``` - **Result: 10h** (Rule 2 subsumes Rule 1; they are not summed) --- ### Example 9: Union of two rules with thresholds **Setup:** - Rule 1: `DAY_LIST = MONDAY` (all hours) - Rule 2: `TIME_RANGE = 10:00-15:00 AND WORKED_HOURS < 3h` - Work: 08:00-18:00 (10h) on Monday **Calculation:** ``` Rule 1: |████████████████████████████████████████| (MONDAY) 08:00 18:00 → [08:00-18:00] Rule 2: ·········|████████████|·················· (TIME_RANGE 10:00 13:00 → [10:00-13:00] (5h filtered, keep 3h) + <3h) Union: |████████████████████████████████████████| (merge) 08:00 18:00 → 10h ``` - **Result: 10h** (Rule 1 already covers the entire day; Rule 2 is subsumed) On Tuesday (Rule 1 does not match): - Rule 1: DAY_LIST = MONDAY → empty - Rule 2: [10:00-13:00] (3h) - **Result: 3h** --- ### Example 10: GREATER_THAN + LESS_THAN combined **Setup:** - Rule: `WORKED_HOURS > 2h AND WORKED_HOURS < 5h` - Work: 08:00-18:00 (10h) **Calculation:** ``` Worked: |████████████████████████████████████████| 08:00 18:00 (10h) Step 2 — GREATER_THAN 2h → trimFromStart(2h): |░░░░░░░░|██████████████████████████████| 08:00 10:00 18:00 → [10:00-18:00] (8h) Step 3 — LESS_THAN 5h → keepFromStart(5h): ·········|██████████████████|░░░░░░░░░░░| 10:00 15:00 18:00 ^^^^^ discarded → [10:00-15:00] (5h) ``` - **Result: 5h** (the first 2h are discarded, then only the next 5h are kept) --- ### Example 11: Night shift with cross-midnight TIME_RANGE **Setup:** - Rule 1: `TIME_RANGE = 14:00-00:00 AND DAY_LIST = SATURDAY` - Rule 2: `DAY_LIST = SUNDAY` - Work on Saturday: 21:00-03:00 (night shift, 6h total) **Calculation for Saturday's day summary** (includes: 00:00-03:00 from Friday→Saturday + 21:00-00:00 from Saturday→Sunday): ``` Worked Saturday: |███|·················|███| 00:00 03:00 21:00 00:00 Rule 1: TIME_RANGE 14:00-00:00 (cross-midnight → [14:00, next-day 00:00]): ··················|██████████| 14:00 00:00 intersect with DAY=SATURDAY (all worked): ··················|██████████| 14:00 00:00 intersect with worked: ··················|···|███|·· 21:00 00:00 → 3h Rule 2: DAY=SUNDAY → Saturday is not Sunday → empty Union: [21:00-00:00] → 3h ``` - **Saturday result: 3h** **Calculation for Sunday's day summary** (includes: 00:00-03:00 + 21:00-00:00): - Rule 1: DAY_LIST = SATURDAY → Sunday is not Saturday → empty - Rule 2: DAY_LIST = SUNDAY → [00:00-03:00, 21:00-00:00] → 6h - **Sunday result: 6h**