# Livestream Stats — How the Calculation Works ## Overview `GET /posts/:id/livestream_stats` (and the groups variant) returns aggregated viewer statistics for a completed livestream. | Field | Source | |-------|--------| | `totalViewers` | Stream.io report — `participants.unique - publishers.unique` | | `totalUniqueViewers` | Same as `totalViewers` (Stream deduplicates by `user_id`) | | `maximumConcurrentViewers` | Stream.io report — `participants.max_concurrent - publishers.unique` | | `averageViewingTimeSeconds` | Stream.io report — `subscribers.total_subscribed_duration_seconds / subscribers.total` | | `viewers.mobilePercentage` | Stream.io report — `by_operating_system` (case-insensitive mobile filter) | | `viewers.webPercentage` | Stream.io report — `by_operating_system` (everything that is not mobile) | | `amountOfViewersTimeline` | Stream.io report — `count_over_time.by_minute` with ghost subtraction | The mapping lives in `infrastructure/streamIOProviderAdapter.ts` → `getLivestreamStats`. --- ## The Ghost Participant Problem Stream.io counts every connection event in `participants.subscribers.*` and `count_over_time.by_minute`. This includes system actors that are not human viewers. These ghosts inflate `subscribers.total`, `subscribers.unique`, and the per-minute `max` values. A stream with zero human viewers still shows 2–4 subscribers in the raw report. ### What is NOT inflated `participants.unique` and `participants.max_concurrent` count **distinct authenticated users** who joined the call. Stream's internal system accounts are separate from human user accounts, so these two fields are ghost-free. --- ## Field-by-field Derivation ### `totalHosts` ``` totalHosts = participants.publishers.unique ``` `publishers.unique` is used — **not** `publishers.total`. When a host uses OBS (external RTMP) and reconnects the stream multiple times, each reconnection creates a new publisher slot, inflating `publishers.total`. `publishers.unique` stays at 2 (RTMP ingest user + WebRTC host control client) regardless of how many times the host reconnected. ### `totalViewers` / `totalUniqueViewers` ``` totalViewers = participants.unique - totalHosts ``` `participants.unique` is the count of distinct users who joined. Subtracting `totalHosts` removes the host (and OBS RTMP user for external streams) to get the human viewer count. Stream deduplicates by `user_id`, so a viewer who rejoined still counts as 1. If `totalViewers <= 0` (host-only stream), all stats return zero immediately. ### `maximumConcurrentViewers` ``` maximumConcurrentViewers = participants.max_concurrent - totalHosts ``` `max_concurrent` is the peak number of simultaneous participants. Subtracting hosts gives the peak concurrent viewer count. ### `averageViewingTimeSeconds` ``` averageViewingTimeSeconds = round(subscribers.total_subscribed_duration_seconds / subscribers.total) ``` Stream is the only source of viewing duration, so `subscribers.*` is used here despite ghost inflation — ghosts contribute duration too, but this is the best available approximation. Guards against `subscribers.total = 0`. ### Ghost offset for the timeline ``` ghosts = max(0, subscribers.unique - totalViewers) ``` `subscribers.unique` includes ghost accounts. `totalViewers` is the known human count. The difference is the per-stream ghost offset applied to each timeline bucket. `Math.max(0, ...)` prevents a negative offset on unusual payloads. ### `amountOfViewersTimeline` Stream reports `count_over_time.by_minute`: a sparse list of `{ start_ts, max }` entries, emitted only when the participant count changes within that minute. The helper `getViewersTimeline` reconstructs a **fixed 5-minute-bucket** timeline: 1. Groups consecutive per-minute samples into 5-minute windows (first sample's timestamp anchors the first window). 2. Fills gaps between windows by repeating the last known bucket (Stream omits windows with no change). 3. Takes the `max` within each bucket, subtracts the ghost offset, and clamps to 0. **Known ±1 artifact**: when two viewers join/leave within the same minute but in different seconds, Stream's per-minute `max` can count them as simultaneous even if they never overlapped. This is an inherent limitation of the 1-minute granularity of the Stream report and is accepted as-is. ### OS breakdown ``` mobileOsNames = { 'android', 'ios' } // matched case-insensitively mobileViewers = by_operating_system.filter(os => isMobile(os.name)).sum(os.unique) webViewers = by_operating_system.filter(os => !isMobile(os.name)).sum(os.unique) mobilePercentage = round(mobileViewers / (mobileViewers + webViewers) * 100) webPercentage = round(webViewers / (mobileViewers + webViewers) * 100) ``` Stream returns mixed-case OS names (`Android`, `android`, `iOS`, `Linux`, `macOS`, `Mac OS`). The filter lowercases before matching. If no device entries exist, both percentages return 0 (avoids NaN). Note: ghost devices (`stream-react`, `stream-go`) appear in `by_device` but also map to OS entries (`Linux`, `linux`, `macOS`). They are currently counted as web viewers, introducing a small constant bias. --- ## Stream.io Report Shape (relevant fields) ``` report.participants ├── unique # distinct human+ghost users (ghost-free for user accounts) ├── max_concurrent # peak simultaneous participants (ghost-free) ├── publishers │ ├── total # total publish events (inflated by OBS reconnects) │ └── unique # distinct publisher identities (safe to use) ├── subscribers │ ├── total # total subscription events (inflated by ghosts) │ ├── unique # distinct subscriber identities (includes ghosts) │ └── total_subscribed_duration_seconds ├── by_operating_system[] # { name, unique } — mixed case └── count_over_time └── by_minute[] # { start_ts (nanoseconds), max } — sparse, changes only ``` --- ## Empirical Test Battery The unit tests in `test/modules/livestream/adapters/streamIOProviderAdapter.test.ts` are driven by typed fixtures in `test/modules/livestream/fixtures/streamIOReport.ts`, extracted from real Stream.io reports captured during manual testing: | Fixture | Scenario | |---------|----------| | `webcamOneViewer` | 1 viewer, webcam stream | | `webcamOneViewerRejoin` | Same viewer rejoined — Stream dedupes by `user_id` | | `webcamThreeConcurrentViewers` | 3 simultaneous viewers at peak | | `webcamThreeSequentialViewers` | 3 sequential viewers — ±1 sub-minute artifact in peak bucket | | `webcamZeroViewers` | Host only — ghost inflation visible in `subscribers.*` | | `webcamOneUserTwoDevices` | Same user on 2 devices — `unique` = 1, `max_concurrent` = 2 | | `obsZeroViewers` | OBS host only — 2 publishers (RTMP + WebRTC control) | | `obsOneViewer` | OBS + 1 viewer — ghost variance clamps timeline bucket to 0 | | `obsReconnectOneViewer` | OBS host reconnected multiple times — `publishers.total` = 5, `publishers.unique` = 2 | --- ## Known Limitations - **Timeline granularity**: 5-minute buckets derived from 1-minute Stream samples. Sub-minute join/leave events within the same minute can cause ±1 overcounting in the peak bucket. - **Ghost OS bias**: system participants (`stream-react`, `stream-go`) show up in `by_operating_system` as Linux/macOS entries and are counted as web viewers, slightly inflating `webPercentage`. - **Average viewing time includes ghosts**: `averageViewingTimeSeconds` uses `subscribers.*`, which includes ghost duration. No alternative source exists in the Stream report. - **OBS timeline undercount**: when ghost variance exceeds the viewer count in a time bucket, the bucket clamps to 0 (observable in `obsOneViewer` fixture).