> ## Documentation Index
> Fetch the complete documentation index at: https://docs.onvy.health/llms.txt
> Use this file to discover all available pages before exploring further.

# Ingestion and processing

> See how provider payloads become harmonized records, scores, and AI outputs.

`health-api` is built around one stable internal model. Many different inputs (provider clouds, mobile SDKs, customer-supplied data) are normalized into a small set of harmonized records that power records, scores, facts, and AI summaries. Your integration reads against that stable model, not against per-provider formats.

## Two integration paths

There are two clear entry paths into the platform that converge into the same downstream model:

| Path            | How data arrives                                        | Typical source                                            |
| --------------- | ------------------------------------------------------- | --------------------------------------------------------- |
| OAuth providers | Provider cloud → ONVY ingestion webhook → harmonization | `oura`, `fitbit`, `garmin`, `withings`, `strava`, `polar` |
| SDK providers   | Mobile app → upload endpoint → harmonization            | `healthkit`, `health_connect`, `samsung_health`           |

Whichever path the data takes, it lands in the same harmonized resources you read through `daily_records`, `activities`, `facts`, `meals`, and `ai_summaries`.

## Raw, harmonized, merged

Three layers describe how provider input becomes the data your integration reads:

* `raw`: provider- or SDK-specific input as it enters the platform
* `harmonized`: the same input mapped into ONVY's common schema, primarily around `daily`, `activity`, and `sleep`
* `merged`: overlapping inputs combined into the higher-level user view that powers records, scores, facts, and downstream delivery

Customer integrations should target the harmonized and merged layers. The raw layer is internal.

## High-level flow

1. ONVY receives provider or application data.
2. Raw payloads are stored and queued for processing.
3. Calculation workers harmonize data into stable record families.
4. Downstream services generate summaries, scores, and webhook events.

This processing is asynchronous. Your integration can write data, then read the resulting harmonized resources once processing completes.

## Source provenance

Every harmonized record carries source information so you can tell where a data point came from:

* `provider`, for example `GARMIN`, `APPLE`, `OURA`
* `app_id`, for example `com.apple.Health`

For SDK-based integrations such as HealthKit, multiple apps can write to the same on-device store. Provenance is preserved per record.

## Deterministic record IDs

Harmonized records use deterministic IDs so the same logical event keeps the same identity even if the same payload is delivered more than once.

* `daily` records use the user's local calendar day. Example: `daily-2024-02-04-APPLE-com.apple.Health`
* `activity` and `sleep` records use a normalized start timestamp. Example: `activity-2024-02-04T06:00:00+00:00-GARMIN-garmin`, `sleep-2024-02-03T22:30:00+00:00-OURA-oura`

Use these IDs as the basis for idempotency on your side.

## Terra ingestion pattern

The Terra pipeline is the main example of delayed ingestion:

1. A Terra webhook writes raw data and schedules work in DynamoDB.
2. A delayed SQS message gives related payloads time to accumulate.
3. The calculation Lambda processes scheduled items in batches by user.
4. Harmonized outputs become available through routes such as `daily_records`, `activities`, and `baselines`.

That short delay reduces duplicate work when several provider payloads arrive back-to-back, while still producing a single consistent set of downstream records for your app to consume.

## What processing produces

* `daily_records` for user-facing scores, zones, and logs
* `facts` for durable AI personalization context
* `activities` and `meals` for structured event history
* `ai_summaries` covering meal nutrition analysis, sleep insights, workouts, and daily, weekly, nutrition, trend, and impact summaries

Domain-specific insight surfaces such as meal nutrition analysis, sleep insights, and weekly trends are delivered through `ai_summaries`. See `/ai-capabilities` for the full type catalog and how each summary is generated.

## Sync state and historical loads

Historical sync is treated as a first-class workflow. Provider, sync type, and sync status are tracked under each user, and changes emit a `users.data_syncs:updated` webhook. Provider-specific constraints, such as a single deep-history request per user for some providers, are handled by ONVY rather than by your integration.

## Schema versioning

Schema evolution within `v1` is additive. Breaking changes require a new major version. Track changes through release notes and adjust client parsing accordingly.

## AI surfaces in the pipeline

AI features consume the same underlying user context:

* Chat completions enrich prompts with ONVY context unless the request opts out of selected sections
* AI summaries persist generated results so you can list, fetch, and audit them later

<Info>
  Internal EventBridge events use a flat `detail.data` shape. The batched `events[]` envelope described in `/webhooks` is only for external webhook delivery.
</Info>
