You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/adr/ADR-0012-label-view.md
+101-8Lines changed: 101 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,7 +36,7 @@ Today, Feast treats all data as immutable, append-only feature data. This works
36
36
|**LabelView**| A new first-class Feast primitive (subclass of `BaseFeatureView`) that manages mutable labels keyed by entities. Stored in its own registry table/proto section. |
37
37
|**ConflictPolicy**| An enum (`LAST_WRITE_WINS`, `LABELER_PRIORITY`, `MAJORITY_VOTE`) controlling how conflicting labels from different labelers are resolved. Enforced for offline store reads; online store uses LAST_WRITE_WINS. |
38
38
|**labeler_field**| A designated schema field (default: `"labeler"`) that identifies which source wrote each label. Enables multi-labeler provenance tracking. |
39
-
|**retain_history**| A boolean flag indicating whether full write history should be kept per entity key. Inherent to the offline store (all writes are appended); online store keeps latest only. |
39
+
40
40
|**reference_feature_view**| Optional link to the `FeatureView` whose entities this label view annotates, for documentation and lineage. |
41
41
|**PushSource integration**| Labels are ingested via `FeatureStore.push()` through a `PushSource`, writing to both online and offline stores in real time. |
42
42
@@ -66,7 +66,7 @@ BaseFeatureView (abstract)
66
66
└── LabelView ← new
67
67
```
68
68
69
-
LabelView inherits from `BaseFeatureView`, gaining the standard name, features, projection, `proto_class`, versioning (`version`, `current_version_number`), and schema infrastructure. It adds label-specific fields: `labeler_field`, `conflict_policy`, `retain_history`, and `reference_feature_view`.
69
+
LabelView inherits from `BaseFeatureView`, gaining the standard name, features, projection, `proto_class`, versioning (`version`, `current_version_number`), and schema infrastructure. It adds label-specific fields: `labeler_field`, `conflict_policy`, `reference_feature_view`, and annotation profile metadata (via `tags`).
70
70
71
71
## Protobuf Schema
72
72
@@ -98,7 +98,7 @@ message LabelViewSpec {
98
98
repeated FeatureSpecV2 entity_columns = 11;
99
99
string labeler_field = 12;
100
100
ConflictResolutionPolicy conflict_policy = 13;
101
-
bool retain_history = 14;
101
+
reserved 14; // was retain_history (removed — offline store always retains history)
LabelView supports **annotation profiles** — metadata that tells the Feast UI *how* labels should be created and edited. Profiles are declared in the existing `tags` dict using the `feast.io/` namespace, requiring no schema or proto changes.
334
+
335
+
## Design Rationale
336
+
337
+
Different labeling tasks require fundamentally different UX:
338
+
339
+
| Task | Interaction | Example |
340
+
|------|-------------|---------|
341
+
| RAG retrieval evaluation | Highlight text spans in a document | Mark chunk relevance for retrieval QA |
342
+
| RLHF reward labeling | Fill structured form per entity | Rate response quality, flag safety issues |
343
+
| Bulk correction | Edit cells in a table | Fix automated labeler mistakes |
344
+
| Active learning | Label model-selected high-value items | Annotate uncertain predictions first |
345
+
346
+
Rather than building a single generic table, the UI reads annotation metadata from tags and selects the appropriate annotation component.
feast.io/label-widget:<field_name> → widget type (enum | binary | text | number)
355
+
```
356
+
357
+
These are parsed by `LabelView.annotation_config` (Python property) and served via the `/annotation-config/{name}` REST endpoint. The UI calls this endpoint to configure the Annotate tab dynamically.
# Why a Separate Primitive Instead of Extending FeatureView?
333
426
334
427
A natural question is: **why introduce a new type rather than adding optional label fields to `FeatureView`?**
335
428
336
-
Structurally, a LabelView today is a schema + entities + PushSource — similar to a `FeatureView` backed by a `PushSource`. The runtime code paths (push, online read, historical join) are identical. One could argue that `labeler_field`, `conflict_policy`, and `retain_history` could be optional fields on `FeatureView` instead of a new type.
429
+
Structurally, a LabelView today is a schema + entities + PushSource — similar to a `FeatureView` backed by a `PushSource`. The runtime code paths (push, online read, historical join) are identical. One could argue that `labeler_field`and `conflict_policy` could be optional fields on `FeatureView` instead of a new type.
337
430
338
431
We chose a separate primitive for the following reasons:
339
432
@@ -355,12 +448,12 @@ The design follows the principle that **it is easier to merge two types later th
355
448
356
449
---
357
450
358
-
# Alpha Limitations & Future Work
451
+
# Limitations & Future Work
359
452
360
453
| Limitation | Current Behavior | Future Direction |
361
454
|---|---|---|
362
455
| Conflict policy enforcement |`conflict_policy` is enforced for **offline store reads** (training data, UI, batch pipelines). Online store uses LAST_WRITE_WINS. | Optional online store enforcement for SQL-capable backends. |
363
-
| History retention |`retain_history` is inherent to the offline store — all writes are appended. Online store keeps only the latest value per entity. | Optional online store multi-row retention for SQL-capable backends. |
456
+
| History retention |The offline store always retains full write history (all writes are appended). Online store keeps only the latest value per entity. | Optional online store multi-row retention for SQL-capable backends. |
364
457
| Labeler priority configuration |`LABELER_PRIORITY` accepts a `labeler_priorities` list via the conflict resolver. Not yet persisted in proto. | Add a `labeler_priorities` field to `LabelViewSpec`. |
365
458
| Batch materialization |`batch_source` is implemented. LabelViews backed by a direct `DataSource` support `feast materialize`. LabelViews with only a `PushSource` (no `batch_source`) remain push-only. | N/A — supported in this release. |
366
459
| Cross-version label joins | No special handling for joining labels across versions in historical retrieval. | Version-aware label joins for reproducible training. |
@@ -372,7 +465,7 @@ The design follows the principle that **it is easier to merge two types later th
372
465
373
466
1.**Should conflict policy enforcement extend to the online store?** Currently enforced only for offline reads (training-first design). SQL-capable online stores could implement MAJORITY_VOTE natively; Redis would need application-level resolution. Most labeling use cases only need offline enforcement.
374
467
375
-
2.**Should retain_history have a configurable retention window?** The offline store currently keeps unbounded history. A `max_history_entries` or `history_ttl` config could bound storage while preserving auditability.
468
+
2.**Should history have a configurable retention window?** The offline store currently keeps unbounded history. A `max_history_entries` or `history_ttl` config could bound storage while preserving auditability.
376
469
377
470
3.**Should FeatureService distinguish features from labels?** Today, FeatureService treats LabelViews and FeatureViews uniformly. A future enhancement could tag which projections are "labels" for downstream frameworks that need this distinction (e.g., auto-splitting X/y in training).
0 commit comments