iceberg_partitions_table: last_updated_at uses manifest-level snapshot, last_updated_snapshot_id column is missing

### Steps to reproduce the behavior (Required)

Against any external Iceberg catalog with a partitioned table that has undergone `rewrite_manifests` / `compactManifests` (any normal housekeeping flow):

```sql
-- Iceberg table with partitions p_a, p_b. Files of p_a were added in snapshot S1,
-- files of p_b in S2. Later, a maintenance snapshot S_M rewrites both manifests
-- (no data change).

SELECT partition_value, last_updated_at, last_updated_snapshot_id
FROM iceberg_cat.db.tbl$iceberg_partitions_table;
```

### Expected behavior (Required)

Per Iceberg's `PARTITIONS` metadata schema, these columns should be _"commit time / id of snapshot that last updated this partition"_ — i.e. the snapshot in which a data or delete file was added/removed for that partition. The same query against Spark/Flink returns:

```
p_a | <ts of S1> | <S1>
p_b | <ts of S2> | <S2>
```

`last_updated_snapshot_id` has existed in Apache Iceberg's `PARTITIONS` metadata since iceberg 1.4.0 (apache/iceberg#7581) — StarRocks runs iceberg 1.10.0, so the column should be exposed in `iceberg_partitions_table`.

### Real behavior (Required)

Two bugs:

1. **`last_updated_at` granularity is wrong.** `IcebergPartitionsTableScanner` resolves `last_updated_at` from `ManifestFile.snapshotId()` (the snapshot that wrote the manifest), not from `ManifestEntry.snapshotId()` (the snapshot that added the file). After manifest rewrite, every partition that lived in a rewritten manifest reports the maintenance snapshot's timestamp:
   ```
   p_a | <ts of S_M> | (missing column)
   p_b | <ts of S_M> | (missing column)
   ```
   Information about when each partition's data actually last changed is lost.

2. **`last_updated_snapshot_id` is missing entirely** from `iceberg_partitions_table` schema, even though:
   - the iceberg `PARTITIONS` metadata table exposes it since iceberg 1.4.0;
   - `IcebergPartitionsTableScanner` already reads via `ManifestReader`, so the per-entry snapshot id is available.

   A user writing portable queries against `$iceberg_partitions_table` cannot get the same answer they would get from Spark/Flink.

The root cause for both is that `IcebergPartitionsTableScanner` iterates `ContentFile` rather than `ManifestEntry`. Iceberg's own `org.apache.iceberg.PartitionsTable.update()` iterates entries and resolves the snapshot per-entry, so manifest rewrites do not lose history.

### StarRocks version (Required)

Reproducible on current `main` (verified against `b91aafbcc5f`). Iceberg dependency: 1.10.0 (`fe/pom.xml:82`, `java-extensions/pom.xml:38`).

---

Fix proposed in #73307.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iceberg_partitions_table: last_updated_at uses manifest-level snapshot, last_updated_snapshot_id column is missing #73308

Steps to reproduce the behavior (Required)

Expected behavior (Required)

Real behavior (Required)

StarRocks version (Required)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

iceberg_partitions_table: last_updated_at uses manifest-level snapshot, last_updated_snapshot_id column is missing #73308

Description

Steps to reproduce the behavior (Required)

Expected behavior (Required)

Real behavior (Required)

StarRocks version (Required)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions