Skip to content

Commit 42e1abc

Browse files
Copilothzxa21
andauthored
docs: enforce merge-on-read for append-only Iceberg sinks (#929)
* Initial plan * docs: document append-only Iceberg sinks require merge-on-read write mode Co-authored-by: hzxa21 <5518566+hzxa21@users.noreply.github.com> * docs: address code review feedback on write mode docs Co-authored-by: hzxa21 <5518566+hzxa21@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hzxa21 <5518566+hzxa21@users.noreply.github.com>
1 parent 52a801b commit 42e1abc

File tree

2 files changed

+14
-2
lines changed

2 files changed

+14
-2
lines changed

iceberg/deliver-to-iceberg.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ WITH (
4848
| `database.name` | Yes | The name of the target Iceberg database. |
4949
| `table.name`| Yes| The name of the target Iceberg table.|
5050
| `primary_key`| Yes, if `type` is `upsert` | A comma-separated list of columns that form the primary key. |
51+
| `write_mode` | No | Write mode for the sink. Options: `'merge-on-read'` (default) or `'copy-on-write'`. **Important:** Copy-on-write is only supported for upsert sinks. Append-only sinks must use merge-on-read. See [Write modes](/iceberg/write-modes) for details. |
5152
| `force_append_only`| No | If `true`, converts an `upsert` stream to `append-only`. Updates become inserts and deletes are ignored. Default: `false`. |
5253
| `is_exactly_once`| No | Set to `true` to enable exactly-once delivery semantics. This provides stronger consistency but may impact performance. Default: `true`. <br/><br/> Exactly-once delivery requires [sink decoupling](/delivery/overview#sink-decoupling) to be enabled (the default behavior). If you `SET sink_decouple = false;`, exactly-once semantics will be automatically disabled for the sink.|
5354
| `commit_checkpoint_interval`| No | Controls how often RisingWave commits to Iceberg. Default: `60` (about every 60 seconds in the default configuration). |

iceberg/write-modes.mdx

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ RisingWave supports two write modes for Iceberg sinks and tables, allowing you t
1313

1414
In merge-on-read mode, updates and deletes are written to separate delta files (delete files) instead of rewriting existing data files. When the data is queried, the engine merges the base data files with the delete files on the fly to produce the latest view.
1515

16-
This is the default write mode in RisingWave.
16+
This is the default write mode in RisingWave and is **required** for append-only sinks and tables.
1717

1818
### How it works
1919

@@ -59,6 +59,10 @@ WITH (
5959

6060
## Copy-on-write (CoW)
6161

62+
<Warning>
63+
Copy-on-write mode is **only supported for upsert sinks and tables**. Append-only sinks must use merge-on-read mode. Attempting to create an append-only sink with `write_mode = 'copy-on-write'` will result in an error.
64+
</Warning>
65+
6266
In copy-on-write mode, updates and deletes are handled by rewriting the data files that contain the affected rows. This ensures that every snapshot presents a clean, delete-free view of the data, optimizing read performance for external consumers.
6367

6468
### How it works
@@ -114,12 +118,18 @@ WITH (
114118

115119
Choose the write mode that best fits your workload and query patterns.
116120

121+
<Note>
122+
**For append-only workloads**: Merge-on-read is required as it is strictly better than copy-on-write for append-only data. Copy-on-write provides no benefit when there are no updates or deletes to eagerly compact, and has worse write performance.
123+
</Note>
124+
117125
* **Use Merge-on-Read (MoR) if**:
118126
* Your primary concern is write performance and low ingestion latency.
119127
* Downstream query engines can efficiently process delete files.
120128
* Workloads are write-heavy with frequent updates or deletes.
129+
* Your sink is append-only (required for append-only sinks).
121130

122131
* **Use Copy-on-Write (CoW) if**:
132+
* Your sink or table is upsert (not available for append-only).
123133
* Your primary concern is read performance.
124134
* Downstream consumers do not efficiently handle delete files.
125135
* You can tolerate higher write amplification and ingestion latency.
@@ -129,10 +139,11 @@ Choose the write mode that best fits your workload and query patterns.
129139

130140
| Feature | Merge-on-Read (MoR) | Copy-on-Write (CoW) |
131141
|---|---|---|
142+
| **Supported for** | Append-only and upsert sinks/tables | Upsert sinks/tables only |
132143
| **Primary goal** | Optimize write performance | Optimize read performance |
133144
| **Write amplification** | Low (writes delta files) | High (data files are rewritten) |
134145
| **Read performance** | Slower (requires merging data and delete files) | Faster (no merge needed at read time) |
135146
| **Ingestion latency** | Lower (writes are faster) | Higher (due to compaction) |
136147
| **Storage overhead** | Higher (stores base and delta files) | Lower (no separate delete files) |
137148
| **Default mode** | Yes | No |
138-
| **Ideal for** | Write-heavy workloads, real-time ingestion | Read-heavy workloads, BI dashboards |
149+
| **Ideal for** | Write-heavy workloads, real-time ingestion, append-only data | Read-heavy upsert workloads, BI dashboards |

0 commit comments

Comments
 (0)