Seo phase2 comparison and concepts by yingjunwu · Pull Request #1075 · risingwavelabs/risingwave-docs

yingjunwu · 2026-03-20T03:56:44Z

Description

[ Please provide a brief summary of the documentation changes and purpose. ]

Related code PR

[ Link to the related code pull request (if any). ]

Related doc issue

[ Link to the related documentation issue or task (if any). ]

Fix [ Provide the link to the doc issue here. ]

Checklist

I have run the documentation build locally to verify the updates are applied correctly.
For new pages, I have updated mint.json to include the page in the table of contents.
All links and references have been checked and are not broken.

- Fix JSON-LD in FAQ pages: convert JSX {JSON.stringify()} to static JSON for correct Mintlify rendering - Add RisingWave vs Materialize comparison page with FAQPage JSON-LD - Add RisingWave vs ksqlDB comparison page with FAQPage JSON-LD - Add RisingWave vs Kafka Streams comparison page with FAQPage JSON-LD - Add all comparison pages to docs.json navigation - Split key concepts into independent AEO-optimized pages: - What is a Streaming Database? - What is a Materialized View in RisingWave? - What is a Source in RisingWave? - What is a Sink in RisingWave? - What is CDC in RisingWave? - Add concept pages to docs.json navigation - Add cross-links from glossary to new concept pages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add "See also" sections to 7 high-traffic pages: - get-started/intro.mdx - processing/overview.mdx - ingestion/overview.mdx - delivery/overview.mdx - cloud/intro.mdx - deploy/deployment-modes-overview.mdx - Add HowTo JSON-LD schema markup to 3 tutorial pages: - get-started/quickstart.mdx - deploy/install-psql-without-postgresql.mdx - deploy/risingwave-docker-compose.mdx Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

yingjunwu

Tech Writer Review: SEO Phase 2 — Comparison & Concept Pages

This PR adds 8 new pages (5 concept pages, 3 comparison pages), cross-links via "See also" sections, JSON-LD structured data, and fixes the {JSON.stringify()} issue from PR #1074. The new content is well-structured and fills an important SEO gap. Detailed findings below.

🔴 Issues to Fix

1. PR description is empty

The template fields (Summary, Related code PR, Related doc issue, Checklist) are all unfilled. Please complete the description for reviewability and future reference.

2. Materialize comparison — specific claims need verification against current docs

reference/risingwave-materialize-comparison.mdx makes several specific claims that may become outdated quickly:

"Community Edition is free but capped at 24 GiB memory and 48 GiB disk" — Materialize iterates on pricing/packaging frequently. Verify these exact numbers against Materialize's current self-managed docs.
"No native PostgreSQL or JDBC sink — only community workarounds via SUBSCRIBE" — Verify this is still true. If Materialize has added new sinks since this was written, the claim becomes misleading.
"Materialize sinks: Kafka, S3, Iceberg (via S3 Tables)" — Verify completeness against Materialize's current sink list.

Since comparison pages are high-stakes (competitors may read them), factual errors here can damage credibility. Recommend adding a "Last updated: YYYY-MM" note to each comparison page so readers know the freshness.

3. ksqlDB comparison — deadlock claim needs citation

"the ksqlDB documentation warns of potential deadlocks under concurrent pull query load"

This is a specific claim about ksqlDB's documentation. Either:

Link to the specific ksqlDB docs page that mentions this, or
Soften to something like "pull queries have known limitations under concurrent load"

Unsubstantiated negative claims about competitors can backfire.

4. Kafka sink SQL example in what-is-sink.mdx — redundant parameters

CREATE SINK order_metrics_sink FROM order_metrics
WITH (
  connector = 'kafka',
  topic = 'order-metrics',
  properties.bootstrap.server = 'broker:9092',
  type = 'upsert',            -- redundant with FORMAT below
  primary_key = 'product_id'
) FORMAT UPSERT ENCODE JSON;

Having both type = 'upsert' in the WITH clause AND FORMAT UPSERT ENCODE JSON is redundant. While the existing Kafka docs have 2 examples with both, the majority (10 out of 12) use only FORMAT ... ENCODE .... For a concept page aimed at newcomers, pick one approach — recommend removing type = 'upsert' and keeping only FORMAT UPSERT ENCODE JSON, which is the modern/preferred pattern.

🟡 Suggestions (Non-blocking)

5. <head> tag for JSON-LD — verify in deploy preview

The fix from {JSON.stringify({...})} to raw JSON inside <head><script> is the right approach. Please verify in the Mintlify deploy preview that:

The <head> tag at the bottom of MDX files actually injects into the page <head>
The JSON-LD is valid (test with Google Rich Results Test)

This applies to all 8 new pages plus the 2 FAQ pages and 2 HowTo pages.

6. Materialize comparison — nuance on replicas

"Materialize replicas are for fault tolerance only — each replica performs identical work, so adding replicas does not increase parallel processing throughput."

This is technically accurate but could benefit from a brief clarification that Materialize scales by increasing cluster size (e.g., from 25cc to 50cc), not by adding replicas. Without this context, readers might think Materialize can't scale at all.

7. ksqlDB — Confluent/Flink strategy note

"Confluent has shifted its strategic focus toward Apache Flink as its primary stream processing solution."

This is accurate, but consider adding a date reference (e.g., "announced in 2024") so the claim stays verifiable as time passes.

8. Concept page SQL examples — simplified but accurate

The CDC shared source example omits optional parameters like slot.name and publication.name. This is fine for a concept page — the "Related topics" links already point to the full connector docs with complete parameter lists.

✅ What works well

Consistent structure across all new pages: every concept page follows (What is X → How it works → Comparison table → When to use → Related topics), and every comparison page follows (Summary table → section-by-section → How to choose)
JSON-LD fix: raw JSON in <head> is much more reliable than {JSON.stringify()} in MDX
"See also" cross-links on existing pages (intro, quickstart, ingestion, processing, delivery, deploy, cloud, key-concepts) form a good internal linking mesh
docs.json updated with all 8 new pages in the correct Reference section
HowTo structured data on install-psql-without-postgresql and risingwave-docker-compose — good candidates for rich snippets
All internal links verified — every link in the new and modified pages points to an existing page
Glossary (key-concepts.mdx) deep-dive links — nice touch connecting brief glossary entries to full concept pages

Summary: Main action items are (1) fill out the PR description, (2) verify Materialize comparison claims against their current docs, (3) cite or soften the ksqlDB deadlock claim, and (4) clean up the redundant Kafka sink example. Everything else is polish.

Copilot

Pull request overview

This PR adds a set of new “What is …” concept pages and product comparison pages under Reference, and improves internal linking/SEO by adding structured data (JSON-LD) and “See also” sections across key docs.

Changes:

Add new Reference concept pages (streaming database, materialized views, sources, sinks, CDC) and new comparison pages (Materialize, ksqlDB, Kafka Streams).
Add cross-links from existing docs (Key concepts + multiple overview pages) to the new Reference pages via “See also” sections.
Add/normalize JSON-LD structured data blocks for SEO on several high-traffic pages.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
reference/what-is-streaming-database.mdx	New concept page explaining streaming databases + JSON-LD FAQ.
reference/what-is-source.mdx	New concept page for sources + examples + JSON-LD FAQ.
reference/what-is-sink.mdx	New concept page for sinks + examples + JSON-LD FAQ.
reference/what-is-materialized-view.mdx	New concept page for materialized views + examples + JSON-LD FAQ.
reference/what-is-cdc.mdx	New concept page for CDC + examples + JSON-LD FAQ.
reference/risingwave-materialize-comparison.mdx	New comparison page (RisingWave vs Materialize) + JSON-LD FAQ.
reference/risingwave-ksqldb-comparison.mdx	New comparison page (RisingWave vs ksqlDB) + JSON-LD FAQ.
reference/risingwave-kafka-streams-comparison.mdx	New comparison page (RisingWave vs Kafka Streams) + JSON-LD FAQ.
reference/key-concepts.mdx	Add “deep dive” links to the new Reference concept pages.
processing/overview.mdx	Add “See also” section linking to MV + related overviews.
ingestion/overview.mdx	Add “See also” section linking to Source/CDC + related overviews.
delivery/overview.mdx	Add “See also” section linking to Sink + related overviews.
get-started/quickstart.mdx	Add JSON-LD HowTo structured data.
get-started/intro.mdx	Add “See also” section for key onboarding routes.
faq/faq-when-to-use-risingwave.mdx	Replace JSON-LD injection with explicit `<head><script type=\"application/ld+json\">…`.
faq/faq-using-risingwave.mdx	Replace JSON-LD injection with explicit `<head><script type=\"application/ld+json\">…`.
deploy/risingwave-docker-compose.mdx	Add JSON-LD HowTo structured data.
deploy/install-psql-without-postgresql.mdx	Add JSON-LD HowTo structured data.
deploy/deployment-modes-overview.mdx	Add “See also” section for deployment-related navigation.
cloud/intro.mdx	Add “See also” section for cloud-related navigation.
docs.json	Add new pages to the Reference navigation group.

Copilot · 2026-03-20T04:02:19Z

reference/what-is-sink.mdx

+            "@type": "Answer",
+            "text": "A sink in RisingWave is a connection that continuously delivers processed streaming data to an external downstream system such as Kafka, PostgreSQL, Iceberg, Snowflake, or Elasticsearch. Sinks push data out of RisingWave and support both append-only and upsert modes with exactly-once delivery semantics."
+          }


The JSON-LD answer text claims sinks have exactly-once delivery semantics. This should match the docs’ actual guarantees (generally at-least-once for most sinks; exactly-once only for specific connectors/settings, e.g., Iceberg with is_exactly_once=true). Please adjust the structured data text accordingly to avoid publishing incorrect guarantees to search engines.

Copilot · 2026-03-20T04:02:20Z

reference/risingwave-kafka-streams-comparison.mdx

+_Both systems support exactly-once processing, but with different mechanisms._
+
+Kafka Streams achieves exactly-once semantics (EOS) by using Kafka transactions. This requires all input and output to be Kafka topics and enabling `processing.guarantee=exactly_once_v2`. EOS adds latency and reduces throughput due to transactional overhead on the Kafka brokers.
+
+RisingWave provides exactly-once semantics through its barrier-based checkpoint mechanism, which does not depend on Kafka transactions. This works across all sources and sinks, not just Kafka.


This section claims RisingWave provides exactly-once semantics across all sources and sinks. The docs currently describe sink delivery guarantees as connector-dependent (most sinks are at-least-once; exactly-once is available only for specific sinks/settings such as Iceberg with is_exactly_once=true). Please reword this comparison to avoid over-claiming delivery semantics and link to /delivery/overview#delivery-semantics for the precise guarantees.

Suggested change

_Both systems support exactly-once processing, but with different mechanisms._

Kafka Streams achieves exactly-once semantics (EOS) by using Kafka transactions. This requires all input and output to be Kafka topics and enabling `processing.guarantee=exactly_once_v2`. EOS adds latency and reduces throughput due to transactional overhead on the Kafka brokers.

RisingWave provides exactly-once semantics through its barrier-based checkpoint mechanism, which does not depend on Kafka transactions. This works across all sources and sinks, not just Kafka.

_Both systems can provide exactly-once processing under certain conditions, but with different mechanisms and scopes._

Kafka Streams achieves exactly-once semantics (EOS) by using Kafka transactions. This requires all input and output to be Kafka topics and enabling `processing.guarantee=exactly_once_v2`. EOS adds latency and reduces throughput due to transactional overhead on the Kafka brokers.

RisingWave uses a barrier-based checkpoint mechanism to provide exactly-once state updates for internal processing without relying on Kafka transactions. End-to-end delivery guarantees for sources and sinks are connector-dependent (many sinks are at-least-once). For the precise semantics by connector, see [Delivery semantics](/delivery/overview#delivery-semantics).

Copilot · 2026-03-20T04:02:20Z

reference/what-is-cdc.mdx

+  hostname = 'db.example.com',
+  port = '5432',
+  username = 'repl_user',
+  password = 'secret',


Avoid using a literal password value like 'secret' in docs examples. Please switch this to a placeholder (for example, <your_password>) and/or reference the secrets guidance so readers don’t copy-paste insecure credentials.

Suggested change

password = 'secret',

password = '<your_password>',

Copilot · 2026-03-20T04:02:20Z

reference/what-is-cdc.mdx

+  hostname = 'db.example.com',
+  port = '5432',
+  username = 'repl_user',
+  password = 'secret',


Avoid using a literal password value like 'secret' in docs examples. Please switch this to a placeholder (for example, <your_password>) and/or reference the secrets guidance so readers don’t copy-paste insecure credentials.

Copilot · 2026-03-20T04:02:20Z

reference/what-is-sink.mdx

+## Exactly-once delivery
+
+RisingWave provides exactly-once semantics for sinks through its barrier-based checkpoint mechanism. When a checkpoint completes, all data up to that point is guaranteed to have been delivered to the sink exactly once. This works across all sink connectors, not just Kafka.
+


The page states that sinks are delivered exactly once across all connectors via checkpoints. This contradicts the documented delivery semantics in /delivery/overview where most sinks are at-least-once (and exactly-once is connector/setting-dependent, e.g., Iceberg when is_exactly_once=true and sink decoupling is enabled). Please update this section to reflect per-connector semantics and point readers to the Delivery semantics section for details.

- Fix over-claimed exactly-once delivery semantics in what-is-sink.mdx and risingwave-kafka-streams-comparison.mdx — most sinks are at-least-once; exactly-once is connector-dependent - Update JSON-LD structured data to reflect accurate delivery guarantees - Replace literal password values with <your_password> placeholder in what-is-cdc.mdx examples Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…cements - Add FAQPage JSON-LD to existing Flink and Kafka comparison pages - Add SoftwareApplication JSON-LD to intro.mdx - Create "What is Stream Processing?" concept page with FAQPage JSON-LD - Create "What is Streaming ETL?" concept page with FAQPage JSON-LD - Create "RisingWave vs ClickHouse" comparison page with FAQPage JSON-LD - Add all new pages to docs.json navigation - Add cross-link from glossary to stream processing concept page Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 19 comments.

Copilot · 2026-03-20T04:45:37Z

faq/faq-using-risingwave.mdx

+<head>
+  <script type="application/ld+json">
    {
-      "@type": "Question",
-      "name": "Why is RisingWave memory usage so high?",
-      "acceptedAnswer": {
-        "@type": "Answer",
-        "text": "This is by design. RisingWave uses memory for in-memory caching of streaming query state to optimize performance. By default, it utilizes all available memory unless configured via RW_TOTAL_MEMORY_BYTES. Setting memory limits is required in Kubernetes and Docker deployments."
-      }
-    },
-    {
-      "@type": "Question",
-      "name": "Why is the memory for a RisingWave Serving or Streaming Node not fully utilized?",
-      "acceptedAnswer": {
-        "@type": "Answer",
-        "text": "RisingWave reserves a portion of total memory for system usage. Starting from v1.10, it uses a gradient formula: 30% of the first 16GB plus 20% of the remaining memory. You can override this with the RW_RESERVED_MEMORY_BYTES environment variable or --reserved-memory-bytes startup option (minimum 512MB)."
-      }
-    },
-    {
-      "@type": "Question",
-      "name": "Why does CREATE MATERIALIZED VIEW take a long time in RisingWave?",
-      "acceptedAnswer": {
-        "@type": "Answer",
-        "text": "CREATE MATERIALIZED VIEW backfills all historical data from referenced tables to ensure a consistent snapshot. Use SHOW JOBS to check progress. To run non-blocking, set BACKGROUND_DDL=true before the CREATE statement. If progress stays at 0%, the cluster may be experiencing high latency."
-      }
-    },
-    {
-      "@type": "Question",
-      "name": "What does RisingWave memory usage consist of?",
-      "acceptedAnswer": {
-        "@type": "Answer",
-        "text": "RisingWave memory is divided into storage memory (block cache, meta cache, shared buffer), compute memory (streaming computation and queries), and reserved memory (system overhead). For example, an 8GB node uses approximately 2.13GB for storage, 3.47GB for compute, and 2.40GB reserved."
-      }
+      "@context": "https://schema.org",
+      "@type": "FAQPage",


The JSON-LD block is written as raw JSON inside a <script> tag. In MDX (Mintlify), { ... } in JSX children is parsed as an expression, and a JSON object literal with quoted keys is not valid JavaScript syntax, which can break the docs build. Wrap the JSON-LD as a string expression (for example via JSON.stringify or a template literal) so it’s emitted as text inside the script tag.

Copilot · 2026-03-20T04:45:37Z

faq/faq-when-to-use-risingwave.mdx

+<head>
+  <script type="application/ld+json">
    {
-      "@type": "Question",
-      "name": "Can RisingWave replace Flink SQL?",
-      "acceptedAnswer": {
-        "@type": "Answer",
-        "text": "Yes. RisingWave is a superset of Flink SQL in terms of capabilities. Users of Flink SQL can easily migrate to RisingWave. RisingWave also offers additional features not present in Flink SQL, such as cascading materialized views. RisingWave uses PostgreSQL syntax, which lowers the learning curve compared to Flink SQL."
-      }
-    },
-    {
-      "@type": "Question",
-      "name": "Is RisingWave a unified batch and streaming system?",
-      "acceptedAnswer": {
-        "@type": "Answer",
-        "text": "Yes, RisingWave fully supports both stream processing (continuous incremental computation) and batch processing (computation on stored data). RisingWave shines in stream processing and uses row-based storage optimized for point queries."
-      }
-    },
-    {
-      "@type": "Question",
-      "name": "Does RisingWave support transaction processing?",
-      "acceptedAnswer": {
-        "@type": "Answer",
-        "text": "RisingWave supports read-only transactions but does not support read-write transaction processing. It is designed to be positioned downstream from transactional databases, using CDC to replicate data for real-time stream processing."
-      }
-    },
-    {
-      "@type": "Question",
-      "name": "Why does RisingWave use row-based storage for tables?",
-      "acceptedAnswer": {
-        "@type": "Answer",
-        "text": "RisingWave uses row-based storage (Hummock) because the same storage engine serves both internal state management for streaming queries and data storage for serving queries. Row-based storage is well-suited for streaming state management and point queries."
-      }
-    },
-    {
-      "@type": "Question",
-      "name": "What are the differences between streaming databases and real-time OLAP databases?",
-      "acceptedAnswer": {
-        "@type": "Answer",
-        "text": "Streaming databases like RisingWave optimize for result freshness with real-time materialized views, monitoring, and alerting. OLAP databases like ClickHouse optimize for ad-hoc analytical query performance. Streaming databases use row-based storage while OLAP databases use columnar storage."
-      }
-    },
-    {
-      "@type": "Question",
-      "name": "How do materialized views in RisingWave differ from those in OLAP databases?",
-      "acceptedAnswer": {
-        "@type": "Answer",
-        "text": "Materialized views in RisingWave are updated synchronously in real time, strongly consistent across cascading views, and support full stream processing semantics. OLAP databases typically use best-effort periodic refresh and do not support cascading materialized views."
-      }
+      "@context": "https://schema.org",
+      "@type": "FAQPage",
+      "mainEntity": [


The JSON-LD is embedded as raw { ... } inside <script> in an MDX file. MDX treats {...} in JSX children as an expression; a JSON object literal with quoted keys isn’t valid JS there, so this can cause MDX parse/build failures. Emit the JSON-LD as a string (e.g., JSON.stringify/template literal) instead of raw JSON.

Copilot · 2026-03-20T04:45:37Z

get-started/quickstart.mdx

+<head>
+  <script type="application/ld+json">
+    {
+      "@context": "https://schema.org",
+      "@type": "HowTo",


This JSON-LD <script> contains raw JSON ({ ... }) in an MDX file. MDX may parse {...} as an inline JS expression; a JSON object literal with quoted keys is not valid JS in that position and can break the build. Wrap the JSON-LD payload as a string (for example JSON.stringify or a template literal) so it renders as script text.

Copilot · 2026-03-20T04:45:37Z

deploy/risingwave-docker-compose.mdx

+<head>
+  <script type="application/ld+json">
+    {
+      "@context": "https://schema.org",
+      "@type": "HowTo",
+      "name": "How to deploy RisingWave with Docker Compose",


This JSON-LD <script> uses raw JSON ({ ... }) inside an MDX file. MDX treats {...} in JSX children as an expression, and a JSON object literal with quoted keys will not parse as valid JS there, potentially breaking the docs build. Wrap the JSON-LD payload as a string (JSON.stringify/template literal) so it’s emitted as script content.

Copilot · 2026-03-20T04:45:38Z

reference/risingwave-kafka-comparison.mdx

+<head>
+  <script type="application/ld+json">
+    {
+      "@context": "https://schema.org",
+      "@type": "FAQPage",
+      "mainEntity": [


The JSON-LD payload is included as raw { ... } inside <script> in an MDX page. MDX can parse {...} as an inline expression, and a JSON object literal with quoted keys is not valid JS syntax there, which may break the docs build. Serialize the JSON-LD as a string (e.g., JSON.stringify/template literal) so it’s emitted as script text.

Copilot · 2026-03-20T04:45:40Z

reference/risingwave-flink-comparison.mdx

+<head>
+  <script type="application/ld+json">
+    {
+      "@context": "https://schema.org",
+      "@type": "FAQPage",
+      "mainEntity": [


This newly added JSON-LD <script> block uses raw JSON inside an MDX file. MDX may interpret {...} in JSX children as an expression, and the JSON object literal (with quoted keys) won’t parse as valid JS in that context, risking a docs build failure. Wrap/serialize the JSON-LD content as a string (JSON.stringify or template literal).

Copilot · 2026-03-20T04:45:40Z

reference/what-is-source.mdx

+<head>
+  <script type="application/ld+json">
+    {
+      "@context": "https://schema.org",
+      "@type": "FAQPage",
+      "mainEntity": [
+        {


The JSON-LD is embedded as raw JSON inside <script> in an MDX page. MDX may parse {...} as an inline expression; a JSON object literal with quoted keys is not valid JS in that context and can cause MDX parse/build errors. Emit the JSON-LD as a string (e.g., JSON.stringify/template literal) instead.

Copilot · 2026-03-20T04:45:40Z

reference/what-is-stream-processing.mdx

+<head>
+  <script type="application/ld+json">
+    {
+      "@context": "https://schema.org",
+      "@type": "FAQPage",
+      "mainEntity": [


The JSON-LD payload is written as raw JSON inside <script> in an MDX page. MDX can interpret {...} as an expression; a JSON object literal with quoted keys is not valid JS there and can break MDX parsing/build. Wrap/serialize the JSON-LD as a string (e.g., JSON.stringify or template literal) so it renders as literal script contents.

Copilot · 2026-03-20T04:45:41Z

reference/risingwave-kafka-streams-comparison.mdx

+<head>
+  <script type="application/ld+json">
+    {
+      "@context": "https://schema.org",
+      "@type": "FAQPage",
+      "mainEntity": [


This JSON-LD <script> is raw JSON inside an MDX page. MDX can parse {...} in JSX children as a JS expression, and a JSON object literal with quoted keys won’t parse as valid JS there, potentially breaking the docs build. Wrap/serialize the JSON-LD as a string (e.g., JSON.stringify or template literal).

Copilot · 2026-03-20T04:45:41Z

reference/risingwave-clickhouse-comparison.mdx

+<head>
+  <script type="application/ld+json">
+    {
+      "@context": "https://schema.org",
+      "@type": "FAQPage",
+      "mainEntity": [


The JSON-LD section is embedded as raw JSON inside <script> in an MDX file. MDX treats {...} in JSX children as an expression; a JSON object literal with quoted keys is not valid JS syntax in that position and can fail parsing/build. Serialize the JSON-LD payload to a string (JSON.stringify or template literal) before emitting it inside the script tag.

yingjunwu

Updated Review (after commits 3 & 4)

The PR now has 26 files (+1909/-95), adding 11 new pages total. Good progress on addressing my earlier feedback — exactly-once semantics corrected, password placeholders sanitized. Here are remaining and new findings.

🔴 Issues to Fix

1. ClickHouse comparison: "no cascading MVs" claim is overstated

"There is no support for cascading materialized views (views on views)."

ClickHouse does support MV chaining through target tables (MV1 writes to table1, MV2 reads from table1). The semantics are different (INSERT-only trigger, no update/delete propagation, no cross-view consistency), but the blanket "no support" statement is technically inaccurate and will be called out by ClickHouse users.

Suggest: "ClickHouse supports chaining materialized views through intermediate target tables, but each view only processes new INSERT blocks — updates and deletes are not propagated, and there is no consistency guarantee across the chain."

2. SoftwareApplication JSON-LD: operatingSystem includes non-OS values

In get-started/intro.mdx:

"operatingSystem": "Linux, macOS, Docker, Kubernetes"

Docker and Kubernetes are not operating systems. Google's structured data guidelines expect actual OS names here. Suggest: "operatingSystem": "Linux, macOS" or omit the field entirely.

3. ClickHouse sink SQL example missing required parameters

CREATE SINK enriched_events_to_clickhouse FROM enriched_events
WITH (
  connector = 'clickhouse',
  type = 'append-only',
  clickhouse.url = 'http://clickhouse:8123',
  clickhouse.database = 'analytics',
  clickhouse.table = 'enriched_events'
);

Per the ClickHouse sink docs, clickhouse.user and clickhouse.password are required parameters. The CDC examples in other new pages include auth params — this should too, for consistency.

4. Still present from first review — ksqlDB deadlock claim needs citation

"the ksqlDB documentation warns of potential deadlocks under concurrent pull query load"

This specific claim still has no citation. Either link to the ksqlDB doc page, or soften the language.

5. Still present — Materialize resource caps need verification

The "24 GiB memory and 48 GiB disk" self-managed cap claim should be verified against current Materialize docs. Materialize iterates on packaging frequently.

🟡 Suggestions (Non-blocking)

6. Kafka sink example in what-is-sink.mdx still has redundant type + FORMAT

Noted in first review. Both type = 'upsert' in WITH and FORMAT UPSERT ENCODE JSON are present. Not wrong (2/12 examples in the official Kafka docs do this), but the majority pattern is FORMAT-only. Consider removing type = 'upsert' for clarity in a concept page aimed at newcomers.

7. ClickHouse comparison: p99 latency claim

"serving latency is consistently low (10-20ms p99)"

This is the same unverified claim from PR #1074. If no benchmark backs this up, soften or remove.

8. what-is-streaming-etl.mdx Iceberg sink example

The Iceberg sink uses type = 'upsert' with primary_key = 'product_name, category, order_date' — verified this is the correct pattern for Iceberg sinks (they use type in WITH, not FORMAT/ENCODE).

✅ What's improved since first review

Exactly-once semantics corrected: what-is-sink.mdx now says "connector-dependent — most sinks provide at-least-once delivery". risingwave-kafka-streams-comparison.mdx now properly scopes EOS to internal processing. Both accurate.
Password placeholders sanitized: CDC examples now use '<your_password>' instead of 'secret'.
3 new well-structured pages: what-is-stream-processing, what-is-streaming-etl, risingwave-clickhouse-comparison — all follow the established structure pattern.
JSON-LD added to existing comparison pages (Flink, Kafka) — good, catches up pages that were missed in phase 1.
SoftwareApplication schema on intro page — useful for Google Knowledge Panel (aside from the OS field issue).
All internal links verified — 27 unique links, all resolve to existing pages or new pages created in this PR.
TUMBLE syntax verified — FROM TUMBLE(orders, order_time, INTERVAL '5 minutes') matches the documented FROM TUMBLE(table, time_col, window_size) pattern.
docs.json correctly updated with all 11 new pages.

Summary: Fix the ClickHouse cascading MV claim (factual error), the operatingSystem JSON-LD field, and the missing ClickHouse sink auth params. The ksqlDB citation and Materialize caps from the first review still need attention. Everything else is solid.

MDX parser (acorn) interprets raw {…} inside <script> tags as JSX expressions, causing build failures. Wrap all JSON-LD content in template literals ({`…`}) across 19 files so the JSON is emitted as literal script text. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

yingjunwu and others added 2 commits March 19, 2026 20:13

Copilot AI review requested due to automatic review settings March 20, 2026 03:56

Copilot started reviewing on behalf of yingjunwu March 20, 2026 03:57 View session

mintlify bot deployed to staging March 20, 2026 03:57 View deployment

yingjunwu commented Mar 20, 2026

View reviewed changes

Copilot AI reviewed Mar 20, 2026

View reviewed changes

yingjunwu and others added 2 commits March 19, 2026 21:21

Copilot AI review requested due to automatic review settings March 20, 2026 04:37

Copilot started reviewing on behalf of yingjunwu March 20, 2026 04:38 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

yingjunwu commented Mar 20, 2026

View reviewed changes

mintlify bot deployed to staging March 20, 2026 05:55 View deployment

yingjunwu merged commit 5ece9d5 into main Mar 20, 2026
6 checks passed

yingjunwu deleted the seo-phase2-comparison-and-concepts branch March 20, 2026 10:17

yingjunwu mentioned this pull request Mar 20, 2026

docs: SEO phase 4 — HowTo JSON-LD, Spark comparison, Flink feature JS… #1076

Merged

3 tasks

		## Exactly-once delivery

		RisingWave provides exactly-once semantics for sinks through its barrier-based checkpoint mechanism. When a checkpoint completes, all data up to that point is guaranteed to have been delivered to the sink exactly once. This works across all sink connectors, not just Kafka.

Conversation

yingjunwu commented Mar 20, 2026

Description

Related code PR

Related doc issue

Checklist

Uh oh!

yingjunwu left a comment

Choose a reason for hiding this comment

Tech Writer Review: SEO Phase 2 — Comparison & Concept Pages

🔴 Issues to Fix

🟡 Suggestions (Non-blocking)

✅ What works well

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

yingjunwu left a comment

Choose a reason for hiding this comment

Updated Review (after commits 3 & 4)

🔴 Issues to Fix

🟡 Suggestions (Non-blocking)

✅ What's improved since first review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects