Conversation
- Fix JSON-LD in FAQ pages: convert JSX {JSON.stringify()} to static JSON
for correct Mintlify rendering
- Add RisingWave vs Materialize comparison page with FAQPage JSON-LD
- Add RisingWave vs ksqlDB comparison page with FAQPage JSON-LD
- Add RisingWave vs Kafka Streams comparison page with FAQPage JSON-LD
- Add all comparison pages to docs.json navigation
- Split key concepts into independent AEO-optimized pages:
- What is a Streaming Database?
- What is a Materialized View in RisingWave?
- What is a Source in RisingWave?
- What is a Sink in RisingWave?
- What is CDC in RisingWave?
- Add concept pages to docs.json navigation
- Add cross-links from glossary to new concept pages
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add "See also" sections to 7 high-traffic pages: - get-started/intro.mdx - processing/overview.mdx - ingestion/overview.mdx - delivery/overview.mdx - cloud/intro.mdx - deploy/deployment-modes-overview.mdx - Add HowTo JSON-LD schema markup to 3 tutorial pages: - get-started/quickstart.mdx - deploy/install-psql-without-postgresql.mdx - deploy/risingwave-docker-compose.mdx Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
yingjunwu
left a comment
There was a problem hiding this comment.
Tech Writer Review: SEO Phase 2 — Comparison & Concept Pages
This PR adds 8 new pages (5 concept pages, 3 comparison pages), cross-links via "See also" sections, JSON-LD structured data, and fixes the {JSON.stringify()} issue from PR #1074. The new content is well-structured and fills an important SEO gap. Detailed findings below.
🔴 Issues to Fix
1. PR description is empty
The template fields (Summary, Related code PR, Related doc issue, Checklist) are all unfilled. Please complete the description for reviewability and future reference.
2. Materialize comparison — specific claims need verification against current docs
reference/risingwave-materialize-comparison.mdx makes several specific claims that may become outdated quickly:
- "Community Edition is free but capped at 24 GiB memory and 48 GiB disk" — Materialize iterates on pricing/packaging frequently. Verify these exact numbers against Materialize's current self-managed docs.
- "No native PostgreSQL or JDBC sink — only community workarounds via SUBSCRIBE" — Verify this is still true. If Materialize has added new sinks since this was written, the claim becomes misleading.
- "Materialize sinks: Kafka, S3, Iceberg (via S3 Tables)" — Verify completeness against Materialize's current sink list.
Since comparison pages are high-stakes (competitors may read them), factual errors here can damage credibility. Recommend adding a "Last updated: YYYY-MM" note to each comparison page so readers know the freshness.
3. ksqlDB comparison — deadlock claim needs citation
"the ksqlDB documentation warns of potential deadlocks under concurrent pull query load"
This is a specific claim about ksqlDB's documentation. Either:
- Link to the specific ksqlDB docs page that mentions this, or
- Soften to something like "pull queries have known limitations under concurrent load"
Unsubstantiated negative claims about competitors can backfire.
4. Kafka sink SQL example in what-is-sink.mdx — redundant parameters
CREATE SINK order_metrics_sink FROM order_metrics
WITH (
connector = 'kafka',
topic = 'order-metrics',
properties.bootstrap.server = 'broker:9092',
type = 'upsert', -- redundant with FORMAT below
primary_key = 'product_id'
) FORMAT UPSERT ENCODE JSON;Having both type = 'upsert' in the WITH clause AND FORMAT UPSERT ENCODE JSON is redundant. While the existing Kafka docs have 2 examples with both, the majority (10 out of 12) use only FORMAT ... ENCODE .... For a concept page aimed at newcomers, pick one approach — recommend removing type = 'upsert' and keeping only FORMAT UPSERT ENCODE JSON, which is the modern/preferred pattern.
🟡 Suggestions (Non-blocking)
5. <head> tag for JSON-LD — verify in deploy preview
The fix from {JSON.stringify({...})} to raw JSON inside <head><script> is the right approach. Please verify in the Mintlify deploy preview that:
- The
<head>tag at the bottom of MDX files actually injects into the page<head> - The JSON-LD is valid (test with Google Rich Results Test)
This applies to all 8 new pages plus the 2 FAQ pages and 2 HowTo pages.
6. Materialize comparison — nuance on replicas
"Materialize replicas are for fault tolerance only — each replica performs identical work, so adding replicas does not increase parallel processing throughput."
This is technically accurate but could benefit from a brief clarification that Materialize scales by increasing cluster size (e.g., from 25cc to 50cc), not by adding replicas. Without this context, readers might think Materialize can't scale at all.
7. ksqlDB — Confluent/Flink strategy note
"Confluent has shifted its strategic focus toward Apache Flink as its primary stream processing solution."
This is accurate, but consider adding a date reference (e.g., "announced in 2024") so the claim stays verifiable as time passes.
8. Concept page SQL examples — simplified but accurate
The CDC shared source example omits optional parameters like slot.name and publication.name. This is fine for a concept page — the "Related topics" links already point to the full connector docs with complete parameter lists.
✅ What works well
- Consistent structure across all new pages: every concept page follows (What is X → How it works → Comparison table → When to use → Related topics), and every comparison page follows (Summary table → section-by-section → How to choose)
- JSON-LD fix: raw JSON in
<head>is much more reliable than{JSON.stringify()}in MDX - "See also" cross-links on existing pages (intro, quickstart, ingestion, processing, delivery, deploy, cloud, key-concepts) form a good internal linking mesh
- docs.json updated with all 8 new pages in the correct Reference section
- HowTo structured data on
install-psql-without-postgresqlandrisingwave-docker-compose— good candidates for rich snippets - All internal links verified — every link in the new and modified pages points to an existing page
- Glossary (
key-concepts.mdx) deep-dive links — nice touch connecting brief glossary entries to full concept pages
Summary: Main action items are (1) fill out the PR description, (2) verify Materialize comparison claims against their current docs, (3) cite or soften the ksqlDB deadlock claim, and (4) clean up the redundant Kafka sink example. Everything else is polish.
There was a problem hiding this comment.
Pull request overview
This PR adds a set of new “What is …” concept pages and product comparison pages under Reference, and improves internal linking/SEO by adding structured data (JSON-LD) and “See also” sections across key docs.
Changes:
- Add new Reference concept pages (streaming database, materialized views, sources, sinks, CDC) and new comparison pages (Materialize, ksqlDB, Kafka Streams).
- Add cross-links from existing docs (Key concepts + multiple overview pages) to the new Reference pages via “See also” sections.
- Add/normalize JSON-LD structured data blocks for SEO on several high-traffic pages.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| reference/what-is-streaming-database.mdx | New concept page explaining streaming databases + JSON-LD FAQ. |
| reference/what-is-source.mdx | New concept page for sources + examples + JSON-LD FAQ. |
| reference/what-is-sink.mdx | New concept page for sinks + examples + JSON-LD FAQ. |
| reference/what-is-materialized-view.mdx | New concept page for materialized views + examples + JSON-LD FAQ. |
| reference/what-is-cdc.mdx | New concept page for CDC + examples + JSON-LD FAQ. |
| reference/risingwave-materialize-comparison.mdx | New comparison page (RisingWave vs Materialize) + JSON-LD FAQ. |
| reference/risingwave-ksqldb-comparison.mdx | New comparison page (RisingWave vs ksqlDB) + JSON-LD FAQ. |
| reference/risingwave-kafka-streams-comparison.mdx | New comparison page (RisingWave vs Kafka Streams) + JSON-LD FAQ. |
| reference/key-concepts.mdx | Add “deep dive” links to the new Reference concept pages. |
| processing/overview.mdx | Add “See also” section linking to MV + related overviews. |
| ingestion/overview.mdx | Add “See also” section linking to Source/CDC + related overviews. |
| delivery/overview.mdx | Add “See also” section linking to Sink + related overviews. |
| get-started/quickstart.mdx | Add JSON-LD HowTo structured data. |
| get-started/intro.mdx | Add “See also” section for key onboarding routes. |
| faq/faq-when-to-use-risingwave.mdx | Replace JSON-LD injection with explicit <head><script type=\"application/ld+json\">…. |
| faq/faq-using-risingwave.mdx | Replace JSON-LD injection with explicit <head><script type=\"application/ld+json\">…. |
| deploy/risingwave-docker-compose.mdx | Add JSON-LD HowTo structured data. |
| deploy/install-psql-without-postgresql.mdx | Add JSON-LD HowTo structured data. |
| deploy/deployment-modes-overview.mdx | Add “See also” section for deployment-related navigation. |
| cloud/intro.mdx | Add “See also” section for cloud-related navigation. |
| docs.json | Add new pages to the Reference navigation group. |
| "@type": "Answer", | ||
| "text": "A sink in RisingWave is a connection that continuously delivers processed streaming data to an external downstream system such as Kafka, PostgreSQL, Iceberg, Snowflake, or Elasticsearch. Sinks push data out of RisingWave and support both append-only and upsert modes with exactly-once delivery semantics." | ||
| } |
There was a problem hiding this comment.
The JSON-LD answer text claims sinks have exactly-once delivery semantics. This should match the docs’ actual guarantees (generally at-least-once for most sinks; exactly-once only for specific connectors/settings, e.g., Iceberg with is_exactly_once=true). Please adjust the structured data text accordingly to avoid publishing incorrect guarantees to search engines.
| _Both systems support exactly-once processing, but with different mechanisms._ | ||
|
|
||
| Kafka Streams achieves exactly-once semantics (EOS) by using Kafka transactions. This requires all input and output to be Kafka topics and enabling `processing.guarantee=exactly_once_v2`. EOS adds latency and reduces throughput due to transactional overhead on the Kafka brokers. | ||
|
|
||
| RisingWave provides exactly-once semantics through its barrier-based checkpoint mechanism, which does not depend on Kafka transactions. This works across all sources and sinks, not just Kafka. |
There was a problem hiding this comment.
This section claims RisingWave provides exactly-once semantics across all sources and sinks. The docs currently describe sink delivery guarantees as connector-dependent (most sinks are at-least-once; exactly-once is available only for specific sinks/settings such as Iceberg with is_exactly_once=true). Please reword this comparison to avoid over-claiming delivery semantics and link to /delivery/overview#delivery-semantics for the precise guarantees.
| _Both systems support exactly-once processing, but with different mechanisms._ | |
| Kafka Streams achieves exactly-once semantics (EOS) by using Kafka transactions. This requires all input and output to be Kafka topics and enabling `processing.guarantee=exactly_once_v2`. EOS adds latency and reduces throughput due to transactional overhead on the Kafka brokers. | |
| RisingWave provides exactly-once semantics through its barrier-based checkpoint mechanism, which does not depend on Kafka transactions. This works across all sources and sinks, not just Kafka. | |
| _Both systems can provide exactly-once processing under certain conditions, but with different mechanisms and scopes._ | |
| Kafka Streams achieves exactly-once semantics (EOS) by using Kafka transactions. This requires all input and output to be Kafka topics and enabling `processing.guarantee=exactly_once_v2`. EOS adds latency and reduces throughput due to transactional overhead on the Kafka brokers. | |
| RisingWave uses a barrier-based checkpoint mechanism to provide exactly-once state updates for internal processing without relying on Kafka transactions. End-to-end delivery guarantees for sources and sinks are connector-dependent (many sinks are at-least-once). For the precise semantics by connector, see [Delivery semantics](/delivery/overview#delivery-semantics). |
reference/what-is-cdc.mdx
Outdated
| hostname = 'db.example.com', | ||
| port = '5432', | ||
| username = 'repl_user', | ||
| password = 'secret', |
There was a problem hiding this comment.
Avoid using a literal password value like 'secret' in docs examples. Please switch this to a placeholder (for example, <your_password>) and/or reference the secrets guidance so readers don’t copy-paste insecure credentials.
| password = 'secret', | |
| password = '<your_password>', |
reference/what-is-cdc.mdx
Outdated
| hostname = 'db.example.com', | ||
| port = '5432', | ||
| username = 'repl_user', | ||
| password = 'secret', |
There was a problem hiding this comment.
Avoid using a literal password value like 'secret' in docs examples. Please switch this to a placeholder (for example, <your_password>) and/or reference the secrets guidance so readers don’t copy-paste insecure credentials.
reference/what-is-sink.mdx
Outdated
| ## Exactly-once delivery | ||
|
|
||
| RisingWave provides exactly-once semantics for sinks through its barrier-based checkpoint mechanism. When a checkpoint completes, all data up to that point is guaranteed to have been delivered to the sink exactly once. This works across all sink connectors, not just Kafka. | ||
|
|
There was a problem hiding this comment.
The page states that sinks are delivered exactly once across all connectors via checkpoints. This contradicts the documented delivery semantics in /delivery/overview where most sinks are at-least-once (and exactly-once is connector/setting-dependent, e.g., Iceberg when is_exactly_once=true and sink decoupling is enabled). Please update this section to reflect per-connector semantics and point readers to the Delivery semantics section for details.
- Fix over-claimed exactly-once delivery semantics in what-is-sink.mdx and risingwave-kafka-streams-comparison.mdx — most sinks are at-least-once; exactly-once is connector-dependent - Update JSON-LD structured data to reflect accurate delivery guarantees - Replace literal password values with <your_password> placeholder in what-is-cdc.mdx examples Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cements - Add FAQPage JSON-LD to existing Flink and Kafka comparison pages - Add SoftwareApplication JSON-LD to intro.mdx - Create "What is Stream Processing?" concept page with FAQPage JSON-LD - Create "What is Streaming ETL?" concept page with FAQPage JSON-LD - Create "RisingWave vs ClickHouse" comparison page with FAQPage JSON-LD - Add all new pages to docs.json navigation - Add cross-link from glossary to stream processing concept page Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| <head> | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@type": "Question", | ||
| "name": "Why is RisingWave memory usage so high?", | ||
| "acceptedAnswer": { | ||
| "@type": "Answer", | ||
| "text": "This is by design. RisingWave uses memory for in-memory caching of streaming query state to optimize performance. By default, it utilizes all available memory unless configured via RW_TOTAL_MEMORY_BYTES. Setting memory limits is required in Kubernetes and Docker deployments." | ||
| } | ||
| }, | ||
| { | ||
| "@type": "Question", | ||
| "name": "Why is the memory for a RisingWave Serving or Streaming Node not fully utilized?", | ||
| "acceptedAnswer": { | ||
| "@type": "Answer", | ||
| "text": "RisingWave reserves a portion of total memory for system usage. Starting from v1.10, it uses a gradient formula: 30% of the first 16GB plus 20% of the remaining memory. You can override this with the RW_RESERVED_MEMORY_BYTES environment variable or --reserved-memory-bytes startup option (minimum 512MB)." | ||
| } | ||
| }, | ||
| { | ||
| "@type": "Question", | ||
| "name": "Why does CREATE MATERIALIZED VIEW take a long time in RisingWave?", | ||
| "acceptedAnswer": { | ||
| "@type": "Answer", | ||
| "text": "CREATE MATERIALIZED VIEW backfills all historical data from referenced tables to ensure a consistent snapshot. Use SHOW JOBS to check progress. To run non-blocking, set BACKGROUND_DDL=true before the CREATE statement. If progress stays at 0%, the cluster may be experiencing high latency." | ||
| } | ||
| }, | ||
| { | ||
| "@type": "Question", | ||
| "name": "What does RisingWave memory usage consist of?", | ||
| "acceptedAnswer": { | ||
| "@type": "Answer", | ||
| "text": "RisingWave memory is divided into storage memory (block cache, meta cache, shared buffer), compute memory (streaming computation and queries), and reserved memory (system overhead). For example, an 8GB node uses approximately 2.13GB for storage, 3.47GB for compute, and 2.40GB reserved." | ||
| } | ||
| "@context": "https://schema.org", | ||
| "@type": "FAQPage", |
There was a problem hiding this comment.
The JSON-LD block is written as raw JSON inside a <script> tag. In MDX (Mintlify), { ... } in JSX children is parsed as an expression, and a JSON object literal with quoted keys is not valid JavaScript syntax, which can break the docs build. Wrap the JSON-LD as a string expression (for example via JSON.stringify or a template literal) so it’s emitted as text inside the script tag.
| <head> | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@type": "Question", | ||
| "name": "Can RisingWave replace Flink SQL?", | ||
| "acceptedAnswer": { | ||
| "@type": "Answer", | ||
| "text": "Yes. RisingWave is a superset of Flink SQL in terms of capabilities. Users of Flink SQL can easily migrate to RisingWave. RisingWave also offers additional features not present in Flink SQL, such as cascading materialized views. RisingWave uses PostgreSQL syntax, which lowers the learning curve compared to Flink SQL." | ||
| } | ||
| }, | ||
| { | ||
| "@type": "Question", | ||
| "name": "Is RisingWave a unified batch and streaming system?", | ||
| "acceptedAnswer": { | ||
| "@type": "Answer", | ||
| "text": "Yes, RisingWave fully supports both stream processing (continuous incremental computation) and batch processing (computation on stored data). RisingWave shines in stream processing and uses row-based storage optimized for point queries." | ||
| } | ||
| }, | ||
| { | ||
| "@type": "Question", | ||
| "name": "Does RisingWave support transaction processing?", | ||
| "acceptedAnswer": { | ||
| "@type": "Answer", | ||
| "text": "RisingWave supports read-only transactions but does not support read-write transaction processing. It is designed to be positioned downstream from transactional databases, using CDC to replicate data for real-time stream processing." | ||
| } | ||
| }, | ||
| { | ||
| "@type": "Question", | ||
| "name": "Why does RisingWave use row-based storage for tables?", | ||
| "acceptedAnswer": { | ||
| "@type": "Answer", | ||
| "text": "RisingWave uses row-based storage (Hummock) because the same storage engine serves both internal state management for streaming queries and data storage for serving queries. Row-based storage is well-suited for streaming state management and point queries." | ||
| } | ||
| }, | ||
| { | ||
| "@type": "Question", | ||
| "name": "What are the differences between streaming databases and real-time OLAP databases?", | ||
| "acceptedAnswer": { | ||
| "@type": "Answer", | ||
| "text": "Streaming databases like RisingWave optimize for result freshness with real-time materialized views, monitoring, and alerting. OLAP databases like ClickHouse optimize for ad-hoc analytical query performance. Streaming databases use row-based storage while OLAP databases use columnar storage." | ||
| } | ||
| }, | ||
| { | ||
| "@type": "Question", | ||
| "name": "How do materialized views in RisingWave differ from those in OLAP databases?", | ||
| "acceptedAnswer": { | ||
| "@type": "Answer", | ||
| "text": "Materialized views in RisingWave are updated synchronously in real time, strongly consistent across cascading views, and support full stream processing semantics. OLAP databases typically use best-effort periodic refresh and do not support cascading materialized views." | ||
| } | ||
| "@context": "https://schema.org", | ||
| "@type": "FAQPage", | ||
| "mainEntity": [ |
There was a problem hiding this comment.
The JSON-LD is embedded as raw { ... } inside <script> in an MDX file. MDX treats {...} in JSX children as an expression; a JSON object literal with quoted keys isn’t valid JS there, so this can cause MDX parse/build failures. Emit the JSON-LD as a string (e.g., JSON.stringify/template literal) instead of raw JSON.
| <head> | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "HowTo", |
There was a problem hiding this comment.
This JSON-LD <script> contains raw JSON ({ ... }) in an MDX file. MDX may parse {...} as an inline JS expression; a JSON object literal with quoted keys is not valid JS in that position and can break the build. Wrap the JSON-LD payload as a string (for example JSON.stringify or a template literal) so it renders as script text.
| <head> | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "HowTo", | ||
| "name": "How to deploy RisingWave with Docker Compose", |
There was a problem hiding this comment.
This JSON-LD <script> uses raw JSON ({ ... }) inside an MDX file. MDX treats {...} in JSX children as an expression, and a JSON object literal with quoted keys will not parse as valid JS there, potentially breaking the docs build. Wrap the JSON-LD payload as a string (JSON.stringify/template literal) so it’s emitted as script content.
| <head> | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "FAQPage", | ||
| "mainEntity": [ |
There was a problem hiding this comment.
The JSON-LD payload is included as raw { ... } inside <script> in an MDX page. MDX can parse {...} as an inline expression, and a JSON object literal with quoted keys is not valid JS syntax there, which may break the docs build. Serialize the JSON-LD as a string (e.g., JSON.stringify/template literal) so it’s emitted as script text.
| <head> | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "FAQPage", | ||
| "mainEntity": [ |
There was a problem hiding this comment.
This newly added JSON-LD <script> block uses raw JSON inside an MDX file. MDX may interpret {...} in JSX children as an expression, and the JSON object literal (with quoted keys) won’t parse as valid JS in that context, risking a docs build failure. Wrap/serialize the JSON-LD content as a string (JSON.stringify or template literal).
| <head> | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "FAQPage", | ||
| "mainEntity": [ | ||
| { |
There was a problem hiding this comment.
The JSON-LD is embedded as raw JSON inside <script> in an MDX page. MDX may parse {...} as an inline expression; a JSON object literal with quoted keys is not valid JS in that context and can cause MDX parse/build errors. Emit the JSON-LD as a string (e.g., JSON.stringify/template literal) instead.
| <head> | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "FAQPage", | ||
| "mainEntity": [ |
There was a problem hiding this comment.
The JSON-LD payload is written as raw JSON inside <script> in an MDX page. MDX can interpret {...} as an expression; a JSON object literal with quoted keys is not valid JS there and can break MDX parsing/build. Wrap/serialize the JSON-LD as a string (e.g., JSON.stringify or template literal) so it renders as literal script contents.
| <head> | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "FAQPage", | ||
| "mainEntity": [ |
There was a problem hiding this comment.
This JSON-LD <script> is raw JSON inside an MDX page. MDX can parse {...} in JSX children as a JS expression, and a JSON object literal with quoted keys won’t parse as valid JS there, potentially breaking the docs build. Wrap/serialize the JSON-LD as a string (e.g., JSON.stringify or template literal).
| <head> | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "FAQPage", | ||
| "mainEntity": [ |
There was a problem hiding this comment.
The JSON-LD section is embedded as raw JSON inside <script> in an MDX file. MDX treats {...} in JSX children as an expression; a JSON object literal with quoted keys is not valid JS syntax in that position and can fail parsing/build. Serialize the JSON-LD payload to a string (JSON.stringify or template literal) before emitting it inside the script tag.
yingjunwu
left a comment
There was a problem hiding this comment.
Updated Review (after commits 3 & 4)
The PR now has 26 files (+1909/-95), adding 11 new pages total. Good progress on addressing my earlier feedback — exactly-once semantics corrected, password placeholders sanitized. Here are remaining and new findings.
🔴 Issues to Fix
1. ClickHouse comparison: "no cascading MVs" claim is overstated
"There is no support for cascading materialized views (views on views)."
ClickHouse does support MV chaining through target tables (MV1 writes to table1, MV2 reads from table1). The semantics are different (INSERT-only trigger, no update/delete propagation, no cross-view consistency), but the blanket "no support" statement is technically inaccurate and will be called out by ClickHouse users.
Suggest: "ClickHouse supports chaining materialized views through intermediate target tables, but each view only processes new INSERT blocks — updates and deletes are not propagated, and there is no consistency guarantee across the chain."
2. SoftwareApplication JSON-LD: operatingSystem includes non-OS values
In get-started/intro.mdx:
"operatingSystem": "Linux, macOS, Docker, Kubernetes"Docker and Kubernetes are not operating systems. Google's structured data guidelines expect actual OS names here. Suggest: "operatingSystem": "Linux, macOS" or omit the field entirely.
3. ClickHouse sink SQL example missing required parameters
CREATE SINK enriched_events_to_clickhouse FROM enriched_events
WITH (
connector = 'clickhouse',
type = 'append-only',
clickhouse.url = 'http://clickhouse:8123',
clickhouse.database = 'analytics',
clickhouse.table = 'enriched_events'
);Per the ClickHouse sink docs, clickhouse.user and clickhouse.password are required parameters. The CDC examples in other new pages include auth params — this should too, for consistency.
4. Still present from first review — ksqlDB deadlock claim needs citation
"the ksqlDB documentation warns of potential deadlocks under concurrent pull query load"
This specific claim still has no citation. Either link to the ksqlDB doc page, or soften the language.
5. Still present — Materialize resource caps need verification
The "24 GiB memory and 48 GiB disk" self-managed cap claim should be verified against current Materialize docs. Materialize iterates on packaging frequently.
🟡 Suggestions (Non-blocking)
6. Kafka sink example in what-is-sink.mdx still has redundant type + FORMAT
Noted in first review. Both type = 'upsert' in WITH and FORMAT UPSERT ENCODE JSON are present. Not wrong (2/12 examples in the official Kafka docs do this), but the majority pattern is FORMAT-only. Consider removing type = 'upsert' for clarity in a concept page aimed at newcomers.
7. ClickHouse comparison: p99 latency claim
"serving latency is consistently low (10-20ms p99)"
This is the same unverified claim from PR #1074. If no benchmark backs this up, soften or remove.
8. what-is-streaming-etl.mdx Iceberg sink example
The Iceberg sink uses type = 'upsert' with primary_key = 'product_name, category, order_date' — verified this is the correct pattern for Iceberg sinks (they use type in WITH, not FORMAT/ENCODE).
✅ What's improved since first review
- Exactly-once semantics corrected:
what-is-sink.mdxnow says "connector-dependent — most sinks provide at-least-once delivery".risingwave-kafka-streams-comparison.mdxnow properly scopes EOS to internal processing. Both accurate. - Password placeholders sanitized: CDC examples now use
'<your_password>'instead of'secret'. - 3 new well-structured pages:
what-is-stream-processing,what-is-streaming-etl,risingwave-clickhouse-comparison— all follow the established structure pattern. - JSON-LD added to existing comparison pages (Flink, Kafka) — good, catches up pages that were missed in phase 1.
- SoftwareApplication schema on intro page — useful for Google Knowledge Panel (aside from the OS field issue).
- All internal links verified — 27 unique links, all resolve to existing pages or new pages created in this PR.
- TUMBLE syntax verified —
FROM TUMBLE(orders, order_time, INTERVAL '5 minutes')matches the documentedFROM TUMBLE(table, time_col, window_size)pattern. - docs.json correctly updated with all 11 new pages.
Summary: Fix the ClickHouse cascading MV claim (factual error), the operatingSystem JSON-LD field, and the missing ClickHouse sink auth params. The ksqlDB citation and Materialize caps from the first review still need attention. Everything else is solid.
MDX parser (acorn) interprets raw {…} inside <script> tags as JSX
expressions, causing build failures. Wrap all JSON-LD content in
template literals ({`…`}) across 19 files so the JSON is emitted
as literal script text.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Description
[ Please provide a brief summary of the documentation changes and purpose. ]
Related code PR
[ Link to the related code pull request (if any). ]
Related doc issue
[ Link to the related documentation issue or task (if any). ]
Fix [ Provide the link to the doc issue here. ]
Checklist
mint.jsonto include the page in the table of contents.