-
Notifications
You must be signed in to change notification settings - Fork 79
Docs for the rafted status check procedure. #1823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
tselmegbaasan
merged 14 commits into
neo4j:dev
from
tselmegbaasan:add-status-check-docs
Sep 25, 2024
Merged
Changes from 1 commit
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
7296e3b
Create a page for the rafted status check.
tselmegbaasan 9c978ee
Add detailed information about fault-tolerance.
tselmegbaasan d93280b
Update the TOC to include the new page
NataliaIvakina b07a7b5
Addressing review comments
tselmegbaasan af39c66
Fix headings and their levels
NataliaIvakina cc15def
Update modules/ROOT/pages/clustering/monitoring/status-check.adoc
tselmegbaasan f86df3b
Update modules/ROOT/pages/clustering/monitoring/status-check.adoc
tselmegbaasan b13f78c
Update modules/ROOT/pages/clustering/monitoring/status-check.adoc
tselmegbaasan 7d13333
Update modules/ROOT/pages/clustering/monitoring/status-check.adoc
tselmegbaasan 671e6cf
Update modules/ROOT/pages/clustering/monitoring/status-check.adoc
tselmegbaasan 32618f5
Update modules/ROOT/pages/clustering/monitoring/status-check.adoc
tselmegbaasan 367035f
Update modules/ROOT/pages/clustering/monitoring/status-check.adoc
tselmegbaasan b9ef49a
Update modules/ROOT/pages/clustering/monitoring/status-check.adoc
tselmegbaasan d45422c
Update modules/ROOT/pages/clustering/monitoring/status-check.adoc
tselmegbaasan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
46 changes: 46 additions & 0 deletions
46
modules/ROOT/pages/clustering/monitoring/status-check.adoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
:description: This section describes how to monitor a database's availability with the help of the rafted status check | ||
[role=label--new-5.24] | ||
== Rafted Status Check | ||
NataliaIvakina marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Neo4j 5.24 introduces the xref:reference/procedures.adoc#procedure_dbms_cluster_statusCheck[`dbms.cluster.statusCheck()`] procedure, which can be used to monitor the ability to replicate in rafted databases, which in most cases means being able to write to the database. It can also | ||
tselmegbaasan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
be used to check which members are up-to-date and can participate in a successful replication. Therefore, it is useful in determining the fault-tolerance of a rafted database as well. A third and final function is to determine the leader of the raft group. | ||
NataliaIvakina marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
[NOTE] | ||
==== | ||
The member on which the procedure is called replicates a `status check entry` in the same raft group as the transactions, and verifies that the entry can be replicated and applied. | ||
|
||
Since the entry is not applied to the transaction state machine, it's not guaranteed that the database is write available even though the status check reports that | ||
tselmegbaasan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
it can replicate. However, it tells that the raft group is healthy and in most cases that means that the database is write available. | ||
tselmegbaasan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
==== | ||
|
||
=== Syntax | ||
|
||
[source, shell] | ||
---- | ||
CALL dbms.cluster.statusCheck(databases :: LIST<STRING>, timeoutMilliseconds = null :: INTEGER) | ||
---- | ||
|
||
* *databases:* the list of databases for which the status check should run. Providing an empty list will run the | ||
status check for all *rafted* databases on that server. | ||
* *timeoutMilliseconds:* specifies how long the replication may take. Default value is 1000 milliseconds. If replication takes longer than this timeout, it will return that | ||
replication is unsuccessful. | ||
tselmegbaasan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
The procedure returns a row for all raft group members of all the requested databases where each row consists of: | ||
|
||
* *database:* the database for which the `status check entry` was replicated. | ||
* *serverId:* the server id of each raft group member, which did or did not participate in a successful replication of the `status check entry`. | ||
* *serverName:* the server name of each raft group member. | ||
* *address:* the bolt address of each raft group member. | ||
* *replicationSuccessful:* indicates if the server (on which the procedure is run) can replicate an entry in raft. Is `TRUE` if this server managed to replicate the `status check entry` to a majority of raft members within the given timeout. `FALSE` | ||
if it failed to replicate within the timeout. The value is the same column-wise. A failed replication | ||
can either mean that there is a real issue in the cluster (e.g. no leader) or it may simply mean that this server is too far behind in raft, and can't therefore replicate. | ||
tselmegbaasan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* *memberStatus:* shows the status of each raft group member. It can either be `APPLYING`, `REPLICATING` or `UNAVAILABLE`. `APPLYING` means that the raft group member has raft running and is actively applying entries, including transactions. | ||
`REPLICATING` means that the member can participate in replicating, but can't apply. This state is uncommon, but may happen while waiting for the database to start and accept transactions. | ||
* *recognisedLeader:* shows the server id of the perceived leader of each raft group member. | ||
* *recognisedLeaderTerm:* shows the term of the perceived leader of each raft group member. If the raft group members report different leaders, the one with the highest term should be trusted. | ||
* *requester:* is `TRUE` for the server on which the procedure is run, and `FALSE` on the remaining servers. | ||
* *error:* contains the error message if there is one. An example of an error is that one of more of the requested databases doesn't exist on the requester. | ||
tselmegbaasan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
In general the `replicationSuccessful` field can be used to determine overall write-availability, whereas the `memberStatus` field can be checked in order to see whether the database is fault-tolerant or not. | ||
nick-giles-neo marked this conversation as resolved.
Show resolved
Hide resolved
tselmegbaasan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.