Skip to content

Commit 739e352

Browse files
committed
persistence: apply review suggestions
1 parent fef7728 commit 739e352

File tree

7 files changed

+160
-103
lines changed

7 files changed

+160
-103
lines changed

doc/code_snippets/snippets/config/instances.enabled/persistence_snapshot/config.yaml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,10 @@ groups:
66
instance001:
77
snapshot:
88
dir: 'var/lib/{{ instance_name }}/snapshots'
9-
count: 10
9+
count: 3
1010
by:
11-
interval: 60
11+
interval: 7200
12+
wal_size: 1000000000000000000
1213
iproto:
1314
listen:
1415
- uri: '127.0.0.1:3301'

doc/concepts/configuration/configuration_persistence.rst

Lines changed: 75 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -26,32 +26,32 @@ This section describes how to define snapshot settings in the :ref:`snapshot <co
2626

2727
.. _configuration_persistence_snapshot_creation:
2828

29-
Configure the automatic snapshot creation
30-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29+
Set up automatic snapshot creation
30+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3131

32-
In Tarantool, it is possible to set automatic :ref:`snapshot creation </reference/reference_lua/box_snapshot>`.
33-
To enable it, the :ref:`snapshot.by.interval <configuration_reference_snapshot_by_interval>` option is used.
34-
The option sets up the :ref:`checkpoint daemon <configuration_persistence_checkpoint_daemon>` that takes new snapshots
35-
every ``snapshot.by.interval`` seconds.
36-
When the number of snapshots reaches the limit of :ref:`snapshot.count <configuration_reference_snapshot_count>` size,
37-
the daemon activates Tarantool garbage collector after the new snapshot is taken.
38-
Tarantool garbage collector deletes the oldest snapshot file and any associated WAL files.
32+
In Tarantool, it is possible to automate the :ref:`snapshot creation </reference/reference_lua/box_snapshot>`.
33+
Automatic creation is enabled by default and can be configured in two ways:
3934

40-
The :ref:`snapshot.by.wal_size <configuration_reference_snapshot_by_wal_size>` option defines the maximum size in bytes
41-
for of all WAL files created since the last snapshot taken.
42-
Once this size is exceeded, the checkpoint daemon takes a snapshot and deletes the old WAL files.
35+
* A new snapshot is taken once in a given period (see :ref:`snapshot.by.interval <configuration_reference_snapshot_by_interval>`).
36+
* A new snapshot is taken once the size of all WAL files created since the last snapshot exceeds a given limit
37+
(see :ref:`snapshot.by.wal_size <configuration_reference_snapshot_by_wal_size>`).
4338

44-
The configuration of the checkpoint daemon might look as follows:
39+
The ``snapshot.by.interval`` option sets up the :ref:`checkpoint daemon <configuration_persistence_checkpoint_daemon>`
40+
that takes a new snapshot every ``snapshot.by.interval`` seconds.
41+
If the ``snapshot.by.interval`` option is set to zero, the checkpoint daemon is disabled.
42+
43+
The ``snapshot.by.wal_size`` option defines the maximum size in bytes for of all WAL files created since the last snapshot taken.
44+
Once this size is exceeded, the checkpoint daemon takes a snapshot. Then, :ref:`Tarantool garbage collector <configuration_persistence_garbage_collector>`
45+
deletes the old WAL files.
46+
47+
The example shows how to specify the ``snapshot.by.interval`` and the ``snapshot.by.wal_size`` options:
4548

4649
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_snapshot/config.yaml
4750
:language: yaml
48-
:start-at: count:
49-
:end-at: 60
51+
:start-at: by:
52+
:end-at: 1000000000000000000
5053
:dedent:
5154

52-
If the ``snapshot.by.interval`` option is set to zero, the checkpoint daemon is disabled.
53-
If the ``snapshot.count`` option is set to zero, the checkpoint daemon does not delete old snapshots.
54-
5555
.. _configuration_persistence_snapshot_dir:
5656

5757
Specify a directory for snapshot files
@@ -79,6 +79,28 @@ For example, you can place snapshots and write-ahead logs on different hard driv
7979
wal:
8080
dir: '/media/drive2/wals'
8181
82+
.. _configuration_persistence_snapshot_count:
83+
84+
Configure a maximum number of stored snapshots
85+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
86+
87+
You can set a limit on the number of snapshots stored in the :ref:`snapshot.dir <configuration_reference_snapshot_dir>`
88+
directory using the :ref:`snapshot.count <configuration_reference_snapshot_count>` option.
89+
Once the number of snapshots reaches the given limit, :ref:`Tarantool garbage collector <configuration_persistence_garbage_collector>`
90+
deletes the oldest snapshot file and any associated WAL files after the new snapshot is taken.
91+
92+
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_snapshot/config.yaml
93+
:language: yaml
94+
:start-at: count:
95+
:end-at: 7200
96+
:dedent:
97+
98+
In the example, the snapshot is created every two hours (every 7200 seconds) until there are three snapshots in the
99+
``snapshot.dir`` directory.
100+
After creating a new snapshot (the fourth one), the oldest snapshot and the corresponding WALs are deleted.
101+
102+
If the ``snapshot.count`` option is set to zero, the garbage collector does not delete old snapshots.
103+
82104
.. _configuration_persistence_wal:
83105

84106
Configure the write-ahead log
@@ -93,8 +115,15 @@ This section describes how to define WAL settings in the :ref:`wal <configuratio
93115
Set the WAL mode
94116
~~~~~~~~~~~~~~~~
95117

96-
To be able to recover data in case of a possible instance restart, enable recording to the write-ahead log.
97-
To do it, set the :ref:`wal.mode <configuration_reference_wal_mode>` configuration option to ``write`` or ``fsync``.
118+
The recording to the write-ahead log is enabled by default.
119+
It means that if an instance restart occurs, the data will be recovered.
120+
The recording to the WAL can be configured using the :ref:`wal.mode <configuration_reference_wal_mode>` configuration option.
121+
122+
There are two modes that enable writing to the WAL:
123+
124+
* ``write`` (default) -- enable WAL and write the data without waiting the data to be flushed to the storage device.
125+
* ``fsync`` -- enable WAL and ensure that the record is written to the storage device.
126+
98127
The example below shows how to specify the ``write`` WAL mode for ``instance001``:
99128

100129
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_wal/config.yaml
@@ -103,9 +132,6 @@ The example below shows how to specify the ``write`` WAL mode for ``instance001`
103132
:end-at: 'write'
104133
:dedent:
105134

106-
The ``write`` mode enables WAL and writes the data without waiting the data to be flushed to the storage device.
107-
The ``fsync`` mode enables WAL and ensures that the record is written to the storage device.
108-
109135
To turn the WAL writer off, set the ``wal.mode`` option to ``none``.
110136

111137
.. _configuration_persistence_wal_dir:
@@ -159,7 +185,7 @@ Set a delay for the garbage collector
159185

160186
In Tarantool, the :ref:`checkpoint daemon <configuration_persistence_checkpoint_daemon>`
161187
takes new snapshots at the given interval (see :ref:`snapshot.by.interval <configuration_reference_snapshot_by_interval>`).
162-
After an instance restart, the daemon activates the Tarantool garbage collector that deletes the old WAL files.
188+
After an instance restart, the Tarantool garbage collector deletes the old WAL files.
163189

164190
To delay the immediate deletion of WAL files, use the :ref:`wal.cleanup_delay <configuration_reference_wal_cleanup_delay>`
165191
configuration option. The delay eliminates possible erroneous situations when the master deletes WALs
@@ -180,7 +206,7 @@ In the example, the delay is set to 5 hours (18000 seconds):
180206
Specify the WAL extensions
181207
~~~~~~~~~~~~~~~~~~~~~~~~~~
182208

183-
In Tarantool Enterprise, you can store an old and new tuple for each crud operation performed.
209+
In Tarantool Enterprise, you can store an old and new tuple for each CRUD operation performed.
184210
The detailed description and examples of the WAL extensions are provided in the :ref:`WAL extensions <wal_extensions>` section.
185211

186212
See also: :ref:`wal.ext.* <configuration_reference_wal_ext>` configuration options.
@@ -191,32 +217,49 @@ Checkpoint daemon
191217
-----------------
192218

193219
The checkpoint daemon (snapshot daemon) is a constantly running :ref:`fiber <app-fibers>`.
194-
If the checkpoint daemon is enabled, it takes new :ref:`snapshot (.snap) files <index-box_persistence>` at the
195-
:ref:`given interval <configuration_reference_snapshot_by_interval>` automatically.
196-
If necessary, the checkpoint daemon also activates the Tarantool garbage collector that deletes old snapshot and WAL files.
220+
The checkpoint daemon creates a schedule for the periodic snapshot creation based on
221+
the :ref:`configuration options <configuration_reference_snapshot_by>`and the speed of file size growth.
222+
If enabled, the daemon makes new snapshots (``.snap``) files according to this schedule.
223+
224+
The work of checkpoint daemon is based on the following configuration options:
225+
226+
* :ref:`snapshot.by.interval <configuration_reference_snapshot_by_interval>` -- a new snapshot is taken once in a given period.
227+
* :ref:`snapshot.by.wal_size <configuration_reference_snapshot_by_wal_size>` -- a new snapshot is taken once the size
228+
of all WAL files created since the last snapshot exceeds a given limit.
229+
230+
If necessary, the checkpoint daemon also activates the :ref:`Tarantool garbage collector <configuration_persistence_garbage_collector>` that deletes old snapshot and WAL files.
231+
232+
.. _configuration_persistence_garbage_collector:
233+
234+
Tarantool garbage collector
235+
---------------------------
236+
237+
Tarantool garbage collector can be activated by the :ref:`checkpoint daemon <configuration_persistence_checkpoint_daemon>`.
238+
The garbage collector tracks the snapshots that are to be :ref:`relayed to a replica <memtx-replication>` or needed
239+
by other consumers. When the files are no longer needed, Tarantool garbage collector deletes them.
197240

198241
.. NOTE::
199242

200-
This garbage collector is distinct from the `Lua garbage collector <https://www.lua.org/manual/5.1/manual.html#2.10>`_
243+
The garbage collector called by the checkpoint daemon, is distinct from the `Lua garbage collector <https://www.lua.org/manual/5.1/manual.html#2.10>`_
201244
which is for Lua objects, and distinct from the Tarantool garbage collector that specializes in :ref:`handling shard buckets <vshard-gc>`.
202245

203246
This garbage collector is called as follows:
204247

205248
* When the number of snapshots reaches the limit of :ref:`snapshot.count <configuration_reference_snapshot_count>` size.
206249
After a new snapshot is taken, Tarantool garbage collector deletes the oldest snapshot file and any associated WAL files.
207250

208-
* When the size all WAL files created since the last snapshot reaches the limit of :ref:`snapshot.by.wal_size <configuration_reference_snapshot_by_wal_size>`.
251+
* When the size of all WAL files created since the last snapshot reaches the limit of :ref:`snapshot.by.wal_size <configuration_reference_snapshot_by_wal_size>`.
209252
Once this size is exceeded, the checkpoint daemon takes a snapshot, then the garbage collector deletes the old WAL files.
210253

211-
If the checkpoint daemon deletes an old snapshot file, the Tarantool garbage collector also deletes
254+
If an old snapshot filen is deleted, the Tarantool garbage collector also deletes
212255
any :ref:`write-ahead log (.xlog) <internals-wal>` files that meet the following conditions:
213256

214257
* The WAL files are older than the snapshot file.
215258
* The WAL files contain information present in the snapshot file.
216259

217260
Tarantool garbage collector also deletes obsolete vinyl ``.run`` files.
218261

219-
The checkpoint daemon and the Tarantool garbage collector don't delete a file in the following cases:
262+
Tarantool garbage collector doesn't delete a file in the following cases:
220263

221264
* A **backup** is running, and the file has not been backed up
222265
(see :ref:`"Hot backup" <admin-backups-hot_backup_vinyl_memtx>`).

doc/dev_guide/internals/file_formats.rst

Lines changed: 22 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,22 +5,26 @@ File formats
55

66
.. _internals-wal:
77

8-
Data persistence and the WAL file format
9-
----------------------------------------
8+
The WAL file format
9+
-------------------
1010

11-
To maintain data persistence, Tarantool writes each data change request (insert,
12-
update, delete, replace, upsert) into a write-ahead log (WAL) file in the
11+
To maintain :ref:`data persistence <concepts-data_model-persistence>`, Tarantool writes each data change request (insert,
12+
update, delete, replace, upsert) to a write-ahead log (WAL) file in the
1313
:ref:`wal.dir <configuration_reference_wal_dir>` directory.
14-
Each data change request gets assigned a continuously growing 64-bit log sequence
14+
Each data change request is assigned a continuously growing 64-bit log sequence
1515
number. The name of the WAL file is based on the log sequence number of the first
1616
record in the file, plus an extension ``.xlog``.
1717
A new WAL file is created
1818
when the current one reaches the :ref:`wal_max_size <cfg_binary_logging_snapshots-wal_max_size>` size.
1919

20-
Apart from a log sequence number and the data change request (formatted as in
21-
:ref:`Tarantool's binary protocol <internals-box_protocol>`),
22-
each WAL record contains a header, some metadata, and then the data formatted
23-
according to `msgpack <https://en.wikipedia.org/wiki/MessagePack>`_ rules.
20+
Each WAL record contains:
21+
22+
* a log sequence number
23+
* a data change request (formatted as in :ref:`Tarantool's binary protocol <internals-box_protocol>`)
24+
25+
* a header
26+
* some metadata
27+
* the data formatted according to `msgpack <https://en.wikipedia.org/wiki/MessagePack>`_ rules.
2428

2529
To see the hexadecimal bytes of the given WAL file, use the ``hexdump`` command:
2630

@@ -102,7 +106,7 @@ It is possible to turn the write-ahead log completely off, by setting the ``wal_
102106
Even without the write-ahead log it's still possible to take a persistent copy of the
103107
entire data set with the :ref:`box.snapshot() <box-snapshot>` request.
104108

105-
An .xlog file always contains changes based on the primary key.
109+
An ``.xlog`` file always contains changes based on the primary key.
106110
Even if the client requested an update or delete using
107111
a secondary key, the record in the .xlog file contains the primary key.
108112

@@ -111,12 +115,14 @@ a secondary key, the record in the .xlog file contains the primary key.
111115
The snapshot file format
112116
------------------------
113117

114-
The format of a snapshot .snap file is similar to the format of a WAL .xlog file, except for the header and content.
115-
The snapshot header contains the instance's global unique identifier
116-
and the snapshot file's position in history, relative to earlier snapshot files.
117-
Also, the content differs: an .xlog file may contain records for any data-change
118-
requests (inserts, updates, upserts, and deletes), a .snap file may only contain records
119-
of inserts to memtx spaces.
118+
The format of a snapshot (``.snap``) file is the following:
119+
120+
* The snapshot header contains the instance's global unique identifier
121+
and the snapshot file's position in history, relative to earlier snapshot files.
122+
123+
* The snapshot content contains the records of inserts to memtx spaces.
124+
That differs from the content of an ``.xlog`` file that may contain records for any data-change requests
125+
(inserts, updates, upserts, and deletes).
120126

121127
Primarily, the records in the snapshot file have the following order:
122128

doc/reference/configuration/cfg_basic.rst

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -100,9 +100,12 @@
100100

101101
Since version 1.7.4.
102102

103-
A directory where memtx stores snapshot (.snap) files. Can be relative to
104-
:ref:`work_dir <cfg_basic-work_dir>`. If not specified, defaults to
105-
``work_dir``. See also :ref:`wal_dir <cfg_basic-wal_dir>`.
103+
A directory where memtx stores snapshot (.snap) files.
104+
A relative path in this option is interpreted as relative to :ref:`work_dir <cfg_basic-work_dir>`.
105+
106+
By default, snapshots and WAL files are stored in the same directory.
107+
However, you can set different values for the ``memtx_dir`` and :ref:`wal_dir <cfg_basic-wal_dir>` options
108+
to store them on different physical disks for performance matters.
106109

107110
| Type: string
108111
| Default: "."
@@ -230,11 +233,12 @@
230233

231234
Since version 1.6.2.
232235

233-
A directory where write-ahead log (.xlog) files are stored. Can be relative
234-
to :ref:`work_dir <cfg_basic-work_dir>`. Sometimes ``wal_dir`` and
235-
:ref:`memtx_dir <cfg_basic-memtx_dir>` are specified with different values, so
236-
that write-ahead log files and snapshot files can be stored on different
237-
disks. If not specified, defaults to ``work_dir``.
236+
A directory where write-ahead log (.xlog) files are stored.
237+
A relative path in this option is interpreted as relative to :ref:`work_dir <cfg_basic-work_dir>`.
238+
239+
By default, WAL files and snapshots are stored in the same directory.
240+
However, you can set different values for the ``wal_dir`` and :ref:`memtx_dir <cfg_basic-memtx_dir>` options
241+
to store them on different physical disks for performance matters.
238242

239243
| Type: string
240244
| Default: "."

doc/reference/configuration/cfg_binary_logging_snapshots.rst

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -130,15 +130,14 @@
130130

131131
Since version :doc:`2.6.3 </release/2.6.3>`.
132132

133-
The delay (in seconds) used to prevent the :ref:`Tarantool garbage collector <cfg_checkpoint_daemon-garbage-collector>`
133+
The delay in seconds used to prevent the :ref:`Tarantool garbage collector <cfg_checkpoint_daemon-garbage-collector>`
134134
from immediately removing :ref:`write-ahead log <internals-wal>` files after a node restart.
135135
This delay eliminates possible erroneous situations when the master deletes WALs
136136
needed by :ref:`replicas <replication-roles>` after restart.
137137
As a consequence, replicas sync with the master faster after its restart and
138138
don't need to download all the data again.
139-
140-
Once all the nodes in the replica set are up and running,
141-
automatic cleanup is started again even if ``wal_cleanup_delay`` has not expired.
139+
Once all the nodes in the replica set are up and running, a scheduled garbage collection is started again
140+
even if ``wal_cleanup_delay`` has not expired.
142141

143142
.. NOTE::
144143

0 commit comments

Comments
 (0)