Skip to content

[Config] Document 'memtx' configuration settings #4038

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/book/box/limitations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ Limitations
**Space size**

The total maximum size for all spaces is in effect set by
:ref:`memtx_memory <cfg_storage-memtx_memory>`, which in turn
:ref:`memtx.memory <configuration_reference_memtx_memory>`, which in turn
is limited by the total available memory.

.. _limitations_update_ops:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
memtx:
memory: 1073741824
min_tuple_size: 8
max_tuple_size: 5242880

groups:
group001:
replicasets:
replicaset001:
instances:
instance001:
iproto:
listen:
- uri: '127.0.0.1:3301'
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
instance001:
27 changes: 15 additions & 12 deletions doc/concepts/atomic/thread_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,19 +64,22 @@ Supplementary threads

There are also several supplementary threads that serve additional capabilities:

* For :ref:`replication <replication-architecture>`, Tarantool creates a separate thread for each connected replica.
This thread reads a write-ahead log and sends it to the replica, following its position in the log.
Separate threads are required because each replica can point to a different position in the log and can run at different speeds.
* For :ref:`replication <replication-architecture>`, Tarantool creates a separate thread for each connected replica.
This thread reads a write-ahead log and sends it to the replica, following its position in the log.
Separate threads are required because each replica can point to a different position in the log and can run at different speeds.

* There is a thread pool for ad hoc asynchronous tasks,
such as a DNS resolver or :ref:`fsync <configuration_reference_wal_mode>`.
* There is a thread pool for ad hoc asynchronous tasks, such as a DNS resolver or :ref:`fsync <configuration_reference_wal_mode>`.

* There are OpenMP threads used to parallelize sorting
(hence, to parallelize building :ref:`indexes <concepts-data_model_indexes>`).
For example, this is applicable when Tarantool is restoring from a
:ref:`snapshot <internals-snapshot>` with a large amount of data
and needs to sort a secondary index if it is ordered by something other than the primary order.
* There is a thread pool that can be used for parallel sorting (hence, to parallelize building :ref:`indexes <concepts-data_model_indexes>`).
To configure it, use the :ref:`memtx.sort_threads <configuration_reference_memtx_sort_threads>` configuration option.
The option sets the number of threads used to sort keys of secondary indexes on loading a ``memtx`` database.

.. note::
.. note_drop_openmp_start

The maximum number of OpenMP threads can be controlled by the ``OMP_NUM_THREADS`` environment variable.
.. NOTE::

Since :doc:`3.0.0 </release/3.0.0>`, this option replaces the approach when OpenMP threads are used to parallelize sorting.
For backward compatibility, the ``OMP_NUM_THREADS`` environment variable is taken into account to
set the number of sorting threads.

.. note_drop_openmp_end
22 changes: 12 additions & 10 deletions doc/concepts/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -393,7 +393,7 @@ The example below shows how to set a listening IP address for ``instance001`` to
You can learn more from the :ref:`configuration_connections` topic.


.. _configuration_options_access_control:
.. _configuration_options_access_control:

Access control
~~~~~~~~~~~~~~
Expand All @@ -410,22 +410,25 @@ In the example below, a ``dbadmin`` user with the specified password is created:
To learn more, see the :ref:`configuration_credentials` section.


.. _configuration_options_memory:
.. _configuration_options_memory:

Memory
~~~~~~

The ``memtx.memory`` option specifies how much :ref:`memory <engines-memtx>` Tarantool allocates to actually store data.
The :ref:`memtx.memory <configuration_reference_memtx_memory>` option specifies how much :ref:`memory <engines-memtx>`
Tarantool allocates to actually store data.

.. code-block:: yaml

memtx:
memory: 100000000
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/memtx/config.yaml
:language: yaml
:start-at: memtx:
:end-at: 1073741824
:dedent:

When the limit is reached, ``INSERT`` or ``UPDATE`` requests fail with :ref:`ER_MEMORY_ISSUE <admin-troubleshoot-memory-issues>`.

Learn more: :ref:`In-memory storage configuration <configuration_memtx>`.

.. _configuration_options_directories:
.. _configuration_options_directories:

Snapshots and write-ahead logs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -446,13 +449,12 @@ To learn more about the persistence mechanism in Tarantool, see the :ref:`Persis
Read more about snapshot and WAL configuration: :ref:`Persistence <configuration_persistence>`.




.. toctree::
:hidden:

configuration/configuration_etcd
configuration/configuration_code
configuration/configuration_memtx
configuration/configuration_persistence
configuration/configuration_connections
configuration/configuration_credentials
Expand Down
2 changes: 1 addition & 1 deletion doc/concepts/configuration/configuration_code.rst
Original file line number Diff line number Diff line change
Expand Up @@ -356,7 +356,7 @@ In the :ref:`next section <configuration_code_iproto-encryption-config-sc>`, you

Example:

.. literalinclude:: /code_snippets/snippets/replication/instances.enabled/ssl/myapp.lua
.. literalinclude:: /code_snippets/snippets/replication/instances.enabled/ssl_with_ca/myapp.lua
:language: lua
:start-at: net.box
:end-before: return connection
Expand Down
65 changes: 65 additions & 0 deletions doc/concepts/configuration/configuration_memtx.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
.. _configuration_memtx:

In-memory storage
=================

**Example on GitHub**: `memtx <https://github.com/tarantool/doc/tree/latest/doc/code_snippets/snippets/config/instances.enabled/memtx>`_

In Tarantool, all data is stored in random-access memory (RAM) by default.
For this purpose, the :ref:`memtx <engines-memtx>` storage engine is used.

This topic describes how to define basic settings related to in-memory storage in the
:ref:`memtx <configuration_reference_memtx>` section of a :ref:`YAML configuration <configuration>`
-- for example, :ref:`memory size <configuration_reference_memtx_memory>` and :ref:`maximum tuple size <configuration_reference_memtx_max_size>`.
For the specific settings related to allocator or sorting threads,
check the corresponding ``memtx`` options in the :ref:`Configuration reference <configuration_reference_memtx>`.

.. NOTE::

To estimate the required amount of memory, you can use the
`sizing calculator <https://www.tarantool.io/en/sizing_calculator/>`_.

.. _configuration_memtx-memory:

Memory size
-----------

In Tarantool, data is stored in spaces.
Each space consists of tuples -- the database records.
To specify the amount of memory that Tarantool allocates to store tuples, use the
:ref:`memtx.memory <configuration_reference_memtx_memory>` configuration option.

In the example below, the memory size is set to 1 GB (1073741824 bytes):

.. literalinclude:: /code_snippets/snippets/config/instances.enabled/memtx/config.yaml
:language: yaml
:start-at: memtx:
:end-at: 1073741824
:dedent:

The server does not exceed this limit to allocate tuples.
For indexes and connection information, additional memory is used.

When the ``memtx.memory`` limit is reached, ``INSERT`` or ``UPDATE`` requests fail with
:ref:`ER_MEMORY_ISSUE <admin-troubleshoot-memory-issues>`.

.. _configuration_memtx-tuple-size:

Tuple size
----------

You can configure the minimum and the maximum tuple sizes in bytes.

* If the tuples are small, you can decrease the minimum size.
* If the tuples are large, you can increase the maximum size.

To define the tuple size, use the :ref:`memtx.min_tuple_size <configuration_reference_memtx_min_size>` and
:ref:`memtx.max_tuple_size <configuration_reference_memtx_max_size>` configuration options.

In the example, the minimum size is set to 8 bytes and the maximum size is set to 5 MB:

.. literalinclude:: /code_snippets/snippets/config/instances.enabled/memtx/config.yaml
:language: yaml
:start-at: memtx:
:end-at: 5242880
:dedent:
2 changes: 1 addition & 1 deletion doc/concepts/configuration/configuration_persistence.rst
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ size. The configuration for this option might look as follows:
:end-at: 268435456
:dedent:

.. _configuration_persistence_wal_rescan:
.. _configuration_persistence_cleanup_delay:

Set a delay for the garbage collector
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
39 changes: 22 additions & 17 deletions doc/concepts/engines/memtx.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,22 @@
Storing data with memtx
=======================

The ``memtx`` storage engine is used in Tarantool by default. It keeps all data in random-access memory (RAM), and therefore has very low read latency.
The ``memtx`` storage engine is used in Tarantool by default.
The engine keeps all data in random-access memory (RAM), and therefore has a low read latency.

The obvious question here is:
if all the data is stored in memory, how can you prevent the data loss in case of emergency such as outage or Tarantool instance failure?
Tarantool prevents the data loss in case of emergency, such as outage or Tarantool instance failure, in the following ways:

First of all, Tarantool persists all data changes by writing requests to the write-ahead log (WAL) that is stored on disk.
Read more about that in the :ref:`memtx-persist` section.
In case of a distributed application, there is an option of synchronous replication that ensures keeping the data consistent on a quorum of replicas.
Although replication is not directly a storage engine topic, it is a part of the answer regarding data safety. Read more in the :ref:`memtx-replication` section.
* Tarantool persists all data changes by writing requests to the :ref:`write-ahead log <internals-wal>` (WAL)
that is stored on disk. Also, Tarantool periodically takes the entire
:doc:`database snapshot </reference/reference_lua/box_snapshot>` and saves it on disk.
Learn more: :ref:`Data persistence <memtx-persist>`.

In this chapter, the following topics are discussed in brief with the references to other chapters that explain the subject matter in details.
* In case of a distributed application, a synchronous replication is used to ensure keeping the data consistent on a quorum of replicas.
Although replication is not directly a storage engine topic, it is a part of the answer regarding data safety.
Learn more: :ref:`Replicating data <memtx-replication>`.

In this section, the following topics are discussed in brief with the references to other sections that explain the
subject matter in details.

.. contents::
:local:
Expand Down Expand Up @@ -43,7 +48,7 @@ Within the TX thread, there is a memory area allocated for Tarantool to store da

.. image:: memtx/arena2.svg

Data is stored in :term:`spaces <space>`. Spaces contain database records:term:`tuples <tuple>`.
Data is stored in :term:`spaces <space>`. Spaces contain database records -- :term:`tuples <tuple>`.
To access and manipulate the data stored in spaces and tuples, Tarantool builds :doc:`indexes </concepts/data_model/indexes>`.

Special `allocators <https://github.com/tarantool/small>`__ manage memory allocations for spaces, tuples, and indexes within the Arena.
Expand All @@ -52,43 +57,43 @@ Tarantool has a built-in module called ``box.slab`` which provides the slab allo
that can be used to monitor the total memory usage and memory fragmentation.
For more details, see the ``box.slab`` module :doc:`reference </reference/reference_lua/box_slab>`.

.. image:: memtx/spaces_indexes.svg
.. image:: memtx/spaces_indexes.svg

Also inside the TX thread, there is an event loop. Within the event loop, there are a number of :ref:`fibers <fiber-fibers>`.
Fibers are cooperative primitives that allow interaction with spaces, that is, reading and writing the data.
Fibers can interact with the event loop and between each other directly or by using special primitives called channels.
Due to the usage of fibers and :ref:`cooperative multitasking <app-cooperative_multitasking>`, the ``memtx`` engine is lock-free in typical situations.

.. image:: memtx/fibers-channels.svg
.. image:: memtx/fibers-channels.svg

To interact with external users, there is a separate :ref:`network thread <thread_model>` also called the **iproto thread**.
The iproto thread receives a request from the network, parses and checks the statement,
and transforms it into a special structure—a message containing an executable statement and its options.
Then the iproto thread ships this message to the TX thread and runs the user's request in a separate fiber.

.. image:: memtx/iproto.svg
.. image:: memtx/iproto.svg

.. _memtx-persist:

Data persistence
----------------

To ensure :ref:`data persistence <index-box_persistence>`, Tarantool does two things.
Tarantool ensures :ref:`data persistence <index-box_persistence>` as follows:

* After executing data change requests in memory, Tarantool writes each such request to the :ref:`write-ahead log (WAL) <internals-wal>` files (``.xlog``)
that are stored on disk. Tarantool does this via a separate thread called the **WAL thread**.

.. image:: memtx/wal.svg
.. image:: memtx/wal.svg

* Tarantool periodically takes the entire :doc:`database snapshot </reference/reference_lua/box_snapshot>` and saves it on disk.
It is necessary for accelerating instance's restart because when there are too many WAL files, it can be difficult for Tarantool to restart quickly.

To save a snapshot, there is a special fiber called the **snapshot daemon**.
To save a snapshot, there is a special fiber called the :ref:`snapshot daemon <configuration_persistence_checkpoint_daemon>`.
It reads the consistent content of the entire Arena and writes it on disk into a snapshot file (``.snap``).
Due of the cooperative multitasking, Tarantool cannot write directly on disk because it is a locking operation.
That is why Tarantool interacts with disk via a separate pool of threads from the :doc:`fio </reference/reference_lua/fio>` library.

.. image:: memtx/snapshot03.svg
.. image:: memtx/snapshot03.svg

So, even in emergency situations such as an outage or a Tarantool instance failure,
when the in-memory database is lost, the data can be restored fully during Tarantool restart.
Expand Down Expand Up @@ -150,7 +155,7 @@ For more information on replication, refer to the :ref:`corresponding chapter <r
.. _memtx-summary:

Summary
--------
-------

The main key points describing how the in-memory storage engine works can be summarized in the following way:

Expand Down
4 changes: 2 additions & 2 deletions doc/enterprise/wal_extensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ Inside the ``wal.ext`` block, you can enable storing old and new tuples as follo
:ref:`wal.ext.old <configuration_reference_wal_ext_old>` and :ref:`wal.ext.new <configuration_reference_wal_ext_new>`
options to ``true``:

.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence/config.yaml
.. literalinclude:: /code_snippets/snippets/config/instances.enabled/persistence_wal/config.yaml
:language: yaml
:start-at: wal:
:start-at: ext:
:end-at: old: true
:dedent:

Expand Down
2 changes: 1 addition & 1 deletion doc/how-to/sql/improving_mysql.rst
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,7 @@ MySQL and Tarantool are now set up. You can proceed to configure the replicator.
- [8, 'Slim', 'Benny', 'snake']
- [9, 'Puffball', 'Diane', 'hamster']

.. _improving_mysql-replicator:
.. _improving_mysql-test-replication:

Testing the replication
-----------------------
Expand Down
Loading