Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 144 additions & 43 deletions docs/binlogging-replication-improvements.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,27 +118,66 @@ SET binlog_ddl_skip_rewrite = ON;
/*comment at start*/DROP TABLE t /*comment at end*/;
```

## Binary log user defined functions
## Binary log user-defined functions

To implement Point in Time recovery, we have added the `binlog_utils_udf`. The following user-defined functions are included:
To implement Point in Time recovery, we have added the `binlog_utils_udf` plugin. These functions help you locate specific transactions in binary logs and determine which binary log files contain particular GTIDs, which is essential for precise point-in-time recovery operations.

| Name | Returns | Description |
| --- | --- | --- |
| get_binlog_by_gtid() | Binlog file name as STRING | Returns the binlog file name that contains the specified GTID |
| get_last_gtid_from_binlog() | GTID as STRING | Returns the last GTID found in the specified binlog |
| get_gtid_set_by_binlog() | GTID set as STRING | Returns all GTIDs found in the specified binlog |
| get_binlog_by_gtid_set() | Binlog file name as STRING | Returns the file name of the binlog which contains at least one GTID from the specified set. |
| get_first_record_timestamp_by_binlog() | Timestamp as INTEGER | Returns the timestamp of the first event in the specified binlog |
| get_last_record_timestamp_by_binlog() | Timestamp as INTEGER | Returns the timestamp of the last event in the specified binlog |
### Prerequisites

Before using these functions, ensure that:

* Binary logging is enabled on your MySQL server
* You have the `SYSTEM_VARIABLES_ADMIN` and `SERVICE_CONNECTION_ADMIN` privileges to install plugins
* You have read access to the binary log directory
* GTID-based replication is configured (for GTID-related functions)

### Installation

Before using the user-defined functions, you must install the plugin:

```{.bash data-prompt="mysql>"}
mysql> INSTALL PLUGIN binlog_utils_udf SONAME 'binlog_utils_udf.so';
```

After installation, you can verify the plugin is loaded by checking the `INFORMATION_SCHEMA.PLUGINS` table:

```{.bash data-prompt="mysql>"}
mysql> SELECT PLUGIN_NAME, PLUGIN_STATUS FROM INFORMATION_SCHEMA.PLUGINS WHERE PLUGIN_NAME = 'binlog_utils_udf';
```

### Understanding GTIDs

A Global Transaction Identifier (GTID) is a unique identifier for each transaction in a MySQL replication setup. GTIDs help ensure data consistency and enable precise point-in-time recovery. GTIDs follow the format `source_id:transaction_id`, where `source_id` is the server's UUID and `transaction_id` is a sequence number.

When using these user-defined functions, you must use CAST to return a result. For example:

```{.bash data-prompt="mysql>"}
mysql> SELECT CAST(get_last_gtid_from_binlog("binlog.0001") AS CHAR) as result;
```

### Available functions

The following user-defined functions are included:

| Name | Returns | Description | Use Case |
| --- | --- | --- | --- |
| [get_binlog_by_gtid()](#get_binlog_by_gtid) | Binlog file name as STRING | Returns the binlog file name that contains the specified GTID | Find which binary log file contains a specific transaction |
| [get_last_gtid_from_binlog()](#get_last_gtid_from_binlog) | GTID as STRING | Returns the last GTID found in the specified binlog | Determine the most recent transaction in a binary log file |
| [get_gtid_set_by_binlog()](#get_gtid_set_by_binlog) | GTID set as STRING | Returns all GTIDs found in the specified binlog | Get a complete list of transactions in a binary log file |
| [get_binlog_by_gtid_set()](#get_binlog_by_gtid_set) | Binlog file name as STRING | Returns the file name of the binlog which contains at least one GTID from the specified set | Find binary log files containing any of the specified transactions |
| [get_first_record_timestamp_by_binlog()](#get_first_record_timestamp_by_binlog) | Timestamp as INTEGER | Returns the timestamp of the first event in the specified binlog | Determine when a binary log file was first created |
| [get_last_record_timestamp_by_binlog()](#get_last_record_timestamp_by_binlog) | Timestamp as INTEGER | Returns the timestamp of the last event in the specified binlog | Determine when the last transaction was written to a binary log file |

!!! note

All functions returning timestamps return their values as microsecond precision UNIX time. In other words, they represent the number of microseconds since 1-JAN-1970.

All functions accepting a binlog name as the parameter accepts only short names, without a path component. If the path separator (‘/’) is found in the input, an error is returned. This serves the purpose of restricting the locations from where binlogs can be read. They are always read from the current binlog directory ([@@log_bin_basename system variable](https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#sysvar_log_bin_basename)).
All functions accepting a binlog name as a parameter accept only short names, without a path component. If the path separator ('/') is found in the input, an error is returned. This restriction serves the purpose of limiting the locations from which binlogs can be read. They are always read from the current binlog directory ([@@log_bin_basename system variable](https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#sysvar_log_bin_basename)).

All functions returning binlog file names return the name in short form, without a path component.

#### get_binlog_by_gtid

The basic syntax for `get_binlog_by_gtid()` is the following:

* get_binlog_by_gtid(string) [AS] alias
Expand All @@ -147,9 +186,9 @@ Usage: SELECT get_binlog_by_gtid(string) [AS] alias

Example:

```sql
CREATE FUNCTION get_binlog_by_gtid RETURNS STRING SONAME 'binlog_utils_udf.so';
SELECT get_binlog_by_gtid("F6F54186-8495-47B3-8D9F-011DDB1B65B3:1") AS result;
```{.bash data-prompt="mysql>"}
mysql> CREATE FUNCTION get_binlog_by_gtid RETURNS STRING SONAME 'binlog_utils_udf.so';
mysql> SELECT CAST(get_binlog_by_gtid("F6F54186-8495-47B3-8D9F-011DDB1B65B3:1") AS CHAR) AS result;
```
??? example "Expected output"

Expand All @@ -161,10 +200,12 @@ SELECT get_binlog_by_gtid("F6F54186-8495-47B3-8D9F-011DDB1B65B3:1") AS result;
+--------------+
```

```sql
DROP FUNCTION get_binlog_by_gtid;
```{.bash data-prompt="mysql>"}
mysql> DROP FUNCTION get_binlog_by_gtid;
```

#### get_last_gtid_from_binlog

The basic syntax for `get_last_gtid_from_binlog()` is the following:

* get_last_gtid_from_binlog(string) [AS] alias
Expand All @@ -173,9 +214,9 @@ Usage: SELECT get_last_gtid_from_binlog(string) [AS] alias

For example:

```sql
CREATE FUNCTION get_last_gtid_from_binlog RETURNS STRING SONAME 'binlog_utils_udf.so';
SELECT get_last_gtid_from_binlog("binlog.00001") AS result;
```{.bash data-prompt="mysql>"}
mysql> CREATE FUNCTION get_last_gtid_from_binlog RETURNS STRING SONAME 'binlog_utils_udf.so';
mysql> SELECT CAST(get_last_gtid_from_binlog("binlog.00001") AS CHAR) AS result;
```

??? example "Expected output"
Expand All @@ -188,10 +229,12 @@ SELECT get_last_gtid_from_binlog("binlog.00001") AS result;
+-----------------------------------------+
```

```sql
DROP FUNCTION get_last_gtid_from_binlog;
```{.bash data-prompt="mysql>"}
mysql> DROP FUNCTION get_last_gtid_from_binlog;
```

#### get_gtid_set_by_binlog

The basic syntax for `get_gtid_set_by_binlog()` is the following:

* get_gtid_set_by_binlog(string) [AS] alias
Expand All @@ -200,9 +243,9 @@ Usage: SELECT get_gtid_set_by_binlog(string) [AS] alias

For example:

```sql
CREATE FUNCTION get_gtid_set_by_binlog RETURNS STRING SONAME 'binlog_utils_udf.so';
SELECT get_gtid_set_by_binlog("binlog.00001") AS result;
```{.bash data-prompt="mysql>"}
mysql> CREATE FUNCTION get_gtid_set_by_binlog RETURNS STRING SONAME 'binlog_utils_udf.so';
mysql> SELECT CAST(get_gtid_set_by_binlog("binlog.00001") AS CHAR) AS result;
```
??? example "Expected output"

Expand All @@ -214,10 +257,12 @@ SELECT get_gtid_set_by_binlog("binlog.00001") AS result;
+-------------------------+
```

```sql
DROP FUNCTION get_gtid_set_by_binlog;
```{.bash data-prompt="mysql>"}
mysql> DROP FUNCTION get_gtid_set_by_binlog;
```

#### get_binlog_by_gtid_set

The basic syntax for `get_binlog_by_gtid_set()` is the following:


Expand All @@ -227,9 +272,9 @@ Usage: SELECT get_binlog_by_gtid_set(string) [AS] alias

Example:

```sql
CREATE FUNCTION get_binlog_by_gtid_set RETURNS STRING SONAME 'binlog_utils_udf.so';
SELECT get_binlog_by_gtid_set("11ea-b9a7:7,11ea-b9a7:8") AS result;
```{.bash data-prompt="mysql>"}
mysql> CREATE FUNCTION get_binlog_by_gtid_set RETURNS STRING SONAME 'binlog_utils_udf.so';
mysql> SELECT CAST(get_binlog_by_gtid_set("11ea-b9a7:7,11ea-b9a7:8") AS CHAR) AS result;
```
??? example "Expected output"

Expand All @@ -241,10 +286,12 @@ SELECT get_binlog_by_gtid_set("11ea-b9a7:7,11ea-b9a7:8") AS result;
+---------------------------------------------------------------+
```

```sql
DROP FUNCTION get_binlog_by_gtid_set;
```{.bash data-prompt="mysql>"}
mysql> DROP FUNCTION get_binlog_by_gtid_set;
```

#### get_first_record_timestamp_by_binlog

The basic syntax for `get_first_record_timestamp_by_binlog()` is the following:

* get_first_record_timestamp_by_binlog(TIMESTAMP) [AS] alias
Expand All @@ -253,9 +300,9 @@ Usage: SELECT get_first_record_timestamp_by_binlog(TIMESTAMP) [AS] alias

For example:

```sql
CREATE FUNCTION get_first_record_timestamp_by_binlog RETURNS INTEGER SONAME 'binlog_utils_udf.so';
SELECT FROM_UNIXTIME(get_first_record_timestamp_by_binlog("bin.00003") DIV 1000000) AS result;
```{.bash data-prompt="mysql>"}
mysql> CREATE FUNCTION get_first_record_timestamp_by_binlog RETURNS INTEGER SONAME 'binlog_utils_udf.so';
mysql> SELECT FROM_UNIXTIME(CAST(get_first_record_timestamp_by_binlog("bin.00003") AS UNSIGNED) DIV 1000000) AS result;
```

??? example "Expected output"
Expand All @@ -264,14 +311,16 @@ SELECT FROM_UNIXTIME(get_first_record_timestamp_by_binlog("bin.00003") DIV 10000
+---------------------+
| result |
+=====================+
| 2020-12-03 09:10:40 |
| 2024-12-03 09:10:40 |
+---------------------+
```

```sql
DROP FUNCTION get_first_record_timestamp_by_binlog;
```{.bash data-prompt="mysql>"}
mysql> DROP FUNCTION get_first_record_timestamp_by_binlog;
```

#### get_last_record_timestamp_by_binlog

The basic syntax for `get_last_record_timestamp_by_binlog()` is the following:

* get_last_record_timestamp_by_binlog(TIMESTAMP) [AS] alias
Expand All @@ -280,9 +329,9 @@ Usage: SELECT get_last_record_timestamp_by_binlog(TIMESTAMP) [AS] alias

For example:

```sql
CREATE FUNCTION get_last_record_timestamp_by_binlog RETURNS INTEGER SONAME 'binlog_utils_udf.so';
SELECT FROM_UNIXTIME(get_last_record_timestamp_by_binlog("bin.00003") DIV 1000000) AS result;
```{.bash data-prompt="mysql>"}
mysql> CREATE FUNCTION get_last_record_timestamp_by_binlog RETURNS INTEGER SONAME 'binlog_utils_udf.so';
mysql> SELECT FROM_UNIXTIME(CAST(get_last_record_timestamp_by_binlog("bin.00003") AS UNSIGNED) DIV 1000000) AS result;
```

??? example "Expected output"
Expand All @@ -291,14 +340,66 @@ SELECT FROM_UNIXTIME(get_last_record_timestamp_by_binlog("bin.00003") DIV 100000
+---------------------+
| result |
+=====================+
| 2020-12-04 04:18:56 |
| 2024-12-04 04:18:56 |
+---------------------+
```

```sql
DROP FUNCTION get_last_record_timestamp_by_binlog;
```{.bash data-prompt="mysql>"}
mysql> DROP FUNCTION get_last_record_timestamp_by_binlog;
```

### Troubleshooting

Common issues and solutions:

Function returns NULL or error:

* Ensure the binary log file exists in the current binlog directory

* Verify you have read permissions on the binary log files

* Check that the binary log file name is correct and does not include a path

Plugin installation fails:

* Verify you have the `SYSTEM_VARIABLES_ADMIN` and `SERVICE_CONNECTION_ADMIN` privileges

* Ensure the `binlog_utils_udf.so` file exists in the plugin directory

* Check that the plugin is compatible with your MySQL version

GTID format errors:

* Ensure GTIDs follow the correct format: `source_id:transaction_id`

* Verify that GTID-based replication is enabled on your server

Timestamp conversion issues:

* Remember that timestamps are returned in microseconds since Unix epoch

* Use `FROM_UNIXTIME()` with division by 1000000 to convert to readable format

Character set and display issues:

* UDFs developed before MySQL's UDF API character set enhancements used to not care about character set of STRING return values and it was always "binary"

* Oracle introduced the `--binary-as-hex` command line option and made it TRUE by default in interactive mode

* If you see unexpected hexadecimal output from UDF functions, start the mysql client with `--binary-as-hex=FALSE` or add this parameter to the client section of your MySQL config file

* The `--binary-as-hex=FALSE` option only affects how binary data is displayed in the output. You still need to use CAST operations (for example, `CAST(function_name() AS CHAR)`) for the UDF functions to work properly, regardless of this setting

### Uninstalling the plugin

To uninstall the `binlog_utils_udf` plugin, use the following command:

```{.bash data-prompt="mysql>"}
mysql> UNINSTALL PLUGIN binlog_utils_udf;
```

The plugin cannot be disabled without uninstalling. When uninstalled, all user-defined functions provided by the plugin are automatically removed and become unavailable.

## Limitations

For the following variables, do not define values with one or more dot (.) characters:
Expand Down
Loading