Make building of search results work for multi-byte encoded characters #3113

Kristian-Krastev · 2021-12-15T15:46:58Z

When a search is made, the information that is shown in every result snippet is taken from database tables 'bookshelves', 'books', 'chapters', 'pages', but the string data in them may be encoded in format different than one-to-one byte format. For example for every cyrillic character are used up to four bytes for representation.

Operations on strings in this context are not multi-byte safe and the returned snippets contain 'broken' information.
I think the multibyte string methods are a good alternative for the solution of that problem.

ssddanbrown · 2021-12-15T15:55:18Z

Thanks @Kristian-Krastev for offering this PR. Could you provide a minimal example of content and search term that causes breakage? Would help so I can add a test case to prevent regression.

Kristian-Krastev · 2021-12-15T16:21:15Z

Sure,
i have created a page with content:

На мен ми трябва нещо добро
Вкарай ги готовите в мойто число
Младо маняче за милион
Накрая да забравиме кво е било

and in the input search field i enter (the third row from the content):

Младо маняче за милион

Related to #3113

ssddanbrown · 2021-12-18T10:44:50Z

Thanks for confirming the content! PR now merged for next feature release.

Make building of search results work for multi-byte encoded characters

d0fd1b7

ssddanbrown added a commit that referenced this pull request Dec 18, 2021

Added test case for multibyte search highlighting

c6e3e85

Related to #3113

ssddanbrown merged commit 5c5a3de into BookStackApp:master Dec 18, 2021

ssddanbrown added this to the Next Feature Release milestone Dec 18, 2021

ssddanbrown added 🐛 Bug 🏭 Back-End 🛠️ Enhancement labels Dec 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Make building of search results work for multi-byte encoded characters #3113

Make building of search results work for multi-byte encoded characters #3113

Uh oh!

Kristian-Krastev commented Dec 15, 2021

Uh oh!

ssddanbrown commented Dec 15, 2021

Uh oh!

Kristian-Krastev commented Dec 15, 2021

Uh oh!

ssddanbrown commented Dec 18, 2021

Uh oh!

Uh oh!

Uh oh!

Make building of search results work for multi-byte encoded characters #3113

Make building of search results work for multi-byte encoded characters #3113

Uh oh!

Conversation

Kristian-Krastev commented Dec 15, 2021

Uh oh!

ssddanbrown commented Dec 15, 2021

Uh oh!

Kristian-Krastev commented Dec 15, 2021

Uh oh!

ssddanbrown commented Dec 18, 2021

Uh oh!

Uh oh!