Skip to content

Conversation

@pedr
Copy link
Contributor

@pedr pedr commented Mar 28, 2025

Resolves #11608

Summary

By decreasing the confidence level of Tesseract we can increase the output generate by it.

Here is a comparison of different outputs from the following image:

image

Confidence Output
70% (before this PR) |! 1.751 eback Mountain (2005) | | |
60% |! 1.751 eback Mountain (2005) | |
4 3.Z1i5 RZ) Love & Other Drugs (2010)
55% (this PR) |! 1.751 eback Mountain (2005) | |
. 2.0 f Havoc (2005) i
4 3.Z1i5 RZ) Love & Other Drugs (2010)
_ 4.53% H9 \I The Last Thing He Wanted (2020) 1 |
50% |! 1.751 eback Mountain (2005) | |
. 2.0 f Havoc (2005) i
4 3.Z1i5 RZ) Love & Other Drugs (2010)
_ 4.53% H9 I The Last Thing He Wanted (2020) 1 |

@pedr pedr added the desktop All desktop platforms label Mar 28, 2025
@pedr pedr requested a review from laurent22 March 28, 2025 19:33
@laurent22
Copy link
Owner

We'd need a test unit, maybe with that one image. There are already some examples of ocr tests so you can just add one


// Empirically, it seems anything below 70 is not usable. Between 70 and 75 it's
// hit and miss, but often it's good enough that we should keep the result.
// Above this is usually reliable.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So how about updating the comment?

@laurent22 laurent22 merged commit 050871b into laurent22:dev Apr 3, 2025
11 checks passed
@pedr pedr deleted the decrease-tesseract-confidence branch April 7, 2025 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

desktop All desktop platforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider decreasing the confidence threshold of Tesseract

2 participants