-
-
Notifications
You must be signed in to change notification settings - Fork 32.3k
gh-108021: In sqlite3, use 64-bit APIs for binding and returning text and blobs #108075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@serhiy-storchaka: I did not add or adjust existing tests yet. Existing tests:
|
@serhiy-storchaka: I adjusted the tests in d669736; I'm not sure if we have buildbots that can test this, though. |
Modules/_sqlite/connection.c
Outdated
#if SIZEOF_SIZE_T > 8 | ||
if (sz > 0xffffffffffffffff) { | ||
PyErr_SetString(PyExc_OverflowError, | ||
"string is longer than INT_MAX bytes"); | ||
"Error setting result value: " | ||
"string is longer than 2**64-1 bytes"); | ||
return -1; | ||
} | ||
sqlite3_result_text(context, str, (int)sz, SQLITE_TRANSIENT); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with this code is that it will be excluded from compilation on all current platforms where Python can be build and on all future platforms for long time. Therefore this code will degrade.
I do not know how better to solve this issue. Try to introduce a macro PYSQLITE_MAX_SIZE
for unsigned 0xffffffffffffffffu
, remove conditional compilation and cast sz
to unsigned size_t
before comparison. Check if any compiler emit a warning about always true condition. Now somebody at least will be able to test the code by temporary reducing the limit. At least we can be sure that the code is still syntactically correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, the SQLite API consistently uses unsigned uint_64
values, and the Python C APIs (involved here) consistently used signed ssize_t
values.
- if
ssize_t
is less than 8 bytes, we're ok; it will fit inuint_64
anyway - if
ssize_t
is 8 bytes or larger, we must check if the value is larger than2**63-1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also the code in _pysqlite_long_as_int64
(Modules/_sqlite/util.c).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote it. 😉
Anyway, it exclude from compilation simpler code.
What I propose here, to write either:
if ((size_t)sz > PYSQLITE_MAX_SIZE) {
where PYSQLITE_MAX_SIZE is defined as 0x7fffffffu (the current limit) or 0x7fffffffffffffffu (theoretical maximum), or
if ((sqlite3_uint64)sz != (size_t)sz) {
which just test the safety of integer conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks; I didn't have time to look at this since yesterday. Maybe I'll find time to return to it later today. Thanks for chiming in; these C data type issues are hard to get straight!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to your former suggestion.
Lib/test/test_sqlite3/test_types.py
Outdated
@unittest.skipUnless(sys.maxsize > 2**32, 'requires 64bit platform') | ||
@support.bigmemtest(size=2**31, memuse=4, dry_run=False) | ||
@unittest.skipUnless(sys.maxsize > 2**64, 'requires (s)size_t > 64bit') | ||
@support.bigmemtest(size=2**64+1, memuse=4, dry_run=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is not the purpose of these tests. It does not test that the code raises an expected exception. It tests that the code does not corrupt the data, does not raise unexpected exception and does not crash. What would happen if remove the overflow check.
How do existing tests fail with these changes applied? Do they stop raising DataError? It is good, we should change tests for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not test that the code raises an expected exception.
The test executes assertRaises
; it does test that an exception is raised. Yes, that also implies that no corruption has happened and that sqlite3 did not segfault.
The current test code does this:
- assert that DataError is raised when binding very large strings or buffers
- select rows from the test table and assert that no row was inserted
IOW, it verifies that sqlite3 raised an exception for large input, and that no rows were added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, now the behavior is changed, it was not unintentional, and the tests should be changed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[...] the tests should be changed as well.
That is part of the aim of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if left these tests as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, on my machine, they actually succeed (though I have to remove the bigmem decorators in order to force them to run), so I guess they are already broken.
Doc/whatsnew/3.13.rst
Outdated
sqlite3 | ||
------- | ||
|
||
* :mod:`sqlite3` now supports strings and BLOB sizes up to ``2**64-1`` bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the hardcoded limit is still 2**31-1
, according to the sources. So this change does not change much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are both compile time and runtime limits that are in effect. Perhaps we can just consider this an implementation detail and remove the NEWS/What's New entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the NEWS/What's New entries; I guess this can be considered an internal change.
Modules/_sqlite/connection.c
Outdated
#if SIZEOF_SIZE_T > 8 | ||
if (sz > 0xffffffffffffffff) { | ||
PyErr_SetString(PyExc_OverflowError, | ||
"string is longer than INT_MAX bytes"); | ||
"Error setting result value: " | ||
"string is longer than 2**64-1 bytes"); | ||
return -1; | ||
} | ||
sqlite3_result_text(context, str, (int)sz, SQLITE_TRANSIENT); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote it. 😉
Anyway, it exclude from compilation simpler code.
What I propose here, to write either:
if ((size_t)sz > PYSQLITE_MAX_SIZE) {
where PYSQLITE_MAX_SIZE is defined as 0x7fffffffu (the current limit) or 0x7fffffffffffffffu (theoretical maximum), or
if ((sqlite3_uint64)sz != (size_t)sz) {
which just test the safety of integer conversion.
Lib/test/test_sqlite3/test_types.py
Outdated
@unittest.skipUnless(sys.maxsize > 2**32, 'requires 64bit platform') | ||
@support.bigmemtest(size=2**31, memuse=4, dry_run=False) | ||
@unittest.skipUnless(sys.maxsize > 2**64, 'requires (s)size_t > 64bit') | ||
@support.bigmemtest(size=2**64+1, memuse=4, dry_run=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if left these tests as is?
I'm marking this as a draft; after all it's too experimental to consider for landing in its current form. |
I don't have the time nor interest to follow this through. Feel free to pick it up. |
Uh oh!
There was an error while loading. Please reload this page.