No double writing of lobs #2483

akostadinov · 2025-12-21T17:23:17Z

This is an important change not only because of improved performance but also because it is needed for compatibility with a potential improvement of ruby-oci8 handling of LOBs, see kubo/ruby-oci8#271. Please read that upstream pull request and the linked kubo/ruby-oci8#230 for explanation about a huge memory bloat issue when many queries retrieve LOBs.

With regards to immediate effect on this project, I think it is better to read the detailed commit messages of the 2 commits part of this PR. In a nutshell:

write_lobs() has been updating the inserted LOBs a second time after a perfectly valid insertion using OCI::BLOB.new or CLOB or NCLOB. So it was completely redundant
the #insert_fixtures_set method had to implemented using prepared statements because the Rails default relies on @conn.quote("value") for everything including LOBs and that has a limitation of only 2k characters for BLOBs without MAX_STRING_SIZE=EXTENDED although (N)CLOBS can be still quiet large in either configuration with the implementation I here propose.
@conn.quote("value") has been updated to support LOBs instead of using empty_(bnc)lob() and it works quiet well now although it is questionably if that method has valid use cases, still better to be working to avoid surprises in edge cases

With regards to trying out all the improvements of kubo/ruby-oci8#271 and this PR with a monkey patch in your project, you can just add to your initializers something like:

    # disable the write_lobs callback
    ActiveRecord::Base.skip_callback(:update, :after, :enhanced_write_lobs)

    # retrieve LOBs as LONG and LONGRAW
    OCI8::BindType::Mapping[:clob] = OCI8::BindType::Long
    OCI8::BindType::Mapping[:blob] = OCI8::BindType::LongRaw

    # disable ruby-oci8 static array fetching when lobs are detected
    module OCI8DisableArrayFetch
      private
      def define_one_column(pos, param)
        @fetch_array_size = nil # disable memory array fetching anytime
        super # call original
      end
    end
    
    OCI8::Cursor.prepend(OCI8DisableArrayFetch)

Additionally, I'm planning to submit another PR that would clean-up cursor cache when connections are returned to the ActiveRecord pool. Because even with the above improvements, still the memory consumption is high with certain hard to reproduce workloads. Reducing the cached statements limit also doesn't help consistently. So I will make it a configuration option. But to try it out:

    # clean-up prepared statements/cursors on connection return to pool
    module OracleStatementCleanup
      def self.included(base)
        base.set_callback :checkin, :after, :close_and_clear_statements
      end

      def close_and_clear_statements
        @statements&.clear
      end
    end

    ActiveRecord::ConnectionAdapters::OracleEnhancedAdapter.include OracleStatementCleanup

I will appreciate any feedback!

P.S. I filed this PR against the release80 branch because master is broken and tests cannot pass but can be happily applied on top of #2480

I could't find any normal code path where inline quoting of LOBs would take place. But it might be used by people who generate raw SQL queries so I think it is worth having it somehow working instead of silently making it empty data. For example `#build_fixture_sql` and `#build_fixture_statements` in ActiveRecord::ConnectionAdapters::DatabaseStatements use it but we override `#insert_fixture` and `#insert_fixtures_set` which use these. Actually this `empty_nclob()` does not even work so perhaps this code path is not used for a very long time. Using it, I get an error > ORA-00904: "EMPTY_NCLOB": invalid identifier In the past such fix was rejected rsim#1588 but I think originally this empty quoting intended to workaround writing the quoted value and then `write_lob()` method inserting it again. But nowadays `write_lob()` is actually redundant and removed in another commit. So basically we have these as a last resort effort to make things work for niche use cases with the significant limitation on BLOBs length in particular.

In the very distant past, LOBs were not possible to insert together with the row. So this complicated callback method was necessary. Initially an `empty_clob()` was inserted and then the callbacks would call `write_lobs()` to insert the actually desired lobs. But nowadays it is performing just a second redundant insert. With prepared statements, the `OCI::CLOB/BLOB/NCLOB` objects are used. And the LOBs get inserted from the beginning as they should. And this also supports large LOBs. In fact presently up to 2GB - 8 bytes but this can be improved even further by making ruby-oci8 to use multiple `write` calls for the supplied argument instead of a single one which fails for large values. Not a high priority to increase max supported string size though, given that is all this huge sting needs to be in memory, possibly multiple copies. This is not a loss of functionality compared to `write_lob()` because: * `write_lobs()` also used a single `write` call * if `OCI:BLOB` would crash then presence of `write_lobs()` wouldn't help

andynu · 2025-12-27T21:29:02Z

Hi @akostadinov,

Thank you for this PR and for identifying the double-write issue. You're right that when prepared_statements: true (the default), LOB data flows through type_cast() which creates temporary LOBs that Oracle handles directly during INSERT — making the subsequent write_lobs callback redundant.

However, this PR breaks functionality when prepared_statements: false.

The Issue

When prepared statements are disabled, the adapter can't bind LOB data as parameters. Instead, it must:

INSERT with empty_clob()/empty_blob() literals
Use the write_lobs callback to SELECT ... FOR UPDATE and write the LOB content

Without the callback, large LOBs fail with ORA-01704: string literal too long (Oracle's 4000-byte limit for literals).

Test Branches

I've created some additional spec tests that demonstrate this behavior:

test-lob-prepared-vs-unprepared — baseline with the new spec (19/20 pass)
test-lob-with-pr2483-changes — same spec with your PR applied (17/20 pass)

The additional failures on the PR branch:

100KB BLOB creation → ORA-01704: string literal too long
SQL verification expecting empty_clob() + FOR UPDATE pattern

Path Forward

The double-write issue you've identified is real and worth fixing. Would you mind opening an issue to track it separately?

The challenge is finding a way to skip the write_lobs hooks when on the prepared statements path, while preserving them for the unprepared case.

When prepared_statements is enabled (the default), LOB data is already written during INSERT via temporary LOB binding in type_cast(). The write_lobs callback was redundantly writing the same data again via SELECT FOR UPDATE. This change skips the callback in the prepared statements path while preserving it for the unprepared path, where empty_clob()/empty_blob() literals require the callback to populate LOB data after INSERT. Fixes the double-write issue identified in rsim#2483.

akostadinov · 2025-12-28T16:18:08Z

My line of thinking is that we should not be obliged to support just any code path, whether it makes sense or not.

If somebody wants to disable prepared statements, then they should live with the inherent limitations of that approach. Which is 4000 chars for BLOBS, or 32k when MAX_STRING_SIZE=EXTENDED is set. For CLOBS and NCLOBS there is no such limitation with this implementation.

This write_lobs approach is still something that is not a normal SQL statement. If somebody wants pure inline SQL (is this the correct term?) for whatever reason, then write_lobs doesn't truly help.

We may somehow force LOBs to use prepared statements in all situation, to assure things working with such configuration but then what is the point disabling prepared statements in the first place?

In summary: Pure Oracle inline SQL limits blobs (but not clobs) to 4k/32k. Disabling prepared statements means user requesting inline SQL so they will be subject to this database server limitation. The normal code path of using prepared statements has no such limitations and has no drawbacks compared to the inline SQL.

@andynu , thank you very much for reviewing this, let me know if you find flaws in the thinking above. Maybe I'm missing some valid use cases?

P.S. Moreover, SELECT ... FOR UPDATE does not work when using the efficient LOB retrieval method as described kubo/ruby-oci8#271 (and can easily be configured in an initializer presently), so this is yet another limitation which makes the exercise of keeping the write_lobs code path more of a nuisance and maintenance burden than actually help valid use cases.

andynu · 2025-12-28T17:07:39Z

I think we should continue supporting existing code paths. Disabling LOB management for non-prepared statements would be a significant change that I, a fellow contributor and not one of the primary maintainers of this library, am not prepared to make (🥁 ). I don't want to interfere with someone else's existing use. And the use or non-use of prepared statements is a decision that you make based on the homogeneity or heterogeneity of your SQL queries. The reason I added an additional write_lobs callback was because of a need I had in a production system. And the overall write_lobs hook predates my involvement here.

I think smoothing over Oracle's quirks, including working around server limitations, is part of what this library does. We're quite a ways away from 'Pure Oracle inline SQL' here. Removing that for non-prepared statements feels like a regression, even if the use case is niche.

I still agree that we need to address the double write. I've drafted a modest change that I think addresses that andynu@151e629

You're right that this path may become untenable if the kubo/ruby-oci8#271 changes land. I'd rather address that when it happens than preemptively remove functionality.

akostadinov · 2025-12-30T16:27:13Z

@andynu , so would you share your use cases that requires write_lobs? Also can you elaborate more about "decision that you make based on the homogeneity or heterogeneity of your SQL queries"? As I wrote above, I don't see valid use cases to disable prepared statements when working with LOBs especially.

My tracking down of the write_lobs callbacks is to the very beginning when there were no other options to add LOBs. So to my thinking, the need for it is gone, and it is a rather ugly workaround involving a whole separate query and circling through all object changed attributes. Few people contribute to the oracle oracle_enhanced and ruby-oci8 and keeping such stuff IMO is counter-productive... unless there are use cases you can't easily adapt (which may exist and maybe I just don't know about them).

And how about quoting that inserts empty_clob()? I think this is a bug, people complained about it in the past that they got bitten by it. Some outdated stuff just has to go if a project is to keep its sanity. That's only my take of course.

andynu · 2025-12-30T16:49:06Z

Thanks for the discussion, @akostadinov.

To provide some context: I encountered this exact issue in production (see #2477). Rails 8's query_log_tags_enabled = true causes Rails to automatically disable prepared statements, which means users can end up on the unprepared path without explicitly choosing it. This isn't just a niche configuration—it's the default development behavior in Rails 8.

I understand the appeal of simplifying the codebase by removing the write_lobs path, but I think we're solving different problems:

The double-write issue is real and worth fixing. I agree completely that when prepared_statements: true, the write_lobs callback is redundant and should be skipped.
The unprepared path is also real and serves actual users. Removing support for large LOBs on that path is a breaking change for existing deployments.

My approach in andynu@151e629 addresses both: skip the callback when prepared statements handle the LOB directly, preserve it when they don't.

On terminology: I'd prefer we avoid characterizing code as "ugly"—it's subjective and doesn't help us evaluate whether the code is solving a real problem. The question is whether this path serves legitimate use cases (it does, as #2477 demonstrates) and whether the maintenance cost is justified.

My position is that we fix the double-write issue without breaking the unprepared statements path. I've demonstrated one approach in andynu@151e629. I'm happy to discuss alternative implementations, but removing large LOB support for the unprepared statements path would be a breaking change and a regression—that's not a direction I'm willing to take.

akostadinov · 2025-12-30T21:24:49Z

I see. It is a valid issue that query_log_tags_enabled feature should be made working. But it also comes to show that previously non-prepared statements did not actually work, and the new default query_log_tags_enabled setting only made that noticeable. Am I correct in this conclusion?

So if write_lobs is removed, this will not be removal of functionality. That functionality was already not there.

I would rather look for options to make query log tags working with prepared statements, before tying to fix write_lobs. I will look at that after the holidays.

write_lobs can be fixed, even with efficient piecewise retrieval of LOBs, we can always use an UPDATE prepared statement instead of SELECT ... FOR UPDATE. I just see it as a really important simplification if we get rid of it instead. It seems too easy to get things wrong with it, as apparent by the current state of the code. i.e. two unnoticed severe issues - double writing and non-prepared statements not actually working.

andynu · 2025-12-30T22:00:36Z

The name changed, but the pattern pre-existed for updates. My PR only expanded it for the create case.

oracle-enhanced/lib/active_record/connection_adapters/oracle_enhanced/lob.rb

Line 14 in 81a0f09

before_update :record_changed_lobs

The original pattern was introduced 8 years ago for the update case. c5c68fe

I'm open to exploring alternative solutions to make the query_log_tags_enabled feature work, but removing a feature that has been in place for 8 years (in the update condition) would be a regression and could break users relying on the non-truncated behavior. I'm very hesitant unless we can find a way that does it without the truncation.

…ents This spec clearly demonstrates the critical difference between LOB handling with prepared statements (bind parameters) vs without (raw SQL with empty_clob()). Key points demonstrated: - With prepared_statements: true, LOB data flows through type_cast() which creates OCI8::CLOB temp LOBs. Data is populated BEFORE INSERT. - With prepared_statements: false, SQL contains empty_clob() literals. The write_lobs callback is REQUIRED to populate LOB data after INSERT. This test suite will FAIL if the lob.rb callbacks are removed, proving they are necessary for backwards compatibility with prepared_statements: false. Related to: rsim#2483

akostadinov mentioned this pull request Dec 21, 2025

Rails 8.1 support #2480

Open

akostadinov force-pushed the no_write_lobs branch from c6336f4 to 5c6c9b9 Compare December 21, 2025 17:41

akostadinov mentioned this pull request Dec 21, 2025

Huge memory usage starting with version 2.2.7 kubo/ruby-oci8#230

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

No double writing of lobs #2483

No double writing of lobs #2483

Uh oh!

akostadinov commented Dec 21, 2025 •

edited

Loading

Uh oh!

andynu commented Dec 27, 2025

Uh oh!

akostadinov commented Dec 28, 2025

Uh oh!

andynu commented Dec 28, 2025

Uh oh!

akostadinov commented Dec 30, 2025

Uh oh!

andynu commented Dec 30, 2025

Uh oh!

akostadinov commented Dec 30, 2025

Uh oh!

andynu commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

No double writing of lobs #2483

Are you sure you want to change the base?

No double writing of lobs #2483

Uh oh!

Conversation

akostadinov commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andynu commented Dec 27, 2025

The Issue

Test Branches

Path Forward

Uh oh!

akostadinov commented Dec 28, 2025

Uh oh!

andynu commented Dec 28, 2025

Uh oh!

akostadinov commented Dec 30, 2025

Uh oh!

andynu commented Dec 30, 2025

Uh oh!

akostadinov commented Dec 30, 2025

Uh oh!

andynu commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akostadinov commented Dec 21, 2025 •

edited

Loading