Replies: 3 comments 7 replies
-
Hey @afg1, great question. Substring can be a bit expensive, but I think we have a path forward for optimizing it. Out of curiosity, how large were the documents (rough # words is fine, tokens even better) that you started to run into issues on? As a very short term recommendation, if you're OK with supporting evidence falling on specific boundaries (e.g. a sentence or a set of sentences), you can get much better performance by just splitting it up and doing a (recursive) |
Beta Was this translation helpful? Give feedback.
-
This reliably falls over for me with a 14B model, and is an extract of the code I'm using, just with stuff stuck together from a few modules. Hopefully that won't impact debugging. I hardcoded the hub cache path, for this example, but I have a slightly more sophisticated model loading logic in the real thing, hopefully that doesn't matter. text.txt - Just the results section extracted from this paper without tables or figure captions. This is 3200 tokens, some places in my workflow will load ~5k for this step, sometimes with 5-7k in context already. Versions: Also fails with Hardware: Sorry this turned into a wall of text, but hopefully this has all the information your need!
Interestingly, it works fine if I use a small model like Qwen 0.5B or Llama 3.2 1B. Maybe its a memory thing? This has the selection of paragraph and substring extraction. I think it is initially falling over selecting a paragraph, because I don't see any log output after loading the text into context, and this time I got a traceback, which didn't happen when I first saw this:
|
Beta Was this translation helpful? Give feedback.
-
The new substring Rust implementation by @hudson-ai can handle up to around 10k elements (so 5k words when spliting on words (you'll get 10k elements because spaces will be separate) or 10k characters when splitting on characters). Just added tests in Rust but this is not yet exposed in Python in any way. As for large selects, we seem to be able to handle up to a few megabytes (this should work already in guidance) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm using guidance to provide a text snippet to support an assertion made by the LLM about something said in a paper. To do that I'm using
substring
to extract the supporting evidence - I want it to be a real quote from the paper.I was quite surprised that this 'just worked' when I stuffed a whole section of paper into the function (e.g. the materials and methods of a paper like this).
However, sometimes the application hangs, and I thnk it is on running the substring selection, based on getting a stacktrace from interrupting it. Additionally I got an error with 0.2.0 where there were too many expressions constructed (I forget the exact error, I downgraded back to 0.1.16 where it worked), which I think was related to
substring
.So, is there a sensible upper limit to the amount of text I can expect
substring
to work with? How would you suggest I extract a text snippet from a big chunk of text without using substring?Thanks!
Beta Was this translation helpful? Give feedback.
All reactions