support async chunking func to improve processing performance when a heavy `chunking_func` is passed in by user by tongda · Pull Request #2336 · HKUDS/LightRAG

tongda · 2025-11-09T07:05:18Z

Description

Current implementation call the chunking_func as synchronized function. If the chunking_func has heavy operations such as calling LLMs, the calling will block the main loop for a long time.

I add support of async implementation of chunking_func so that it will not block the main loop. If the chunking_func is a normal function, the code will run as before.

Related Issues

Changes Made

I simply add a condition check before calling chunking_func. If it is an async function, call await, else call as before.

lightrag.py:apipeline_process_enqueue_documents:1761

if iscoroutinefunction(self.chunking_func):
  chunks = await self.chunking_func(
      self.tokenizer,
      content,
      split_by_character,
      split_by_character_only,
      self.chunk_overlap_token_size,
      self.chunk_token_size,
  )
else:
  chunks = self.chunking_func(
      self.tokenizer,
      content,
      split_by_character,
      split_by_character_only,
      self.chunk_overlap_token_size,
      self.chunk_token_size,
  )
chunks: dict[str, Any] = {
  compute_mdhash_id(dp["content"], prefix="chunk-"): {
      **dp,
      "full_doc_id": doc_id,
      "file_path": file_path,  # Add file path to each chunk
      "llm_cache_list": [],  # Initialize empty LLM cache list for each chunk
  }
  for dp in chunks
}

Checklist

Changes tested locally
Code reviewed
Documentation updated (if necessary)
Unit tests added (if applicable)

Additional Notes

[Add any additional notes or context for the reviewer(s).]

…heavy `chunking_func` is passed in by user

danielaskdd · 2025-11-09T15:20:16Z

@codex review

chatgpt-codex-connector · 2025-11-09T15:23:44Z

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

tongda · 2025-11-10T12:53:53Z

I changed the code to a simplifier but more general implementation.

Previous code could not work with object that implemented an async __call__ method.

tongda · 2025-11-10T13:04:09Z

@codex review

chatgpt-codex-connector · 2025-11-10T13:04:14Z

To use Codex here, create a Codex account and connect to github.

danielaskdd · 2025-11-10T14:50:06Z

@codex review

chatgpt-codex-connector · 2025-11-10T14:52:51Z

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

support async chunking func to improve processing performance when a …

d137ba5

…heavy `chunking_func` is passed in by user

easier version: detect chunking_func result is coroutine or not

245df75

danielaskdd merged commit 245df75 into HKUDS:main Nov 13, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support async chunking func to improve processing performance when a heavy `chunking_func` is passed in by user#2336

support async chunking func to improve processing performance when a heavy `chunking_func` is passed in by user#2336
danielaskdd merged 2 commits intoHKUDS:mainfrom
tongda:main

tongda commented Nov 9, 2025

Uh oh!

danielaskdd commented Nov 9, 2025

Uh oh!

chatgpt-codex-connector bot commented Nov 9, 2025

Uh oh!

tongda commented Nov 10, 2025

Uh oh!

tongda commented Nov 10, 2025

Uh oh!

chatgpt-codex-connector bot commented Nov 10, 2025

Uh oh!

danielaskdd commented Nov 10, 2025

Uh oh!

chatgpt-codex-connector bot commented Nov 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tongda commented Nov 9, 2025

Description

Related Issues

Changes Made

Checklist

Additional Notes

Uh oh!

danielaskdd commented Nov 9, 2025

Uh oh!

chatgpt-codex-connector bot commented Nov 9, 2025

Uh oh!

tongda commented Nov 10, 2025

Uh oh!

tongda commented Nov 10, 2025

Uh oh!

chatgpt-codex-connector bot commented Nov 10, 2025

Uh oh!

danielaskdd commented Nov 10, 2025

Uh oh!

chatgpt-codex-connector bot commented Nov 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants