Skip to content

Conversation

@nddipiazza
Copy link
Contributor

Summary

Fixes runtime fetcher/emitter configuration sharing between TikaGrpcServerImpl and forked PipesServer workers.

Problem

When using tika-grpc, fetchers and emitters saved via the gRPC saveFetcher/saveEmitter endpoints were not available to the forked PipesServer workers because they used separate ConfigStore instances. This caused FetcherNotFoundException errors when calling fetchAndParse after saveFetcher.

Solution

Modified PipesServer.initializeResources() to:

  1. Create a ConfigStore from PipesConfig using the same factory mechanism as TikaGrpcServerImpl
  2. Pass this ConfigStore to FetcherManager and EmitterManager so runtime configurations are shared
  3. Enable runtime modification support by passing true to the load methods

Changes

  • Added ConfigStore and ConfigStoreFactory imports
  • Added createConfigStore() helper method to instantiate ConfigStore from PipesConfig
  • Updated initializeResources() to create and use ConfigStore for FetcherManager and EmitterManager

Testing

This enables the tika-grpc-e2e-test suite to properly test fetcher/emitter lifecycle through gRPC.

JIRA: https://issues.apache.org/jira/browse/TIKA-4594

PipesServer now creates and uses the same type of ConfigStore configured
in PipesConfig, allowing runtime fetcher/emitter configurations saved via
gRPC endpoints to be available to forked worker processes.

This fixes FetcherNotFoundException errors when using saveFetcher followed
by fetchAndParse in tika-grpc.
@nddipiazza nddipiazza closed this Dec 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant