I have a TFX pipeline that runs in Kubeflow on GCP and recently one of my pipelines started failing with the following error in a ResolverNode.latest_model_resolver and ResolverNode.latest_blessed_model_resolver
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/ml_metadata/metadata_store/metadata_store.py", line 165, in _call_method
response.CopyFrom(grpc_method(request))
File "/usr/local/lib/python3.7/dist-packages/grpc/_channel.py", line 826, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib/python3.7/dist-packages/grpc/_channel.py", line 729, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED
details = "Received message larger than max (4199881 vs. 4194304)"
debug_error_string = "{"created":"@1603760693.874743930","description":"Received message larger than max (4199881 vs. 4194304)","file":"src/core/ext/filters/message_size/message_size_filter.cc","file_line":203,"grpc_status":8}"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 360, in <module>
main()
File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 353, in main
execution_info = launcher.launch()
File "/tfx-src/tfx/orchestration/launcher/base_component_launcher.py", line 197, in launch
self._exec_properties)
File "/tfx-src/tfx/orchestration/launcher/base_component_launcher.py", line 166, in _run_driver
component_info=self._component_info)
File "/tfx-src/tfx/components/common_nodes/resolver_node.py", line 73, in pre_execution
source_channels=input_dict.copy())
File "/tfx-src/tfx/dsl/experimental/latest_artifacts_resolver.py", line 56, in resolve
output_key=c.output_key)
File "/tfx-src/tfx/orchestration/metadata.py", line 323, in get_qualified_artifacts
executions = self.store.get_executions_by_context(context.id)
File "/usr/local/lib/python3.7/dist-packages/ml_metadata/metadata_store/metadata_store.py", line 1080, in get_executions_by_context
self._call('GetExecutionsByContext', request, response)
File "/usr/local/lib/python3.7/dist-packages/ml_metadata/metadata_store/metadata_store.py", line 140, in _call
return self._call_method(method_name, request, response)
File "/usr/local/lib/python3.7/dist-packages/ml_metadata/metadata_store/metadata_store.py", line 170, in _call_method
raise _make_exception(e.details(), e.code().value[0]) # pytype: disable=attribute-error
ml_metadata.errors.ResourceExhaustedError: Received message larger than max (4199881 vs. 4194304)
I have a TFX pipeline that runs in Kubeflow on GCP and recently one of my pipelines started failing with the following error in a
ResolverNode.latest_model_resolverandResolverNode.latest_blessed_model_resolverIs there a way to fix this on my side?