Skip to content

glue-adaptor throws arbitrary table not found when source.yml is corrupted #610

@udaykirantippireddy

Description

@udaykirantippireddy

Describe the bug

When tables are defined at source.yml though yml and is syntactically correct, the corrupted yml leads to arbitrary source table not defined error.

Steps To Reproduce

Create a dbt-glue project with source.yml with schema, tables, alias. Up on testing it should work. Now change the source.yml as below reproduce error

version: 2

sources:
  - name: "XYZ"
    schema: "MNB"
    tables:
      - name: abc
    schema: "DEF"
    tables:
      - name: hij

Expected behavior

The dbt-glue adaptor should throw and YML syntax error or point to the right line where the issue is when any key is repeated under the -name key.

Screenshots and log output


============================== 11:41:14.751912 | 56be1f55-e094-4db2-b8ae-6da1e39c6436 ==============================
[0m11:41:14.751912 [info ] [MainThread]: Running with dbt=1.9.8
[0m11:41:14.751912 [debug] [MainThread]: running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'write_json': 'True', 'log_cache_events': 'False', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'debug': 'False', 'fail_fast': 'False', 'log_path': 'C:\\workarea\\code\\project\\modules\\unified\\XX\\logs', 'version_check': 'True', 'profiles_dir': 'C:\\workarea\\code\\project\\modules\\unified\\XX', 'use_colors': 'True', 'use_experimental_parser': 'False', 'empty': 'False', 'quiet': 'False', 'no_print': 'None', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'introspect': 'True', 'invocation_command': 'dbt run --select XX --target prep', 'static_parser': 'True', 'target_path': 'None', 'log_format': 'default', 'send_anonymous_usage_stats': 'True'}
[0m11:41:21.515493 [debug] [MainThread]: Spark adapter: Setting pyhive.hive logging to ERROR
[0m11:41:21.517503 [debug] [MainThread]: Spark adapter: Setting thrift.transport logging to ERROR
[0m11:41:21.517503 [debug] [MainThread]: Spark adapter: Setting thrift.protocol logging to ERROR
[0m11:41:21.720394 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'project_id', 'label': '56be1f55-e094-4db2-b8ae-6da1e39c6436', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000018CB8CC2C10>]}
[0m11:41:21.768499 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': '56be1f55-e094-4db2-b8ae-6da1e39c6436', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000018CB6460C90>]}
[0m11:41:21.768499 [info ] [MainThread]: Registered adapter: glue=1.9.4
[0m11:41:22.726068 [debug] [MainThread]: checksum: 981edcdd0017ce61f4721754721476a69b2d9cdea708e7551cb25a33570725f7, vars: {}, profile: , target: prep, version: 1.9.8
[0m11:41:22.728230 [info ] [MainThread]: Unable to do partial parsing because saved manifest not found. Starting full parse.
[0m11:41:22.729367 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'partial_parser', 'label': '56be1f55-e094-4db2-b8ae-6da1e39c6436', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000018CB87C7E10>]}
[0m11:41:26.934758 [error] [MainThread]: Encountered an error:
Compilation Error
  Model 'model.YY.XX' (models\XX\XX.sql) depends on a source named 'ZZ.JKL' which was not found
[0m11:41:26.934758 [debug] [MainThread]: Command `dbt run` failed at 11:41:26.934758 after 12.37 seconds
[0m11:41:26.947629 [debug] [MainThread]: Glue adapter: cleanup called
[0m11:41:26.947629 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000018CB7292210>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000018CB7293190>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000018CB7251110>]}
[0m11:41:26.951375 [debug] [MainThread]: Flushing usage events
[0m11:41:27.372730 [debug] [MainThread]: An error was encountered while trying to flush usage events
[0m11:46:11.273830 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001F09F0D3250>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001F09F0D0710>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001F09F0D0E90>]}


============================== 11:46:11.305178 | f70744ed-3d32-4b58-98ce-41da72abf25f ==============================
[0m11:46:11.305178 [info ] [MainThread]: Running with dbt=1.9.8
[0m11:46:11.307176 [debug] [MainThread]: running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'write_json': 'True', 'log_cache_events': 'False', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'version_check': 'True', 'fail_fast': 'False', 'log_path': 'C:\\workarea\\code\\project\\modules\\unified\\XX\\logs', 'debug': 'False', 'profiles_dir': 'C:\\workarea\\code\\project\\modules\\unified\\XX', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'empty': 'False', 'log_format': 'default', 'static_parser': 'True', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'introspect': 'True', 'target_path': 'None', 'invocation_command': 'dbt run --select XX --target prep', 'send_anonymous_usage_stats': 'True'}
[0m11:46:17.116583 [debug] [MainThread]: Spark adapter: Setting pyhive.hive logging to ERROR
[0m11:46:17.118613 [debug] [MainThread]: Spark adapter: Setting thrift.transport logging to ERROR
[0m11:46:17.120627 [debug] [MainThread]: Spark adapter: Setting thrift.protocol logging to ERROR
[0m11:46:17.578157 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'project_id', 'label': 'f70744ed-3d32-4b58-98ce-41da72abf25f', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001F0A0B33810>]}
[0m11:46:17.642844 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': 'f70744ed-3d32-4b58-98ce-41da72abf25f', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001F09E2E4D90>]}
[0m11:46:17.646042 [info ] [MainThread]: Registered adapter: glue=1.9.4
[0m11:46:18.857994 [debug] [MainThread]: checksum: 981edcdd0017ce61f4721754721476a69b2d9cdea708e7551cb25a33570725f7, vars: {}, profile: , target: prep, version: 1.9.8
[0m11:46:18.861506 [info ] [MainThread]: Unable to do partial parsing because saved manifest not found. Starting full parse.
[0m11:46:18.864021 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'partial_parser', 'label': 'f70744ed-3d32-4b58-98ce-41da72abf25f', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001F0A0654810>]}
[0m11:46:23.653229 [error] [MainThread]: Encountered an error:
Compilation Error
  Model 'model.YY.XX' (models\XX\XX.sql) depends on a source named 'ZZ.FGH' which was not found
[0m11:46:23.655234 [debug] [MainThread]: Command `dbt run` failed at 11:46:23.655234 after 12.55 seconds
[0m11:46:23.657240 [debug] [MainThread]: Glue adapter: cleanup called
[0m11:46:23.657240 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001F09F0CAD10>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001F09F0CB550>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001F0A0FE9890>]}
[0m11:46:23.657240 [debug] [MainThread]: Flushing usage events
[0m11:46:24.073240 [debug] [MainThread]: An error was encountered while trying to flush usage events
[0m11:49:21.324955 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000019AEA7B2D10>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000019AEA630550>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000019AEA7B3050>]}


============================== 11:49:21.346310 | 191502cd-6893-4ed9-8bb0-616929926e73 ==============================
[0m11:49:21.346310 [info ] [MainThread]: Running with dbt=1.9.8
[0m11:49:21.346310 [debug] [MainThread]: running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'write_json': 'True', 'log_cache_events': 'False', 'partial_parse': 'True', 'cache_selected_only': 'False', 'profiles_dir': 'C:\\workarea\\code\\project\\modules\\unified\\XX', 'fail_fast': 'False', 'debug': 'False', 'log_path': 'C:\\workarea\\code\\project\\modules\\unified\\XX\\logs', 'warn_error': 'None', 'version_check': 'True', 'use_colors': 'True', 'use_experimental_parser': 'False', 'empty': 'False', 'quiet': 'False', 'no_print': 'None', 'log_format': 'default', 'introspect': 'True', 'invocation_command': 'dbt run --select XX --target prep', 'static_parser': 'True', 'target_path': 'None', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'send_anonymous_usage_stats': 'True'}
[0m11:49:29.426984 [debug] [MainThread]: Spark adapter: Setting pyhive.hive logging to ERROR
[0m11:49:29.426984 [debug] [MainThread]: Spark adapter: Setting thrift.transport logging to ERROR
[0m11:49:29.432918 [debug] [MainThread]: Spark adapter: Setting thrift.protocol logging to ERROR
[0m11:49:29.655406 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'project_id', 'label': '191502cd-6893-4ed9-8bb0-616929926e73', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000019AEC1D6B90>]}
[0m11:49:29.702689 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': '191502cd-6893-4ed9-8bb0-616929926e73', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000019AE99B7FD0>]}
[0m11:49:29.702689 [info ] [MainThread]: Registered adapter: glue=1.9.4
[0m11:49:30.791658 [debug] [MainThread]: checksum: 981edcdd0017ce61f4721754721476a69b2d9cdea708e7551cb25a33570725f7, vars: {}, profile: , target: prep, version: 1.9.8
[0m11:49:30.793663 [info ] [MainThread]: Unable to do partial parsing because saved manifest not found. Starting full parse.
[0m11:49:30.795668 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'partial_parser', 'label': '191502cd-6893-4ed9-8bb0-616929926e73', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000019AEC4912D0>]}
[0m11:49:36.412695 [error] [MainThread]: Encountered an error:
Compilation Error
  Model 'model.YY.XX' (models\XX\XX.sql) depends on a source named 'ZZ.ABC' which was not found
[0m11:49:36.414703 [debug] [MainThread]: Command `dbt run` failed at 11:49:36.414703 after 15.38 seconds
[0m11:49:36.417224 [debug] [MainThread]: Glue adapter: cleanup called
[0m11:49:36.417896 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000019AEA7FF550>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000019AEA7FFB10>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000019AEA7B1010>]}
[0m11:49:36.418732 [debug] [MainThread]: Flushing usage events
[0m11:49:36.517779 [debug] [MainThread]: An error was encountered while trying to flush usage events
[0m11:52:49.111413 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001D47B99CCD0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001D47B99DD90>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001D47B99D110>]}


============================== 11:52:49.119209 | 8a2ebbc8-c14b-4134-b9dd-51a02577c223 ==============================
[0m11:52:49.119209 [info ] [MainThread]: Running with dbt=1.9.8
[0m11:52:49.119209 [debug] [MainThread]: running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'write_json': 'True', 'log_cache_events': 'False', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'debug': 'False', 'profiles_dir': 'C:\\workarea\\code\\project\\modules\\unified\\XX', 'log_path': 'C:\\workarea\\code\\project\\modules\\unified\\XX\\logs', 'fail_fast': 'False', 'version_check': 'True', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'empty': 'False', 'log_format': 'default', 'static_parser': 'True', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'introspect': 'True', 'target_path': 'None', 'invocation_command': 'dbt run --select XX --target prep', 'send_anonymous_usage_stats': 'True'}
[0m11:52:49.531554 [debug] [MainThread]: Spark adapter: Setting pyhive.hive logging to ERROR
[0m11:52:49.531554 [debug] [MainThread]: Spark adapter: Setting thrift.transport logging to ERROR
[0m11:52:49.531554 [debug] [MainThread]: Spark adapter: Setting thrift.protocol logging to ERROR
[0m11:52:49.722681 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'project_id', 'label': '8a2ebbc8-c14b-4134-b9dd-51a02577c223', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001D47CF30E50>]}
[0m11:52:49.771318 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': '8a2ebbc8-c14b-4134-b9dd-51a02577c223', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001D47B9D8350>]}
[0m11:52:49.771318 [info ] [MainThread]: Registered adapter: glue=1.9.4
[0m11:52:50.369092 [debug] [MainThread]: checksum: 981edcdd0017ce61f4721754721476a69b2d9cdea708e7551cb25a33570725f7, vars: {}, profile: , target: prep, version: 1.9.8
[0m11:52:50.369092 [info ] [MainThread]: Unable to do partial parsing because saved manifest not found. Starting full parse.
[0m11:52:50.372873 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'partial_parser', 'label': '8a2ebbc8-c14b-4134-b9dd-51a02577c223', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001D47D446F50>]}
[0m11:52:51.837514 [error] [MainThread]: Encountered an error:
Compilation Error
  Model 'model.YY.XX' (models\XX\XX.sql) depends on a source named 'XX.DEF' which was not found
[0m11:52:51.837514 [debug] [MainThread]: Command `dbt run` failed at 11:52:51.837514 after 2.85 seconds
[0m11:52:51.837514 [debug] [MainThread]: Glue adapter: cleanup called
[0m11:52:51.837514 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001D47B9E6710>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001D47B9EC150>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001D47D70BDD0>]}
[0m11:52:51.845608 [debug] [MainThread]: Flushing usage events
[0m11:52:52.857007 [debug] [MainThread]: An error was encountered while trying to flush usage events
[0m11:59:44.498014 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000021DC55BC110>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000021DC55BCDD0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x0000021DC55BD290>]}

System information

The output of dbt --version:

$ dbt --version
Core:
  - installed: 1.9.8  
  - latest:    1.10.11 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - glue:  1.9.4 - Update available!
  - spark: 1.9.2 - Update available!

  At least one plugin is out of date with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

The operating system you're using:
Windows 11 24H2
The output of python --version:
Python 3.11.3

Additional context

Above error may look minor but it leads to lots of developer productivity issues when error itself is misleading.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions