-
Notifications
You must be signed in to change notification settings - Fork 46
Closed
Description
Is your feature request related to a problem? Please describe.
I tried to use the library to parse HQL DDLs, but some of them I got the error below
---------------------------------------------------------------------------
DDLParserError Traceback (most recent call last)
/var/folders/gv/rh_2w83x16s3t1bkll6gd9ym0000gq/T/ipykernel_7778/1646707155.py in <module>
2 print("="*40, tbl_name)
3 if tbl_file["parse"] != "PARSE" or tbl_name not in {"silver.hvc_general", "silver.hvc_relog", "silver.hvc_telemetry"}: continue
----> 4 contents = parse_from_file(tbl_file["path"], output_mode="hql")
5
6
~/.pyenv/versions/3.7.10/envs/lab/lib/python3.7/site-packages/simple_ddl_parser/ddl_parser.py in parse_from_file(file_path, **kwargs)
203 """get useful data from ddl"""
204 with open(file_path, "r") as df:
--> 205 return DDLParser(df.read()).run(file_path=file_path, **kwargs)
~/.pyenv/versions/3.7.10/envs/lab/lib/python3.7/site-packages/simple_ddl_parser/parser.py in run(self, dump, dump_path, file_path, output_mode, group_by_type, json_dump)
270 Dict == one entity from ddl - one table or sequence or type.
271 """
--> 272 self.tables = self.parse_data()
273 self.tables = result_format(self.tables, output_mode, group_by_type)
274 if dump:
~/.pyenv/versions/3.7.10/envs/lab/lib/python3.7/site-packages/simple_ddl_parser/parser.py in parse_data(self)
184
185 for num, self.line in enumerate(lines):
--> 186 self.process_line(num != len(lines) - 1)
187 if self.comments:
188 self.tables.append({"comments": self.comments})
~/.pyenv/versions/3.7.10/envs/lab/lib/python3.7/site-packages/simple_ddl_parser/parser.py in process_line(self, last_line)
216 self.set_default_flags_in_lexer()
217
--> 218 self.process_statement()
219
220 def process_statement(self):
~/.pyenv/versions/3.7.10/envs/lab/lib/python3.7/site-packages/simple_ddl_parser/parser.py in process_statement(self)
220 def process_statement(self):
221 if not self.set_line and self.statement:
--> 222 self.parse_statement()
223 if self.new_statement:
224 self.statement = self.line
~/.pyenv/versions/3.7.10/envs/lab/lib/python3.7/site-packages/simple_ddl_parser/parser.py in parse_statement(self)
227
228 def parse_statement(self) -> None:
--> 229 _parse_result = yacc.parse(self.statement)
230 if _parse_result:
231 self.tables.append(_parse_result)
~/.pyenv/versions/3.7.10/envs/lab/lib/python3.7/site-packages/ply/yacc.py in parse(self, input, lexer, debug, tracking, tokenfunc)
331 return self.parseopt(input, lexer, debug, tracking, tokenfunc)
332 else:
--> 333 return self.parseopt_notrack(input, lexer, debug, tracking, tokenfunc)
334
335
~/.pyenv/versions/3.7.10/envs/lab/lib/python3.7/site-packages/ply/yacc.py in parseopt_notrack(self, input, lexer, debug, tracking, tokenfunc)
1061 if not lookahead:
1062 if not lookaheadstack:
-> 1063 lookahead = get_token() # Get the next token
1064 else:
1065 lookahead = lookaheadstack.pop()
~/.pyenv/versions/3.7.10/envs/lab/lib/python3.7/site-packages/ply/lex.py in token(self)
384 tok.lexpos = lexpos
385 self.lexpos = lexpos
--> 386 newtok = self.lexerrorf(tok)
387 if lexpos == self.lexpos:
388 # Error method didn't change text position at all. This is an error.
~/.pyenv/versions/3.7.10/envs/lab/lib/python3.7/site-packages/simple_ddl_parser/ddl_parser.py in t_error(self, t)
193
194 def t_error(self, t):
--> 195 raise DDLParserError("Unknown symbol %r" % (t.value[0],))
196
197 def p_error(self, p):
DDLParserError: Unknown symbol "'"It was hard to find the problem in a big DDL. After finding another shorter example, I could isolate the cause and figured out that the following comment was the issue.
column_name STRING COMMENT 'yada yada yada don’t bla bla bla', -- the problem was the single stylized quote (’ in HTML) in "don't".Hence, some debugging parameters to see which error lexer/yaccer got might be helpful. For example, I could see ply lets you do that.
Describe the solution you'd like
parse_from_file(tbl_file["path"], output_mode="hql", debug=True)
Describe alternatives you've considered
Show the character it caused the problem
Additional context
Add any other context or screenshots about the feature request here.
Metadata
Metadata
Assignees
Labels
No labels