-
-
Notifications
You must be signed in to change notification settings - Fork 8
133 code parser #139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
133 code parser #139
Conversation
@@ -3,6 +3,7 @@ | |||
#' @name get_code | |||
#' @param object (`qenv`) | |||
#' @param deparse (`logical(1)`) if the returned code should be converted to character. | |||
#' @param names (`character(n)`) if provided, returns the code only for objects specified in `names`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Food for thought: is it a better API to have this argument or to have a separate function, say get_object_code
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can also have get_code()
that extracts a list for all the objects, and you could call get_code()['object_name']
, unsure what is the best way in here yet
Hey @chlebowa for this
q <- eval_code(new_qenv(), "a <- 1")
get_code(q, deparse = FALSE, names = "a")
Yeah, it returns character for testthat::test_that(
"get_code returns the same class when names is specified and when not",
{
q <- eval_code(new_qenv(), "a <- 1")
testthat::expect_identical(
get_code(q, deparse = FALSE, names = "a"),
get_code(q, deparse = TRUE)
)
}
) |
…ineering/teal.code into 133_code_parser@main
…ineering/teal.code into 133_code_parser@main
"Objects not found in 'qenv' environment: ", | ||
paste(names[!(names %in% ls(qenv@env))], collapse = ", ") | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | |
return(character(0L)) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I allow the function to work further, because if someone asks for 3 objects and 1 of them does not exist, you at least get the code for other two. Maybe it's better if we put error in here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of code parser is smart and complex 💯
I've been testing and thinking about this and can't shake off the feeling of the existence of a bunch of exceptions that may exist outside the control of the insightsengineering
team
However, I'm not finding a lot of them and the ones I find are a bit specific 😁
Mostly when accessing data from packages or an initial assignment via assign('yada') # @effect yada
Minor edge cases
It's been hard to find situations where it fails, which is nice in something as complex as this!!
I guess both the examples below come from the initial object not being detected as "assigned"
Data from packages
I believe this case might be plausible
testthat::test_that("code_parser load data from package & effect hint", {
code <- 'data(iris) # @effect iris'
q1 <- teal.code::eval_code(teal.code:::new_qenv(), code = code)
# Makes sure the object is on qenv
q2 <- testthat::expect_output(
teal.code::eval_code(q1, code = "print(NROW(iris))"),
"150"
)
get_code(q1, deparse = FALSE, names = "iris") |>
length() |>
expect_gt(0)
parsed_code <- get_code(q2, deparse = FALSE, names = "iris")
expect_gt(length(parsed_code), 0)
expect_false(is.na(parsed_code))
})
assign
as first call
Related to the one above, although I guess it's not as plausible but exists nonetheless
testthat::test_that("code_parser with assign & effect hint", {
code <- 'assign("ADSL", iris) # @effect ADSL'
q1 <- teal.code::eval_code(teal.code:::new_qenv(), code = code)
# Makes sure the object is on qenv
q2 <- testthat::expect_output(
teal.code::eval_code(q1, code = "print(NROW(ADSL))"),
"150"
)
get_code(q1, deparse = FALSE, names = "ADSL") |>
length() |>
expect_gt(0)
parsed_code <- get_code(q2, deparse = FALSE, names = "iris")
expect_gt(length(parsed_code), 0)
expect_false(is.na(parsed_code))
})
Thanks @averissimo for kind words. This is a joint team effort, so there were multiple people involved in coming up with great ideas and suggestions. For the cases that you found with teal.code/R/utils-code-dependency.R Lines 4 to 5 in 76d1edd
as we are aware of our limitations. |
@m7pr I'm aware of that documentation 🙂 and the examples above have the hint, however, it's not catching it when getting the code. |
ah, got you! Alrighty then, thanks for pointing this up. I think I can have an extra look on this |
We had a call today with @gogonzo and @chlebowa where we decided to simplify the approach. |
Hey, working on a new approach on a separate branch so it's easier to track change of the final approach against the main Incorporated some of the feedback provided by @chlebowa and @averissimo but not all yet. Work in progress |
Hey @averissimo I incorporated your 2 examples in tests in other PR #146 |
closing in favour of #146 |
Fixes #133 Alternative to #139 --------- Signed-off-by: Marcin <[email protected]> Co-authored-by: go_gonzo <[email protected]> Co-authored-by: Aleksander Chlebowski <[email protected]> Co-authored-by: Dawid Kałędkowski <[email protected]> Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Aleksander Chlebowski <[email protected]>
Closes #133
This PR introduces a feature that can be utilized in a broad usage. Currently it only extends
qenv
class, but the big picture is that we will be able to change the way we providedata
inteal::init
.Current behavior
Currently
teal::init
takesteal.data::teal_data()
as an input which takes actual R objects as an input with an extra companion of code specification, which in all cases is the code used to create the R object that is being passed. This results in code duplication: we first create a code to create the object, and then we copy-paste this code into an object specification atteal.data::teal_data()
.Proposed alternative
The alternative to the above, proposed in this PR, is having a functionality called code parser. This functionality understands which parts of the code (passed as a character) is needed to create a specific object (with all it's dependent objects and dependent side-effects). Thanks to that, we don't need to pass object specifications and code separately - we can just pass the code, which will be evaluated and which will be parsed so that under the hood objects are created and their respective code is assigned to them automatically.
This PR only introduces changes to
qenv
object. Further changes to howteal.data::teal_data()
orteal::init
data
parameter work will be needed.qenv
received a new field calledcode_dependency
that is a list needed to restore the object and it's side effects. Below are a few examples on extraction of the code ofADSL
objectcode objects
Side effects
The functionality might be a bit complicated. The main reason for that is the handling of side effects. Often in a code there are side effects that can not be directly connected with specific objects. If you connect objects with assign operators (like
<-
,=
,->
) then it is easy to understand the dependency structure between objects and code lines. However if you have side effects, like the creation of a database connection, that influences all other operations in the code, it is not possible to be guessed just by the static code analysis. Hence we introduce a possibility to pass# @effect object_name
tag at the end of the line, to specify on which objects does this line has effects. The bottleneck of this solution is that, we operate on a parsed code that looses information about comments. The comments are stored in it'ssrcref
attribute that is put intoutils::getParseData()
function, which requires us to have some extra meta-information stored if we want to also restore lines that are side effects.Notes
The relation between objects is assumed to be passed by
<-
,=
or->
assignment operators. No other object creation methods (likeassign
, or<<-
or any non-standard-evaluation method) are supported. This is solved by# @effect
tag