-
Notifications
You must be signed in to change notification settings - Fork 8
Add OPTIONAL fast forward button after data import #233
Description
Is your feature request related to a problem? Please describe.
Sometimes users are simply exploring the WQP data (not using it for CWA assessments or other important analyses) and would like to fast forward the data cleaning and filtering process so they can jump right into viewing the maps and figures on the exploration tab.
Describe the solution you'd like
In theory, we could add a 'Fast Forward' button after the import tab that would run a series of default data cleaning and filtering steps all at once. This would take some time to run, so users would need a pop up indicating that it is still running and they need to wait. However, once complete they could jump right into data exploration.
I started exploring what the defaults might be (below). This needs further research and discussion with users to better understand what would be most useful.
## TADAShiny MODULE 1 DEFAULTS BUTTON ##
## this would come after overview tab?
# Creates "working" df - which includes everything in the raw + all the flags
# "Clean" df used eventually for analysis is a subset generated
# from the "working" df
# in the app, data is flagged and removed at the end
# when the Load Review Data button on the review tab is hit OR when the
# clean dataset is generated by a user selecting that button at the bottom of all tabs
# those buttons should still work if these functions are all run in batch....
# Get data, not actually applicable to button but here just to start the example
Data_WV_raw <- TADA_DataRetrieval(
startDate = "2020-03-14",
huc = "02070004",
applyautoclean = TRUE,
ask = FALSE # suggest to add this to the import tab to give counts to shiny app
# users in the shiny app before the actual download
)
### this is where the button would start ####
# Flags non-surface water media for exclusion from CWA assessment use case
# NOTE: I think this function still needs to be added to the "flag" tab
Data_WV <- TADA_AnalysisDataFilter(
Data_WV_raw,
clean = FALSE,
surface_water = TRUE,
ground_water = FALSE,
sediment = FALSE
)
# Flags single org duplicates
# REQUIRED
Data_WV <- TADA_FindPotentialDuplicatesSingleOrg(
Data_WV)
# discussion point: retain only unique flagged results, this happens in the shiny app via apps internal flag tables
# # GreenBay_FoxRiver <- dplyr::filter(GreenBay_FoxRiver, TADA.SingleOrgDup.Flag == "Unique")
# Prepare censored results
# REQUIRED
# discuss default method of 1/2 DL
Data_WV <- TADA_SimpleCensoredMethods(
Data_WV,
nd_method = "multiplier",
nd_multiplier = 0.5,
od_method = "as-is",
od_multiplier = "null"
)
# Remove multiple org duplicates
# OPTIONAL - slow right now, I removed it from the flag tab
# let's add back to flag tab later once TADA team speeds it up
# Data_WV <- TADA_FindPotentialDuplicatesMultipleOrgs(
# Data_WV
# )
# Data_WV <- dplyr::filter(
# Data_WV,
# TADA.ResultSelectedMultipleOrgs == "Y"
# )
# Filter out remaining irrelevant data, NA's and empty cols
# we are losing some observations here, but in the shiny app
# they need to be flagged not removed until the "clean" data is generated...
# suggest TADA team addressed this issues as update to the TADA_AutoFilter function
# (add clean = FALSE option)
# NOTE: this function also still needs to be added to the "flag" tab
# REQUIRED
# unique(Data_WV$TADA.ResultMeasureValueDataTypes.Flag)
# sum(is.na(Data_WV$TADA.ResultMeasureValue))
Data_WV <- TADA_AutoFilter(Data_WV)
# unique(Data_WV$TADA.ResultMeasureValueDataTypes.Flag)
# sum(is.na(Data_WV$TADA.ResultMeasureValue))
# Flag results with QC issues
# this function already runs a lot of the functions that run individually on the flag tab in batch
# REQUIRED
Data_WV <- TADA_RunKeyFlagFunctions(
Data_WV,
clean = FALSE
)
# CM note for discussion: Should results with NA units be dealt with now as well
# we can considering adding this to TADA_AutoFilter? not yet addressed....
Data_WV <- Data_WV[!is.na(Data_WV$TADA.ResultMeasure.MeasureUnitCode),]
# Flag above and below threshold. Suggest we do not allow automated removal...
# OPTIONAL - needs team discussion
Data_WV <- TADA_FlagAboveThreshold(Data_WV, clean = FALSE, flaggedonly = FALSE)
Data_WV <- TADA_FlagBelowThreshold(Data_WV, clean = FALSE, flaggedonly = FALSE)
# Harmonize synonyms
# should have manual review...?? team discussion
Data_WV <- TADA_HarmonizeSynonyms(Data_WV)
# calculate TN and TP, ideally TADA_HarmonizeSynonyms should be run first
# TADA Team needs to address open metadata issue: https://github.com/USEPA/EPATADA/issues/588
Data_WV <- TADA_CalculateTotalNP(Data_WV, daily_agg = "max")
# flag coordinate issues and change coordinate sign if appropriate?
# discuss where this belongs in shiny apps (mod 1 or 2?)
Data_WV = TADA_FlagCoordinates(Data_WV, clean_outsideUSA = "change sign", clean_imprecise = FALSE)
# same as above, discuss where this belongs in shiny apps (mod 1 or 2?)
# if df has NA lons from USGS those must be addressed before TADA_MakeSpatial
# can be run in module 2...
# Flag rows with NA lons for removal from df
# we are losing some observations here, but in the shiny app
# they need to be flagged not removed until the "clean" data is generated...
# EPATADA team should considering adding this to TADA_FlagCoordinates? not yet addressed....
Data_WV <- Data_WV[!is.na(Data_WV$LongitudeMeasure),]
Data_WV <- Data_WV[!is.na(Data_WV$LatitudeMeasure),]
# lets discuss clean vs. working dataset in relation with this "batch" optionDescribe alternatives you've considered
It may not be appropriate or useful to add this button, since users have different use cases and need more flexibility in their data cleaning and filtering processes. The defaults will likely not work for everyone and it might not be worth adding. It could also be confusing to users if they don't run through each data cleaning and filtering step themselves.
Additional context
This feature request originally came from Sarah Wheeler (EPA R8). Suggest to follow up with her to discuss desired defaults.
Reminders for TADA contributors addressing this issue
New features and/or edits should include the following work:
-
Create or edit the code.
-
Document all code using line/inline and/or multi-line/block comments
to describe what is does. -
Create or edit tests in tests/testthat folder to help prevent and/or
troubleshoot potential future issues. -
If your code edits impact other functionality in the shiny
app, ensure those are updated as well. -
Run styler::style_pkg(), devtools::document(), and devtools::check()
and address any new notes or issues before creating a pull request.