-
-
Notifications
You must be signed in to change notification settings - Fork 789
wip: use search_name and placex for imports on dropped #3900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
wip: use search_name and placex for imports on dropped #3900
Conversation
|
I assume you are attacking #2463 here. The initial idea was to change only the TIGER import code but I really like your approach of changing the partition functions, so that even an So here is what I would suggest trying out: instead of modifying the partition functions in place, create a new file in You'd also need to modifiy the tests for adding data and TIGER data to test against frozen and non-frozen databases. |
|
As I said, I find your approach very interesting, but it's not clear if it will work out in the end. So I wouldn't mind if you both keep working on your respective approaches and we see in the end what works better. We will merge only one of the two eventually, of course, so I'll understand if you don't feel like continuing. For me, it's more the working on a PR the things to be learned that counts. If it gets merged or not, doesn't matter so much. |
…d installs frozen-db-functions.sql
|
I still need to update tests but freeze now installs the modified partition functions. Currently, this does not handle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few preliminary review comments, I hope that is okay at this stage.
Here is an idea how you might have an easier time getting the SQL functions right: start on the master branch and replace the functions in lib-sql/functions/partition-functions,sql one by one with your implementation using placex. Every time you have rewritten a function, make sure that all tests still pass before going to the next. Once you have all replaced and all tests pass, move the function to your new file and reinstate the original implementation. That way you can be quite sure that the SQL code works correctly before working on the remaining Python code for replacing the functions.
| DROP FUNCTION IF EXISTS find_road_with_postcode(integer, text) CASCADE; | ||
| DROP FUNCTION IF EXISTS getNearestRoadPlaceId(integer, geometry) CASCADE; | ||
| DROP FUNCTION IF EXISTS getAddressName(integer, geometry) CASCADE; | ||
| DROP FUNCTION IF EXISTS getNearestParallelRoadFeature(integer, geometry) CASCADE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to explicitly drop the function, just always use 'CREATE OR REPLACE FUNCTION' below. Note that it will only work properly when the function signature is exactly the same. But that should be the case here.
| WHILE search_diameter < 0.01 LOOP | ||
| FOR r IN | ||
| SELECT place_id FROM placex | ||
| WHERE ST_DWithin(line, geometry, search_diameter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to restrict to road feature here. This is done with: rank_address between 26 and 27.
| AND class = 'highway' | ||
| ORDER BY ST_Distance(centroid, | ||
| (SELECT ST_Centroid(geometry) | ||
| FROM placex WHERE postcode = postcode LIMIT 1)) ASC LIMIT 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two issues with this query: there is no condition that goes to an index on placex, so it will scan over the entire 200mio entries. And postcodes are not in the placex table but in a separate table location_postcode.
So the better strategy here that solves both it to find the postcode in the location postcode table and then join on placex to find the nearest road for each result.
| FROM placex | ||
| WHERE centroid && ST_Expand(point, 0.015) | ||
| AND rank_address between 26 and 27 | ||
| AND class = 'highway' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The additional class check is not necessary here. Ranks 26 and 27 are guaranteed to be roads.
| def install_frozen_partition_functions(conn: Connection, config: Configuration) -> None: | ||
| """Frozen versions of partition-functions.sql""" | ||
|
|
||
| frozen_functions_file = config.lib_dir.sql / 'functions' / 'frozen-db-functions.sql' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply use SQLPreprocessor.run_sql_file() here.
| pass # No options | ||
|
|
||
| def run(self, args: NominatimArgs) -> int: | ||
| from ..tools import freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that function-level import are frowned upon but it is intentional here. If your Python IDE complains about it, please just silence the warning.
| END | ||
| $$ | ||
| LANGUAGE plpgsql STABLE PARALLEL SAFE; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the function in this file should also check that the partion ID in placex matches the input parameter.
search_name/placex.placexqueries are now filtered byrank_address26–27 to align with the street rank.is_frozen()checks and related imports.test_add_tiger_data_database_frozen()to test--dropimportstest_add_tiger_data_after_drop_updates()Testing
Verified
make lint,make pytest, and the recommended first steps in #2463:now gives: