Skip to content

Conversation

@Jamiller137
Copy link

  • Replaced partitioned tables with search_name / placex.
    • placex queries are now filtered by rank_address 26–27 to align with the street rank.
    • Partition arguments are now unused.
  • Commented out is_frozen() checks and related imports.
    • Dead code is not cleaned up yet.
  • Adapted test_add_tiger_data_database_frozen() to test --drop imports
    • Renamed to test_add_tiger_data_after_drop_updates()

Testing

Verified make lint, make pytest, and the recommended first steps in #2463:

./nominatim-cli.py import \
  --osm-file test/testdb/apidb-test-data.pbf \
  --no-updates

./nominatim-cli.py add-data \
  --tiger-data test/testdb/tiger/

now gives:

2025-12-05 23:21:43: Found 1 CSV files in path test/testdb/tiger/
......
2025-12-05 23:21:45: Creating indexes on Tiger data

@lonvia
Copy link
Member

lonvia commented Dec 8, 2025

I assume you are attacking #2463 here. The initial idea was to change only the TIGER import code but I really like your approach of changing the partition functions, so that even an add-data still works. The only drawback right now is that you are changing the functions globally for everything including the import. We don't want that. As long as the database isn't frozen, we'd want to use the partitioned helper tables.

So here is what I would suggest trying out: instead of modifying the partition functions in place, create a new file in lib-sql/functions that contains your modified partition functions. Then have the freeze function install your modified functions around here. Subsequent calls to add tiger and other data then will automatically use the modified functions.

You'd also need to modifiy the tests for adding data and TIGER data to test against frozen and non-frozen databases.

@Jamiller137
Copy link
Author

@lonvia Thank you for the feedback and direction! It appears that #3906 also addresses this issue. Should this be closed? Otherwise, I am happy to implement these or any other changes you'd suggest.

@lonvia
Copy link
Member

lonvia commented Dec 12, 2025

As I said, I find your approach very interesting, but it's not clear if it will work out in the end. So I wouldn't mind if you both keep working on your respective approaches and we see in the end what works better. We will merge only one of the two eventually, of course, so I'll understand if you don't feel like continuing. For me, it's more the working on a PR the things to be learned that counts. If it gets merged or not, doesn't matter so much.

@Jamiller137
Copy link
Author

I still need to update tests but freeze now installs the modified partition functions. Currently, this does not handle add-data for databases imported with --no-updates without first freezing.

Copy link
Member

@lonvia lonvia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few preliminary review comments, I hope that is okay at this stage.

Here is an idea how you might have an easier time getting the SQL functions right: start on the master branch and replace the functions in lib-sql/functions/partition-functions,sql one by one with your implementation using placex. Every time you have rewritten a function, make sure that all tests still pass before going to the next. Once you have all replaced and all tests pass, move the function to your new file and reinstate the original implementation. That way you can be quite sure that the SQL code works correctly before working on the remaining Python code for replacing the functions.

DROP FUNCTION IF EXISTS find_road_with_postcode(integer, text) CASCADE;
DROP FUNCTION IF EXISTS getNearestRoadPlaceId(integer, geometry) CASCADE;
DROP FUNCTION IF EXISTS getAddressName(integer, geometry) CASCADE;
DROP FUNCTION IF EXISTS getNearestParallelRoadFeature(integer, geometry) CASCADE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to explicitly drop the function, just always use 'CREATE OR REPLACE FUNCTION' below. Note that it will only work properly when the function signature is exactly the same. But that should be the case here.

WHILE search_diameter < 0.01 LOOP
FOR r IN
SELECT place_id FROM placex
WHERE ST_DWithin(line, geometry, search_diameter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to restrict to road feature here. This is done with: rank_address between 26 and 27.

AND class = 'highway'
ORDER BY ST_Distance(centroid,
(SELECT ST_Centroid(geometry)
FROM placex WHERE postcode = postcode LIMIT 1)) ASC LIMIT 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues with this query: there is no condition that goes to an index on placex, so it will scan over the entire 200mio entries. And postcodes are not in the placex table but in a separate table location_postcode.

So the better strategy here that solves both it to find the postcode in the location postcode table and then join on placex to find the nearest road for each result.

FROM placex
WHERE centroid && ST_Expand(point, 0.015)
AND rank_address between 26 and 27
AND class = 'highway'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional class check is not necessary here. Ranks 26 and 27 are guaranteed to be roads.

def install_frozen_partition_functions(conn: Connection, config: Configuration) -> None:
"""Frozen versions of partition-functions.sql"""

frozen_functions_file = config.lib_dir.sql / 'functions' / 'frozen-db-functions.sql'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass # No options

def run(self, args: NominatimArgs) -> int:
from ..tools import freeze
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that function-level import are frowned upon but it is intentional here. If your Python IDE complains about it, please just silence the warning.

END
$$
LANGUAGE plpgsql STABLE PARALLEL SAFE;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the function in this file should also check that the partion ID in placex matches the input parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants