Skip to content

feat(FR): remap mainland cities region->department (#1352 PR-E)#1484

Merged
dr5hn merged 1 commit into
masterfrom
feat/issue-1352-france-cities-remap
Apr 27, 2026
Merged

feat(FR): remap mainland cities region->department (#1352 PR-E)#1484
dr5hn merged 1 commit into
masterfrom
feat/issue-1352-france-cities-remap

Conversation

@dr5hn
Copy link
Copy Markdown
Owner

@dr5hn dr5hn commented Apr 27, 2026

Refs #1352 — does NOT close. The FR equivalent of #1395 (Italy region→province remap). Sibling PRs #1394 (PR-A diff/additions), #1393 (PR-B), #1400 (PR-D), #1392 (PR-C) cover other facets of the same issue and do not remap existing cities.

Customer report (Allier): GET /v1/countries/FR/states/03/cities previously returned [] because every Allier commune sat under the parent region ARA. After this PR, Allier holds 59 cities (Vichy, Moulins, Montluçon, …). Same was true for every other metropolitan department.

What

bin/scripts/fixes/france_cities_remap.py — offline, dependency-free, idempotent. Reassigns 8,727 of 10,079 FR cities from the 12 metropolitan regions plus the Corsica collectivity to the correct INSEE department-level entity (0195, 2A, 2B, 75C). Only state_id and state_code are mutated; name, native, latitude, longitude, wikiDataId, translations etc. are preserved verbatim.

Before / after distribution

State_code level Before After
12 metropolitan regions (ARA, IDF, NOR, PDL, NAQ, BRE, OCC, GES, CVL, BFC, HDF, PAC) 8,699 0
Corsica collectivity (20R) 28 0
Metropolitan departments (0195, 2A, 2B, 75C) 1,351 10,078
Out of scope (NC) 1 1

Top 5 target departments after remap: 55 (500), 52 (428), 59 (345), 62 (238), 2B (237).

Per-resolution-path counts

Path Count
name_unique — single name match 7,441
name_region — one in-region candidate among multiple 418
name_region_multi — multi in-region, closest by coord 181
name_other_region — name match outside region within 25km 0 (cascade falls through to k-NN when far)
proximity_knn — 5-NN inverse-distance vote, capped at 25km 687
Total changed 8,727
Unmapped 0
Skipped (already at dept level / overseas) 1,352

Proximity-pass distance distribution: 499 of 687 within 3 km, max 8.56 km, none above 10 km.

Mapping source

https://geo.api.gouv.fr/communes (Licence Ouverte v2.0 / Etalab — ODbL-1.0 compatible). 34,969 communes bundled at bin/scripts/fixes/data/geo-api-gouv-communes.json. INSEE codeDepartement = our state.iso2 for every metropolitan dept, with one override: 7575C (Paris collectivity).

Notes / known limitations

  • Métropole de Lyon (69M): upstream codeDepartement is just 69 for every dept-69 commune, so all 142 Lyon-region rows go to state 69 (Rhône). Splitting 69M out is left as a follow-up — none of our region-coded rows currently distinguished it either.
  • Historical / merged communes (~660 rows): names like Le Pin-en-Mauges no longer correspond to a separate INSEE commune (merged in the past decade), so they take the dept of their administrative successor via the proximity pass. PR-A's extra_in_csc list (643 entries) provides the surface for a separate cleanup PR if maintainers want to drop the historical names.
  • Idempotent: re-running on the post-remap file produces 0 changes.
  • Customer scenario (Kevin / Allier) verified: state_code=03 now returns 59 cities.

Full methodology and edge cases in .github/fixes-docs/FIX_1352_PR_E_SUMMARY.md.

🤖 Generated with Claude Code

Reassigns 8,727 of 10,079 French cities from the 12 metropolitan regions
plus the Corsica collectivity (20R) to the correct INSEE department-level
state (01-95, 2A, 2B, 75C). Mirrors the IT remap shipped in #1395.

Endpoints like GET /v1/countries/FR/states/03/cities (Allier) used to
return [] because all of Allier's communes sat under the parent region
ARA. After this fix Allier holds 59 cities. Same was true for every
other metropolitan department.

Resolution cascade (offline, dependency-free, idempotent):
1. INSEE name match in current region (region tie-break + nearest coord)
2. INSEE name match anywhere within 25km
3. 5-NN proximity vote weighted by inverse distance, capped at 25km

Only state_id / state_code are mutated. name, native, latitude, longitude,
wikiDataId, translations, population, timezone are preserved verbatim.
0 unmapped, 0 deleted; re-run produces 0 changes.

Bundles the geo.api.gouv.fr commune dataset (Etalab Licence Ouverte v2.0,
ODbL-1.0 compatible) under bin/scripts/fixes/data/ for reproducibility.
Refs #1352 — does not close (sibling PRs A/B/C/D handle other facets).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 27, 2026 15:42
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 27, 2026
@dosubot dosubot Bot added the enhancement New feature or request label Apr 27, 2026
@dr5hn dr5hn merged commit 4560086 into master Apr 27, 2026
2 of 3 checks passed
@dr5hn dr5hn deleted the feat/issue-1352-france-cities-remap branch April 27, 2026 15:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the France mainland city “region → department” parent remap to address issue #1352, aligning contributions/cities/FR.json city state_id/state_code values with INSEE department-level subdivisions instead of metropolitan regions.

Changes:

  • Adds an offline/idempotent remap script (france_cities_remap.py) that resolves each city to an INSEE commune and maps it to the correct department (state.iso2), including the 75 → 75C override.
  • Commits a structured run report (france_cities_remap.report.json) summarizing remap counts and a sample of per-city annotations.
  • Adds fix documentation (FIX_1352_PR_E_SUMMARY.md) describing methodology, validation, and outcomes.

Reviewed changes

Copilot reviewed 1 out of 5 changed files in this pull request and generated 1 comment.

File Description
bin/scripts/fixes/france_cities_remap.py New remap script to reassign FR cities from region-level to department-level state codes/ids using geo.api.gouv.fr commune data.
bin/scripts/fixes/france_cities_remap.report.json Structured output report capturing totals, per-source/per-target distributions, and sample annotations.
.github/fixes-docs/FIX_1352_PR_E_SUMMARY.md Documentation of scope, approach, counts, and validation for the FR cities remap.

Comment on lines +14 to +19
| State_code level | Cities (before) | Cities (after) |
|------------------|----------------:|---------------:|
| Metropolitan region (ARA, IDF, NOR, PDL, NAQ, BRE, OCC, GES, CVL, BFC, HDF, PAC) | 8,699 | 0 |
| Corsica collectivity (`20R`) | 28 | 0 |
| Metropolitan department (01–95, `2A`, `2B`, `75C`) | 1,351 | 10,078 |
| Other (overseas: NC, etc.) | 1 | 1 |
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The before/after distribution table claims the only non-metropolitan bucket after the remap is a single overseas row (NC), but the committed remap report shows at least two cities remapped to overseas department codes (971 and 974). Please reconcile the summary numbers with the actual output (either update the table/wording to include these overseas department assignments, or adjust the script to exclude overseas departments so the table remains correct).

Copilot uses AI. Check for mistakes.
dr5hn added a commit that referenced this pull request Apr 27, 2026
…e_code

After cherry-picking PR-A onto post-PR-E master, ran
france_cities_remap.py (the script committed in PR-E #1484) to
remap the 455 newly-added communes from their authored region codes
(NOR, PDL, ARA, etc.) to the correct INSEE department codes
(50 for Manche, 14 for Calvados, etc.).

Verified: 0 region-coded rows remain, 0 invalid state_ids.
Allier (state_code=03) goes from 59 to 60 cities.

Refs: #1352

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dr5hn added a commit that referenced this pull request Apr 27, 2026
…communes (#1352 PR-A) (#1394)

* feat(FR): diff cities against data.gouv.fr, add missing metropolitan communes (#1352 PR-A)

Adds 455 metropolitan French communes (population ≥ 2,000) that were missing
from contributions/cities/FR.json relative to the canonical INSEE list at
data.gouv.fr. Includes large communes-nouvelles created since 2015 — e.g.,
Cherbourg-en-Cotentin (78K), Évry-Courcouronnes (66K), Saint-Ouen-sur-Seine
(53K), Oullins-Pierre-Bénite (38K).

The diff script (bin/scripts/fixes/france_cities_diff.py) produces a structured
report and a conservative merge proposal:
  - Matches by (state_code, normalised name); normalisation handles œ/æ
    ligatures and lès/lez preposition variants.
  - Department overrides for 2A/2B/48/52/55 follow existing FR.json
    convention.
  - 1,194 cross-region matches (cities under wrong state) are flagged for
    PR-B, not auto-moved.
  - 643 "extra" CSC records (obsolete/merged communes, quartiers, dept names)
    are flagged for PR-C.
  - Overseas territories excluded (PR-D).

Validation: 0 schema errors, 0 cross-reference errors, 0 coord-bounds
violations, 0 exact-name same-state duplicates. Full breakdown in
.github/fixes-docs/FIX_1352_PR_A_SUMMARY.md.

Refs #1352

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(FR): remap PR-A's 455 new communes from region to department state_code

After cherry-picking PR-A onto post-PR-E master, ran
france_cities_remap.py (the script committed in PR-E #1484) to
remap the 455 newly-added communes from their authored region codes
(NOR, PDL, ARA, etc.) to the correct INSEE department codes
(50 for Manche, 14 for Calvados, etc.).

Verified: 0 region-coded rows remain, 0 invalid state_ids.
Allier (state_code=03) goes from 59 to 60 cities.

Refs: #1352

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dr5hn added a commit that referenced this pull request Apr 27, 2026
…hy (#1489)

Customer-facing follow-up to #1349 (Italy) and #1352 (France). Cities
were re-parented onto departments (FR) and provinces (IT) by #1395 /
#1394 / #1393 / #1400 / #1484, but the state records themselves still
carried inconsistent 'level' values, blocking downstream filters like
"all departments == level=2" or "all regions == level=1".

bin/scripts/fixes/states_level_normalise.py drives the change:
  - FR: 29 region-tier rows None -> 1 (13 metro regions, 3 special
        metro collectivities incl. Corse + Alsace + Métropole de Lyon,
        13 overseas regions/collectivities/territories/dependency).
        95 metropolitan departments unchanged at level=2.
  - IT: 103 rows updated. Final state: 20 at level=1
        (15 region + 5 autonomous region) and 106 at level=2
        (80 province + 14 metropolitan city + 6 free municipal
        consortium + 4 decentralized regional entity + 2 autonomous
        province).

Only the 'level' field is touched; idempotent on re-run; non-FR/IT
states untouched.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data:cities enhancement New feature or request large-contribution size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants