feat(postcodes/CZ): 2,695 Česká pošta PSČ codes (#1039)#1512
Merged
Conversation
Adds the full Czech 5-digit PSČ (poštovní směrovací číslo / postal code) list joined with okres (district) and obec (municipality) data from the 1nfinity84/PSC-Okres-Obec-OkresCZ mirror. Why --- Closes the CZ gap on issue #1039. The previously-tracked soit-sk/czech_republic_post_codes_2007 source shipped only a Perl scraper for the 2007 stamps DB and required Česká pošta b2b TLS access (blocked from this harness). 1nfinity84's mirror is a static JSON join requiring no scraping. Coverage -------- - 2,695 codes / 100% state FK - 77 of 90 CSC CZ states covered (76 districts + Praha capital city) State FK strategy ----------------- Direct district-name match against CSC's 76 okres entries plus a single alias 'Praha' -> 'Praha, Hlavní město' (CSC iso2 '10', the capital city which is administered separately from the surrounding Praha-východ/Praha-západ districts). For PSCs whose source value is an array (multiple districts share the same PSC), picks the first as primary state. Locality -------- Each record carries a locality_name derived from the source's psc_to_obec list. Parenthetical fragments like '(část)' (part of) or '(Praha 10)' are stripped for readability. License ------- 1nfinity84/PSC-Okres-Obec-OkresCZ: no formal LICENSE file. Upstream chain: Česká pošta + ČSÚ open lookups -> rotten77's SQL dump -> 1nfinity84's static JSON join. Tier 5 per #1039 license-tier policy. Each row: source: "ceska-posta-via-1nfinity84" Validation ---------- - python3 -m py_compile passes - 100% regex match (^\d{3}\s?\d{2}$) - 100% state_id valid + state.country_id == 58 + state_code agrees - No auto-managed fields (id, created_at, updated_at, flag) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
CSC Validation ReportPR Format
Labels applied:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Only ships a Perl scraper for the 2007 stamps DB; would need to run scraper + fetch from Česká pošta b2b (memory: TLS handshake fails)Source
1nfinity84/PSC-Okres-Obec-OkresCZ—mapping_data.json(4 MB)Why this source (not soit-sk)
The previously-tracked
soit-sk/czech_republic_post_codes_2007shipped only a Perl scraper requiring Česká pošta b2b TLS access (blocked from this harness per memory). 1nfinity84's mirror is a static JSON join — no scraping needed, refresh by upstream re-publish.State FK strategy
Direct district-name match against CSC's 76 okres entries plus a single alias:
'Praha'→'Praha, Hlavní město'(CSC iso210, the capital city which is administered separately from the surroundingPraha-východ/Praha-západdistricts)For PSCs whose source value is an array (multiple districts share the same PSC), picks the first as primary state.
Locality
Each record carries a
locality_namederived from the source'spsc_to_obeclist. Parenthetical fragments like(část)(part of) or(Praha 10)are stripped for readability.Distribution (top 5)
Test plan
python3 -m py_compile bin/scripts/sync/import_czech_postcodes.py^\d{3}\s?\d{2}$id,created_at,updated_at,flag)🤖 Generated with Claude Code