feat(postcodes/CN): 22,656 China Post codes (#1039)#1485
Merged
Conversation
Adds the mumuy/data_post 6-digit postcode dataset (MIT-licensed, ~45⭐) covering all 31 mainland Chinese provinces, autonomous regions, and direct-administered municipalities. Why --- Closes the CN gap on issue #1039. mumuy/data_post is the largest mature MIT-licensed mirror; the official China Post (中国邮政) publishes its full postal-code list only via paid API. Coverage -------- - 22,656 codes / 100% state FK resolution - 31 of 34 CSC CN states covered (HK / MO / TW handled as separate CSC countries) - All 60 source 2-digit prefixes mapped via PREFIX_TO_ISO2 (derived from XX0000 trunk codes + per-prefix province-name vote count) State FK strategy ----------------- Source has no province column — the 22,656 values are district/town names in Chinese that would not name-match against states.json reliably. Hand-curated 60-entry 2-digit prefix table is the only reliable resolver and pulls 100% FK. License ------- MIT (clean redistribution). Each row carries `source: "china-post-via-mumuy"` for export-time attribution. Validation ---------- - python3 -m py_compile passes - 100% regex match (^\d{6}$) - 100% state_id valid + state.country_id == 45 + state_code agrees - No auto-managed fields (id, created_at, updated_at, flag) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
CSC Validation ReportPR Format
Labels applied:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Source
mumuy/data_post— MIT, ~45⭐, the largest mature mirrorlist.json(dict keyed by 6-digit postcode → most-specific Chinese district/town)State FK strategy
The source has no province column — 22,656 values are Chinese district/town names that would not name-match against states.json reliably. The
PREFIX_TO_ISO2table maps each 2-digit code prefix to one of the 31 CSC CN states; entries derived fromXX0000trunk codes + per-prefix province-name vote count.China's 6-digit code structure (per source README):
Coverage notes
flying-itmen-eagle/eagle-tw-open-data).14is unused by China Post (no 140000-149999 codes).Test plan
python3 -m py_compile bin/scripts/sync/import_china_postcodes.py^\d{6}$id,created_at,updated_at,flag)🤖 Generated with Claude Code