feat(postcodes/TH): bulk-import 1,189 Thailand postcodes via Thai Post (#1039)#1454
Merged
Conversation
#1039) Source: Thai Post catalogue redistributed via the chawatvish/thailand_postcode_json mirror. All 77 source provinces resolve at 100% via direct postcode-prefix lookup — Thai Post's 5-digit code carries the province in its first two digits, which exactly match CSC's 2-digit numeric state.iso2 (e.g. 10 = Bangkok, 50 = Chiang Mai). Records dedupe at (postcode, district) granularity. Refs #1039. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
CSC Validation ReportPR Format
Labels applied:
|
… postal prefix
The original importer assumed Thai Post's 5-digit postal prefix matches
ISO 3166-2:TH iso2 ("Thai Post's 5-digit code carries the province in
its first two digits, which exactly match CSC's 2-digit numeric
state.iso2"). This is wrong for ~10 provinces:
- Postal-prefix-land treats Bangkok and Samut Prakan as one zone
(both 10xxx), so every later province's postal prefix is off-by-one
from CSC's iso2.
- Concretely: 11xxx is Nonthaburi (not Samut Prakan), 12xxx is
Pathum Thani (not Nonthaburi), ..., 18xxx is Saraburi (not Chai Nat).
- ~10% of records (115+) were attributed to the wrong province; the
most visible miss was Saraburi (iso2=19) being entirely absent
while its 20 districts ended up under Chai Nat.
The source feed already nests every postcode under its real province
name in Thai script. Switching to native-name lookup against
states.json gives the correct mapping:
- 11xxx → Nonthaburi (12)
- 17xxx → Chai Nat (18)
- 18xxx → Saraburi (19)
- 77/78 provinces covered (only Pattaya special metro admin missing,
as expected — it's not a separate entry in the source's 77-province
list).
Two CSC-side typos required aliases:
- Bangkok native is "กรุงเทพฯ" in CSC vs "กรุงเทพมหานคร" (formal) in source
- Nan native is "แนน" in CSC (typo) vs "น่าน" (correct) in source
Both handled by NATIVE_ALIASES so the fix doesn't depend on a separate
states.json correction.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 1,189 Thailand postcodes spanning all 77 Thai provinces, sourced
from the
chawatvish/thailand_postcode_jsonmirror of Thai Post'scatalogue.
(postcode, district)pair.How it works
Thai Post's 5-digit code carries the province administrative number in
its first two digits, and CSC's
states.jsonalready uses the same2-digit numeric
iso2for Thai provinces (10 = Bangkok, 50 = ChiangMai, 81 = Krabi, etc.). State resolution is a direct
code[:2]lookup — no name normalisation needed.
Source & licence
by chawatvish.
"source": "thai-post-via-chawatvish"Validation
python3 -m py_compileclean.country.postal_code_regex(^(\d{5})$).state_idwhosecountry_id == 219and whose
iso2matchesstate_code.id,created_at,updated_at,flag).Test plan
Refs #1039.
🤖 Generated with Claude Code