-
Notifications
You must be signed in to change notification settings - Fork 1
Contribution: 671 ecoinvent→EF 3.1 biosphere links + 3 likely incorrect mappings #8
Description
Summary
We (Eaternity) have been building a unified biosphere flow mapping that triangulates across three registries (ecoinvent biosphere3, BAFU/UVEK, and EF 3.1) using 8 evidence sources. We recently integrated the D-D-S consensus flows from randonneur_data as one of those evidence sources. This integration was very valuable — it confirmed 4,226 of our existing GLAD-based links and added 119 new unique cross-registry links.
As part of this integration, we identified:
- 671 ecoinvent biosphere3 → EF 3.1 links that we have but
randonneur_datacurrently lacks - 3 likely incorrect mappings in the current data that we think should be reviewed
Proposed New Links (671)
The full set of 671 links is available in randonneur format (ready for direct import) as a gist:
eaternity_biosphere_feedback.json (456 KB, CC-BY-4.0)
Methodology
These links were derived by triangulating biosphere3, BAFU/UVEK (32,525 flows), and EF 3.1 (93,815 flows) using 8 evidence sources:
| Evidence Source | Weight | Description |
|---|---|---|
exact_name_compartment |
0.40 | Exact name match within same compartment |
cf_fingerprint_agreement |
0.25 | Characterization factor agreement across ≥2 LCIA methods |
glad_uuid_bridge |
0.20 | GLAD UUID-based authoritative link |
dds_consensus_bridge |
0.18 | D-D-S consensus (this package!) |
existing_mapping |
0.15 | Pre-existing cross-registry mapping |
base_name_compartment_strong |
0.12 | Base name match (strong, same qualifier) |
base_name_compartment |
0.08 | Base name match (weak) |
paren_comma_name_compartment |
0.10 | Parenthetical/comma variant match |
The 671 proposed links all have a minimum combined confidence score of 0.30. Evidence breakdown:
- 651 have
exact_name_compartmentevidence - 661 have
existing_mappingevidence (from BAFU triangulation) - 530 have
glad_uuid_bridgeevidence (GLAD confirms but D-D-S lacks the specific compartment variant) - 40 have
cf_fingerprint_agreement(CF values match across registries)
Each entry in the gist follows the randonneur replace format with source UUID, target UUID, names, contexts, and a comment explaining the evidence.
3 Likely Incorrect Mappings
We found 3 mappings in the current ecoinvent-3.x-biosphere-EF-3.1-biosphere datasets that appear to be errors. All originate from GLAD:
1. Methyl acrylate → methacrylate (10 links across compartments)
- bio3 name: Methyl acrylate
- bio3 CAS: 000096-33-3
- ef31 name: methacrylate
- D-D-S method: glad
- Problem: These are different chemical compounds:
- Methyl acrylate (CAS 96-33-3) = methyl ester of acrylic acid (CH₂=CHCOOCH₃)
- Methacrylate = methacrylic acid / methyl methacrylate (CAS 79-41-4 / 80-62-6)
- Different molecular structure (methyl acrylate lacks the alpha-methyl group that defines methacrylate)
- Different ecotoxicity CFs in USEtox: methyl acrylate has higher acute toxicity (LC50 freshwater fish 29 mg/L vs ~200+ mg/L for methyl methacrylate)
- Affected compartments: air (5 sub-compartments), water (4 sub-compartments), soil (1)
- Recommendation: Remove these 10 links. Methyl acrylate should map to "methyl acrylate" or "methyl propenoate" in EF 3.1, not "methacrylate".
2. Imazethapyr → pursuit (3 links)
- bio3 name: Imazethapyr
- bio3 CAS: 081335-77-5
- ef31 name: pursuit
- D-D-S method: glad (air), algorithmic (water, soil)
- Problem: "Pursuit" is a trade name (BASF herbicide brand) rather than a chemical name. While Pursuit does contain imazethapyr as the active ingredient, EF 3.1 should use standard chemical nomenclature, not brand names. This creates ambiguity:
- A formulated product ("Pursuit") contains adjuvants and inert ingredients beyond the active substance
- Trade names can refer to different formulations over time
- Other databases won't recognize "pursuit" as imazethapyr
- Affected compartments: air/non-urban, soil/agricultural, water/ground-
- Recommendation: Rename the EF 3.1 target to "imazethapyr" (the IUPAC/ISO common name used in all other pesticide databases: PAN, PPDB, PubChem).
3. HCFC-140 CAS number inconsistency (5 affected links)
- bio3 name: Ethane, 1,1,1-trichloro-, HCFC-140
- ef31 name: 1,1,1-trichloroethane
- Problem: The air emission entries carry CAS 000079-00-5, which is the CAS number for 1,1,2-trichloroethane (a different isomer). The correct CAS for 1,1,1-trichloroethane (HCFC-140) is 000071-55-6.
- The same bio3 flow's water emission entries correctly use CAS 000071-55-6
- So within the same dataset, HCFC-140 has two different CAS numbers depending on compartment
- These isomers have very different properties:
- 1,1,1-trichloroethane (CAS 71-55-6): ozone-depleting substance, ODP = 0.12
- 1,1,2-trichloroethane (CAS 79-00-5): not ozone-depleting, but higher cancer risk (IARC Group 3)
- Using the wrong CAS would pull incorrect characterization factors from USEtox
- Affected compartments: 5 air sub-compartments have the wrong CAS
- Recommendation: Correct CAS to 000071-55-6 for all air emission entries of "Ethane, 1,1,1-trichloro-, HCFC-140"
Additionally, we noticed that "Ethane, 1,1,2-trichloro-" (without the trifluoro suffix) maps to "Hydrocarbons, chlorinated" (a generic group flow) in 4 air compartments. This is a significant loss of specificity — EF 3.1 does have "1,1,2-trichloroethane" as a specific flow (used in the air/unspecified compartment). The other 4 compartments should also map to the specific flow rather than the generic group.
Our Integration Results
For reference, integrating the D-D-S consensus flows into our mapping yielded:
| Metric | Value |
|---|---|
| D-D-S links loaded | 4,345 |
| Confirms our GLAD links | 4,226 (97.3%) |
| New unique links from D-D-S | 119 |
| Clusters strengthened | 993 |
| Cross-block merges | 22 |
| CF conflicts introduced | 0 |
| Regressions | 0 |
The integration was smooth and highly valuable. The randonneur format and CC-BY-4.0 licensing made it straightforward to consume programmatically.
Context
We maintain a unified biosphere flow mapping across 84,276 flow clusters (6,580 cross-registry) as part of our open LCA infrastructure. The mapping is used to bridge BAFU/UVEK (Swiss Federal Office for the Environment) and ecoinvent flows to the EF 3.1 framework for multi-impact assessment. Our pipeline runs weekly automated updates against randonneur_data to stay current.
Happy to provide more detail on any of these findings, or to submit the 671 links as a PR in whatever format works best for the project.
🤖 Generated with Claude Code