Skip to content

Migrate trade and prices#82

Open
lbm364dl wants to merge 22 commits intomainfrom
lbm364dl/migrate-trade-and-prices
Open

Migrate trade and prices#82
lbm364dl wants to merge 22 commits intomainfrom
lbm364dl/migrate-trade-and-prices

Conversation

@lbm364dl
Copy link
Copy Markdown
Collaborator

@lbm364dl lbm364dl commented Apr 8, 2026

Summary

Migrate trade and price computation from the Global repo into the WHEP package.

New exported functions:

  • build_detailed_trade() — bilateral trade matrix with country shares, polity mapping, and optional time extension via CBS
  • build_trade_prices() — global prices (KDollars/tonne) from FAOSTAT bilateral trade data
  • build_primary_prices() — primary item prices with export trade price preference and production value fallback
  • build_cbs_prices() — CBS item prices including residues, proxy estimates, and gap-filling

Bugs found and fixed during real-data testing:

  • build_trade_prices() was reading from faostat-trade-totals (quantities only) instead of faostat-trade-bilateral (has both values and quantities)
  • Cartesian join in .add_residue_prices() from duplicate Cat_1→Herb_Woody mappings in items_prod_full
  • Duplicate rows in extend_time from full outer merge introducing NA group keys that multiplied during tidyr::complete
  • Missing unit in fill_linear .by causing 595k+ duplicate warnings on real data
  • .clean_value_of_production() not handling space-separated FAOSTAT column names from pins, integer→character type mismatch, and setDT mutating caller's tibble by reference

Testing:

  • 113 new test expectations across test_build_trade.R (54) and test_prices.R (59)
  • All functions accept raw data injection for unit testing without pin access
  • Dependency injection via raw_trade parameter instead of mocking whep_read_file
  • Warnings when unmapped items or area codes are silently dropped

Other changes:

  • Register faostat-value-of-production pin
  • All price functions return tibbles (data.table used internally)
  • Disable commas_linter in CI (conflicts with air formatter, see Put a space after a comma? posit-dev/air#330)
  • Add functions to pkgdown reference

Test plan

  • devtools::test() — 795 pass, 2 pre-existing expected failures in CBS
  • rcmdcheck::rcmdcheck() — 0 errors, 0 warnings, 0 notes
  • lintr::lint_package() — no issues
  • Full pipeline verified with real data (inst/scripts/test.R)

🤖 Generated with Claude Code

lbm364dl and others added 18 commits April 6, 2026 21:17
Migrate bilateral_trade.r from Global repo into the WHEP package.
Builds the detailed trade matrix (DTM) from the bilateral_trade pin,
maps trade items to CBS items, aggregates to polity level, and
computes country shares. Optionally extends the time series using CBS
years and fill_linear() gap-filling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Migrate prices.r from Global repo into the WHEP package. Three
exported functions:
- build_trade_prices(): global prices from FAOSTAT trade data
- build_primary_prices(): primary item prices with export/production
  value fallback
- build_cbs_prices(): CBS item prices including processed products,
  crop residues, and proxy estimates for missing items

Inlines afsetools::residues_as_items() logic for residue
classification (Straw, Firewood, Other crop residues).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Toy examples for build_detailed_trade(), build_trade_prices(),
  build_primary_prices(), and build_cbs_prices().
- Test files: test_build_trade.R (8 tests) and test_prices.R (16).
- NSE global variable declarations in utils.R.
- NAMESPACE exports.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
faostat-trade → faostat-trade-totals (aggregated quantities + values)
bilateral_trade → faostat-trade-bilateral (partner-level flows, new pin)
The old bilateral_trade pin used by bilateral_trade.R is unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… trade data

- Rename FAOSTAT-style columns ('reporter country code', 'partner country
  code', 'item code') to internal names after tolower() in .read_and_clean_dtm()
- Replace inline CBS filtering in .extend_dtm_time() with .extract_cbs_ie_for_dtm()
  helper that handles both wide format (import/export columns, from get_wide_cbs())
  and long format (element column, from build_commodity_balances()), normalising
  column names via tolower() before format detection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…detailed_trade

Add raw_trade parameter to build_detailed_trade() for dependency injection,
replacing the need to mock whep_read_file in tests. Add cli warnings when
unmapped items or area codes are dropped. Expand test suite from 4 to 21
test_that blocks (52 expectations) covering the full pipeline, extend_time
path, edge cases, and error handling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ice tests

build_trade_prices() was reading from "faostat-trade-totals" which only
has quantities, not monetary values. Switch to "faostat-trade-bilateral"
which has both "1000 US$" and "tonnes" rows needed for price computation.

Add raw_trade injection parameter, graceful handling when monetary data
is missing, and comprehensive tests for all three price functions
(trade, primary, CBS) covering price computation, fallback hierarchy,
gap-filling, residue pricing, and proxy prices. Also add pipeline
test script.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The merge with category->Herb_Woody mapping produced duplicates because
categories like "Fibres" and "Fruits" have both Woody and Herbaceous
entries. Fix by joining Herb_Woody at the item_cbs_code level instead
of the Cat_1 level, deduplicating to one row per item.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…n prices

Tests that would have caught the Fibres/Fruits cartesian join bug,
plus coverage for multi-item CBS aggregation, original-vs-estimated
proxy preference, unit name variants, infinite price handling, and
year gap-filling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The full outer merge between DTM and CBS data introduced rows with NA
group keys (unit, area_code_partner) for CBS-only years. These NA rows
multiplied during tidyr::complete, producing 595k+ duplicate
year/group combinations. Fix by using CBS years only to extend the year
range in complete() instead of merging CBS rows into the data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Items with multiple units (tonnes, heads) had duplicate year values
within fill_linear groups because unit was missing from .by. This
caused 595k+ duplicate warnings on real data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Handle space-separated FAOSTAT column names ("Item Code" vs "Item.Code"),
convert item_prod_code to character to match items_prod_full, and
tighten element filter to US$ only (excluding international dollars).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
setDT modifies the caller's tibble by reference, so a failed call
leaves column names permanently changed. Using as.data.table creates
a copy, preventing side effects on the original variable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verify that build_primary_prices does not modify the caller's
value_of_production tibble, and that passing the same tibble twice
produces identical results without errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
build_trade_prices and build_primary_prices now wrap their output in
tibble::as_tibble, matching build_cbs_prices and the rest of the
package API. Update tests to use dplyr instead of data.table syntax.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shorten .read_bilateral_trade_for_prices to .read_bilateral_for_prices
(30 char limit), remove commented code in test, and declare ..keep as
global variable for R CMD check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add VoP pin to whep_inputs.csv and rebuild .rda. Update test.R to
use extend_time = TRUE for trade.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lbm364dl lbm364dl force-pushed the lbm364dl/migrate-trade-and-prices branch 2 times, most recently from ceaec36 to 406b9f4 Compare April 8, 2026 13:01
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lbm364dl lbm364dl force-pushed the lbm364dl/migrate-trade-and-prices branch from 406b9f4 to 43db075 Compare April 8, 2026 13:32
@lbm364dl lbm364dl requested a review from eduaguilera April 8, 2026 13:37
lbm364dl and others added 3 commits April 9, 2026 09:16
Since build_trade_prices now returns a tibble, internal helpers that
use data.table syntax on trade_prices need to convert it first.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant