Skip to content

Commit

Permalink
Add new consignment configuration parameters (#140)
Browse files Browse the repository at this point in the history
* Add new configuration parameters - generation_method and file_type - to improve clarity of consignment generation options. Rearrange config file hierarchy based on consignment generator options.

* Add new config parameters to documentation.

* Use simpler fstring syntax in consignments and inspections modules.

* Use more universal, Unix style line endings for notebooks and CSVs.

Authored-by: kellynm <kellynmontgomery@gmail.comgit>
  • Loading branch information
kellynm authored Nov 16, 2021
1 parent b4a93dd commit 114eef8
Show file tree
Hide file tree
Showing 32 changed files with 1,564 additions and 1,350 deletions.
72 changes: 37 additions & 35 deletions config.yml
Original file line number Diff line number Diff line change
@@ -1,45 +1,47 @@
consignment:
boxes:
min: 100
max: 100
generation_method: parameter_based
items_per_box:
default: 200
air:
default: 200
maritime:
default: 200
origins:
- Netherlands
- Mexico
- Israel
- Japan
- New Zealand
- India
- Tanzania
flowers:
- Hyacinthus
- Rosa
- Gerbera
- Agapanthus
- Aegilops
- Protea
- Liatris
- Mokara
- Anemone
- Actinidia
ports:
- NY JFK CBP
- FL Miami Air CBP
- HI Honolulu CBP
- AZ Phoenix CBP
- VA Dulles CBP
- CA San Francisco CBP
- WA Seattle Air CBP
- TX Brownsville CBP
- WA Blaine CBP
# f280_file: F280_sample.csv
aqim_file: aqim_all.csv
#aqim_file: blank_aqim_for_testing.csv
parameter_based:
boxes:
min: 1
max: 100
origins:
- Netherlands
- Mexico
- Israel
- Japan
- New Zealand
- India
- Tanzania
flowers:
- Hyacinthus
- Rosa
- Gerbera
- Agapanthus
- Aegilops
- Protea
- Liatris
- Mokara
- Anemone
- Actinidia
ports:
- NY JFK CBP
- FL Miami Air CBP
- HI Honolulu CBP
- AZ Phoenix CBP
- VA Dulles CBP
- CA San Francisco CBP
- WA Seattle Air CBP
- TX Brownsville CBP
- WA Blaine CBP
input_file:
file_type: AQIM
file_name: blank_aqim_for_testing.csv
contamination:
contamination_unit: item
contamination_rate:
Expand Down
161 changes: 97 additions & 64 deletions docs/consignments.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,79 @@
# Consignment configuration

The consignments can be either purely synthetic or based on AQAS inspection
records (Form 280 or AQIM). Configuration for the consignments is under
the `consignment` key in the configuration file.
## Consignment generation method
The consignments can be either purely synthetic or based on inspection records
(e.g., Form 280 or AQIM). Configuration for the consignments is under the
`consignment` key in the configuration file.

The `generation_method` key can be either `parameter_based` to generate
synthetic consignments based on user-provided list of parameter values, or
`input_file` to create consignments that match inspection records in a CSV
file.

```yaml
consignment:
generation_method: parameter_based
```
## Items per box
To create the boxes (of items) for each consignment, a value for `items_per_box`
needs to be specified.

```yaml
consignment:
items_per_box:
default: 200
```

Only the `default` value for `items_per_box` is required, but the user may also
vary the value by transport pathway. For example, F280 and AQIM inspections
records include information about the consignment pathway, which can be used to
vary the value of `items_per_box`. The following is a configuration for
generating consignments using a file called `AQIM_sample.csv` with
`items_per_box` values that vary by `air` and `maritime` pathways. If the
consignment arrives via an air pathway, one box will contain 200 items. If the
consignment arrives via a maritime pathway, one box will contain 700 items. If
the pathway is not `air` or `maritime`, a default value of 100 items per box is
used.

```yaml
consignment:
generation_method: input_file
items_per_box:
default: 100
air:
default: 200
maritime:
default: 700
input_file:
file_type: AQIM
file_name: AQIM_sample.csv
```

Notice that the values for `items_per_box` are under a `default` key. In the
future, the simulation may support other keys to vary the number of
`items_per_box` by commodity, origin, or port.

## Synthetic consignments

The two main keys for configuration of the synthetic consignment generator
are `boxes` and `items_per_box`. The `min` and `max` values of `boxes`
determine the range of sizes of consignments within the simulation. The
`default` value of `items_per_box` determines how many items are in one
box. An example configuration with consignments with 10 to 100 boxes and
200 items per box, i.e., 2000 to 20,000 items per consignments follows.
The main keys for the `parameter_based` consignment generator are the `min` and
`max` values of `boxes`. These values determine the range of sizes of
consignments within the simulation. In the example configuration below,
consignments will have 10 to 100 boxes per consignments.

```yaml
consignment:
boxes:
min: 10
max: 100
items_per_box:
default: 200
```

Currently, no further settings for `items_per_box` is possible, but in
the future further settings might be added.

The generator adds origin, flower (commodity type), and port (where
consignment was received). These are randomly selected from the lists
specified in the configuration like so:
The generator adds origin, flower (commodity type), and port (where consignment
was received). These are randomly selected from the lists specified in the
configuration. These values are not currently used, but may be used to configure
other parameters in the future (e.g., variable contamination rates by origin or
inspection efficacy by commodity).

```yaml
origins:
Expand Down Expand Up @@ -61,10 +107,26 @@ specified in the configuration like so:
- WA Blaine CBP
```

## F280-based consignments
## Create consignments to match an input file

To use a file of inspection records, set the `generation_method` to `input_file`
and specify the `file_type` and `file_name`. Currently, the options for
`file_type` are `F280` and `AQIM`. Additional type of inspection data can be
supported upon request, or you can format the inspection records in the same way
as F280 or AQIM data, described below. The `file_name` is a path absolute or relative to the place where the Python program is running.

```yaml
consignment:
consignment_generator: input_file
input_file:
file_type: AQIM
file_name: aqim_sample.csv
```

### F280-based consignments

Consignments in the simulation can be based on real F280 records. In that
case, a CSV file needs to be specified using the `f280_file` key.
Consignments in the simulation can be based on real F280 records. In that case,
a CSV file needs to be specified using the `file_name` key.

The CSV is expected to have the following columns:
* QUANTITY which will be used as number of items,
Expand All @@ -75,60 +137,31 @@ The CSV is expected to have the following columns:
* ORIGIN_NM as origin, and
* LOCATION as port (where consignment was received).

The CSV file should be comma-separated (`,`) using double quote for text
fields (`"`). The path is absolute or relative to the place where the
Python program is running.
The CSV file should be comma-separated (`,`) using double quote for text fields
(`"`). The path is absolute or relative to the place where the Python program is
running.


## AQIM-based consignments
### AQIM-based consignments

Consignments in the simulation can also be based on AQIM inspection
records. In that case, a CSV file needs to be specified using the
`aqim_file` key.
Consignments in the simulation can also be based on AQIM inspection records. In
that case, a CSV file needs to be specified using the `file_name` key.

The CSV is expected to have the following columns:
* UNIT which is used to specify the unit (must be items or boxes) used
in QUANTITY.
* QUANTITY which is used as number of items or number of boxes
depending on UNIT specified,
* CARGO_FORM which is used to determine the `items_per_box` value
similar to PATHWAY in F280 (case insensitive),
* UNIT which is used to specify the unit (must be items or boxes) used in
QUANTITY.
* QUANTITY which is used as number of items or number of boxes depending on
UNIT specified,
* CARGO_FORM which is used to determine the `items_per_box` value similar to
PATHWAY in F280 (case insensitive),
* CALENDAR_YR is used for date (YYYY only),
* COMMODITY_LIST is used as the flower (commodity type),
* ORIGIN as origin, and
* LOCATION as port of entry (where consignment was received).

The CSV file should be comma-separated (`,`) using double quote for text
fields (`"`). The path is absolute or relative to the place where the
Python program is running.

## Items per box
To create the boxes (of items) for the simulation, a value for
`items_per_box` needs to be specified. F280 and AQIM inspections records
include information about the consignment pathway, which can be used to
vary the value of `items_per_box`. For example, the following is a
configuration for generating consignments using a file called
`AQIM_sample.csv` with `items_per_box` values that vary by `air` and
`maritime` pathways. If the consignment arrives via an air pathway, one box
will contain 50 items. If the consignment arrives via a maritime pathway,
one box will contain 500 items. If the pathway is not `air` or
`maritime`, a default value of 100 items per box is used.

```yaml
consignment:
aqim_file: aqim_sample.csv
items_per_box:
default: 100
air:
default: 50
maritime:
default: 700
```

Notice that the values for `items_per_box` are under additional key
`default`. In the future, the simulation may support other keys for
specific commodities, origins, or ports.

The CSV file should be comma-separated (`,`) using double quote for text fields
(`"`). The path is absolute or relative to the place where the Python program is
running.

---

Expand Down
72 changes: 37 additions & 35 deletions examples/Montgomery_2021/data/config.yml
Original file line number Diff line number Diff line change
@@ -1,45 +1,47 @@
consignment:
boxes:
min: 100
max: 100
generation_method: parameter_based
items_per_box:
default: 200
air:
default: 200
maritime:
default: 200
origins:
- Netherlands
- Mexico
- Israel
- Japan
- New Zealand
- India
- Tanzania
flowers:
- Hyacinthus
- Rosa
- Gerbera
- Agapanthus
- Aegilops
- Protea
- Liatris
- Mokara
- Anemone
- Actinidia
ports:
- NY JFK CBP
- FL Miami Air CBP
- HI Honolulu CBP
- AZ Phoenix CBP
- VA Dulles CBP
- CA San Francisco CBP
- WA Seattle Air CBP
- TX Brownsville CBP
- WA Blaine CBP
# f280_file: F280_sample.csv
#aqim_file: aqim_box_insp_unit.csv
#aqim_file: blank_aqim_for_testing.csv
parameter_based:
boxes:
min: 1
max: 100
origins:
- Netherlands
- Mexico
- Israel
- Japan
- New Zealand
- India
- Tanzania
flowers:
- Hyacinthus
- Rosa
- Gerbera
- Agapanthus
- Aegilops
- Protea
- Liatris
- Mokara
- Anemone
- Actinidia
ports:
- NY JFK CBP
- FL Miami Air CBP
- HI Honolulu CBP
- AZ Phoenix CBP
- VA Dulles CBP
- CA San Francisco CBP
- WA Seattle Air CBP
- TX Brownsville CBP
- WA Blaine CBP
input_file:
file_type: AQIM
file_name: blank_aqim_for_testing.csv
contamination:
contamination_unit: box
contamination_rate:
Expand Down
Loading

0 comments on commit 114eef8

Please sign in to comment.