Skip to content

Commit 8b65889

Browse files
author
Scott Havens
authored
Merge pull request #184 from scotthavens/gridded_data
Input data overhaul
2 parents b12a27b + bdd04a5 commit 8b65889

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+1410
-1889
lines changed

docs/api/smrf.data.rst

Lines changed: 28 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,34 +4,50 @@ smrf.data package
44
Submodules
55
----------
66

7-
smrf.data.loadData module
8-
-------------------------
7+
smrf.data.csv module
8+
--------------------
99

10-
.. automodule:: smrf.data.loadData
10+
.. automodule:: smrf.data.csv
1111
:members:
1212
:undoc-members:
1313
:show-inheritance:
1414

15-
smrf.data.loadGrid module
16-
-------------------------
15+
smrf.data.hrrr\_grib module
16+
---------------------------
1717

18-
.. automodule:: smrf.data.loadGrid
18+
.. automodule:: smrf.data.hrrr_grib
1919
:members:
2020
:undoc-members:
2121
:show-inheritance:
2222

23-
smrf.data.loadTopo module
24-
-------------------------
23+
smrf.data.load\_data module
24+
---------------------------
2525

26-
.. automodule:: smrf.data.loadTopo
26+
.. automodule:: smrf.data.load_data
2727
:members:
2828
:undoc-members:
2929
:show-inheritance:
3030

31-
smrf.data.mysql\_data module
32-
----------------------------
31+
smrf.data.load\_topo module
32+
---------------------------
3333

34-
.. automodule:: smrf.data.mysql_data
34+
.. automodule:: smrf.data.load_topo
35+
:members:
36+
:undoc-members:
37+
:show-inheritance:
38+
39+
smrf.data.netcdf module
40+
-----------------------
41+
42+
.. automodule:: smrf.data.netcdf
43+
:members:
44+
:undoc-members:
45+
:show-inheritance:
46+
47+
smrf.data.wrf module
48+
--------------------
49+
50+
.. automodule:: smrf.data.wrf
3551
:members:
3652
:undoc-members:
3753
:show-inheritance:

docs/getting_started/run_smrf.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ below is the function :mod:`run_smrf <smrf.framework.model_framework.run_smrf>`.
3030
s.loadTopo()
3131
3232
# initialize the distribution
33-
s.initializeDistribution()
33+
s.create_distribution()
3434
3535
# initialize the outputs if desired
3636
s.initializeOutput()

docs/user_guide/auto_config.rst

Lines changed: 10 additions & 102 deletions
Original file line numberDiff line numberDiff line change
@@ -112,115 +112,15 @@ csv
112112
|
113113
114114

115-
mysql
116-
-----
117-
118-
| **air_temp**
119-
| name of the table column containing station air temperature
120-
| *Default: air_temp*
121-
| *Type: string*
122-
|
123-
124-
| **cloud_factor**
125-
| name of the table column containing station cloud factor
126-
| *Default: cloud_factor*
127-
| *Type: string*
128-
|
129-
130-
| **data_table**
131-
| name of the database table containing station data
132-
| *Default: tbl_level2*
133-
| *Type: string*
134-
|
135-
136-
| **database**
137-
| name of the database containing station data
138-
| *Default: weather_db*
139-
| *Type: string*
140-
|
141-
142-
| **host**
143-
| IP address to server.
144-
| *Default: None*
145-
| *Type: string*
146-
|
147-
148-
| **metadata**
149-
| name of the database table containing station metadata
150-
| *Default: tbl_metadata*
151-
| *Type: string*
152-
|
153-
154-
| **password**
155-
| password used for database login.
156-
| *Default: None*
157-
| *Type: password*
158-
|
159-
160-
| **port**
161-
| Port for MySQL database.
162-
| *Default: 3606*
163-
| *Type: int*
164-
|
165-
166-
| **precip**
167-
| name of the table column containing station precipitation
168-
| *Default: precip_accum*
169-
| *Type: string*
170-
|
171-
172-
| **solar**
173-
| name of the table column containing station solar radiation
174-
| *Default: solar_radiation*
175-
| *Type: string*
176-
|
177-
178-
| **station_table**
179-
| name of the database table containing client and source
180-
| *Default: tbl_stations*
181-
| *Type: string*
182-
|
183-
184-
| **stations**
185-
| List of station IDs to use for distributing any of the variables
186-
| *Default: None*
187-
| *Type: station*
188-
|
189-
190-
| **user**
191-
| username for database login.
192-
| *Default: None*
193-
| *Type: string*
194-
|
195-
196-
| **vapor_pressure**
197-
| name of the table column containing station vapor pressure
198-
| *Default: vapor_pressure*
199-
| *Type: string*
200-
|
201-
202-
| **wind_direction**
203-
| name of the table column containing station wind direction
204-
| *Default: wind_direction*
205-
| *Type: string*
206-
|
207-
208-
| **wind_speed**
209-
| name of the table column containing station wind speed
210-
| *Default: wind_speed*
211-
| *Type: string*
212-
|
213-
214-
215115
gridded
216116
-------
217117

218118
| **data_type**
219119
| Type of gridded input data
220-
| *Default: hrrr_netcdf*
120+
| *Default: hrrr_grib*
221121
| *Type: string*
222122
| *Options:*
223-
*wrf hrrr_grib netcdf hrrr_netcdf*
123+
*wrf hrrr_grib netcdf*
224124
|
225125
226126
| **hrrr_directory**
@@ -235,6 +135,14 @@ gridded
235135
| *Type: bool*
236136
|
237137
138+
| **hrrr_load_method**
139+
| Method to load the HRRR data either load all data first or for each timestep
140+
| *Default: first*
141+
| *Type: string*
142+
| *Options:*
143+
*first timestep*
144+
|
145+
238146
| **netcdf_file**
239147
| Path to the netCDF file containing weather data
240148
| *Default: None*

docs/user_guide/input_data.rst

Lines changed: 2 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -129,44 +129,8 @@ Example data files can be found in the ``tests`` directory for RME.
129129
MySQL Database
130130
``````````````
131131

132-
The MySQL database is more flexible than CSV files but requires more effort to setup. However,
133-
SMRF will only import the data and stations that were requested without loading in additional
134-
data that isn't required. See :mod:`smrf.data.mysql_data` for more information.
135-
136-
The data table contains all the measurement data with a single row representing a measurement
137-
time for a station. The date column (i.e. ``date_time``) must be a ``DATETIME`` data type with
138-
a unique constraint on the ``date_time`` column and ``primary_id`` column.
139-
140-
================ ========== ==== ==== === =====
141-
date_time primary_id var1 var2 ... varN
142-
================ ========== ==== ==== === =====
143-
10/01/2008 00:00 ID_1 5.2 13.2 ... -1.3
144-
10/01/2008 00:00 ID_2 1.1 0 ... -10.3
145-
10/01/2008 01:00 ID_1 6.3 NAN ... -2.5
146-
10/01/2008 01:00 ID_2 0.3 7.1 ... 9.4
147-
================ ========== ==== ==== === =====
148-
149-
The metadata table is the same format as the CSV files, with a primary_id, X, Y, and elevation
150-
column. A benefit to using MySQL is that we can use a ``client`` as a way to group multiple
151-
stations to be used for a given model run. For example, we can have a client named BRB, which
152-
will have all the station ID's for the stations that would be used to run SMRF. Then we can
153-
specify the client in the configuration file instead of listing out all the station ID's. To use
154-
this feature, a table must be created to hold this information. Then the station ID's matching
155-
the client will only be imported. The following is how the table should be setup. Source is used
156-
to track where the data is coming from.
157-
158-
========== ====== ======
159-
station_id client source
160-
========== ====== ======
161-
ID_1 BRB Mesowest
162-
ID_2 BRB Mesowest
163-
ID_3 TUOL CDEC
164-
... ... ...
165-
ID_N BRB Mesowest
166-
========== ====== ======
167-
168-
Visit the `Weather Database GitHub page <https://github.com/USDA-ARS-NWRC/weather_database>`_ if you'd
169-
like to use a MySQL database.
132+
The MySQL database has been deprecated as of SMRF v0.11.0. If that feature is needed,
133+
we recommend using v0.9.X or export the tables to csv format.
170134

171135

172136
Weather Research and Forecasting (WRF)

requirements.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
coloredlogs
22
Cython>=0.28.4
33
inicheck>=0.9.0,<0.10.0
4-
mysql-connector-python-rf==2.2.2
54
netCDF4>=1.2.9
65
numpy>=1.14.0,<1.19.0
76
pandas>=0.23.0

smrf/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
__version__ = get_distribution(__name__).version
1010
except DistributionNotFound:
1111
__version__ = 'unknown'
12+
1213
__core_config__ = os.path.abspath(
1314
os.path.dirname(__file__) + '/framework/CoreConfig.ini')
1415
__recipes__ = os.path.abspath(os.path.dirname(
@@ -21,7 +22,6 @@
2122
"time": "Dates to run model",
2223
"stations": "Stations to use",
2324
"csv": "CSV section configurations",
24-
"mysql": "MySQL database",
2525
"gridded": "Gridded datasets configurations",
2626
"air_temp": "Air temperature distribution",
2727
"vapor_pressure": "Vapor pressure distribution",
@@ -36,7 +36,7 @@
3636
"system": "System variables and Logging"
3737
}
3838

39-
# from . import data, distribute, envphys, framework, output, spatial, utils # isort:skip
39+
from . import utils, data, distribute, envphys, framework, output, spatial # isort:skip
4040

4141
__config_header__ = "Config File for SMRF {0}\n" \
4242
"For more SMRF related help see:\n" \

smrf/data/__init__.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
11
# -*- coding: utf-8 -*-
22
# flake8: noqa
3-
from . import loadData, loadGrid, loadTopo, mysql_data
3+
from .csv import InputCSV
4+
from .hrrr_grib import InputGribHRRR
5+
from .load_topo import Topo
6+
from .netcdf import InputNetcdf
7+
from .wrf import InputWRF
8+
9+
from .load_data import InputData # isort:skip

smrf/data/csv.py

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
import logging
2+
3+
import pandas as pd
4+
5+
from smrf.utils.utils import check_station_colocation
6+
7+
8+
class InputCSV():
9+
10+
DATA_TYPE = 'csv'
11+
12+
def __init__(self, start_date, end_date, stations=None, config=None):
13+
14+
self.start_date = start_date
15+
self.end_date = end_date
16+
self.stations = stations
17+
self.config = config
18+
self.time_zone = start_date.tzinfo
19+
20+
self._logger = logging.getLogger(__name__)
21+
22+
if self.stations is not None:
23+
self._logger.debug('Using only stations {0}'.format(
24+
", ".join(self.stations)))
25+
26+
def load(self):
27+
"""
28+
Load the data from a csv file
29+
Fields that are operated on
30+
- metadata -> dictionary, one for each station,
31+
must have at least the following:
32+
primary_id, X, Y, elevation
33+
- csv data files -> dictionary, one for each time step,
34+
must have at least the following columns:
35+
date_time, column names matching metadata.primary_id
36+
"""
37+
38+
self._logger.info('Reading data coming from CSV files')
39+
40+
variable_list = list(self.config.keys())
41+
variable_list.remove('stations')
42+
43+
self._logger.debug('Reading {}...'.format(self.config['metadata']))
44+
metadata = pd.read_csv(
45+
self.config['metadata'],
46+
index_col='primary_id')
47+
# Ensure all stations are all caps.
48+
metadata.index = [s.upper() for s in metadata.index]
49+
self.metadata = metadata
50+
variable_list.remove('metadata')
51+
52+
for variable in variable_list:
53+
filename = self.config[variable]
54+
55+
self._logger.debug('Reading {}...'.format(filename))
56+
57+
df = pd.read_csv(
58+
filename,
59+
index_col='date_time',
60+
parse_dates=[0])
61+
df = df.tz_localize(self.time_zone)
62+
df.columns = [s.upper() for s in df.columns]
63+
64+
if self.stations is not None:
65+
df = df[df.columns[(df.columns).isin(self.stations)]]
66+
67+
# Only get the desired dates
68+
df = df[self.start_date:self.end_date]
69+
70+
if df.empty:
71+
raise Exception("No CSV data found for {0}"
72+
"".format(variable))
73+
74+
setattr(self, variable, df)
75+
76+
def check_colocation(self):
77+
# Check all sections for stations that are colocated
78+
colocated = check_station_colocation(metadata=self.metadata)
79+
if colocated is not None:
80+
self._logger.error(
81+
"Stations are colocated: {}".format(','.join(colocated[0])))

0 commit comments

Comments
 (0)