-
Notifications
You must be signed in to change notification settings - Fork 2
/
FluxManual.txt
265 lines (219 loc) · 10.6 KB
/
FluxManual.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
--------------------------
UNM Flux Processing Manual
Timothy W. Hilton
hilton@unm.edu
July-August 2013
--------------------------
ABSTRACT
This document covers processing of Marcy Litvak's New Mexico eddy
covariance sites (Grassland (burned), Grassland (unburned), Shrubland,
Juniper Savanna, Pinon-Juniper woodland (control), Pinon-Juniper
woodland (girdled), Valles Caldera Ponderosa Pine, and Valles Caldera
Mixed Conifer. It also covers processing of three sites in Texas:
Freeman Ranch, Freeman (forest), and Freeman (grassland).
Timothy W. Hilton implemented an extensive overhaul of the processing
pipeline between 2011 and 2013. As of July 2013 the pipeline is very
different from the pipeline as of August 2011.
Processing eddy covariance fluxes from the New Mexico Elevation
Gradient takes place in Matlab and is organized under a handful of
"top-level" functions, described below. There are a number of
intermediate files produced between the raw data from the field and
the finished Ameriflux files.
--------------------------
TABLE OF CONTENTS
I. Environment
II. Suggested work flow
III. Top-Level Functions
IV. Helper functions (these may be sometimes independently useful)
V. Additional notes
VI. Wishlist
--------------------------
I. Environment
--------
1.a. Directory structure
The processing code manipulates data files in a directory with its
root in the environment variable FLUXROOT. On Jemez FLUXROOT is set
to C:\Research_Flux_Towers. If setting up a new machine, FLUXROOT
must be defined (and available to Matlab).
Within FLUXROOT, the processing code expects to find the following
structure:
$FLUXROOT/Flux_Tower_Data_by_Site
$FLUXROOT/AncillaryData/MetData/
$FLUXROOT/Tower_Information/UNM_flux_site_name_table.csv
$FLUXROOT/Gapfiller_Logs
$FLUXROOT/Card_Processing_Logs/
$FLUXROOT/Flux_Tower_Data_by_Site should contain a directory for each
site (e.g. GLand, SLand, etc.). Each site directory should contain:
FLUXALL files (xls or txt, depending on year; see below)
$FLUXROOT/Flux_Tower_Data_by_Site/SITE/processed_flux
$FLUXROOT/Flux_Tower_Data_by_Site/SITE/toa5
$FLUXROOT/Flux_Tower_Data_by_Site/SITE/ts_data
the toa5 directory contains 30-minute data in Campbell Scientic's TOA5
format. The processed_flux directory contains FLUXALL_QC files,
FLUXALL_for_gap_filling files, and the output from the MPI
gapfiller/partitioner tool (see below). The ts_data directory
contains 10hz data in Cambell Scientific's TOB1 format.
--------
1.b Required software
The Matlab processing code relies on several external pieces of
software. These must be available through Matlab's system command.
- CardConvert version 4.0 or later (Campbell Scientific proprietary
software)
- 7zip (www.7-zip.org/) (used to compress data files)
- sftp with command line interface. openssh installed via Cygwin
(www.cygwin.com) includes this. (used to transfer data to offsite
servers for backup)
- R (version 3.0 or later, with REddyProc package installed (for gapfilling).
(http://www.r-project.org/,
http://www.bgc-jena.mpg.de/bgi/index.php/Services/REddyProcWebRPackage,
http://r-forge.r-project.org/projects/reddyproc/)
--------------------------
II. Suggested Work Flow
This section describes the typical work flow for processing an
incoming compact flash (CF) card from a site's datalogger. This is a
summary. Complete documentation for Matlab functions (specific tasks
performed, inputs, outputs, etc.) is available from Matlab's
documentation (within Matlab, doc FUNCTIONNAME).
There are two scenarios considered here: the data are on the compact
flash card from the field (section IIa), and the data have already
been copied to the local disk (section IIb). The second situation
occurs sometimes when, for example, the cards are needed immediately
in the field again and there is not time to fully process their data.
IIa. processing data on a compact flash card
1. Place the CF card in the computer's card reader.
2. Within Matlab, run UNM_retrieve_card_data_GUI. Select the site
whose data you are processing from the list and click "Go".
- Progress updates will appear in the Matlab window. These updates
are also echoed to
$FLUXROOT/Card_Processing_Logs/YYYY_MM_DD_HHMM_SITE_card_process.log.
- After the data are copied from the card to disk, processed to TOA5
and TOB1 files, and compressed, the compressed data are copied to
edacdata1.unm.edu by secure file transfer protocol (sftp). The sftp
process opens in a Windows command window and requires the user to
enter the password for mlitvak@edacdata1.unm.edu. Note that (by
design) nothing is echoed to the prompt during password entry.
2.b. run UNM_site_plot_fullyear_time_offsets. This produces a plot of
calculated vs. observed sunrise for the site-year. Adjustments to the
datalogger timestamps are sometimes necessary for reasonable sunrise
times. The offsets in the plot should be as close to zero as
possible. The timing of necessary shifts are specified in
UNM_fix_datalogger_timestamps.m. Record any new adjustments there.
3. Run UNM_RemoveBadData (or UNM_RemoveBadData_pre2012 for data from
2011 or earlier). Inspect the NEE plot to determine whether the
default NEE filters are adequate. If siteyear-specific changes are
necessary, they go in one of the following three helper functions
within UNM_RemoveBadData:
- remove_specific_problem_periods
- specify_siteyear_filter_exceptions
- specify_siteyear_co2_conc_filter_exceptions
Adjust the filtering parameters, rerun UNM_RemoveBadData, and examine
the NEE plot until satisfied with the filtering.
4. When satisfied with the performance of UNM_RemoveBadData, run
UNM_fill_met_gaps_from_nearby_site. This creates a file to be
submitted to the gapfiller/partitioner:
$FLUXROOT/Flux_Tower_Data_by_Site/SITE/processed_flux/SITE_flux_all_YYYY_for_gap_filling_filled.txt.
5. call UNM_run_gapfiller. This gapfills the fluxall file via a
system call to R and REddyProc.
6. Run UNM_Ameriflux_File_Maker. This creates with_gaps and gapfilled
Ameriflux files and produces several diagnostic plots.
IIb. processing data that have been copied to local disk.
1. The GUI interface currently does not handle this situation (see
V. Wishlist, below). process_card_main allows specification of a
location on disk to look for data. For example (from the Matlab
command prompt):
>> process_card_main( UNM_sites.GLand, 'disk', 'data_path', 'C:\Research_Flux_Towers\CF_Data_to_Process\DIRECTORY_CONTAINING_DATA);
2. proceed as usual from step 2b above (section IIa).
--------------------------
III. Top-Level functions
These functions are intended as "top-level", to be called by users
from the Matlab prompt. Use "doc FUNCTIONNAME" within Matlab for more
detailed documentation, inputs and outputs, etc.
UNM_retrieve_card_data_GUI
UNM_RemoveBadData
UNM_RemoveBadData_pre2012
UNM_fill_met_gaps_from_nearby_site
UNM_run_gapfiller
UNM_Ameriflux_File_Maker
--------------------------
IV. Helper functions
These functions are helper functions for the above top level
functions, but are at times independently useful for exploring EC
data.
card_data_processor
combine_and_fill_TOA5_files
dataset_fill_timestamps
dataset_vertcat_fill_vars
dataset_viewer
DOYidx
export_dataset_tim
figure_2_eps
merge_datasets_by_datenum
parse_ameriflux_file
parse_forgapfilling_file
parse_jena_output
parse_TAMU_ameriflux_file
parse_TOA5_file_headers
parse_UNM_site_table
plot_CZO_figure
plot_fingerprint
UNM_Ameriflux_Data_Viewer
UNM_parse_both_ameriflux_files
UNM_parse_fluxall_txt_file
UNM_parse_fluxall_xls_file
UNM_parse_gapfilled_partitioned_output
UNM_parse_QC_txt_file
UNM_parse_QC_xls_file
UNM_parse_sev_met_data
UNM_parse_valles_met_data
UNM_site_plot_fullyear_time_offsets
UNM_write_for_gapfiller_file
PJG_PJ_cumulative_flux_plotter
--------------------------
V. Additional notes
The directory C:\Code\UNM_Flux_Code_as_of_15Aug2011 (on Jemez)
contains the UNM flux processing code as it existed on my arrival at
UNM on 15 August 2011.
Occasionally processing of a datalogger card as outlined in II fails
because Matlab encounters an error. This is usually caused by one of
two things. First, file permissions problems where the user currently
logged in does not have read permission for a data file that the
processing code needs to read, or does not have execute or write
permissions for a directory in which the processing code is trying to
write. The bash script fix_file_permissions.sh addresses these
problems. Note that it must be run with administrator priveleges.
Second, the processing code expects the datalogger card to contain
exactly two files named *.flux.dat and *.ts_data.dat. If this is not
the case the code will fail. At present manual intervention is
necessary: extra files need to be moved and files named differently
need to be renamed. Different names frequently are the result of an
external repair attempt in Campbell Scientific's CardConvert software.
FLUXALL files for site-years 2011 and earlier are Microsoft Excel .xls
documents, and must be processed with UNM_RemoveBadData_pre2012.
Site-years 2012 and later FLUXALL files are delimited ASCII files, and
must be processed with UNM_RemoveBadData_pre2012.
--------------------------
VI. Wishlist
This section describes upgrades I have not had time to implement.
- integrate handling of data that have been copied to local disk into
UNM_retrieve_card_data_GUI. This situation is currently handled
(from the Matlab prompt only) in process_card_main.
- move specification of exceptions to the filters in RemoveBadData to
configuration files. They are currently put directly into matlab
code in helper functions remove_specific_problem_periods,
specify_siteyear_filter_exceptions, and
specify_siteyear_co2_conc_filter_exceptions.
- Convert pre-2012 FLUX_all files to delimited ASCII text files
readable by UNM_RemoveBadData. I have begun work on this on a
couple of occasions. It's a big job because many observations have
different labels at different sites, and often have different labels
within the same site at different times as well. I've reached the
conclusion that the only way to standardize observations labels
while maintaining confidence that nothing has been erroneously
combined or mislabel is to go through every column of each site-year
FLUXALL file and explicitly consider each label. At 150+ columns
per site-year, this is a big job and I have not yet taken the time.
In addition to this some of the pre-2012 FLUX_all Excel files
contain variables (that is, columns) that were added by hand, or
sections that are not actually columnar data. It is not entirely
clear whether some of these sections are important to preserve, and
these sections make automation somewhat troublesome.