- About this document
- New in this version
- Introduction
- Input and validation data
- Methods
- Product evaluation and assessment
- Product description
- Conclusions
- Data and code availability
- Algorithms
- Data
- Quality control
- Supplemental material
- Figures
- Overview
- Basin changes with changing k
- Bulk observation vs. RCM scatter plots
- Modified Tukey plot for all observations
- Bamber 2018
- Observations vs. RCM, map + ts + scatter
- Watson River
- Watson Adjustments
- Leverett Glacier
- Kiattuut Sermiat
- Kingigtorssuaq
- Oriartorfik
- Teqinngalip
- Kobbefjord
- Røde Elv
- Zackenberg
- Qaanaaq
- Elevation histogram
- README
- Appendix
- Misc journal sections
- References
- Meta
- LaTeX setup
This document is an Emacs Org Mode plain-text file with code and text embedded. If you are viewing:
- A DOC, Google Doc, or PDF file, then it was generated by exporting from Org. Not all of the Org parts (code, results, comments, etc.) were exported. The Org source file is available upon request, and may be embedded in the PDF. Most non-Apple PDF viewers provide easy access to embedded or attached files.
- A webpage somewhere, then this is a subset of the code and text that the website render has decided to display to you through the browser. You can choose to view the raw source and/or download it and view it locally on your computer.
- A file with a
org
extension in something other than Emacs, then you are seeing the canonical version and the full source, but without any syntax highlighting, document structure, or the ability to execute the code blocks. - An
Org
file within Emacs, then this is the canonical version. You should be able to fully interact and reproduce the contents of this document, although it may require 3rd-party applications (Python, etc.) a similar Emacs configuration, and the data files. This is available upon request.
To recreate this work
- Open this file in Emacs Org Mode.
- check that you have the necessary software dependencies installed. See section: Code.
- Download and set up the necessary data files as per the Data section
- Tangle the embedded code blocks.
- Execute
C-c C-v C-t
to run the (org-babel-tangle) function
- Execute
- Run
make
- This should probably be run in an external terminal because it takes hours to days…
- Update Babel result blocks throughout the document by
- Cleaning all result blocks with
C-u C-c C-v k
or (org-babel-remove-result-one-or-many t), then - Executing all blocks (without
:eval no
) usingC-c C-v C-b
or (org-babel-execute-buffer)
- Cleaning all result blocks with
This is captured programatically by workflow-update
The discussion of quality control and land runoff with depth < 0 as invalid has been updated. Land runoff with depth << 0 is possible and occurs when land-source runoff enters the subglacial system and discharges subglacially at depth from a marine terminating glacier. This occurs often when runoff is sourced from nunatuks. This scenario also generates land outlets that exist outside of land basins.
Various input data products have been upgraded. BedMachine has been upgraded from v3 to v5. MAR has been upgraded from version 3.11 to 3.12, and the simulation period has been extended from 1979 through September 2019, to 1950 through December 2022. The RACMO simulation period has been extended from 1958 through September 2019 to 1958 through December 2022. Prior to 1990, the RACMO data is unchanged (6-hourly ERA-Interim forcing). From 1990 onward, we now use 3-hourly ERA5 forcing. Both the MAR and RACMO domain boundaries have been corrected, fixing a small alignment error in the initial version.
Additional metadata now includes the citet:mouginot_2019_data basins and regions and the citet:zwally_2012_data sector nearest to each outlet, and shortest distance between the outlet and the basin, region, or sector boundary. This should be used with caution - some peripheral ice cap outlets may be assigned to the ice sheet, or land-terminating outlets may be nearest to an ice basin that does not overlap with hydrological basin. The distance can be used as a filter. We also include the nearest named glacier from citet:bjork_2015, and distance to the citet:bjork_2015 point. We also include the nearest citet:mankoff_2020_solid gate and distance to gate. Because citet:mankoff_2020_solid is a continually updating product, and gate locations and IDs may change in the future, we note here that we use gates V3 (file: https://doi.org/10.22008/promice/data/ice_discharge/gates/v02/GSEWLR) that are associated with discharge V55. The gate IDs used here are also valid with many previous ice discharge versions, and likely to be valid with many following ice discharge versions.
\introduction
Over the past decades, liquid runoff from Greenland has increased citep:mernild_2012,bamber_2018_freshwater,trusel_2018,perner_2019 contributing to mass decrease citep:sasgen_2020. When that runoff leaves the ice sheet and discharges into fjords and coastal seas, it influences a wide range of physical citep:straneo_2011,an_2012,mortensen_2013,bendtsen_2015,cowton_2015,mankoff_2016,fried_2019,cowton_2019,beckmann_2019, chemical citep:kanna_2018,balmonte_2019, and biological citep:kamenos_2012,kanna_2018,balmonte_2019 systems citep:catania_2020. The scales of the impacts range from instantaneous at the ice–ocean boundary to decadal in the distal ocean citep:gillard_2016. The influence of freshwater on multiple domains and disciplines citep:catania_2020 is the reason several past studies have estimated runoff and discharge at various temporal and spatial scales (e.g., citet:mernild_2008,mernild_2009,mernild_2010,langen_2015,ahlstrom_2017,citterio_2017,van-as_2018,bamber_2018_freshwater,perner_2019,slater_2019).
To date no product provides discharge estimates at high spatial resolution (~100 m; resolving individual streams), daily temporal resolution, for all of Greenland, covering a broad time span (1950 through September 2022), from multiple regional climate models (RCMs), and with a simple database access software to support downstream users. Here we present these data. In the following description and methods, we document the inputs, assumptions, methodologies, and results we use to estimate Greenland discharge from 1950 through December 2022.
Freshwater discharge from Greenland primarily takes three forms: solid ice from calving at marine-terminating glaciers; submarine meltwater from ice-ocean boundary melting at marine-terminating glaciers; and liquid runoff from melted inland surface ice, rain, and condensation. A recent paper by citet:mankoff_2020_ice targets the solid ice discharge plus submarine melt budget by estimating the ice flow rate across gates 5 km upstream from all fast-flowing marine-terminating glaciers in Greenland. Complementing that paper, this paper targets Greenland’s point-source liquid water discharge budget by partitioning RCM runoff estimates to all ice margin and coastal outlets. The sum of these data and citet:mankoff_2020_ice is an estimate of the majority of freshwater (in both liquid and solid form) volume flow rates into Greenland fjords. Those two terms comprise the bulk but not all freshwater - they exclude precipitation directly onto the fjord or ocean surface, as well as relatively minor contributions from evaporation and condensation, sea ice formation and melt, or subglacial basal melting.
The static products (streams, outlets, and basins (Fig. fig:overview)) are derived from an ice sheet surface digital elevation model (DEM), an ice sheet bed DEM, an ice sheet mask, the land surface DEM, and an ocean mask. For the surface DEM, we use ArcticDEM v7 100 m citep:ArcticDEM. Subglacial routing uses ArcticDEM and ice thickness from BedMachine v5 citep:NSIDC_BedMachine_GL_v5,morlighem_2017. Both DEMs are referenced to the WGS84 ellipsoid. For the ice mask we use the Programme for Monitoring of the Greenland Ice Sheet (PROMICE) ice extent citep:citterio_2013. For the ocean mask we use the Making Earth System Data Records for Use in Research Environments (MEaSUREs) Greenland Ice Mapping Project (GIMP) Land Ice and Ocean Classification Mask, Version 1 citep:NSIDC_0714,howat_2014.
The time series product (daily discharge) is derived from daily runoff estimates from RCM calculations over the land and ice areas of Greenland. We use the Modèle Atmosphérique Régional (MAR; citet:fettweis_2017) and the Regional Atmospheric Climate Model (RACMO; citet:noel_2019). Runoff, \(R\), is defined by
\begin{equation} R = ME + RA - RT - RF. \end{equation}
In Eq. \ref{eq:runoff}, \(ME\) is melt, \(RA\) is rainfall, \(RT\) is retention, and \(RF\) is refreezing. In RACMO, retention occurs only when firn is present (not with bare ice). MAR does have a delay for bare ice runoff. Neither have a delay for land runoff. Both RCM outputs were provided regridded to the same 1 km grid using an offline statistical downscaling technique based on the local vertical runoff gradient applied to the subgrid topography citep:noel_2016,fettweis_2020. MAR (v 3.13; citet:delhasse_2020) ran with 7.5 km resolution and ERA5 6 h forcing. RACMO (v 2.3p2; citet:noel_2018,noel_2019) ran with 5.5 km resolution and ERA-Interim 6-hour forcing. Runoff is assigned an uncertainty of ±15 % (Sect. \ref{sec:uncertain:RCM}).
We use 10 river discharge daily time series to validate the results of this work. The name, location, time coverage, and relevant data and scientific publications associated with each of these observational data are listed in Table tbl_obs.
Location | Long | Lat | Time | Data | Publication | Fig(s). |
---|---|---|---|---|---|---|
Kiattuut Sermiat | 45.33 | 61.21 | 2013 | citet:hawkings_2016_data | citet:hawkings_2016 | 1 3 4 5 \ref{fig:Ks} |
Kingigtorssuaq (Nuuk) | 51.5801 | 64.1387 | 2008-2018 | citet:GEM_data | 1 3 4 \ref{fig:K} | |
Kobbefjord (Nuuk) | 51.3810 | 64.1336 | 2006-2017 | citet:GEM_data | 1 3 4 \ref{fig:Kb} | |
Leverett Glacier | 50.17 | 67.06 | 2009-2012 | citet:tedstone_2017 | citet:hawkings_2015 | 1 3 4 5 \ref{fig:L} |
Oriartorfik (Nuuk) | 51.4066 | 64.1707 | 2007-2018 | citet:GEM_data | 1 3 4 \ref{fig:O} | |
Qaanaaq | 69.3030 | 77.4753 | 2017-2018 | citet:qaanaaq_data | citet:sugiyama_2014 | 1 3 4 5 \ref{fig:Q} |
Røde Elv (Disko) | 53.4989 | 69.2534 | 2017 | citet:GEM_data | 1 3 4 5 6 \ref{fig:R} | |
Teqinngalip (Nuuk) | 51.5484 | 64.1586 | 2007-2018 | citet:GEM_data | 1 3 4 \ref{fig:T} | |
Watson River | 50.68 | 67.01 | 2006-2019 | citet:van-as_2018 | citet:van-as_2018 | 1 3 4 5 \ref{fig:W} \ref{fig:W_adjust} |
Zackenberg | 20.5628 | 74.4722 | 1996-2018 | citet:GEM_data | 1 3 4 5 \ref{fig:Z} |
Extract the longitude, latitude, and abbreviation from tbl_obs and import into GRASS for use for the analysis elsewhere in this document.
- Extract abbreviation for each location
- Add Convert (lon,lat) to EPSG:3413 (x,y)
for key in "${!tbl[@]}"; do #
row=(${tbl[$key]})
IFS=" " read lon lat time data pub fig <<< ${row[@]}
IFS=":" read x y z <<< $(echo "${lat} -${lon}" | cs2cs EPSG:4326 EPSG:3413 -f "%0.f" | sed 's/[[:space:]]/:/g')
echo ${key/\ */} $x $y
done | sort
Note - to solve jupyter-python
issue with :cache
results,
- Manually edit this table
- add
#+NAME: tbl_obs_xy_cache
- Reduce 1st column to 1-letter abbreviation, except
- Kobbefjord -> Kb
- Kiattuut Sermiat -> Ks
- Adjust Leverett to “-216646 | -2504583”
- Then access via table name and not babel code block name.
Ks | -18335 | -3183360 |
K | -326372 | -2829354 |
Kb | -316602 | -2831048 |
L | -216646 | -2504583 |
O | -317396 | -2826710 |
Q | -560538 | -1241281 |
R | -335678 | -2246371 |
T | -324548 | -2827284 |
W | -249713 | -2510668 |
Z | 699990 | -1540459 |
We use the following terminology throughout the document:
- Runoff refers to the unmodified RCM data products – melted ice, rain, condensation, and evaporation – that comprise the RCM runoff output variable.
- Discharge refers to the runoff after it has been processed by this work - routed to and aggregated at the outlets. Depending on context, discharge may also refer to the observed stream discharge (Table \ref{tbl_obs}).
- Basins refer to the 100 m x 100 m gridded basins derived from a combination of the ArcticDEM product and the mask.
- Mask refers to the surface classification on that 100 m x 100 m grid and is one of ice, land, or ocean (also called fjord or water). When referring to the surface classification in the RCM, we explicitly state “RCM mask”.
- MAR and RACMO refer to the RCMs, but when comparing discharge estimates between them or to observations, we use MAR and RACMO to refer to our discharge product derived from the MAR and RACMO RCM runoff variables rather than repeatedly explicitly stating “discharged derived from [MAR|RACMO] runoff”. The use should be clear from context.
Streams are calculated from the hydraulic head \(h\) which is the DEM surface for land surface routing, or the subglacial pressure head elevation for subglacial routing. \(h\) is defined as
\begin{equation} h = z_b + k \frac{ρ_i}{ρ_w} (z_s - z_b), \end{equation}
with \(z_b\) the ice-free land surface and basal topography, \(k\) the flotation fraction, \(ρ_i\) the density of ice (917 kg m-3), \(ρ_w\) the density of water (1000 kg m-3), and \(z_s\) the land surface for both ice-free and ice-covered surfaces.
Eq. eq:head comes from citet:shreve_1972, where the hydropotential has units of pascals (Pa), but here it is divided by gravitational acceleration \(g\) times the density of water \(ρ_w\) to convert the units from pascals to meters (Pa to m). We compute \(h\) and from that streams, outlets, basins, and runoff for a range of subglacial pressures, implemented as a range of \(k\) values: 0.8, 0.9, and 1.0. We use these three scenarios to estimate sensitivity of the outlet location for all upstream cells, but otherwise only use results from the \(k = 1.0\) scenario. Eq. eq:head makes the assumption that when ice is present all water routes subglacially, meaning that water flows from the surface to the bed in the grid cell where it is generated. In reality, internal catchments and moulins likely drain waters to the bed within a few kilometers of their source citep:yang_2016_internally. The difference between some supraglacial flow and immediate subglacial flow is not likely to impact results because discharge is reported only at the outlet locations.
We use the GRASS GIS software citep:neteler_2012,GRASS_GIS_software and the r.stream.extract
command configured for single-flow direction from eight neighbors (SFD-8) to calculate streams and outlets at the ice edge and coast. Streams are defined only if their upstream contributing area is above a threshold (> 3 km^2), so small basins may have outlets but no streams. The software fills all sinks so that all water flows to the domain edge. We then use the r.stream.basins
tool citep:jasiewicz_2011 to calculate basins upstream from each outlet. Basins < 1 km2 are absorbed into their largest neighbor and the associated outlets are dropped.
The three choices of \(k\) generate three scenarios of basins and outlets, and we use this to show sensitivity of every ice grid cell to these choices. After three \(k\) scenarios, each cell has three possible outlets, where each outlet is an (x,y) coordinate. To show results in a map view, we reduced these six properties (three 2D coordinates) to a single property. For every grid cell in the ice domain we compute the maximum distance between each outlet and the other two (six becomes three), and we then select the maximum (three becomes one). Fig. fig:k_basin_change displays the maximum distance - a worst-case scenario - of how far the outlet of every inland ice cell may move due to basal routing assumptions.
RCM runoff is summed over each basin for each day of RCM data and assigned to each outlet for that day. This assumes routing between the runoff and the outlet is instantaneous, so all analyses done here include a 7 d smooth applied to the RCM discharge product (cf. citet:van-as_2017). The released data do not include any smoothing.
The alignments of the RCM and the basins do not always agree. Each 100 m x 100 m ArcticDEM pixel is classified as ice citep:citterio_2013, ocean citep:NSIDC_0714, or land (defined as neither ice nor ocean). However, the classification of the mask cells and the 1 km2 RCM domains do not always agree - for example, when a mask cell is classified as ice but the matching RCM cell is land. This disagreement occurs almost everywhere along the ice margin because the 1 km RCM boundary and the 100 m mask boundary rarely perfectly align. The ice margin is where most runoff occurs per unit area due to the highest temperatures at the lowest ice elevations, so small changes in masks in these locations can introduce large changes in RCM outputs.
We adjust for this imprecise overlap and scale the RCM results to the basin area. Where the mask reports ice and a RCM reports land, the RCM land runoff fraction is discarded, and the RCM ice runoff fraction over this basin is adjusted for the uncovered basin cells (and vice versa for basin land and RCM ice). Small basins with no RCM coverage of the same type have no runoff.
Runoff adjustments using this method are underestimated for large basins with large inland high-elevation regions with low runoff, because this method fills in misaligned cells with each day’s average discharge, but the misalignment (missing runoff) occurs at the ice sheet edge where maximum runoff occurs. However, given that the basin is large, misalignment is proportionally small, and therefore errors are proportionally small. Conversely, when misalignment is proportionally large (e.g., a basin is only 1 % covered by the same RCM classification), this implies a small basin. Because the basin is small, the covered region (no matter how much smaller) must be nearby and not climatically different.
RCM inputs are also scaled to adjust for the EPSG:3413 non-equal-area projection. This error is up to 8 % for some grid cells but ranges from - 6 % to + 8 % over Greenland and the cumulative error for the entire ice sheet is < 8 %.
We validate the modeled outlet discharge against the observations first in bulk and then individually. Bulk comparisons are done with scatter plots (Figs. \ref{fig:scatter_daily} & \ref{fig:scatter_yearsum}) and modified Tukey plots comparing observations vs. the ratio of the RCMs to observations (Fig. \ref{fig:tukey}, based on Tukey mean-difference plots, also known as Bland–Altman plots citep:altman_1983,martin_1986).
We introduce the graphics here as part of the methods to reduce replication in figure captions - we show 10 nearly identical graphics (Figs. \ref{fig:W} and \ref{fig:L} through \ref{fig:Q}) for 10 different observation locations, and each graphic uses the same template of six panels.
For each figure (Figs. \ref{fig:W} and \ref{fig:L} to \ref{fig:Q}), the top panel (a) shows a satellite basemap with the land portion of the basin of interest (if it exists) outlined in dark green, the streams within that basin in light green, the basin outlet as an orange filled diamond, and the stream gauge location as an orange unfilled diamond. Ice basin(s) that drain to the outlet are outlined in thick dark blue if they exist, and all other ice basins are outlined in thin dark blue. Both MAR and RACMO use the same domains. The RCM ice domain is in light blue, and the RCM land domain is not shown, but is outside the light blue ice domain (not including the water). The scale of each map varies, but the basin lines (green and dark blue) are on a 100 m grid, and the RCM grid cells (light blue) are on a 1 km grid.
Panel b shows an example time series - whatever data are available for the last calendar year of the observations.
Panel c shows a scatter plot of observations vs. RCM-derived discharge. This is the same data shown in Fig. \ref{fig:scatter_daily} but subset to just the basin of interest. Color encodes day of year, and a kernel density estimation (KDE) of the discharge highlights where most points occur - not necessarily visible without the KDE because the points overlap (total number of plotted points is printed on the graphic near “n:”). The r2 correlation coefficient for each RCM-derived discharge is displayed. The gray band shows the 5 to 95 % prediction interval, and the three solid lines mark the 1:1, 1:5, and 5:1 ratios.
Panel d shows observations vs. the ratio of the RCM to the observations. This is the same data shown in Fig. \ref{fig:tukey}, but subset to just the basin of interest. Color denotes sample density (similar to the KDE in panel c). The horizontal lines mark the mean, 0.05, and 0.95 quantile of the ration between the RCM and the observations. A value of 1 (or 100) is agreement between observations and the RCM, and a value of 2 or 0.5 is a factor of 2 or a +100/-50 % disagreement. The horizontal split marks the bottom one-third and top two-thirds quantiles of discharge.
Results of this work include (1) ice-margin-terminating streams, outlets, and basins; (2) coast-terminating streams, outlets, and basins; (3) discharge at the ice marginal outlets from ice runoff; and (4) discharge at the coastal outlets from land runoff. Discharge products are provided from both the MAR and RACMO RCMs. We note that our subglacial streams represent where the model routes the water and do not indicate actual streams, unlike the land streams that do appear near actual streams when compared to satellite imagery. Even so, these streams routed using simple subglacial theory show remarkable alignment with ice surface streams and lakes visible in satellite imagery. This may support the theory that basal topography exerts a strong control on supraglacial hydrology citep:lampkin_2011,sergienko_2013_glaciological,crozier_2018, or may indicate a poorly represented and smooth bed in BedMachine, and therefore Eq. eq:head is effectively applying surface routing in these locations.
Of the 361,950 km2 of basin land cells, the RCMs cover 339,749 km2 (~94 %) with their land grid cells, and 22,201 km2 (~6 %) of basin grid cells are filled in with our coverage algorithm (Sect. \ref{sec:methods:discharge}; the RCMs have these as ice or ocean). A total of 51,532 km2 of RCM land is discarded because the basins classify part or all of these cells as ice or ocean. Of the 1,781,816 km2 of basin ice cells, the RCMs cover 1,760,912 km2 (~99 %) with their ice cells, and 20,904 km2 (~1 %) of basin grid cells are filled in (the RCMs have these as land or ocean). A total of 21,793 km^2 of RCM ice is discarded, because the basins classify part or all of these cells as land or ice (table and data available in at https://github.com/GEUS-Glaciology-and-Climate/freshwater citep:github_freshwater).
Our coverage correction (Sect. \ref{sec:methods:discharge}) adjusts RCM ice runoff values by ~3 %. Discarding RCM ice runoff that does not match the underlying mask ice cells results in a 5 % reduction in discharge. However, applying our coverage algorithm to adjust RCM inputs for regions where basins have ice but the RCMs do not results in an 8 % increase from the reduced discharge (net gain of ~3 %). A similar adjustment occurs for RCM land runoff.
Our static products - streams, outlets, and basins - have been previously estimated. citet:lewis_2009 identified 293 distinct hydrologic ice basins and provided a data set of ice basins and ice margin outlets. Our work, a decade later, has ~2 orders of magnitude more basins and outlets because of the higher resolution of the input data, and includes additional data. We provide ice basins, ice margin outlets, ice streams with metadata, land basins, coastal outlets, and land streams with metadata. citet:lewis_2009 generated basins from a 5 km DEM, compared to the 100 m DEM used here. Routing with a 5 km DEM that does not capture small-scale topography is likely to cause some basins and outlets to drain into an incorrect fjord - we find that some land basins delineated with even the 150 m BedMachine land surface may drain into the incorrect fjord, but we did not find similar errors with the 100 m ArcticDEM product used in this work.
Our time-series product (discharge) also has existing similar products. The most recent of these is from citet:bamber_2018_freshwater, who provide a data product at lower spatial resolution (5 km), lower temporal resolution (monthly), and only coastal discharge, not coastal basins, ice basins, or ice margin outlets and discharge. However, citet:bamber_2018_freshwater surpasses our product in that spatial coverage includes a larger portion of the Arctic including Iceland, Svalbard, and Arctic Canada. Furthermore, by providing data at 5 km spatial and monthly temporal resolution, citet:bamber_2018_freshwater implements the main strategy suggested here to increase the signal-to-noise ratio of the data - averaging discharge in space or time (see Sect. \ref{sec:mitigation}).
We show both the geospatial and temporal differences between this product and citet:bamber_2018_freshwater for an example location - Disko Island (Fig. fig:bamber_2018). Spatially our product allows assessment of discharge at interior locations, necessary when comparing with observations that are not at the coast (for example, the Leverett Glacier observations, Fig. fig:L). Temporally, the MAR and RACMO runoff summed over all of Disko Island and to monthly resolution is similar to the monthly Disko Island discharge of citet:bamber_2018_freshwater, but the daily resolution shows increased variability and individual discharge events (from warm days or rain) not seen in the monthly view.
A similar GIS workflow was presented by citet:pitcher_2016 only focusing on the discharge uncertainty from basal routing assumptions (the \(k\) parameter in Eq \ref{eq:head}). We find these differences to be smaller than the differences between RCMs or between RCM and observations (see Sect. \ref{sec:uncertainty}).
\label{sec:validation_obs}
Here we compare our results to all publicly accessible observations we could find or those willing to become open and publicly accessible as part of this work (Table tbl_obs).
This validation compares observations with discharge at stream gauges derived from RCM runoff estimates, much of it coming from far inland on the ice sheet. Disagreement is expected and does not indicate any specific issues in the RCMs but is instead likely due to our routing algorithm (Sect. \ref{sec:methods:discharge}).
Below we discuss first the validation for all discharge estimates together and then the individual outlets. For the individual outlets we begin by focusing on the problematic results in order of severity – Watson River (Figs. \ref{fig:W} & \ref{fig:W_adjust}), Leverett Glacier (Fig. \ref{fig:L}), and Kiattuut Sermiat (Fig. \ref{fig:Ks}) – and show that for two of these three, simple solutions are available, although manual intervention is needed to detect the issue and then adjust results.
A comparison of every day of observational data with discharge > 0 (15,778 days) and the two RCMs (Fig. fig:scatter_daily) shows good agreement with r2 of 0.45 and 0.88 for discharge derived from MAR and RACMO runoff respectively (hereafter “MAR” and “RACMO”). This comparison covers more than 4 orders of magnitude of modeled and observed discharge. The RACMO vs. observed discharge is within a factor of 5 (e.g., plus or minus half an order of magnitude), although both RCMs report only ~50 % of the observed discharge for the largest volumes at the Watson River outlet (Fig. \ref{fig:W}). The reason for the disagreement at the Watson River outlet is discussed in detail in Sect. \ref{sec:W}.
The four near-Nuuk GEM basins (Table \ref{tbl_obs}, Sect. \ref{sec:K}) have ice basins but either no or limited coverage in the RCMs. When excluding these basins from the comparison the r2 agreement changes to 0.59 and 0.78 for MAR and RACMO respectively, and the 5 to 95 % prediction interval is significantly smaller for MAR (red band in Fig. \ref{fig:scatter_daily}). The largest disagreements throughout this work come from these small basins with no RCM coverage. These disagreements are therefore indicative of differences between the land/ice classification mask used by the RCMs compared with the basin masks used here and not necessarily an insufficient ability of the models to simulate melting (near-surface) climate conditions.
Fig. fig:scatter_yearsum shows a similar view as Fig. fig:scatter_daily, but here each observational data set and associated daily discharge is summed by year for the days in that year that observations exist (hence units m3 and not m3 yr$-1$; for example four “L” means there are four calendar years with some observations at the Leverett outlet). Here it is more clear that the Watson River outlet (Sect. \ref{sec:W}) reports ~50 % of the observed discharge, the Kiattuut Sermiat outlet (Sect. \ref{sec:Ks}) overestimates discharge, and the remainder fall within the factor-of-2 lines, except for low discharge at Kingigtorssuaq in the MAR RCM where the RCMs do not cover that small glacier (Sect. \ref{sec:K}).
\label{sec:method:tukey}
Because discharge spans a wide range (~4 orders of magnitude, Fig. \ref{fig:scatter_daily}), a high correlation (r2 of 0.88, Fig. \ref{fig:scatter_daily}) may be due primarily to the range which is larger than the error citep:altman_1983,martin_1986. Fig. \ref{fig:tukey} compensates for this by comparing the observations with the ratio of the RCM to the observations. This graphic again excludes the four near-Nuuk GEM basins. From Fig. \ref{fig:tukey}, the top two-thirds of observed discharge has modeled discharge underestimated by a factor of 0.78 (MAR) and 0.73 (RACMO), as well as 5 to 95 % quantile of 0.30 to 2.08. The top two-thirds of observed discharge spans ~2 orders of magnitude (width of horizontal lines, from ~101 to ~103 m3 s-1). The ratio of the RCMs to these observations for the top two-thirds has a 5 to 95 % quantile range of ~1 order of magnitude (distance between horizontal lines, from log$10$ 0.3 to log$10$ 2.08 = 0.84). The 5 to 95 % quantile range of the ratio between the RCMs and the observations is therefore half the range of the observations. Put differently, days with high observed discharge may have modeled discharge within ±0.5 order of magnitude, or plus or minus a factor of 5, or +500/-80 %. The modeled discharge is not likely to move farther than this from the observations, and high discharge remains high.
The bottom third of discharge is where the largest disagreement occurs. The mean model values are near the observed - the ratio of RCM to observed discharge is scaled by 0.67 for MAR (~33 % low) and 1.08 for RACMO (~8 % high), but the 5 to 95 % quantile range of the ratio between RCM and observations is large. Although large uncertainties for low discharge may not seem to matter for some uses (e.g., estimates of total discharge from Greenland, which is dominated by the largest quantities of discharge), it may matter for other uses. The bottom one-third quantile of observed discharge spans 3 orders of magnitude (10-2 to ~101) but the uncertainty of the RCM-to-observations ratio spans ~4 and ~2 orders of magnitude for MAR and RACMO respectively (~10-3 to ~2.2x101 MAR; ~10-1 to 2.2x101 RACMO).
abbrev | total | land | land % | ice | ice % |
---|---|---|---|---|---|
Ks | 693.33 | 391.31 | 56.4392 | 302.02 | 43.5608 |
K | 7.56 | 5.64 | 74.6032 | 1.92 | 25.3968 |
Kb | 37.52 | 28.71 | 76.5192 | 8.81 | 23.4808 |
L | 1360.88 | 0 | 0 | 1360.88 | 100 |
O | 16.37 | 14.39 | 87.9047 | 1.98 | 12.0953 |
Q | 13.22 | 2.2 | 16.6415 | 11.02 | 83.3585 |
R | 100.12 | 72.39 | 72.3032 | 27.73 | 27.6968 |
T | 24.85 | 18.89 | 76.0161 | 5.96 | 23.9839 |
W | 1881.79 | 520.91 | 27.6816 | 1360.88 | 72.3184 |
Z | 487.02 | 377.76 | 77.5656 | 109.26 | 22.4344 |
The Watson River discharge basin area is 1882 km2, of which 521 km2 (28 %) is land and 1361 km2 (72 %) is ice (Fig fig:Wa). The partial (last calendar year) discharge time series shows MAR and RACMO agree well with each other but have a maximum of 500 m3 s-1, while observations are up to 4x more (Fig. fig:Wb). Low discharge (both early and late season) is overestimated and high discharge is underestimated, approximately equal for both RCMs (Fig. fig:Wc). The low discharge overestimate ranges from a mean multiple of 1.68 (MAR) and 1.57 (RACMO) to a +95 % quantile ratio of ~70 (MAR) and ~52 (RACMO). The high-discharge underestimate has a mean multiple of ~0.5 for both MAR and RACMO and a 5 to 95 quantile range of between 0.23 to 1.09.
The Watson River discharge presented here is approximately half of the citet:van-as_2018 discharge for high discharge. The large underestimate for high discharge may be due to either errors in the basin delineation used in this study, errors in the stage–discharge relationship used by citet:van-as_2018, errors in the RCM runoff estimates, or a combination of the above three. All three of these error sources increase with high discharge (and associated melt): basin delineation becomes less certain with inland distance from the ice sheet margin. The river stage–discharge conversion becomes less certain at high stage levels. Runoff calculations become less certain from a snow surface than an ice surface because of, e.g., snow density, subsurface refreezing, and surface darkening.
The complexity of estimating the area of the Watson River catchment is described by citet:monteban_2020, who note that previous studies have used values ranging from 6131 km2 citep:mernild_2010_runoff to 12547 km2 citep:van-as_2012. Our basin is smaller than the basin used in citet:van-as_2018 and similar to citet:mernild_2018 who attributed the difference between their modeled outflow and observations from citet:van-as_2017 to their decision to use surface rather than subglacial routing, and applied a correction term. We find that our basin does not include a separate basin to the south that is part of the Watson River ice basin in citet:van-as_2018 (from citet:lindback_2015 and citet:lindback_2014). We are able to recreate the citet:van-as_2018 basin but only when using the citet:lindback_2014 bed and the citet:bamber_2013 surface. When using any other combination of bed DEM, surface DEM, or \(k\) values, we are unable to match the citet:lindback_2015 basin. Instead all our basins resemble the basin shown in Fig fig:W. To solve this, we manually select two large ice basins to the south of the Watson River ice basin. Modeled and observed discharge agree after including these two basins (Fig. fig:W_adjust), suggesting basin delineation, not stage–discharge or RCM runoff is the primary cause for this disagreement. Furthermore, it is the additional width at lower elevation from the larger basin, not the increased inland high-elevation area, that likely contributes the runoff needed to match the observations, because 85 % of all surface runoff occurs below 1350 m, and almost all below 1850 citet:van-as_2017.
At the Watson River outlet, there is no reason to suspect this product underestimates observed discharge by 50 %. The observations are needed to highlight the disagreement. Once this disagreement is apparent, it is also not clear what to do to reduce the disagreement without the previous efforts by citet:lindback_2015 and citet:lindback_2014. Basin delineation is discussed in more detail in the uncertainty section (Sect. \ref{sec:uncertain:basin}). The other two problematic areas highlighted above (Sect. \ref{sec:validation_obs}) can be detected and improved without observational support.
The Leverett Glacier basin area is 1361 km2 and 100 % ice (Fig fig:La). The partial (last calendar year) discharge time series shows MAR and RACMO agree well with each other and with the observations (Fig. fig:Lb), with no seasonal dependence (Fig fig:Lc). The 5 to 95 % prediction interval for MAR is generally within the 1:5 and 5:1 bands, with a larger spread for RACMO (Fig fig:Lc). High model discharge is 3 % higher than observed (MAR) or 25 % higher than observed (RACMO), and the 5 to 95 quantile range of the ratio is between 0.73 and 1.62 (MAR) and 0.83 and 2.02 (RACMO). Low model discharge is also centered near the observations, but as always larger errors exist for low discharge (Fig fig:Ld).
This basin is problematic because the basin feeding the outlet is small (< 5 km2), but even without the observational record satellite imagery shows a large river discharging from the ice sheet here. Meanwhile, a large (100s of km2) ice basin does discharge just a few 100 m away, but not upstream of this gauge location. We therefore adjust the gauge location onto the ice (equivalent to selecting a different outlet) so that our database access software selects what appears to be the correct basin given the size of the stream in the satellite imagery (Fig. fig:L).
The plots shown here use the adjusted gauge location and modeled discharge appears to match the observed discharge. When plotting (not shown) the modeled discharge for the outlet just upstream of the true gauge location, results are clearly incorrect. This issue - small basins at the margin and incorrect outlet location - is persistent throughout this product and discussed in more detail in Sect. \ref{sec:uncertain:basin}.
The Leverett Glacier basin is a subset of the Watson River outlet basin (Sect. \ref{sec:W}). The strong agreement here supports our claim that the Watson River disagreement is not from the RCM runoff or the stage–discharge relationship but more likely due to basin area. The correct Watson River basin should include some basins outside of the Leverett Glacier basin that still drain to the Watson River outlet gauge location.
<<py_init>>
<<load_all_obs>>
df = obs['Ks']
df['MAR'] = df['MAR_ice'] + df['MAR_ice_upstream']
df['RACMO'] = df['RACMO_ice'] + df['RACMO_ice_upstream']
df = df[['obs','MAR','MAR_land','RACMO','RACMO_land']]
df = df.sum(axis='rows')
df['MAR land %'] = df['MAR_land'] / (df['MAR']+df['MAR_land']) * 100
df['RACMO land %'] = df['RACMO_land'] / (df['RACMO']+df['RACMO_land']) * 100
df
The Kiattuut Sermiat discharge basin area is 693 km2, of which 391 km2 (56 %) is land and 302 km2 (44 %) is ice. The basin area is incorrectly large because the land basin reported and shown includes the entire basin that contains the discharge point, of which some is downstream (Fig fig:Ksa). However, only ~25 % of runoff comes from the land, and only a small portion of the land basin is downstream of the gauge location, so this is not enough to explain the discharge vs. observation disagreement. The partial (last calendar year) discharge time series shows MAR and RACMO agree well with each other, but are significantly higher than the observations (Fig. fig:Ksb). Both low and high discharge are overestimated, but the 5 to 95 % quantile range of the ratio are within a factor of 5 (Fig fig:Ksc), with a mean ratio between 1.71 (RACMO bottom one-third of discharge) to 2.44 (MAR high two-thirds discharge)
The Kiattuut Sermiat gauge is in a problematic location in terms of determining the actual (nontheoretical) upstream contributing area. Similar to the Leverett Glacier gauge location, the issues here can be estimated independent of observational data. Specifically, it is not clear if this stream includes water from the larger glacier to the east and east-northeast that feeds this glacier (Fig. fig:Ksa) - in our delineation it does not. Furthermore, several glaciers to the north-northwest and detached from the glacier near the stream gauge appear to drain into a lake that then drains under the glacier and then to the stream gauge. This latter issue is observable in any cloud-free satellite imagery and does not need the basin delineations provided here to highlight the complexities of this field site. Nonetheless, RCM discharge estimates are only slightly more than double the observations.
The Kiattuut Sermiat gauge location may have been selected in part due to its accessibility - it is walking distance from the Narsarsuaq Airport. The data may also suit their intended purpose well and there are likely many results that can be derived independent of the area or location of the upstream source water. However, if the location or area of the upstream contributions is important, then gauge location should balance ease of access and maintenance with the ease with which the data can be interpreted in the broader environment.
\label{sec:K} \label{sec:Kb} \label{sec:O} \label{sec:T}
Four Greenland Ecosystem Monitoring (GEM) program stream gauges are located near Nuuk, with similar basin properties. All are small (7.56 to 37.52 km2) and 10 % to 25 % ice in the basin mask, but two of the four (Kingigtorssuaq, Fig. fig:K; and Oriartorfik, Fig. fig:O) contain small glaciers contributing to observed discharge but no RCM ice cells cover those glaciers, and the remaining two (Teqinngalip, Fig. fig:T; and Kobbefjord, Fig. fig:Kb) have several small glaciers, but only one per basin has RCM ice coverage.
All four of these basins show some weak agreement. The maximum r2 is 0.47 (Fig. fig:Tc) and the minimum is 0.11 (Fig fig:Kc), but we note that the worst agreement comes from a basin with no glaciers in the RCM domain, and that in all cases the mean high discharge agrees well, suggesting high discharge in these small basins with few small glaciers may be due to rain (captured in the RCMs) rather than warm days and melted ice.
\label{sec:R} \label{sec:Z} \label{sec:Q}
Three additional stream gauges remain: Røde Elv, Zackenberg, and Qaanaaq.
The Røde Elv basin is situated at the southern edge of Disko Island (Fig. fig:bamber_2018). It has an area of 100 km2, of which 72 km2 is land and 28 km2 is ice (Fig fig:Ra). The partial (last calendar year) discharge time series shows MAR, RACMO, and the observations all in approximately the same range but with high variability (Fig. fig:Rb). Of the few samples here (n = 98), most are within the factor-of-5 bands for MAR and a few more are outside the bands for RACMO (Fig. fig:Rc). Mean discharge offset ranges from a ratio of 0.82 (RACMO low) to 1.85 (MAR low), with high-discharge estimates slightly closer to observations - a 48 % and 77 % overestimate for MAR and RACMO respectively (Fig. fig:Rd).
The Zackenberg basin in NE Greenland has an area of 487 km2, of which 378 km2 (78 %) is land and 109 km2 (22 %) is ice (Fig. fig:Za). The partial (last calendar year) discharge time series shows disagreements between MAR and RACMO that generally bound the observations (Fig. fig:Zb). RACMO-derived discharge is consistently high for low discharge early in the year, but both discharge products fall mostly within the factor-of-5 bands (Fig. fig:Zc). For high discharge, mean modeled discharge is 9 % high (MAR) and 24 % low (RACMO) and has a worst-case 5 to 95 % quantile range low by a factor of 0.29 (Fig. fig:Zd).
The Qaanaaq basin in NW Greenland has an area of 13.2 km2, of which 2.2 km2 (17 %) is land and 11 km2 (83 %) is ice (Fig. fig:Qa). The partial (last calendar year) discharge time series shows disagreements between MAR and RACMO that generally bound the observations (Fig fig:Qb). Of the few samples (n = 82), MAR preferentially overestimates and RACMO underestimates discharge, but both generally within a factor of 5 (Fig fig:Qc). The mean high-discharge ratio is 1.26 (MAR) and 0.4 (RACMO) from Fig. fig:Qd.
The volume of data generated here is such that manually examining all of it or editing it to remove artifacts or improve the data would be time and cost prohibitive. A similar warning is provided with the ArcticDEM data used here. However, any ArcticDEM issues interior to a basin do not impact results here that are aggregated by basin and reported at the outlet. ArcticDEM issues that cross basin boundaries should only impact a small part of the basin near the issue.
Uncertainty from RCM inputs and observations are considered external to this work, although they are still discussed (Sects. \ref{sec:uncertain:RCM} and \ref{sec:uncertain:obs}). In this work, we introduce one new source of uncertainty - the routing model, which generates both temporal (runoff delay) and spatial (basin delineation) uncertainty.
The RCMs include a time lag between when water melts in the model and when it leaves a grid cell. RACMO retention occurs only when there is firn cover (no retention when bare ice melts); MAR includes a time delay of up to 10 days that is primarily a function of surface slope citep:zuo_1996,yang_2019. However, neither model includes a subglacial system. Properly addressing time delays with runoff requires addressing storage and release of water across a variety of timescales in a variety of media: firn (e.g., citet:munneke_2014,vandecrux_2019), supraglacial streams and lakes (e.g., citet:zuo_1996,smith_2015,yang_2019), the subglacial system (e.g., citet:rennermalm_2013), possibly terrestrial streams and lakes (e.g., citet:van-as_2018) and a variety of other physical processes that are not within the scope of surface mass balance (SMB) modeling. Runoff delay can be implemented outside the RCMs (e.g., citet:liston_2012,mernild_2018), but for this version of the product we assume that once an RCM classifies meltwater as runoff, it is instantly transported to the outlet. Actual lags between melt and discharge range from hours to years citep:colgan_2011_hydrology,van-as_2017,rennermalm_2013,livingston_2013.
Data released here include no additional lag beyond the RCM lag, although a 7 d running mean (cf. citet:van-as_2017) is included in all of the results presented here except Fig. fig:bamber_2018, which shows monthly summed data, and Fig. fig:scatter_yearsum, which shows yearly summed data. When increasing the signal to noise by summing by year (Fig. fig:scatter_yearsum vs. Fig. fig:scatter_daily), model results more closely match observations.
Basin uncertainty is a function of the subglacial routing assumptions (the \(k\) parameter in Eq. eq:head, which in reality varies in both space and time). However, basin uncertainty does not necessary translate to discharge uncertainty. For example, when comparing two \(k\) simulations, a large basin in simulation \(k_0\) may change only its outlet by a few grid cells in \(k_1\). A small micro-basin may appear in \(k_1\) with its outlet in the same grid cell as the large \(k_0\) outlet. The large change in discharge between the two outlets at the same location in \(k_0\) and \(k_1\) is not an appropriate estimate of uncertainty - rather the large basin in \(k_0\) should be compared with the almost entirely overlapping large basin in \(k_1\) with the different outlet. This fluidity of basins and outlets between \(k\) scenarios makes it almost impossible to define, identify, and compare basins between scenarios, unless working manually with individual basins (as we did, for example, at the Leverett Glacier observation location, modeled upstream basin, and adjusted upstream basin; see Sect. \ref{sec:L}).
Another example has a large basin in simulation \(k_0\) and a similarly large basin in simulation \(k_1\) draining out of the same grid cell, but overlapping only at the outlet grid cell. Upstream the two do not overlap and occupy different regions of the ice sheet. These two basins sharing one outlet (between different \(k\) simulations) could have similar discharge. Put differently, although inland grid cells may change their outlet location by large distances under different routing assumptions (Fig. fig:k_basin_change), that does not imply upstream basin area changes under different routing assumptions. Large changes in upstream catchment area are possible citep:chu_2016_rerouting, but we note citet:chu_2016_rerouting highlight changes at only a few outlets and under the extreme scenario of \(k = 1.11\) describing an overpressured system. Because \(ρ_w/ρ_i = 1.09\), setting \(k = 1.09\) reduces Eq. \ref{eq:head} to \(h = z_s\) and is equivalent to an overpressured system with surface routing of the water. In a limited examination comparing our results with \(k ∈ [0.8, 0.9, 1.0]\), we did not detect basins with large changes in upstream area. In addition, all time series graphics show the mean RCM discharge for \(k = 1.0\), but the uncertainty among all three \(k\) values (not shown) is small enough that it is difficult to distinguish the three separate uncertainty bands - the difference between RCMs or between RCMs and observations is much larger than uncertainty from the \(k\) parameter.
The above issues are specific to ice basins. Land basin outlets do not change location, and the range of upstream runoff from different \(k\) simulations to a land outlet provides one metric of uncertainty introduced by the \(k\) parameter. This uncertainty among all three \(k\) values is small at ice margin outlets. It is even smaller at land outlets which act as spatial aggregators and increase the signal-to-noise ratio.
Below, we discuss the known uncertainties, ranging from least to most uncertain.
The basins presented here are static approximations based on the 100 m DEM of a dynamic system. Land basin boundaries are likely to be more precise and accurate than ice basins because the land surface is better resolved, has larger surface slopes, has negligible subsurface flow, and is less dynamic than the ice surface. Even if basins and outlets seem visually correct from the 100 m product, the basin outline still has uncertainty on the order of hundreds of meters and will therefore include many minor errors and nonphysical properties, such as drainage basin boundaries bisecting lakes. However, all artifacts we did find are significantly smaller than the 1 km2 grid of the RCM inputs. We do not show but note that when doing the same work with the 150 m BedMachine land surface DEM, some basins change their outlet locations significantly - draining on the opposite side of a spit or isthmus and into a different fjord than the streams do when observed in satellite imagery. We have not observed these errors in streams and basins derived from the 100 m ArcticDEM in a visual comparison with Google Earth, although they may still exist.
Moving from land basins to subglacial ice basins, the uncertainty increases because subglacial routing is highly dynamic on timescales from minutes to seasons (e.g., citet:werder_2013). This dynamic system may introduce large spatial changes in outflow location (water or basin “piracy”, citet:ahlstrom_2002,lindback_2015 and citet:chu_2016_rerouting), but citet:stevens_2018 suggests basins switching outlet locations may not be as common as earlier work suggests, and our sensitivity analysis suggests that near the margin where the majority of runoff occurs, outlet location often changes by less than 10 km under different routing assumptions (Fig. fig:k_basin_change). The largest (> 100 km) changes in outlet location in Fig. fig:k_basin_change occur when the continental or ice flow divides move, and one or two of the \(k\) scenario(s) drain cells to an entirely different coast or sector of the ice sheet. Finally, in some locations water is routed supraglacially, not subglacially (c.f. citet:li_2022).
The regions near the domain edges - both the land coast and the ice margin - are covered by many small basins, and in this work basins < 1 km2 are absorbed into their largest neighbor (see Methods section). By definition these basins are now hydraulically incorrect. An example can be seen in the Zackenberg basin (Fig. fig:Za, southwest corner of the basin), where one small basin on the southern side of a hydraulic divide was absorbed into the large Zackenberg basin that should be defined by and limited to the northern side of the mountain range.
Near the ice margin quality issues exist. At the margin, many of the small basins (absorbed or not) may be incorrect because the bed uncertainty is larger relative to the ice thickness, and therefore uncertainty has a larger influence on routing. Minor mask misalignments may cause hydraulic jumps (waterfalls) at the margin, or sinks that then need to be filled by the algorithm, and may overflow (i.e., the stream continues onward) somewhere at the sink edge different from the location of the real stream. The solution for individual outlets is to visually examine modeled outlet location, nearby streams in satellite imagery, and the area of upstream catchments, as we did for the Leverett Glacier outlet (Sect \ref{sec:L}). Alternatively, selecting several outlets in an area will likely include the nearby correct outlet. This can be automated and an effective method to aggregate all the micro-ice basins that occur at the domain edge is to select the downstream land basin associated with one ice outlet and then all upstream ice outlets for that land basin.
In addition to the basin delineation issues discussed above, the runoff product from the RCMs also introduces uncertainty into the product generated here. The RCM input products do not provide formal time- or space-varying error estimates, but of course do contain errors because they represent a simplified and discretized reality. RCM uncertainty is shown here with a value of ±15 %. The MAR uncertainty comes from an evaluation by the Greenland SMB Model Intercomparison Project (GrSMBMIP; citet:fettweis_2020) that examined the uncertainty of modeled SMB for 95 % of the 10767 in situ measurements over the main ice sheet. The mean bias between the model and the measurements was 15 % with a maximum of 1000 mmWE yr-1. GrSMBMIP uses integrated values over several months of SMB, suggesting larger uncertainty of modeled runoff at the daily timescale. The RACMO uncertainty comes from an estimated average 5% runoff bias in RACMO2.3p2 compared to annual cumulative discharge from the Watson River citep:noel_2019. The bias increases to a maximum of 20 % for extreme runoff years (e.g., 2010 and 2012), so here we select 15 %, a value between the reported 5 % and the maximum 20 % that matches the MAR uncertainty. We display ±15 % uncertainty in the graphics here and suggest this is a minimum value for daily runoff data.
The 15 % RCM uncertainty is represented graphically in the time series plots when comparing to each of the observations. It is not shown in the scatter plots because the log–log scaling and many points make it difficult to display. In the time series plots, we show the mean value from the \(k = 1.0\) scenario and note that discharge from the other two \(k\) scenarios covered approximately the same range.
When comparing against observations, additional uncertainty is introduced because the stage–discharge relationship is neither completely precise nor accurate. We use published observation uncertainty when it exists. Only two observational data sets come with uncertainty: Watson River and Qaanaaq. Similar to the RCM uncertainty, they are displayed in the time series but not in the scatter plots.
Traditional uncertainty propagation is further complicated because it is not clear to what extent the three uncertainties (observational, RCM, and routing model) should be treated as independent from each other - all three uncertainties are likely to show some correlation with elevation, slope, air temperature, or other shared properties or processes.
Many of the uncertainties discussed here can be mitigated by increasing the signal-to-noise ratio of the product. Because we provide a high spatial and temporal resolution product, this is equivalent to many signals, each of which has some uncertainty (noise). Averaging results spatially or temporally, if possible for a downstream use of this product, will increase the signal-to-noise ratio and reduce uncertainty.
For example, because we provide basins for the entire ice sheet, total discharge is not subject to basin uncertainty. Any error in the delineation of one basin must necessarily be corrected by the inclusion (if underestimate) or exclusion (if overestimate) of a neighboring basin, although neighboring basins may introduce their own errors. Therefore, summing basins reduces the error introduced by basin outline uncertainty and should be done if a downstream product does not need an estimate of discharge from a single outlet. This feature is built-in to coastal outlet discharge, which is not as sensitive to our routing algorithm as ice margin outlet discharge because most coast outlets include a range of upstream ice margin outlets (e.g., Fig. fig:W vs. fig:L). Conversely, at the ice margin, outlet location and discharge volume is more uncertain than at the land coast. However, most runoff is generated near the ice margin, and as runoff approaches the margin, there are fewer opportunities to change outlet location (Fig. fig:k_basin_change).
Our coverage algorithm (Sect \ref{sec:methods:discharge}) only fills in glaciated regions that have at least some RCM coverage. When working with basins that have glaciated areas and no RCM coverage as in the case for all four of the GEM outlets near Nuuk, discharge could be approximated by estimating discharge from the nearest covered glaciated area with a similar climatic environment.
Temporally, errors introduced by this study’s assumption of instantaneous discharge can be reduced by summing or averaging discharge over larger time periods, or applying a lag function to the time series as done here and in citet:van-as_2017. Although a given volume of water may remain in storage long term, if one assumes that storage is in roughly steady state, then long-term storage shown by, for example, dye trace studies, can be ignored - the volume with the dye may be stored, but a similar volume should be discharged in its place.
The scale of the data are such that manual editing to remove artifacts is time and cost prohibitive. Here we provide one example of incorrect metadata. The elevation of each outlet is included as metadata by looking up the bed elevation in the BedMachine data set at the location of each outlet. Errors in BedMachine or in the outlet location (defined by the GIMP ocean mask) introduce errors in outlet elevation.
A large basin in NW Greenland has metadata outlet elevation > 0 (gray in Fig. fig:overview) but appears to be marine terminating when viewed in satellite imagery. Elsewhere the land- vs. marine-terminating color coding in Fig. fig:overview appears to be mostly correct, but this view only provides information about the sign of the elevation and not the magnitude (i.e., if the reported depth is correct). Ice outlets can occur above, at, or below 0 m. It is easier to validate the land-terminating basins, which should in theory all have an outlet elevation of ≤ 0 m. That is not the case (Fig. fig:elev). Outlets at 0 m are coastal outlets. Outlets < 0 m occur when land-source runoff enters the subglacial system and discharges at depth from a marine terminating glacier, but we expect these outlets to be << 0 m, not near 0 m. This edge case also produces land outlets that exist outside of land basins. It is possible for land outlets to be correctly assigned an elevation > 0 m, if a land basin outlet occurs at a waterfall off a cliff (as might occur the edges of Petermann fjord) or due to DEM discretization of steep cells. However, most of the land outlets near but not at 0 m are likely due to mask misalignment placing a section of coastline in a fjord (negative land elevation) or inland (positive land elevation), or discretization issues in BedMachine. The bulk of land discharge (75 %) occurs within 0 ±10 m elevation and 90 % within 0 ±30 m elevation (Fig. fig:elev).
The liquid water discharge product provided here is only one source of freshwater that leaves the ice sheet and affects fjords and coastal seas. The other primary freshwater source is iceberg calving and submarine melt at the ice/ocean boundary of marine-terminating glaciers. A companion to the liquid water discharge product introduced here is provided by citet:mankoff_2019_ice,mankoff_2020_ice, which estimates solid ice volume flow rates across gates near marine-terminating glaciers. That downstream ice enters fjords as either calving icebergs or liquid water from submarine melting.
Both this product and citet:mankoff_2020_ice provide liquid or solid freshwater volume flow rates at outlets (this product) or grounding lines citep:mankoff_2020_ice, but actual freshwater discharge into a fjord occurs at a more complicated range of locations. Solid ice melts throughout the fjord and beyond (e.g., citet:enderlin_2016,moon_2017), and the freshwater discharge presented here may enter at the reported depth (Sect. \ref{sec:QC}) but rapidly rises up the ice front and eventually flows into the fjord at some isopycnal (see citet:mankoff_2016). The eventual downstream location of the fresh water is not addressed in this work.
Freshwater inputs directly to the water surface are also not included in this product. The flux (per square meter) to the water surface should be similar to the flux to the non-ice-covered land surface - assuming the orographic effects on precipitation produce similar fluxes to the near-land water surface.
Finally, basal melt from 1) geothermal heating (e.g., citet:fahnestock_2001) 2) frictional heating (e.g., citet:echelmeyer_1990) and 3) viscous heat dissipation from runoff (e.g., citet:mankoff_2017) contributes additional discharge (see for example citet:johannesson_2020) to the surface melt. Geothermal and frictional heating are approximately in steady state and contribute freshwater throughout the winter months.
Of the 20 comparisons between the two RCMs and the 10 observations we note the following.
- In general this product shows good agreement between observations and the modeled discharge from the RCM runoff routed to the outlets, when comparing across multiple basins, especially when ignoring small basins with small glaciers that are not included in the RCMs (Fig. \ref{fig:scatter_daily}). The agreement is not as good when estimating the discharge variability within individual basins. From this, the product is more appropriately used to estimate the magnitude of the discharge from any individual basin, and perhaps provide some idea of the statistical variability, but not necessarily the precise amount of discharge for any specific day, because routing delays are neglected.
- The majority of the 20 comparisons have the 5 to 95 % prediction interval between scales of 1:5 and 5:1. From this, the model results match observations within plus or minus a factor of 5, or half an order of magnitude. Put differently, the daily RCM values for single or few basins have an uncertainty of +500 % or -80 %.
- The uncertainty of +500%/-80% is for “raw” data: daily discharge for one or few basins with a simple temporal smooth. When averaging spatially or temporally over larger areas or longer times, uncertainty decreases (Sect. \ref{sec:uncertainty}). For example, when moving from daily data (Fig. fig:scatter_daily) to annual sums (Fig. fig:scatter_yearsum), the uncertainty is reduced to +100%/-50%.
- The two RCMs agree best with each other for the three observations dominated by large ice domains (Watson River, Sect. \ref{sec:W} & Fig. \ref{fig:W}); Leverett Glacier, Sect. \ref{sec:L} & Fig. \ref{fig:L} which is a subset of the Watson River basin; and Kiattuut Sermiat, Sect. \ref{sec:Ks} & Fig. \ref{fig:Ks}). RCMs agree best with observations for ice-dominated basins with well-resolved bed topography in BedMachine (i.e., correct basins modeled in this work) - here only Leverett Glacier (Sect. \ref{sec:L} & Fig. \ref{fig:L}) meets this criterion.
- Runoff errors increase with low discharge (panels ‘d’ in Figs. \ref{fig:W}, \ref{fig:L} to \ref{fig:Q}).
- For land basins, subglacial routing errors no longer exist, basins are well defined, and errors are due to neglecting runoff delays or the RCM estimates of runoff.
- For ice basins, errors are dominated by basin uncertainty. Errors between similar-sized and neighboring basins are likely to offset and may even cancel each other. Even so, a conservative treatment might consider errors between basins as random and reduce by the sum of the squares when summing discharge from multiple similar-sized and neighboring basins.
These data contain a static map of Greenland’s hydrological outlets, basins, and streams and a times-series of discharge from each outlet.
The output data are provided in the following formats:
the stream data are provided as a GeoPackage standard GIS product and a metadata CSV that includes the stream type (start or intermediate segment), network, stream along-flow length, stream straight length, sinuosity, source elevation, outlet elevation, and a variety of stream indices such as the Strahler, Horton, Shreve, Hack, and other parameters citep:jasiewicz_2011. We note that the subglacial streams are unvalidated with respect to actual subglacial conduits, and they should be used with caution.
The outlet data are also provided as a GeoPackage and CSV, each of which include the outlet ID (linked to the basin ID), the longitude, latitude, EPSG:3413 x and y, and the outlet elevation. The outlet elevation is the BedMachine bed elevation at the outlet location, and users should be aware of quality issues identified in Sect. \ref{sec:QC}. The ice outlet metadata includes the ID, longitude, latitude, x, and y of the downstream land outlet, if one exists.
The basin product GeoPackage includes the geospatial region that defines the basin. The metadata CSV includes the basin ID (linked to the outlet ID), and the area of each basin.
The time series discharge product is provided as four NetCDF files per year, one for each domain (ice margin and land coast) and one for each RCM (MAR and RACMO). The NetCDF files contain an unlimited time dimension, usually containing 365 or 366 d; much of the same metadata as the outlets CSV file, including the outlet (also known as station) ID, the latitude, longitude, and elevation of the outlet; and a runoff variable with dimensions (station, time) and units of cubic meters per second (m3 s-1).
The data can be accessed with custom code from the raw data files. However, to support downstream users we provide a tool to access the outlets, basins, and discharge for any region of interest (ROI). The ROI can be a point, a list describing a polygon, or a file, with units in longitude and latitude (EPSG:4326) or meters (EPSG:3413). If the ROI includes any land basins, an option can be set to include all upstream ice basins and outlets, if they exist. The script can be called from the command line (CLI) and returns CSV formatted tables or within Python and returns standard Python data structures (from the GeoPandas
or xarray
package).
For example, to query for discharge at one point (50.5 °W, 67.2 °N), the following command is issued:
python ./discharge.py --base ./freshwater --roi=-50.5,67.2 --discharge
,
where discharge.py
is the provided script, ./freshwater
is the folder containing the downloaded data, and --discharge
tells the program to return RCM discharge (as opposed to --outlets
which would return basin and outlet information). The program documentation and usage examples are available at https://github.com/GEUS-Glaciology-and-Climate/freshwater citep:github_freshwater.
Because the --upstream
option is not set, the --discharge
option is set, and the point is over land, the results of this command are a time series for the MAR and RACMO land discharge for the basin containing this point. A small subset (the first 10 days of June 2012) is shown as an example:
time | MAR_land | RACMO_land |
---|---|---|
2012-06-01 | 0.043025 | 0.382903 |
2012-06-02 | 5.5e-05 | 0.095672 |
2012-06-03 | 5e-05 | 0.009784 |
2012-06-04 | 9e-06 | -0.007501 |
2012-06-05 | 0.008212 | 0.007498 |
2012-06-06 | 28.601947 | 0.607345 |
2012-06-07 | 0.333926 | 0.05691 |
2012-06-08 | 0.489437 | 0.204384 |
2012-06-09 | 0.038816 | 0.167325 |
2012-06-10 | 5.1e-05 | 0.011415 |
If the upstream
option is set, two additional columns are added: one for each of the two RCM ice domains. A maximum of six columns may be returned: 2 RCMs × (1 land + 1 ice + 1 upstream ice domain), because results are summed across all outlets within each domain when the script is called from the command line (summing is not done when the script is accessed from within Python).
If the --outlets
option is set instead of the --discharge
option, then results are a table of outlets. For example, moving 10 ° east over the ice,
python ./discharge.py --base ./freshwater --roi=-40.5,67.2 --outlets
results in the following.
index | id | lon | lat | x | y | elev | domain | upstream | coast_id | … |
---|---|---|---|---|---|---|---|---|---|---|
0 | 118180 | -38.071 | 66.33 | 313650 | -2580750 | -78 | land | False | -1 | … |
1 | 67133 | -38.11 | 66.333 | 311850 | -2580650 | -58 | ice | False | 118180 | … |
If the script is accessed from within Python, then the discharge
option returns an xarray
Dataset
of discharge, without aggregating by outlet, and the outlets
option returns a GeoPandas
GeoDataFrame
, and includes the geospatial location of all outlets and outline of all basins, and can be saved to GIS-standard file formats for further analysis.
\conclusions
We provide a 100 m spatial resolution data set of streams, outlets, and basins, and a 1 day temporal resolution data set of discharge through those outlets for the entire ice sheet area from 1950 through December 2022. Access to this database is made simple for nonspecialists with a Python script. Comparing the two RCM-derived discharge products to 10 gauged streams shows the uncertainty is approximately plus or minus a factor of 5, or half an order of magnitude, or +500%/-80%, when comparing daily discharge for single or few basins.
Because of the high spatial (individual basins) and temporal (daily) resolution, larger uncertainty exists than when working over larger areas or time steps. These larger areas and times can be achieved through spatial and temporal aggregating or by implementing a lag function.
This liquid freshwater volumetric flow rate product is complemented by a solid ice discharge product citep:mankoff_2020_ice. Combined, these provide an estimate of the majority of freshwater (total solid ice and liquid) flow rates from the Greenland Ice Sheet into fjords and coastal seas, at high temporal resolution and process-level spatial resolution (i.e., glacier terminus for solid ice discharge, stream for liquid discharge).
This estimate of freshwater volume flow rate into Greenland fjords aims to support further studies of the impact of freshwater on ocean physical, chemical, and biological properties; fjord nutrient, sediment, and ecosystems; and larger societal impacts of freshwater on the fjord and surrounding environments.
The data from this work are available at doi:10.22008/promice/freshwater citep:GEUS_freshwater_paper.
The code and a website for postpublication updates are available at https://github.com/GEUS-Glaciology-and-Climate/freshwater citep:github_freshwater where we document changes to this work and use the GitHub Issues feature to collect suggested improvements, document those improvements as they are implemented, and document problems that made it through review. This version of the document is generated with git commit version \input{|”git describe –always –dirty=’*’”}.
The hydrological basins are defined based on a range of subglacial flow routing regimes.
The head gradient is defined as:
Location | Description |
---|---|
Sea | Undefined |
Land | ArcticDEM 100 m |
Ice | ArcticDEM + k * 0.917 * thickness |
thickness
is from BedMachine.
k
is equal to one of 0.7, 0.8, 0.9, 1.0, or 1.09, where 1.09 ~ \(ρ_w/ρ_i\), or surface routing.
bed
is ArcticDEM surface - BedMachine thickness.
log_info "Calculating subglacial head with k: ${k} (${domain})"
r.mapcalc "head = if(mask_o_l_i@ArcticDEM == 1, null(), 0) + if(mask_o_l_i@ArcticDEM == 2, z_s@ArcticDEM, 0) + if(mask_o_l_i@ArcticDEM == 3, (z_s@ArcticDEM - thickness@BedMachine) + ${k} * 0.917 * thickness@BedMachine)"
Then, we calculate the streams, outlets, and basins based on the head
<<streams>>
<<outlets>>
<<basins>>
Putting it all together, we want to calculate streams, outlets, and basins to the ice edge (domain = ice), and once to the coast (domain=land). See Section Domains for implementation. This is the top-level ./sob.sh code that implements the streams, outlets, and basins routing and exports the results to disk.
- NOTE
- We only run the land domain 1x. The upstream basins are a function of subglacial pressure, but the exposed portion of the land basins are independent from subglacial pressure. I run ice_100, although any could be run because there is no ice overburden on land.
- Note
- land domain run first, because ice domains need info from land domain (downstream land outlet)
<<init>>
for domain in land ice; do
for k_pct in 100 90 80; do
# when land, only 100
[[ ${domain} == land ]] && [[ ${k_pct} != 100 ]] && break
k=$(echo "scale=2; ${k_pct}/100" | bc -l)
log_info "Setting domain to ${domain}_${k_pct}"
g.mapset -c ${domain}_${k_pct}
<<mask_domain>>
<<head>>
<<sob>>
<<adjust_basins>>
<<metadata>>
<<export>>
done
done
Below, we’ll build out the code defined above.
After calculating the head, we use 3rd party tools to get the flow direction and streams
THRESH=300
log_warn "Using threshold: ${THRESH}"
log_info "r.stream.extract..."
r.stream.extract elevation=head threshold=${THRESH} memory=16384 direction=dir stream_raster=streams stream_vector=streams
- The flow direction
dir
is negative where flow leaves the domain. These are the outlets. - Encode each outlet with a unique id
log_info "Calculating outlets"
r.mapcalc "outlets_1 = if(dir < 0, 1, null())"
r.out.xyz input=outlets_1 | \
cat -n | \
tr '\t' '|' | \
cut -d"|" -f1-3 | \
v.in.ascii input=- output=outlets_uniq separator=pipe \
columns="x int, y int, cat int" x=2 y=3 cat=1
Using r.stream.basins
, we can get basins for every outlet.
log_info "r.stream.basins..."
r.stream.basins -m direction=dir points=outlets_uniq basins=basins_uniq memory=16384 --verbose
<<absorb_small_basins>>
- For the ice domain, the domain boundary is the ice/land edge.
- For the land domain, the domain boundary is the land/fjord edge.
g.region -dp
log_info "Masking domain to ${domain}"
[[ ${domain} == "ice" ]] && r.mask raster=mask_o_l_i@ArcticDEM maskcats=3 --o # mask to ice
[[ ${domain} == "land" ]] && r.mask raster=mask_o_l_i@ArcticDEM maskcats="2 3" --o # mask to land & ice
<<mask_small_areas>>
I make the following basin adjustments:
- Ice basins have nunatuks masked out. They needed to be classified as “ice” for the routing algorithm, otherwise SOBs occur inland, not routed to the ice margin, but the “basin” raster used for masking RCM data should not include the nunatuk area.
- Land basins have nunatuks and inland “land islands in the ice” included in the land basin. No outlets occur here, but the land area and basin ID match the associated outlet.
if [[ ${domain} == "ice" ]]; then
log_info "Adjusting ice basins..."
g.rename raster=basins,basins_filled vector=basins,basins_filled
r.mapcalc "basins = basins_filled * mask_ice_holes@Citterio_2013"
r.to.vect -v input=basins output=basins type=area
v.db.dropcolumn map=basins column="label"
fi
if [[ ${domain} == "land" ]]; then
log_info "Adjusting land basins..."
g.rename raster=basins,basins_filled vector=basins,basins_filled
r.mapcalc "basins = basins_filled * not_ice@Citterio_2013"
r.to.vect -v input=basins output=basins type=area
v.db.dropcolumn map=basins column="label"
fi
<<add_metadata>>
<<add_stream_indices>>
- streams [2/2]
- [X] stream indices
- [X] stream length
- basin [2/2]
- [X] area
- [X] ice - has some ice contribution
- outlet [5/5]
- [X] acc value - no, can use area
- [X] BedMachine z_b
- [X] lon, lat
- [X] EPSG 3413 x, y
- [X] link margin outlets to coast outlet
log_info "Adding metadata..."
###
### streams
###
# v.db.addcolumn map=streams column="length DOUBLE"
v.to.db map=streams option=length column=length
###
### outlets
###
v.db.addcolumn map=outlets column="lon DOUBLE PRECISION"
v.db.addcolumn map=outlets column="lat DOUBLE PRECISION"
v.db.addcolumn map=outlets column="x INT"
v.db.addcolumn map=outlets column="y INT"
# v.db.addcolumn map=outlets column="cells INT"
v.db.addcolumn map=outlets column="elev INT"
# r.mask -r
v.what.rast map=outlets raster=x@PERMANENT column=x
v.what.rast map=outlets raster=y@PERMANENT column=y
v.what.rast map=outlets raster=z_b@BedMachine column=elev
# probably a more efficient way to get lon,lat column from x,y...
mkdir -p tmp
db.select -c sql='select x,y,cat from outlets' | m.proj -od input=- | tr '|' ',' > ./tmp/lonlat.csv
db.in.ogr input=./tmp/lonlat.csv output=lonlat
db.select table=lonlat|head
v.db.join map=outlets column=cat other_table=lonlat other_column=field_3
v.db.update map=outlets column=lon query_column=field_1
v.db.update map=outlets column=lat query_column=field_2
v.db.dropcolumn map=outlets columns=field_1,field_2,field_3
db.select table=outlets | head
# distance from outlet ice or coast
if [[ ${domain} == "ice" ]]; then # processing ice domain.
# Find which coast basin we're inside of
ice_domain=$(g.mapset -p)
land_domain=land_100
v.db.addcolumn map=outlets column="coast_id int"
v.what.rast map=outlets type=point raster=basins_filled@${land_domain} column=coast_id
v.db.addcolumn map=outlets column="coast_lon double"
v.db.addcolumn map=outlets column="coast_lat double"
v.db.addcolumn map=outlets column="coast_x int"
v.db.addcolumn map=outlets column="coast_y int"
g.copy vector=outlets@${land_domain},oland
db.execute sql='UPDATE outlets SET coast_lon=(SELECT lon from oland WHERE outlets.coast_id=oland.cat)'
db.execute sql='UPDATE outlets SET coast_lat=(SELECT lat from oland WHERE outlets.coast_id=oland.cat)'
db.execute sql='UPDATE outlets SET coast_x=(SELECT x from oland WHERE outlets.coast_id=oland.cat)'
db.execute sql='UPDATE outlets SET coast_y=(SELECT y from oland WHERE outlets.coast_id=oland.cat)'
fi
# Nearest upstream Zwally sector or Mouginot basin/region for each hydro outlet
v.db.addcolumn map=outlets columns="Z2012_sector INT"
v.db.addcolumn map=outlets columns="Z2012_sector_dist INT"
v.distance from=outlets to=sectors@Zwally_2012 upload=to_attr column=Z2012_sector to_column=cat_
v.distance from=outlets to=sectors@Zwally_2012 upload=dist column=Z2012_sector_dist
v.db.addcolumn map=outlets columns="M2019_ID INT"
v.db.addcolumn map=outlets columns="M2019_ID_dist INT"
v.db.addcolumn map=outlets columns="M2019_basin CHAR(99)"
v.db.addcolumn map=outlets columns="M2019_region CHAR(99)"
v.distance from=outlets to=basins@Mouginot_2019 upload=to_attr column=M2019_ID to_column=cat output=connection
v.distance from=outlets to=basins@Mouginot_2019 upload=dist column=M2019_ID_dist
v.distance from=outlets to=basins@Mouginot_2019 upload=to_attr column=M2019_basin to_column=NAME
v.distance from=outlets to=basins@Mouginot_2019 upload=to_attr column=M2019_region to_column=SUBREGION1
# Nearest Mankoff 2020 gate and distance to gate
v.db.addcolumn map=outlets columns="M2020_gate INT"
v.db.addcolumn map=outlets columns="M2020_gate_dist INT"
v.distance from=outlets to=gates@Mankoff_2020 upload=to_attr column=M2020_gate to_column=gate
v.distance from=outlets to=gates@Mankoff_2020 upload=dist column=M2020_gate_dist
# Nearest Bjørk 2015d name and distance to point
v.db.addcolumn map=outlets columns="B2015_name CHAR(99)"
v.db.addcolumn map=outlets columns="B2015_dist INT"
v.distance from=outlets to=names@Bjork_2015 upload=to_attr column=B2015_name to_column=Official_n
v.distance from=outlets to=names@Bjork_2015 upload=dist column=B2015_dist
###
### basins
###
# area of basin
v.to.db map=basins option=area column=area
log_info "r.stream.order: BEGIN"
date
time r.stream.order -m stream_rast=streams direction=dir elevation=head accumulation=ones@PERMANENT stream_vect=stream_vect strahler=strahler horton=horton shreve=shreve hack=hack topo=topological memory=16384
date
log_info "r.stream.order: END"
# g.copy vector=streams,foo --o
# g.copy vector=stream_vect,bar --o
for c in $(echo strahler horton shreve hack drwal_old topo_dim); do
db.execute sql="ALTER TABLE streams ADD COLUMN ${c} INT"
db.execute sql="UPDATE streams SET ${c}=(SELECT ${c} from stream_vect WHERE stream_vect.cat=streams.cat)"
done
for c in $(echo stright sinosoid cum_length source_elev outlet_elev); do
db.execute sql="ALTER TABLE streams ADD COLUMN ${c} double"
db.execute sql="UPDATE streams SET ${c}=(SELECT ${c} from stream_vect WHERE stream_vect.cat=streams.cat)"
done
# # fix typo: sinosoid -> sinusoid; stright -> straight
db.execute sql="ALTER TABLE streams ADD COLUMN sinusoid DOUBLE"
db.execute sql="UPDATE streams SET sinusoid = sinosoid"
# db.execute sql="ALTER TABLE streams DROP COLUMN sinosoid"
v.db.dropcolumn map=streams columns=sinosoid
db.execute sql="ALTER TABLE streams ADD COLUMN straight DOUBLE"
db.execute sql="UPDATE streams SET straight = stright"
# db.execute sql="ALTER TABLE streams DROP COLUMN stright"
v.db.dropcolumn map=streams columns=stright
log_info "Exporting..."
MAPSET=$(g.mapset -p)
mkdir -p freshwater/${MAPSET}
# db.select table=streams | tr '|' ',' > ./freshwater/${MAPSET}/streams.csv
# db.select table=outlets | tr '|' ',' > ./freshwater/${MAPSET}/outlets.csv
# db.select table=basins | tr '|' ',' > ./freshwater/${MAPSET}/basins.csv
parallel --bar "db.select table={} | tr '|' ',' > ./freshwater/${MAPSET}/{}.csv" ::: streams outlets basins
# v.out.ogr input=streams output=./freshwater/${MAPSET}/streams.gpkg --o
# v.out.ogr input=outlets output=./freshwater/${MAPSET}/outlets.gpkg --o
# v.out.ogr input=basins output=./freshwater/${MAPSET}/basins.gpkg --o
parallel --bar "v.out.ogr input={} output=./freshwater/${MAPSET}/{}.gpkg --o" ::: streams outlets basins basins_filled
- I set
DATADIR
as abash
environment variable in my login scripts. - This is so that Python babel blocks can also easily get that property.
import os
DATADIR = os.environ['DATADIR']
<<get_DATADIR>>
print(DATADIR)
set -o nounset
set -o pipefail
# set -o errexit
### uncomment the above line when doing initial run. When rerunning and
### counting on GRASS failing w/ overwrite issues (speed increase), the
### line above must be commented
red='\033[0;31m'; orange='\033[0;33m'; green='\033[0;32m'; nc='\033[0m' # No Color
log_info() { echo -e "${green}[$(date --iso-8601=seconds)] [INFO] ${@}${nc}"; }
log_warn() { echo -e "${orange}[$(date --iso-8601=seconds)] [WARN] ${@}${nc}"; }
log_err() { echo -e "${red}[$(date --iso-8601=seconds)] [ERR] ${@}${nc}" >&2; }
trap ctrl_c INT # trap ctrl-c and call ctrl_c()
function ctrl_c() {
MSG_WARN "Caught CTRL-C"
MSG_WARN "Killing process"
kill -term $$ # send this program a terminate signal
}
debug() { if [[ debug:- == 1 ]]; then log_warn "debug:"; echo $@; fi; }
<<GRASS_config>>
https://grass.osgeo.org/grass74/manuals/variables.html
GRASS_VERBOSE | |
---|---|
-1 | complete silence (also errors and warnings are discarded) |
0 | only errors and warnings are printed |
1 | progress and important messages are printed (percent complete) |
2 | all module messages are printed |
3 | additional verbose messages are printed |
export GRASS_VERBOSE=3
# export GRASS_MESSAGE_FORMAT=silent
if [ -z ${DATADIR+x} ]; then
echo "DATADIR environment varible is unset."
echo "Fix with: \"export DATADIR=/path/to/data\""
exit 255
fi
set -x # print commands to STDOUT before running them
log_info "x, y, ones..."
MAPSET=$(g.mapset -p)
g.mapset PERMANENT
r.mapcalc "x = x()"
r.mapcalc "y = y()"
r.mapcalc "ones = 1"
g.mapset ${MAPSET}
Don’t process tiny land islands.
- This is supposed to remove small islands in the ocean
- It may also remove very small nunatuks
# remove islands
# frink "90 m^2 * 10 -> hectares" # 8.1
# frink "1 km^2 -> hectares" # 100
# value is in hectares
if [[ ${domain} == "land" ]]; then
r.reclass.area -d input=MASK output=MASK_nosmall value=100.1 mode=lesser method=rmarea
r.mask MASK_nosmall --o
fi
- Merge small (< 1 km^2) basins with their largest neighbor.
- Drop associated outlets too.
# absorb small basins and outlets
# frink "1.0 km^2 / ((90 * 90) m^2)" # 123.4567
# frink "1.0 km^2 / ((150 * 150) m^2)" # 45
# frink "1.0 km^2 / ((100 * 100) m^2)# #100
# minsize is in cells
r.clump -d input=basins_uniq output=basins_nosmall minsize=101
r.mode base=basins_nosmall cover=basins_uniq output=basins
r.to.vect -v input=basins output=basins type=area
v.db.dropcolumn map=basins column="label"
v.to.rast input=outlets_uniq output=outlets_uniq use=cat
# r.mapcalc "outlets = if(outlets_streams == basins, basins, null())"
r.mapcalc "outlets = if(outlets_uniq == basins, basins, null())"
r.to.vect -v input=outlets output=outlets type=point
v.db.dropcolumn map=outlets column=label
# db.dropcolumn -f table=outlets column=area
- Launches GRASS if not running.
- Changes to specified mapset if not already in it.
[[ -z ${mapset} ]] && mapset=PERMANENT
if [[ ${GRASS_VERSION:-x} == "x" ]]; then
[[ -d ./G ]] || grass -e -c EPSG:3413 ./G
[[ -d ./G/${mapset} ]] || grass -e -c ./G/${mapset}
grass ./G/${mapset}
else
[[ ${mapset} == $(g.mapset -p) ]] || g.mapset -c ${mapset} --q
fi
Example usage:
<<grass_init_mapset>>
echo "MAPSET is: " $(g.mapset -p)
echo ""
echo ""
echo "${data}" | tr '>' '\n' | grep -v -E "^ ?$" | grep -v "GRASS"
Example Usage:
<<grass_init_mapset>>
g.region -p
- This correction needs to be applied to any model data.
- It is easiest and fastest to generate an area correction raster for each of the two models on their exact grid.
- To do this, we set up model domains in GRASS, estimate the area correction for each cell, write out a NetCDF file of that raster, and then apply that to each day of the model data.
<<init>>
log_info "Area Error..."
MAR NetCDF files don’t contain projection information that can be used by GRASS. So I find the bottom, top, left, and right edges by…
- longitude where lat is max,
- longitude where lat is min
- latitude where lon is max
- latitude where lon is min
Then pass those four through m.proj
to get the x,y bounds of the region in GRASS
log_info "Creating MAR mapset..."
g.mapset -c MAR
# NOTE: The origin of the hard-code values used here can be found in the Org source file.
T=-45.039822,83.948792
B=-30.694536,58.800426
L=-89.264137,81.557274
R=7.516274,80.071167
Txy=$(m.proj -i coordinates=$T)
Bxy=$(m.proj -i coordinates=$B)
Lxy=$(m.proj -i coordinates=$L)
Rxy=$(m.proj -i coordinates=$R)
echo $Txy $Bxy $Lxy $Rxy
N=$(echo ${Txy} | cut -d"|" -f2)
S=$(echo ${Bxy} | cut -d"|" -f2)
E=$(echo ${Rxy} | cut -d"|" -f1)
W=$(echo ${Lxy} | cut -d"|" -f1)
g.region e=$E w=$W s=$S n=$N -pl res=1000
g.region w=w-500 e=e+500 n=n+500 s=s-500 res=1000 -p
g.region save=MAR
Note - this check diagnosed a misalignmnet of 1/2 MAR grid cell shifted SOUTH. A correction was implemented during the import phase, so when re-running this, everything may appear to line up properly initially, and then be shifted 1/2 cell (500 m) too far NORTH if the adjustment below is applied a second time.
Review:
grass ./G/MAR
g.region region=MAR -p
d.mon start=wx0
g.list type=raster mapset=ice_surf -m |cat
d.rast basins@ice_surf
g.list type=raster mapset=MAR -m |cat
r.to.vect input=mask_ice output=mask_ice type=area
d.vect mask_ice fill_color=none
- zoom in N - MAR mask seems 0.5 to 1 grid cell shifted south.
- same near Disko
- same in S
Let’s shift everything N by half a grid cell
g.region region=MAR -p
r.mapcalc "mask_ice_shift = mask_ice"
g.region s=s+500 n=n+500
r.region -c map=mask_ice_shift
diff <(g.region -up region=MAR) <(g.region -up)
diff <(g.region -up region=MAR) <(g.region -up raster=mask_ice_shift)
diff <(g.region -up) <(g.region -up raster=mask_ice_shift)
r.to.vect input=mask_ice_shift output=mask_ice_shift type=area
d.vect mask_ice_shift fill_color=none color=red
- EPSG:3413 has projection errors of \(±\) ~8% in Greenland
- Method
- Email: Re: {GRASS-user} scale error for each pixel
- Webmail: https://www.mail-archive.com/grass-user@lists.osgeo.org/msg35005.html
MAPSET=$(g.mapset -p)
log_info "2D Area Error for ${MAPSET}"
v.mkgrid map=grid position=region type=point
v.out.ascii grid | m.proj input=- -od | cut -d"|" -f1,2 | tr '|' ' ' > ./tmp/distortion_ll_${MAPSET}.txt
PROJSTR=$(g.proj -j | grep -v 'crs')
echo $PROJSTR
cat ./tmp/distortion_ll_${MAPSET}.txt \
| proj -VS ${PROJSTR} \
| grep Areal \
| column -t \
| sed s/\ \ /,/g \
| cut -d, -f4 \
> ./tmp/distortion_err_${MAPSET}.txt
paste -d " " <(m.proj -i input=./tmp/distortion_ll_${MAPSET}.txt separator=space | cut -d" " -f1,2) ./tmp/distortion_err_${MAPSET}.txt | r.in.xyz input=- output=err_2D_inv separator=space
r.mapcalc "err_2D_area = 1/(err_2D_inv)" # convert to multiplier
r.mapcalc "err_2D_line = 1/(err_2D_inv^0.5)" # convert area error to linear error
r.out.gdal --o -c -m input=err_2D_area output=./tmp/err_2D_area_${MAPSET}.nc type=Float32 format=netCDF
if [[ "" == $(g.list type=raster pattern=err_2D_area mapset=.) ]]; then
<<area_error>>
fi
log_info "Creating RACMO mapset..."
g.mapset -c RACMO
FILE=${DATADIR}/RACMO/freshwater/Icemask_Topo_Iceclasses_lon_lat_average_1km.nc
cdo -sinfo ${FILE}
x0=$(ncdump -v x ${FILE} | grep "^ x =" | cut -d" " -f4 | cut -d, -f1)
x1=$(ncdump -v x ${FILE} | tail -n2 | head -n1 | tr ',' '\n' | tail -n1 | cut -d" " -f2)
y0=$(ncdump -v y ${FILE} | grep "^ y =" | cut -d" " -f4 | cut -d, -f1)
y1=$(ncdump -v y ${FILE} | tail -n2 | head -n1 | tr ',' '\n' | tail -n1 | cut -d" " -f2)
g.region w=$x0 e=$x1 s=$y0 n=$y1 res=1000 -p
g.region s=s-500 n=n+500 e=e+500 w=w-500 -p
g.region save=RACMO
See: 2D Area Error
if [[ "" == $(g.list type=raster pattern=err_2D_area mapset=.) ]]; then
<<area_error>>
fi
At this point the static work is done - we have basins, streams, and outlets for all of GIS.
The remaining tasks are to SUM the daily model variables by each basin, and assign results to the outlet for each basin.
Working with the provided NetCDF files directly is possible but accessing them is very slow (in GRASS). I believe this has to do with extracting one variable from a multi-variable file, and/or chunking and compression issues. Given that the code operates day-by-day, that means from 1950 through December 2022 there are src_bash{echo $(( ($(date -d 2022-12-31 +%s) - $(date -d 1950-01-01 +%s)) / 86400 ))} {{{results(26662
)}}} days, if I can cut access time down by 1 second that save 6.3 hours. In reality, by pre-processing the NetCDF files access time is reduced from 10 seconds to 1 second, saving ~2.5 days of processing time (cutting processing time in half). The cost for this savings is a few dozen lines of code and a few hours of processing.
- MAR: One RUNOFF variable that contains appropriate runoff for both ice and land
- RACMO: One RUNOFF variable that contains appropriate runoff for both ice and land
import xarray as xr
import numpy as np
import os
import glob
from tqdm import tqdm
from dask.diagnostics import ProgressBar
<<get_DATADIR>>
From Xavier:
RU(CORR) = the (sub grid topography corrected) runoff over the permanent ice part of a pixel. In your case, I suggest you to use RUcorr instead of RU.
RU2 = runoff over the tundra part of a pixel (land runoff).
SF/RF are provided everywhere.
MSK_GIMP = land/sea/ice mask.
Over land you can use RUcorr and RU2 in fct of MSK_GIMP Over sea, you can use RF+SF
- Use
xarray
citep:hoyer_2017 anddask
citep:DASK so that we can efficiently process the ~330 GB of data (much larger than RAM).
area = xr.open_dataset("./tmp/err_2D_area_MAR.nc")
ROOT=DATADIR+"/MAR/3.13-freshwater"
filelist = np.sort(glob.glob(ROOT+"/MARv3.13-daily-ERA5-????.nc"))
years = [_.split("-")[-1].split(".")[0] for _ in filelist]
for y in years:
outfile = "./tmp/MAR_runoff_ice_" + y + ".nc"
if os.path.exists(outfile):
continue
infile = ROOT+"/MARv3.13-daily-ERA5-" + y + ".nc"
print("infile: ", infile, "outfile: ", outfile)
ds = xr.open_mfdataset(infile,
chunks={'time': 30},
combine='by_coords')
da = (ds['RUcorr'] * (ds['MSK'] == 2) * area['Band1'].values) # runoff over ice
da.variable.attrs = {'units':'mmWE/day','long_name':'Water runoff','standard_name':'Water runoff'}
ds_sub = da.to_dataset(name='runoff')
delayed_obj = ds_sub.to_netcdf(outfile, compute=False)
with ProgressBar(): results = delayed_obj.compute() # takes a few min...
del(ds); del(da); del(ds_sub); del(delayed_obj)
for y in years:
outfile = "./tmp/MAR_runoff_land_" + y + ".nc"
if os.path.exists(outfile):
continue
infile = ROOT+"/MARv3.13-daily-ERA5-" + y + ".nc"
print("infile: ", infile, "outfile: ", outfile)
ds = xr.open_mfdataset(infile,
chunks={'time': 30},
combine='by_coords')
da = ds['RU2'] * (ds['MSK'] == 1) * area['Band1'].values # runoff over land
da.variable.attrs = {'units':'mmWE/day','long_name':'Water runoff','standard_name':'Water runoff'}
ds_sub = da.to_dataset(name='runoff')
delayed_obj = ds_sub.to_netcdf(outfile, compute=False)
with ProgressBar(): results = delayed_obj.compute()
del(ds); del(da); del(ds_sub); del(delayed_obj)
- Sum land runoff for one year
- Sum ice runoff for one year
- There should be no overlap (and no gaps at the boundaries)
grass -c ./G/tmp/
g.region MAR@MAR
export GRASS_OVERWRITE=1
d.mon start=wx0
d.erase
### MAR ice
seq=$(seq 365)
parallel --bar 'r.external -o source="netCDF:./tmp/MAR_runoff_ice_2000.nc:runoff" output=ice.{} band={}' ::: ${seq}
parallel --bar 'r.region -c map=ice.{} --q' ::: ${seq}
r.series input=$(g.list type=raster pattern=ice.* separator=,) output=ice method=maximum
r.mapcalc "ice0 = if(ice == 0, null(), ice)"
r.colors -a map=ice0 color=blues
### MAR land
parallel --bar 'r.external -o source="netCDF:./tmp/MAR_runoff_land_2000.nc:runoff" output=land.{} band={}' ::: ${seq}
parallel --bar 'r.region -c map=land.{} --q' ::: ${seq}
r.series input=$(g.list type=raster pattern=land.* separator=,) output=land method=maximum
r.mapcalc "land0 = if(land == 0, null(), land)"
r.colors -a map=land0 color=green
r.mapcalc "test = if(ice0) + if(land0)"
r.info test # should be min=NULL max=NULL showing no overlap. What about gaps?
r.mapcalc "test = if(isnull(ice0), 0, 1) + if(isnull(land0), 0, 2)"
# min = 0, max = 2
TODO: Problem: There appears to be no land runoff.
Brice email: Re: All freshwater
The data consist of:
1 km: ice covered regions
- Daily total precipitation (rain + snow) and runoff in mm w.e. or kg/m2 per day covering the whole GrIS and peripheral ice caps (GICS) for the year 1979-2017.
- Mask file including: lon/lat, PROMICE mask (3 = GrIS, 1 and 2 = GICs) and topography from the GIMP DEM upscaled to 1 km.
The data are statistically downscaled from the output of RACMO2.3p2 (5.5 km) to 1 km resolution.
The projection used is Polar Stereographic North (EPSG:3413) and the resolution is exactly 1 km x 1 km.
5.5 km: tundra region
- Daily runoff and evaporation in mm w.e. or kg/m2 per day at 5.5 km for 1979-2017.
Use of the data:
Note that in RACMO2 a positive subl means condensation while negative subl actually means sublimation.
- This data set provided in its current from by Brice.
- 1 km gridded land runoff on EPSG:3413 grid
area = xr.open_dataset("./tmp/err_2D_area_RACMO.nc")
ROOT=DATADIR+"/RACMO/freshwater"
msk = xr.open_dataset(ROOT+"/Icemask_Topo_Iceclasses_lon_lat_average_1km.nc")
msk['landmask'] = (msk['Promicemask'] == 0) & (msk['Topography'] != 0)
filelist = np.sort(glob.glob(ROOT+"/runoff_ice_1km/runoff*.nc"))
years = np.unique([_.split(".")[1].split("_")[0] for _ in filelist])
for i,y in enumerate(tqdm(years)):
outfile = "./tmp/RACMO_runoff_ice_" + y + ".nc"
if os.path.exists(outfile):
continue
filelist_year = [_ for _ in filelist if str(y) in _]
print("infile: ", filelist_year, "outfile: ", outfile)
ds = xr.open_mfdataset(filelist_year,
chunks={'time': 30},
combine='by_coords')
da = (ds['runoffcorr'] * (msk['Promicemask'] != 0) * area['Band1'].values) # ice runoff
da.variable.attrs = {'units':'mm w.e. per day',
'long_name':'Downscaled corrected snowmelt',
'standard_name':'Downscaled_corrected_snowmelt'}
ds_sub = da.to_dataset(name='runoff')
delayed_obj = ds_sub.to_netcdf(outfile, compute=False)
with ProgressBar(): results = delayed_obj.compute()
del(ds); del(da); del(ds_sub); del(delayed_obj)
- This data set provided at 5.5 km on a rotated pole grid
- Code to regrid onto 1 km EPSG:3413 was also provided.
- See
${DATADIR}/RACMO/freshwater/runoff_land_1km_regrid/
folder for details
- See
area = xr.open_dataset("./tmp/err_2D_area_RACMO.nc")
ROOT=DATADIR+"/RACMO/freshwater"
ROOT_REGRID = ROOT + "/runoff_land_1km_regrid"
msk = xr.open_dataset(ROOT_REGRID+"/Tundra_Mask_1km.nc")
filelist = np.sort(glob.glob(ROOT_REGRID + "/out/runoff*.nc"))
years = np.unique([_.split(".")[1] for _ in filelist])
for i,y in enumerate(tqdm(years)):
outfile = "./tmp/RACMO_runoff_land_" + y + ".nc"
if os.path.exists(outfile):
continue
print("infile: ", filelist[i], "outfile: ", outfile)
ds = xr.open_mfdataset(filelist[i], chunks={'time': 30}, combine='by_coords')
da = ds['runoff'] * (msk['Tundra']) * area['Band1'].values # runoff over land
da.variable.attrs = {'units':'mm w.e. per day',
'long_name':'Downscaled corrected snowmelt',
'standard_name':'Downscaled_corrected_snowmelt'}
ds_sub = da.to_dataset(name='runoff')
delayed_obj = ds_sub.to_netcdf(outfile, compute=False)
with ProgressBar(): results = delayed_obj.compute()
del(ds); del(da); del(ds_sub); del(delayed_obj)
Domains should be independent.
grass -c ./G/tmp/
g.region RACMO@RACMO
export GRASS_OVERWRITE=1
d.mon start=wx0
d.erase
### RACMO ice
seq=$(seq 365)
parallel --bar 'r.external -o source="netCDF:./tmp/RACMO_runoff_ice_2000.nc:runoff" output=ice.{} band={}' ::: ${seq}
parallel --bar 'r.region -c map=ice.{} --q' ::: ${seq}
r.series input=$(g.list type=raster pattern=ice.* separator=,) output=ice method=maximum
r.mapcalc "ice0 = if(ice == 0, null(), ice)"
r.colors -a map=ice0 color=blues
### RACMO land
parallel --bar 'r.external -o source="netCDF:./tmp/RACMO_runoff_land_2000.nc:runoff" output=land.{} band={}' ::: ${seq}
parallel --bar 'r.region -c map=land.{} --q' ::: ${seq}
r.series input=$(g.list type=raster pattern=land.* separator=,) output=land method=maximum
r.mapcalc "land0 = if(land == 0, null(), land)"
r.colors -a map=land0 color=green
r.mapcalc "test = if(ice0) + if(land0)"
r.info test # should be min=NULL max=NULL showing no overlap.
r.mapcalc "test = if(isnull(ice0), 0, 1) + if(isnull(land0), 0, 2)" # What about gaps?
r.info test
# min = 0, max = 2
d.rast test
d.legend test
- Not an issue when the RCM provides two different domains and a mask (MAR)
- May be an issue when RCM results are interpolated onto a grid from elsewhere (RACMO)
import xarray as xr
import numpy as np
import glob
import os
area = xr.open_dataset("./tmp/err_2D_area_RACMO.nc")
DATADIR='/home/kdm/data'
ROOT=DATADIR+"/RACMO/freshwater"
### ICE
msk = xr.open_dataset(ROOT+"/Icemask_Topo_Iceclasses_lon_lat_average_1km.nc")
msk['landmask'] = (msk['Promicemask'] == 0) & (msk['Topography'] != 0)
filelist = np.sort(glob.glob(ROOT+"1km/runoff*.nc"))
years = np.unique([_.split(".")[1].split("_")[0] for _ in filelist])
i = 42
y = '2000'
filelist_year = [_ for _ in filelist if str(y) in _]
ds = xr.open_mfdataset(filelist_year,
chunks={'time': 30},
combine='by_coords')
imshow(ds['runoffcorr'][200,:,:], origin='lower')
### LAND
ROOT_REGRID = ROOT + "/regrid_5.5-to-1km"
msk = xr.open_dataset(ROOT_REGRID+"/Tundra_Mask_1km.nc")
filelist = np.sort(glob.glob(ROOT_REGRID + "/out/runoff*.nc"))
years = np.unique([_.split(".")[2].split(".")[0] for _ in filelist])
i = 3
y = '2000'
ds = xr.open_mfdataset(filelist[i], chunks={'time': 30}, combine='by_coords')
imshow(msk['Tundra'], origin='lower')
land_runoff = ds.squeeze()['runoff'].max(axis=0).compute()
imshow(land_runoff, origin='lower')
imshow(land_runoff * msk['Tundra'], origin='lower') # appears that ice runoff "bleeds" into land runoff.
- I expect not - the difference should be due to the area scaling
cdo -v selyear,2000 ./tmp/MAR_runoff_ice.nc ./tmp/MAR_Y2K_tmp.nc
cdo -v yearsum ./tmp/MAR_Y2K_tmp.nc ./tmp/MAR_Y2K.nc
cdo -v selyear,2000 ./tmp/MAR_runoff_land.nc ./tmp/MAR_Y2K_tmp.nc
cdo -v yearsum ./tmp/MAR_Y2K_tmp.nc ./tmp/MAR_Y2K_land.nc
cdo -v selyear,2000 ./tmp/RACMO_runoff_ice.nc ./tmp/RACMO_Y2K_tmp.nc
cdo -v yearsum ./tmp/RACMO_Y2K_tmp.nc ./tmp/RACMO_Y2K.nc
rm ./tmp/{MAR,RACMO}_Y2K_tmp.nc
<<init>>
Have:
product | loc | k | RCM | DEM | variable |
---|---|---|---|---|---|
runoff | ice margin | 80,90,100 | MAR | ArcticDEM | RUcorr |
runoff | ice margin | 80,90,100 | RACMO | ArcticDEM | runoffcorr |
runoff | land coast | 100 | MAR | ArcticDEM | RUcorr |
runoff | land coast | 100 | RACMO | ArcticDEM | runoffcorr |
Can expand to include:
- [ ] product
- More products (e.g. precip, condensation, etc.)
- [ ] location
- Add Fjord Surface. Improve linking ice margin to coast
- [ ] RCM
- More RCMs? More times (future simulations?) Near-realtime Wx forecast?
- [ ] DEM
- Multiple DEMs? Higher res ArcticDEM?
- [X] Other
- Multiple DEM routing schemes (subglacial)
- The algorithm is designed to capture model data that is over basin ice.
- This works without extra effort when operating in the ice domain routed to the ice edge, because data outside the domain is masked.
- However, when working over the ice domain routed to the coast, this includes the land domain. Now 1 km^2 model grid cells that only partially cover an ice basin are included in their entirety.
- From this, ice runoff routed to the ice edge is not equal to ice runoff routed over land to the coast, but the two should be equal.
- Therefore, I build a basin mask here
- Later, we’ll apply the smallest version of AND(basin_mask, model_mask) limiting to the smallest ice subset.
log_info "Initilazing MAR mapset..."
g.mapset MAR
g.region -d # default region, or raster=basins_merged@coast
r.mapcalc "mask_ice_basin = if(basins@ice_100, 1, null())"
r.mapcalc "mask_land_basin = if(not(isnull(basins@land_100)) & isnull(mask_ice_basin), 1, null())"
r.mapcalc "mask_ice_land_basin = if(not(isnull(basins_filled@land_100)), 1, null())"
g.region MAR@MAR -p
r.in.gdal -o input="NetCDF:${DATADIR}/MAR/3.13-freshwater/MARv3.13-daily-ERA5-2000.nc:MSK" output=MSK
r.region map=MSK region=MAR@MAR
r.mapcalc "mask_ice_MAR = if((MSK == 2) & mask_ice_land_basin, 1, null())"
r.mapcalc "mask_land_MAR = if((MSK == 1) & mask_ice_land_basin, 1, null())"
g.region -d
Below is…
- NON PARALLEL
- not tangled
- Run for 5 days
- Run parallelized version for 5 days
- Check outputs are the same
Timing Test:
r.mapcalc "model = 1" # One synthetic day
r.region map=model region=MAR@MAR --q
r.mask -r
time r.univar -t map=model zones=basins@ice_100 > /dev/null # 10-15 sec
### LOOP VERSION (~11 seconds per day)
rm ./tmp/MAR_ice_runoff.noparallel.bsv
rm ./tmp/MAR_ice_runoff.bsv
for t in $(seq 0 4); do
date
tt=$(( $t + 1 )) # T0 is 0-based. r.external is 1-based
DATESTR=$(date -d"2000-01-01+${t} days" +"%Y-%m-%d")
r.external -o source="netCDF:./tmp/MAR_runoff_ice_2000.nc:runoff" output=model band=${tt} --o --q
r.region map=model region=MAR@MAR --q
r.univar -t map=model zones=basins@ice_100 --q | cut -d"|" -f13 | datamash -t"|" transpose | sed s/^sum/${DATESTR}/ >> ./tmp/MAR_ice_runoff.noparallel.bsv
done
Implement above code but in parallel
function basin_partition() {
JOB="$1" # 0 to ndays-1
SLOT="$2" # 0 to ncpus-1
ZONES="$3" # basins@land_100, basins@ice_80, etc.
RCMfilevar=$4 # MAR_runoff_ice_2000.nc:runoff
REGION=$5 # MAR@MAR
RCMfile=${RCMfilevar/:*/} # <RCM>_runoff_<domain>_<k>.nc
r.external -o source="netCDF:tmp/${RCMfilevar}" output=model_${SLOT} band=${JOB} --o --q 2>/dev/null
r.region map=model_${SLOT} region=${REGION} --q
DATESTR0=$(ncdump -h tmp/${RCMfile} |grep time:units)
DATESTR0=${DATESTR0: 27:10}
DATESTR=$(date -d"${DATESTR0}+$(( ${JOB}-1 )) days" +"%Y-%m-%d")
r.univar -t map=model_${SLOT} zones=${ZONES} --q 2>/dev/null | cut -d"|" -f13 | datamash -t"|" transpose | sed s/^sum/${DATESTR}/
}
export -f basin_partition
DEBUG: Proc 1 day and make sure its same as NON PARALLEL
seq 5 | parallel --keep-order --bar basin_partition {#} {%} basins@ice_100 "MAR_runoff_ice_1980.nc:runoff" MAR@MAR >> ./tmp/MAR_ice_runoff.bsv # DEBUG
diff -q ./tmp/MAR_ice_runoff.bsv ./tmp/MAR_ice_runoff.noparallel.bsv
head -n1 ./tmp/MAR_ice_runoff.bsv | tr '|' '\n' | wc
tail -n1 ./tmp/MAR_ice_runoff.bsv | tr '|' '\n' | wc
We want to calculate
- ice runoff to ice margin
- land runoff to coast
All separately. Therefore, easiest to build the ZONES file for r.univar to cover only what we want.
Set up header
log_info "Building BSV headers"
# r.external -o source="netCDF:./tmp/MAR_runoff_ice.nc:runoff" output=model band=1 --o --q # load in a sample day
# r.region map=model region=MAR@MAR
r.mapcalc "model = 1" --o # dummy data to build headers
for k_pct in 100 90 80; do
r.univar -t map=model zones=basins@ice_${k_pct} \
| sed s/^zone/cat/ \
| cut -d"|" -f1 \
| datamash -t"|" transpose \
> ./tmp/header_ice_${k_pct}.bsv
done
# land only 100
k_pct=100
r.univar -t map=model zones=basins@land_${k_pct} \
| sed s/^zone/cat/ \
| cut -d"|" -f1 \
| datamash -t"|" transpose \
> ./tmp/header_land_${k_pct}.bsv
log_info "Using EXTRA GNU parallel options via \${PARALLEL}: ${PARALLEL:-Not set}"
for y in $(seq 1950 2022); do
for k_pct in 100 90 80; do
file_base=MAR_runoff_ice_${y}
infile=${file_base}.nc
outfile=${file_base}_${k_pct}.bsv
log_info "Calculating MAR ice runoff at ice outlets: ${outfile}"
if [[ -e ./dat/${outfile} ]]; then
log_warn "Output file ${outfile} exists. Skipping..."
else
cp ./tmp/header_ice_${k_pct}.bsv ./dat/${outfile}
T1=$(ncdump -v time ./tmp/${infile} | grep -A 100000 "^ time =" | tr -dc [0-9,] | rev | cut -d, -f1 | rev)
seq 0 ${T1} \
| parallel --nice 9 --keep-order --bar basin_partition {#} {%} basins@ice_${k_pct} "${infile}:runoff" MAR@MAR \
>> ./dat/${outfile}
fi
done
done
I expect outputs to be less than inputs because not every model ice cell is fully sampled, because not every model ice cell has an ice basin under it. Model ice cells over land are not counted.
import pandas as pd
import xarray as xr
import numpy as np
<<get_DATADIR>>
mask = xr.open_dataset(DATADIR+"/MAR/3.13-freshwater/MARv3.13-daily-ERA5-2000.nc")['MSK']
# First day match? # convert mm/grid cell to m^3
o = pd.read_csv('./dat/runoff_ice2margin_MAR.bsv', sep="|", index_col=0, nrows=1) * 1E-3 * 100 * 100
# first year of input
i = xr.open_dataset('./dat/MAR_runoff_ice.nc').isel(time=0)['runoff'] * 1E-3 * 1000 * 1000 * (mask == 2)
osum = o.sum().sum()
isum = i.sum().values
print(isum, osum)
print(isum/osum)
# check for first year
o = pd.read_csv('./dat/runoff_ice2margin_MAR.bsv', sep="|", index_col=0, nrows=366) * 1E-3 * 100 * 100
i = xr.open_dataset('./dat/MAR_runoff_ice.nc').isel(time=np.arange(366))['runoff'] * 1E-3 * 1000 * 1000 * (mask == 2)
osum = o.sum().sum()
isum = i.sum().values
print(isum, osum)
print(isum/osum)
- Inputs are ~4 % larger than outputs over the first year
- Due to EPSG:3413 scaling?
- Due to Arctic Canada in this mask?
log_info "Using EXTRA GNU parallel options via \${PARALLEL}: ${PARALLEL:-Not set}"
k_pct=100
for y in $(seq 1950 2022); do
file_base=MAR_runoff_land_${y}
infile=${file_base}.nc
outfile=${file_base}_${k_pct}.bsv
log_info "Calculating MAR land runoff at land outlets: ${outfile}"
if [[ -e ./dat/${outfile} ]]; then
log_warn "Output file ${outfile} exists. Skipping..."
else
cp ./tmp/header_land_${k_pct}.bsv ./dat/${outfile}
T1=$(ncdump -v time ./tmp/${infile} | grep -A 100000 "^ time =" | tr -dc [0-9,] | rev | cut -d, -f1 | rev)
seq 0 ${T1} \
| parallel --nice 9 --keep-order --bar basin_partition {#} {%} basins@land_${k_pct} "${infile}:runoff" MAR@MAR \
>> ./dat/${outfile}
fi
done
g.mapset -c RACMO
g.region -d
r.mapcalc "mask_ice_basin = if(basins@ice_100, 1, null())"
r.mapcalc "mask_land_basin = if(not(isnull(basins@land_100)) & isnull(mask_ice_basin), 1, null())"
r.mapcalc "mask_ice_land_basin = if(not(isnull(basins_filled@land_100)), 1, null())"
g.region RACMO@RACMO -p
FILE=${DATADIR}/RACMO/freshwater/Icemask_Topo_Iceclasses_lon_lat_average_1km.nc
r.in.gdal -o input="NetCDF:${FILE}:Promicemask" output=mask_ice
r.region map=mask_ice region=RACMO@RACMO
r.mapcalc "mask_ice_RACMO = if((mask_ice != 0) & mask_ice_land_basin, 1, null())"
r.mapcalc "mask_land_RACMO = if((mask_ice == 0) & mask_ice_land_basin, 1, null())"
g.region -d
r.mapcalc "model = 1" # One synthetic day
r.region map=model region=RACMO@RACMO
g.mapset RACMO
log_info "Using EXTRA GNU parallel options via \${PARALLEL}: ${PARALLEL:-Not set}"
# ice
for y in $(seq 1958 2022); do
for k_pct in 100 90 80; do
file_base=RACMO_runoff_ice_${y}
infile=${file_base}.nc
outfile=${file_base}_${k_pct}.bsv
log_info "Calculating RACMO ice runoff at ice outlets: ${outfile}"
if [[ -e ./dat/${outfile} ]]; then
log_warn "Output file ${outfile} exists. Skipping..."
else
cp ./tmp/header_ice_${k_pct}.bsv ./dat/${outfile}
T0=$(ncdump -v time ./tmp/${infile} | grep -A 100000 "^ time =" | tr -dc [0-9,] | cut -d, -f1)
T1=$(ncdump -v time ./tmp/${infile} | grep -A 100000 "^ time =" | tr -dc [0-9,] | rev | cut -d, -f1 | rev)
seq ${T0} ${T1} \
| parallel --nice 9 --keep-order --bar basin_partition {#} {%} basins@ice_${k_pct} "${infile}:runoff" RACMO@RACMO \
>> ./dat/${outfile}
fi
done
done
# land
k_pct=100
for y in $(seq 1958 2022); do
file_base=RACMO_runoff_land_${y}
infile=${file_base}.nc
outfile=${file_base}_${k_pct}.bsv
log_info "Calculating RACMO land runoff at land outlets: ${outfile}"
if [[ -e ./dat/${outfile} ]]; then
log_warn "Output file ${outfile} exists. Skipping..."
else
cp ./tmp/header_land_${k_pct}.bsv ./dat/${outfile}
T0=$(ncdump -v time ./tmp/${infile} | grep -A 100000 "^ time =" | tr -dc [0-9,] | cut -d, -f1)
T1=$(ncdump -v time ./tmp/${infile} | grep -A 100000 "^ time =" | tr -dc [0-9,] | rev | cut -d, -f1 | rev)
seq ${T0} ${T1} \
| parallel --nice 9 --keep-order --bar basin_partition {#} {%} basins@land_${k_pct} "${infile}:runoff" RACMO@RACMO \
>> ./dat/${outfile}
fi
done
grass -c ./G/fjords
g.region -d res=1000 -a -p
r.mask -r
d.mon start=wx0
d.erase
d.rast shade
d.rast land_shade
# fill everything (Disko bay, Geiki, fjords, etc.)
r.grow -m input=head@coast output=grow radius=50000 old=-1 new=1 --o
# shrink back to just the coastal region
r.grow -m input=grow output=shrink radius=-44000 --o
r.null map=shrink setnull=-1
# fill in cells that are part land/coast
# but might have land at the center at 30 m res
r.grow -m input=shrink output=buffer_coast radius=1001 new=1 --o
d.erase
d.rast shade
d.rast buffer_coast
# for display purposes?
r.to.vect input=buffer_coast output=buffer_coast type=point
# then, r.out.xyz for the model data using buffer_coast as a mask.
# This should produce "x,y,val" for each time step
grass -c ./G/tmp/
export GRASS_OVERWRITE=1
d.mon start=wx0
d.erase
### ICE
d.rast basins@ice_80
# OR
r.to.vect input=basins@ice_80 output=zones type=area
d.vect zones fill_color=none color=red
### MAR
band=250
r.external -o source="netCDF:./tmp/MAR_runoff_ice_2000.nc:runoff" output=map band=${band}
r.region map=map region=MAR@MAR --q
r.colors -a map=map color=viridis
d.rast map
d.vect zones fill_color=none color=red
### RACMO
band=200
r.external -o source="netCDF:./tmp/RACMO_runoff_ice_1958.nc:runoff" output=map band=${band}
r.region map=map region=MAR@MAR --q
r.colors -a map=map color=viridis
d.rast map
### LAND
# r.to.vect input=basins@land_100 output=zones type=area
# d.vect zones fill_color=none color=red
d.rast basins@land_100
# MAR
band=220
r.external -o source="netCDF:./tmp/MAR_runoff_land_1999.nc:runoff" output=map band=${band}
r.region map=map region=MAR@MAR --q
r.colors -a map=map color=viridis
d.rast map
# RACMO
band=220
r.external -o source="netCDF:./tmp/RACMO_runoff_land_1999.nc:runoff" output=map band=${band}
r.region map=map region=MAR@MAR --q
r.colors -a map=map color=viridis
d.rast map
# r.to.vect input=map_${band} output=map_${band} type=area
# d.vect map_${band} fill_color=none color=blue
<<init>>
See [ Coverage ]
- How much of each ice basin is (1) (is not (*)) covered by model ice cells.
for k in 80 90 100; do
mapset=ice_${k}
g.mapset ${mapset}
r.mask -r
r.stats --q -aN input=basins@${mapset},mask_ice_MAR@MAR separator=, \
> ./tmp/coverage_MAR_ice_${k}.csv
r.stats --q -aN input=basins@${mapset},mask_ice_RACMO@RACMO separator=, \
> ./tmp/coverage_RACMO_ice_${k}.csv
done
- How much of each land basin is (1) (is not (*)) covered by model land cells.
k=100
mapset=land_${k}
g.mapset ${mapset}
r.mask -r
r.stats --q -aN input=basins@${mapset},mask_land_MAR@MAR separator=, \
> ./tmp/coverage_MAR_land_${k}.csv
r.stats --q -aN input=basins@${mapset},mask_land_RACMO@RACMO separator=, \
> ./tmp/coverage_RACMO_land_${k}.csv
Seems the easiest (and fastest) way to apply coverage is to write to NetCDF that is in the same format as the runoff NetCDF files so that they can easily be multiplied together.
Convert annual BSV files to annual NetCDF format files
Want to produce CF-compliant and user-friendly NetCDF that will work with nco and panoply.
- SEE: https://www.nodc.noaa.gov/data/formats/netcdf/v2.0/ timeSeriesProfile
- https://www.nodc.noaa.gov/data/formats/netcdf/v2.0/timeSeriesOrthogonal.cdl
- http://cfconventions.org/cf-conventions/cf-conventions.html#time-series-data
Also modeling after:
- ~/data/Bamber_2018/FWF17.v3_a.nc
- See also: https://gis.stackexchange.com/questions/67012/using-netcdf-for-point-time-series-observations
This code:
- Can be run 1x within Org to produce
ds.nc
for testing purposes - Is also included in the tangled
bsv2netcdf.py
script
In the bsv2netcdf.py
script, pass the input file in, and the output file is determined from the input file name.
Can be run in parallel (for example in the Makefile) with: parallel --bar "python bsv2netcdf.py {}" ::: $(ls dat/*_runoff_*.bsv)
import pandas as pd
import xarray as xr
import numpy as np
import datetime
import subprocess
import os
# if "BSVFILE" not in locals(): BSVFILE="./tmp/runoff_ice2margin_MAR.bsv.1979"
#if "BSVFILE" not in locals(): BSVFILE="./tmp/runoff_ice2margin_MAR.bsv.2000"
if "BSVFILE" not in locals(): BSVFILE="./dat/RACMO_runoff_land_1979_100.bsv"
df = pd.read_csv(BSVFILE, index_col=0, delimiter="|")
df.index = pd.to_datetime(df.index)
RCM, var, domain, year, k = os.path.basename(os.path.splitext(BSVFILE)[0]).split("_")
# patch RACMO dates
if (RCM == 'RACMO') & (domain == 'land'):
df.index = pd.date_range(start=str(year)+'-01-01', end=str(year)+'-12-31')[0:df.index.size]
metadir = domain + "_" + k
meta = pd.read_csv("freshwater/" + metadir + "/outlets.csv" , index_col=0).T
meta = meta[df.columns.astype(np.int)]
ds = xr.Dataset()
ds["station"] = (("station"), df.columns.astype(np.str).astype(np.uint32))
ds["time"] = (("time"), df.index)
ds["station"].attrs["long_name"] = "outlet id"
ds["station"].attrs["cf_role"] = "timeseries_id"
# ds["station"].attrs["units"] = ""
ds["time"].attrs["long_name"] = "time of measurement"
ds["time"].attrs["standard_name"] = "time"
# ds["time"].attrs["units"] = "day of year"
# ds["time"].attrs["calendar"] = "julian"
ds["time"].attrs["axis"] = "T"
# gridsize = 100 x 100
CONV = 1E-3 * 100 * 100 / 86400
ds["discharge"] = (("station", "time"), df.values.T*CONV)
ds["discharge"].attrs["long_name"] = RCM + " discharge"
ds["discharge"].attrs["standard_name"] = "water_volume_transport_in_river_channel"
ds["discharge"].attrs["units"] = "m3 s-1"
ds["discharge"].attrs["coordinates"] = "lat lon alt station"
ds["lat"] = (("station"), meta.loc['lat'].astype(np.float32))
#ds["lat"].attrs["coordinates"] = "station"
ds["lat"].attrs["long_name"] = "latitude"
ds["lat"].attrs["standard_name"] = "latitude"
ds["lat"].attrs["units"] = "degrees_north"
ds["lat"].attrs["axis"] = "Y"
ds["lon"] = (("station"), meta.loc['lon'].astype(np.float32))
#ds["lon"].attrs["coordinates"] = "station"
ds["lon"].attrs["long_name"] = "longitude"
ds["lon"].attrs["standard_name"] = "longitude"
ds["lon"].attrs["units"] = "degrees_east"
ds["lon"].attrs["axis"] = "X"
ds["alt"] = (("station"), meta.loc['elev'].astype(np.float32))
ds["alt"].attrs["long_name"] = "height_above_mean_sea_level"
ds["alt"].attrs["standard_name"] = "altitude"
# ds["alt"].attrs["long_name"] = "height above mean sea level"
# ds["alt"].attrs["standard_name"] = "height"
ds["alt"].attrs["units"] = "m"
ds["alt"].attrs["positive"] = "up"
ds["alt"].attrs["axis"] = "Z"
ds["Z2012_sector"] = (("station"), meta.loc["Z2012_sector"].replace(np.nan, value=-1).astype(np.uint32))
ds["Z2012_sector"].attrs["long_name"] = "Zwally (2012) sector nearest outlet"
ds["Z2012_sector_dist"] = (("station"), meta.loc["Z2012_sector_dist"].replace(np.nan, value=-1).astype(np.uint32))
ds["Z2012_sector_dist"].attrs["long_name"] = "Zwally (2012) sector nearest outlet distance"
ds["M2019_ID"] = (("station"), meta.loc["M2019_ID"].replace(np.nan, value=-1).astype(np.uint32))
ds["M2019_ID"].attrs["long_name"] = "Mouginot (2019) basin nearest outlet ID"
ds["M2019_ID_dist"] = (("station"), meta.loc["M2019_ID_dist"].replace(np.nan, value=-1).astype(np.uint32))
ds["M2019_ID_dist"].attrs["long_name"] = "Mouginot (2019) basin nearest outlet ID distance"
ds["M2019_basin"] = (("station"), meta.loc["M2019_basin"])
ds["M2019_basin"].attrs["long_name"] = "Mouginot (2019) basin nearest outlet name"
ds["M2019_region"] = (("station"), meta.loc["M2019_region"])
ds["M2019_region"].attrs["long_name"] = "Mouginot (2019) region nearest outlet"
ds["M2020_gate"] = (("station"), meta.loc["M2020_gate"].replace(np.nan, value=-1).astype(np.uint32))
ds["M2020_gate"].attrs["long_name"] = "Mankoff (2020) gate nearest outlet"
ds["M2020_gate_dist"] = (("station"), meta.loc["M2020_gate_dist"].replace(np.nan, value=-1).astype(np.uint32))
ds["M2020_gate_dist"].attrs["long_name"] = "Mankoff (2020) gate nearest outlet distance"
ds["B2015_name"] = (("station"), meta.loc["B2015_name"])
ds["B2015_name"].attrs["long_name"] = "Bjørk (2015) Official Name gate nearest outlet"
ds["B2015_dist"] = (("station"), meta.loc["B2015_dist"])
ds["B2015_dist"].attrs["long_name"] = "Bjørk (2015) Official Name point distance"
if 'ice' in BSVFILE:
ds["coast_id"] = (("station"), meta.loc["coast_id"].replace(np.nan, value=-1).astype(np.uint32))
ds["coast_id"].attrs["long_name"] = "ID of coastal outlet"
ds["coast_lat"] = (("station"), meta.loc['coast_lat'].astype(np.float32))
ds["coast_lat"].attrs["long_name"] = "latitude"
ds["coast_lat"].attrs["standard_name"] = "latitude"
ds["coast_lat"].attrs["units"] = "degrees_north"
ds["coast_lat"].attrs["axis"] = "Y"
ds["coast_lon"] = (("station"), meta.loc['coast_lon'].astype(np.float32))
ds["coast_lon"].attrs["long_name"] = "longitude"
ds["coast_lon"].attrs["standard_name"] = "longitude"
ds["coast_lon"].attrs["units"] = "degrees_east"
ds["coast_lon"].attrs["axis"] = "X"
ds["coast_alt"] = (("station"), meta.loc['elev'].astype(np.float32))
ds["coast_alt"].attrs["long_name"] = "height_above_mean_sea_level"
ds["coast_alt"].attrs["standard_name"] = "altitude"
ds["coast_alt"].attrs["units"] = "m"
ds["coast_alt"].attrs["positive"] = "up"
ds["coast_alt"].attrs["axis"] = "Z"
ds.attrs["featureType"] = "timeSeries"
ds.attrs["title"] = "Greenland discharge"
ds.attrs["summary"] = "Greenland RCM discharge at basin outlets"
ds.attrs["keywords"] = "Hydrology; Greenland; Runoff; Discharge; Freshwater"
ds.attrs["Conventions"] = "CF-1.7"
ds.attrs["source"] = "git commit: " + \
subprocess.check_output(["git", "describe", "--always"]).strip().decode('UTF-8')
# ds.attrs["comment"] = "TODO"
# ds.attrs["acknowledgment"] = "TODO"
# ds.attrs["license"] = "TODO"
# ds.attrs["date_created"] = datetime.datetime.now().strftime("%Y-%m-%d")
ds.attrs["creator_name"] = "Ken Mankoff"
ds.attrs["creator_email"] = "kdm@geus.dk"
ds.attrs["creator_url"] = "http://kenmankoff.com"
ds.attrs["institution"] = "GEUS"
# ds.attrs["time_coverage_start"] = "TODO"
# ds.attrs["time_coverage_end"] = "TODO"
# ds.attrs["time_coverage_resolution"] = "TODO"
ds.attrs["references"] = "10.22008/promice/freshwater"
ds.attrs["product_version"] = 1.0
And now noweb weave bsv2xarray to convert all the BSVs to NetCDF files
import numpy as np
import pandas as pd
import xarray as xr
import numpy as np
import sys
import os
f = sys.argv[1]
# f = './dat/MAR_runoff_ice_1979_100.bsv'
RCM, var, domain, year, k = os.path.basename(os.path.splitext(f)[0]).split("_")
#MAR runoff ice 2000 100
# Set up output folder and file name
OUTDIR = "./freshwater/" + domain + "_" + k + "/discharge" # ./freshwater_90 or ./freshwater_80
# if k == '100': OUTDIR = "./freshwater/" + domain + "/discharge" # ./freshwater/
if not os.path.exists(OUTDIR):
os.makedirs(OUTDIR, exist_ok=True)
OUTBASE = RCM + "_" + year
OUTFILE = OUTDIR + "/" + OUTBASE + ".nc"
# Generate xarray structure
BSVFILE=f
<<bsv2xarray>>
# load and apply coverage scaling
cf = "coverage_" + RCM + "_" + domain +"_" + k + ".nc"
c = xr.open_dataset("./tmp/"+cf)
c = c.where(c['runoff'] != 0, np.nan, drop=False)
ds['discharge'].values = ds['discharge']/c['runoff'] # apply coverage
ds.to_netcdf(OUTFILE, unlimited_dims='time', encoding={'discharge': {'zlib': True, 'complevel': 2}}, engine='netcdf4', mode='w')
Note:
ln -s ice_100 ice
ln -s land_100 land
import xarray as xr
import numpy as np
for d in ['ice','land']:
for r in ['RACMO','MAR']:
print(d,r)
ds = xr.open_mfdataset('./freshwater/'+d+'/discharge/'+r+'_*.nc', combine='by_coords')
for v in ['Z2012_sector', 'Z2012_sector_dist',
'M2019_ID','M2019_ID_dist','M2019_region','M2019_basin',
'M2020_gate','M2020_gate_dist',
'B2015_name','B2015_dist']:
ds[v] = ds[v].isel({'time':1}).squeeze()
if d == 'ice':
for v in ['coast_id','coast_lat','coast_lon','coast_alt']:
ds[v] = ds[v].isel({'time':1}).squeeze()
for v in ['M2019_basin','M2019_region','B2015_name']:
ds[v] = ds[v].astype(str)
for v in ['B2015_dist','M2020_gate_dist','M2020_gate',
'M2019_ID_dist','M2019_ID','Z2012_sector_dist']:
ds[v] = ds[v].astype(np.uint32)
ds.to_netcdf('./freshwater/'+d+'/'+r+'.nc',
unlimited_dims='time',
encoding={'discharge': {'zlib': True, 'complevel': 2},
'Z2012_sector': {'zlib': True, 'complevel': 2},
'M2019_ID': {'zlib': True, 'complevel': 2},
'M2019_basin': {'zlib': True, 'complevel': 2},
'M2019_region': {'zlib': True, 'complevel': 2},
'B2015_name': {'zlib': True, 'complevel': 2},
'M2020_gate': {'zlib': True, 'complevel': 2},
},
engine='netcdf4', mode='w')
This code, and all code files in this project, are derived products tangled from the sob.org source file.
freshwater.zip: G run
# dist
G:
grass -e -c EPSG:3413 ./G
run: FORCE
# basins, outlets, and streams
grass ./G/PERMANENT --exec ./import.sh
grass ./G/PERMANENT --exec ./sob.sh
# how do basins change under different SOB routing?
grass ./G/PERMANENT --exec ./k_basin_change.sh
# 2D EPSG area error
grass ./G/PERMANENT --exec ./area_error.sh
# RCM
mkdir -p dat
python ./rcm_preprocess.py
grass ./G/PERMANENT --exec ./rcm.sh # partition RCM output to basins
# coverage
grass ./G/PERMANENT --exec ./coverage_calc.sh
python coverage_csv2nc.py
# create end-user data product
parallel --bar "python bsv2netcdf.py {}" ::: $$(ls dat/*_runoff_*.bsv)
python netcdf_combine.py
dist: freshwater.zip
zip -r freshwater.zip freshwater
FORCE: # dummy target
clean_all:
rm -fR G tmp dat freshwater freshwater.zip
clean_RCM:
rm -fR G/MAR G/RACMO
clean_sob:
rm -fR G/land_* G/ice_*
- Show enough information about data files and metadata to aid in reproducibility
file=${DATADIR}/ArcticDEM/arcticdem_mosaic_100m_v4.1_dem.tif
md5sum ${file}
echo ""
gdalinfo ${file}
be66611e17eb925404daa080e690522a /home/kdm/data/ArcticDEM/arcticdem_mosaic_100m_v4.1_dem.tif Driver: GTiff/GeoTIFF Files: /home/kdm/data/ArcticDEM/arcticdem_mosaic_100m_v4.1_dem.tif Size is 74002, 75002 Coordinate System is: PROJCRS["WGS 84 / NSIDC Sea Ice Polar Stereographic North", BASEGEOGCRS["WGS 84", ENSEMBLE["World Geodetic System 1984 ensemble", MEMBER["World Geodetic System 1984 (Transit)"], MEMBER["World Geodetic System 1984 (G730)"], MEMBER["World Geodetic System 1984 (G873)"], MEMBER["World Geodetic System 1984 (G1150)"], MEMBER["World Geodetic System 1984 (G1674)"], MEMBER["World Geodetic System 1984 (G1762)"], MEMBER["World Geodetic System 1984 (G2139)"], ELLIPSOID["WGS 84",6378137,298.257223563, LENGTHUNIT["metre",1]], ENSEMBLEACCURACY[2.0]], PRIMEM["Greenwich",0, ANGLEUNIT["degree",0.0174532925199433]], ID["EPSG",4326]], CONVERSION["US NSIDC Sea Ice polar stereographic north", METHOD["Polar Stereographic (variant B)", ID["EPSG",9829]], PARAMETER["Latitude of standard parallel",70, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8832]], PARAMETER["Longitude of origin",-45, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8833]], PARAMETER["False easting",0, LENGTHUNIT["metre",1], ID["EPSG",8806]], PARAMETER["False northing",0, LENGTHUNIT["metre",1], ID["EPSG",8807]]], CS[Cartesian,2], AXIS["easting (X)",south, MERIDIAN[45, ANGLEUNIT["degree",0.0174532925199433]], ORDER[1], LENGTHUNIT["metre",1]], AXIS["northing (Y)",south, MERIDIAN[135, ANGLEUNIT["degree",0.0174532925199433]], ORDER[2], LENGTHUNIT["metre",1]], USAGE[ SCOPE["Polar research."], AREA["Northern hemisphere - north of 60°N onshore and offshore, including Arctic."], BBOX[60,-180,90,180]], ID["EPSG",3413]] Data axis to CRS axis mapping: 1,2 Origin = (-4000100.000000000000000,4100100.000000000000000) Pixel Size = (100.000000000000000,-100.000000000000000) Metadata: AREA_OR_POINT=Area Image Structure Metadata: COMPRESSION=LZW INTERLEAVE=BAND LAYOUT=COG PREDICTOR=3 Corner Coordinates: Upper Left (-4000100.000, 4100100.000) (179d17'33.71"E, 40d21'17.26"N) Lower Left (-4000100.000,-3400100.000) ( 94d38' 7.22"W, 44d 3'59.59"N) Upper Right ( 3400100.000, 4100100.000) ( 95d19'55.26"E, 43d27'54.66"N) Lower Right ( 3400100.000,-3400100.000) ( 0d 0' 0.01"E, 47d34'59.22"N) Center ( -300000.000, 350000.000) (175d36' 4.66"E, 85d44'47.27"N) Band 1 Block=512x512 Type=Float32, ColorInterp=Gray NoData Value=-9999 Overviews: 37001x37501, 18500x18750, 9250x9375, 4625x4687, 2312x2343, 1156x1171, 578x585, 289x292
file=${DATADIR}/Citterio_2013/PROMICE_250_2019-12-04_EPSG4326/PROMICE_250_2019-12-04.shp
md5sum ${file}
a69e8f7fa340d8998c69aef539de8c58 /home/kdm/data/Citterio_2013/PROMICE_250_2019-12-04_EPSG4326/PROMICE_250_2019-12-04.shp
file=${DATADIR}/GIMP/0714/1999.07.01/GimpOceanMask_90m
md5sum ${file}
echo ""
gdalinfo ${file}
99e30ed3209ef4c6d4fe4e84cd76596c /home/kdm/data/GIMP/0714/1999.07.01/GimpOceanMask_90m Driver: GTiff/GeoTIFF Files: /home/kdm/data/GIMP/0714/1999.07.01/GimpOceanMask_90m Size is 16620, 30000 Coordinate System is: PROJCRS["unnamed", BASEGEOGCRS["WGS 84", DATUM["World Geodetic System 1984", ELLIPSOID["WGS 84",6378137,298.257223563, LENGTHUNIT["metre",1]]], PRIMEM["Greenwich",0, ANGLEUNIT["degree",0.0174532925199433]], ID["EPSG",4326]], CONVERSION["Polar Stereographic (variant B)", METHOD["Polar Stereographic (variant B)", ID["EPSG",9829]], PARAMETER["Latitude of standard parallel",70, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8832]], PARAMETER["Longitude of origin",-45, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8833]], PARAMETER["False easting",1, LENGTHUNIT["metre",1], ID["EPSG",8806]], PARAMETER["False northing",1, LENGTHUNIT["metre",1], ID["EPSG",8807]]], CS[Cartesian,2], AXIS["(E)",south, MERIDIAN[90, ANGLEUNIT["degree",0.0174532925199433, ID["EPSG",9122]]], ORDER[1], LENGTHUNIT["metre",1]], AXIS["(N)",south, MERIDIAN[180, ANGLEUNIT["degree",0.0174532925199433, ID["EPSG",9122]]], ORDER[2], LENGTHUNIT["metre",1]]] Data axis to CRS axis mapping: 1,2 Origin = (-640000.000000000000000,-655550.000000000000000) Pixel Size = (90.000000000000000,-90.000000000000000) Metadata: AREA_OR_POINT=Area Image Structure Metadata: COMPRESSION=DEFLATE INTERLEAVE=BAND Corner Coordinates: Upper Left ( -640000.000, -655550.000) ( 89d18'44.40"W, 81d33'26.51"N) Lower Left ( -640000.000,-3355550.000) ( 55d47'53.80"W, 59d11'57.96"N) Upper Right ( 855800.000, -655550.000) ( 7d32'51.02"E, 80d 4'20.46"N) Lower Right ( 855800.000,-3355550.000) ( 30d41'32.31"W, 58d47'45.95"N) Center ( 107900.000,-2005550.000) ( 41d55'13.60"W, 71d36'44.99"N) Band 1 Block=16620x1 Type=Int16, ColorInterp=Gray
md5sum ${DATADIR}/Morlighem_2017/BedMachineGreenland-v5.nc
echo ""
ncdump -chs ${DATADIR}/Morlighem_2017/BedMachineGreenland-v5.nc
7387182a059dd8cad66ce7638eb0d7cd /home/kdm/data/Morlighem_2017/BedMachineGreenland-v5.nc netcdf BedMachineGreenland-v5 { dimensions: x = 10218 ; y = 18346 ; variables: char mapping ; mapping:grid_mapping_name = "polar_stereographic" ; mapping:latitude_of_projection_origin = 90. ; mapping:standard_parallel = 70. ; mapping:straight_vertical_longitude_from_pole = -45. ; mapping:false_easting = 0. ; mapping:false_northing = 0. ; mapping:proj4text = "+proj=stere +lat_0=90 +lat_ts=70 +lon_0=-45 +k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs" ; mapping:crs_wkt = "PROJCS[\"WGS 84 / NSIDC Sea Ice Polar Stereographic North\",GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.0174532925199433,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4326\"]],PROJECTION[\"Polar_Stereographic\"],PARAMETER[\"latitude_of_origin\",70],PARAMETER[\"central_meridian\",-45],PARAMETER[\"scale_factor\",1],PARAMETER[\"false_easting\",0],PARAMETER[\"false_northing\",0],UNIT[\"metre\",1,AUTHORITY[\"EPSG\",\"9001\"]],AXIS[\"X\",EAST],AXIS[\"Y\",NORTH],AUTHORITY[\"EPSG\",\"3413\"]]" ; mapping:_Storage = "contiguous" ; int x(x) ; x:long_name = "Cartesian x-coordinate" ; x:standard_name = "projection_x_coordinate" ; x:units = "meter" ; x:_Storage = "contiguous" ; x:_Endianness = "little" ; int y(y) ; y:long_name = "Cartesian y-coordinate" ; y:standard_name = "projection_y_coordinate" ; y:units = "meter" ; y:_Storage = "contiguous" ; y:_Endianness = "little" ; byte mask(y, x) ; mask:long_name = "mask (0 = ocean, 1 = ice-free land, 2 = grounded ice, 3 = floating ice, 4 = non-Greenland land)" ; mask:grid_mapping = "mapping" ; mask:valid_range = 0b, 4b ; mask:flag_values = 0b, 1b, 2b, 3b, 4b ; mask:flag_meanings = "ocean ice_free_land grounded_ice floating_ice non_greenland_land" ; mask:source = "gimpmask v2.0 (https://byrd.osu.edu/research/groups/glacier-dynamics/data/icemask) combined with coastline from Jeremie Mouginot" ; mask:_Storage = "contiguous" ; float surface(y, x) ; surface:long_name = "ice surface elevation" ; surface:standard_name = "surface_altitude" ; surface:units = "meters" ; surface:grid_mapping = "mapping" ; surface:source = "gimpdem v2.1 (https://byrd.osu.edu/research/groups/glacier-dynamics/data/gimpdem)" ; surface:_Storage = "contiguous" ; surface:_Endianness = "little" ; float thickness(y, x) ; thickness:long_name = "ice thickness" ; thickness:standard_name = "land_ice_thickness" ; thickness:units = "meters" ; thickness:grid_mapping = "mapping" ; thickness:source = "Mass conservation (Mathieu Morlighem)" ; thickness:_Storage = "contiguous" ; thickness:_Endianness = "little" ; float bed(y, x) ; bed:long_name = "bed topography" ; bed:standard_name = "bedrock_altitude" ; bed:units = "meters" ; bed:grid_mapping = "mapping" ; bed:source = "Mass conservation (Mathieu Morlighem)" ; bed:_FillValue = -9999.f ; bed:_Storage = "contiguous" ; bed:_Endianness = "little" ; short errbed(y, x) ; errbed:long_name = "bed topography/ice thickness error" ; errbed:units = "meters" ; errbed:grid_mapping = "mapping" ; errbed:source = "Mathieu Morlighem" ; errbed:_FillValue = -9999s ; errbed:_Storage = "chunked" ; errbed:_ChunkSizes = 3670, 2044 ; errbed:_DeflateLevel = 2 ; errbed:_Shuffle = "true" ; errbed:_Endianness = "little" ; byte source(y, x) ; source:long_name = "data source (0 = none, 1 = gimpdem, 2 = Mass conservation, 3 = synthetic, 4 = interpolation, 5 = hydrostatic equilibrium, 6 = kriging, 7 = RTOPO-2, 8 = gravity inversion, 9 = Millan et al. 2021, 10+ = bathymetry data)" ; source:grid_mapping = "mapping" ; source:valid_range = 0b, 50b ; source:flag_values = 0b, 1b, 2b, 3b, 4b, 5b, 6b, 7b, 8b, 9b, 10b, 11b, 12b, 13b, 14b, 15b, 16b, 17b, 18b, 19b, 20b, 21b, 22b, 23b, 24b, 25b, 26b, 27b, 28b, 29b, 30b, 31b, 32b, 33b, 34b, 35b, 36b, 37b, 38b, 39b, 40b, 41b, 42b, 43b, 44b, 45b, 46b, 47b, 48b, 49b, 50b ; source:flag_meanings = "none gimpdem mass_conservation synthetic interpolation hydrodstatic_equilibrium kriging RTopo_2 gravity_inversion millan_etal_2021 bathymetry1 bathymetry2 bathymetry3 bathymetry4 bathymetry5 bathymetry6 bathymetry7 bathymetry8 bathymetry9 bathymetry10 bathymetry11 bathymetry12 bathymetry13 bathymetry14 bathymetry15 bathymetry16 bathymetry17 bathymetry18 bathymetry19 bathymetry20 bathymetry21 bathymetry22 bathymetry23 bathymetry24 bathymetry25 bathymetry26 bathymetry27 bathymetry28 bathymetry29 bathymetry30 bathymetry31 bathymetry32 bathymetry33 bathymetry34 bathymetry35 bathymetry36 bathymetry37 bathymetry38 bathymetry39 bathymetry40 bathymetry41" ; source:source = "Mathieu Morlighem" ; source:_Storage = "contiguous" ; byte dataid(y, x) ; dataid:long_name = "data id" ; dataid:grid_mapping = "mapping" ; dataid:valid_range = 1b, 10b ; dataid:flag_values = 1b, 2b, 7b, 10b ; dataid:flag_meanings = "GIMPdem Radar seismic multibeam" ; dataid:source = "Mathieu Morlighem" ; dataid:_Storage = "chunked" ; dataid:_ChunkSizes = 4587, 2555 ; dataid:_DeflateLevel = 1 ; dataid:_Shuffle = "true" ; short geoid(y, x) ; geoid:long_name = "EIGEN-6C4 Geoid - WGS84 Ellipsoid difference" ; geoid:standard_name = "geoid_height_above_reference_ellipsoid" ; geoid:units = "meters" ; geoid:grid_mapping = "mapping" ; geoid:geoid = "eigen-6c4 (Forste et al 2014)" ; geoid:_Storage = "chunked" ; geoid:_ChunkSizes = 3670, 2044 ; geoid:_DeflateLevel = 2 ; geoid:_Shuffle = "true" ; geoid:_Endianness = "little" ; // global attributes: :Conventions = "CF-1.7" ; :Title = "BedMachine Greenland" ; :Author = "Mathieu Morlighem" ; :version = "28-Jul-2022 (v5.5)" ; :nx = 10218. ; :ny = 18346. ; :Projection = "Polar Stereographic North (70N, 45W)" ; :proj4 = "+init=epsg:3413" ; :sea_water_density\ \(kg\ m-3\) = 1023. ; :ice_density\ \(kg\ m-3\) = 917. ; :xmin = -652925 ; :ymax = -632675 ; :spacing = 150 ; :no_data = -9999. ; :license = "No restrictions on access or use" ; :Data_citation = "Morlighem M. et al., (2017), BedMachine v3: Complete bed topography and ocean bathymetry mapping of Greenland from multi-beam echo sounding combined with mass conservation, Geophys. Res. Lett., 44, doi:10.1002/2017GL074954. (http://onlinelibrary.wiley.com/doi/10.1002/2017GL074954/full)" ; :_NCProperties = "version=2,netcdf=4.7.4,hdf5=1.8.12" ; :_SuperblockVersion = 0 ; :_IsNetcdf4 = 0 ; :_Format = "netCDF-4 classic model" ; }
- WARNING
- May be stale. Using pre-computed md5sum because it takes so long to calculate.
ROOT=${DATADIR}/MAR/3.13-freshwater
LIST=$(cd ${ROOT}; ls *.nc | { tee >(head -n4 >&3; cat >/dev/null) | tail -n4; } 3>&1 ) # first and last four files
parallel --keep-order md5sum ${ROOT}/{} ::: ${LIST}
f678065481615c0d74078dacf0350457 /home/kdm/data/MAR/3.12-freshwater/MARv3.12-daily-ERA5-2000.nc ae0e868b79f0a9554779d85c0260115e /home/kdm/data/MAR/3.12-freshwater/MARv3.12-daily-ERA5-2001.nc 792b2794f606127aa3f00b215d641859 /home/kdm/data/MAR/3.12-freshwater/MARv3.12-daily-ERA5-2002.nc 19ad04f82cce52403533582c4df236cf /home/kdm/data/MAR/3.12-freshwater/MARv3.12-daily-ERA5-2003.nc f678065481615c0d74078dacf0350457 /home/kdm/data/MAR/3.12-freshwater/MARv3.12-daily-ERA5-2000.nc ae0e868b79f0a9554779d85c0260115e /home/kdm/data/MAR/3.12-freshwater/MARv3.12-daily-ERA5-2001.nc 792b2794f606127aa3f00b215d641859 /home/kdm/data/MAR/3.12-freshwater/MARv3.12-daily-ERA5-2002.nc 19ad04f82cce52403533582c4df236cf /home/kdm/data/MAR/3.12-freshwater/MARv3.12-daily-ERA5-2003.nc
ROOT=${DATADIR}/MAR/3.13-freshwater
ncdump -chs ${ROOT}/MARv3.13-daily-ERA5-2000.nc
netcdf MARv3.13-daily-ERA5-2000 { dimensions: y = 2700 ; x = 1496 ; time = 366 ; variables: float LAT(y, x) ; LAT:units = "degrees" ; LAT:long_name = "Latitude" ; LAT:standard_name = "Latitude" ; LAT:_FillValue = 9.96921e+36f ; LAT:missing_value = 9.96921e+36f ; LAT:_Storage = "chunked" ; LAT:_ChunkSizes = 2700, 1496 ; LAT:_DeflateLevel = 6 ; LAT:_Shuffle = "true" ; LAT:_Endianness = "little" ; float LON(y, x) ; LON:units = "degrees" ; LON:long_name = "Longitude" ; LON:standard_name = "Longitude" ; LON:_FillValue = 9.96921e+36f ; LON:missing_value = 9.96921e+36f ; LON:_Storage = "chunked" ; LON:_ChunkSizes = 2700, 1496 ; LON:_DeflateLevel = 6 ; LON:_Shuffle = "true" ; LON:_Endianness = "little" ; float MSK(y, x) ; MSK:units = "-" ; MSK:long_name = "Land/Ice Mask" ; MSK:standard_name = "Land_Ice_Mask" ; MSK:_FillValue = 9.96921e+36f ; MSK:missing_value = 9.96921e+36f ; MSK:_Storage = "chunked" ; MSK:_ChunkSizes = 2700, 1496 ; MSK:_DeflateLevel = 6 ; MSK:_Shuffle = "true" ; MSK:_Endianness = "little" ; float MSK_MAR(y, x) ; MSK_MAR:units = "-" ; MSK_MAR:long_name = "Original MAR 5x 5km2 Ice Mask" ; MSK_MAR:standard_name = "Original_MAR__5x_5km2_Ice_Mask" ; MSK_MAR:_FillValue = 9.96921e+36f ; MSK_MAR:missing_value = 9.96921e+36f ; MSK_MAR:_Storage = "chunked" ; MSK_MAR:_ChunkSizes = 2700, 1496 ; MSK_MAR:_DeflateLevel = 6 ; MSK_MAR:_Shuffle = "true" ; MSK_MAR:_Endianness = "little" ; float RU2(time, y, x) ; RU2:units = "mmWE/day" ; RU2:long_name = "Water run-off (sub-pixel 2)" ; RU2:standard_name = "Water_run-off__sub-pixel_2_" ; RU2:_FillValue = 9.96921e+36f ; RU2:missing_value = 9.96921e+36f ; RU2:_Storage = "chunked" ; RU2:_ChunkSizes = 100, 2700, 1496 ; RU2:_DeflateLevel = 6 ; RU2:_Shuffle = "true" ; RU2:_Endianness = "little" ; float RUcorr(time, y, x) ; RUcorr:units = "mmWEday" ; RUcorr:long_name = "Water run-off (sub-pixel 1)" ; RUcorr:standard_name = "Water_run-off__sub-pixel_1_" ; RUcorr:_FillValue = 9.96921e+36f ; RUcorr:missing_value = 9.96921e+36f ; RUcorr:_Storage = "chunked" ; RUcorr:_ChunkSizes = 100, 2700, 1496 ; RUcorr:_DeflateLevel = 6 ; RUcorr:_Shuffle = "true" ; RUcorr:_Endianness = "little" ; float SMB2(time, y, x) ; SMB2:units = "mmWE/day" ; SMB2:long_name = "Surface Mass Balance (sub-pixel 2)" ; SMB2:standard_name = "Surface_Mass_Balance__sub-pixel_2_" ; SMB2:_FillValue = 9.96921e+36f ; SMB2:missing_value = 9.96921e+36f ; SMB2:_Storage = "chunked" ; SMB2:_ChunkSizes = 100, 2700, 1496 ; SMB2:_DeflateLevel = 6 ; SMB2:_Shuffle = "true" ; SMB2:_Endianness = "little" ; float SMBcorr(time, y, x) ; SMBcorr:units = "mmWE/day" ; SMBcorr:long_name = "Surface Mass Balance (sub-pixel 1)" ; SMBcorr:standard_name = "Surface_Mass_Balance__sub-pixel_1_" ; SMBcorr:_FillValue = 9.96921e+36f ; SMBcorr:missing_value = 9.96921e+36f ; SMBcorr:_Storage = "chunked" ; SMBcorr:_ChunkSizes = 100, 2700, 1496 ; SMBcorr:_DeflateLevel = 6 ; SMBcorr:_Shuffle = "true" ; SMBcorr:_Endianness = "little" ; float SRF(y, x) ; SRF:units = "m" ; SRF:long_name = "Surface height" ; SRF:standard_name = "Surface_height" ; SRF:_FillValue = 9.96921e+36f ; SRF:missing_value = 9.96921e+36f ; SRF:_Storage = "chunked" ; SRF:_ChunkSizes = 2700, 1496 ; SRF:_DeflateLevel = 6 ; SRF:_Shuffle = "true" ; SRF:_Endianness = "little" ; float SRF_MAR(y, x) ; SRF_MAR:units = "m" ; SRF_MAR:long_name = "Original MAR 5x 5km2 Surface height" ; SRF_MAR:standard_name = "Original_MAR__5x_5km2_Surface_height" ; SRF_MAR:_FillValue = 9.96921e+36f ; SRF_MAR:missing_value = 9.96921e+36f ; SRF_MAR:_Storage = "chunked" ; SRF_MAR:_ChunkSizes = 2700, 1496 ; SRF_MAR:_DeflateLevel = 6 ; SRF_MAR:_Shuffle = "true" ; SRF_MAR:_Endianness = "little" ; float time(time) ; time:units = "DAYS since 2000-01-01 12:00:00" ; time:long_name = "time" ; time:standard_name = "time" ; time:_Storage = "chunked" ; time:_ChunkSizes = 100 ; time:_DeflateLevel = 6 ; time:_Shuffle = "true" ; time:_Endianness = "little" ; float x(x) ; x:units = "m" ; x:long_name = "x" ; x:standard_name = "x" ; x:_Storage = "chunked" ; x:_ChunkSizes = 1496 ; x:_DeflateLevel = 6 ; x:_Shuffle = "true" ; x:_Endianness = "little" ; float y(y) ; y:units = "m" ; y:long_name = "y" ; y:standard_name = "y" ; y:_Storage = "chunked" ; y:_ChunkSizes = 2700 ; y:_DeflateLevel = 6 ; y:_Shuffle = "true" ; y:_Endianness = "little" ; // global attributes: :title = "Daily MARv3.12 outputs in 2000 interpolated on the 1x1km^2 grid from Noel et al. using ERA5" ; :institution = "University of Liège (Belgium)" ; :contact = "xavier.fettweis@uliege.be" ; :institute = "University of Liège (Belgium)" ; :model = "regional climate model MARv3.12.0" ; :forcing = "ERA5" ; :creation_date = "2021-11-18-T114559Z" ; :history = "Thu Nov 18 11:45:59 2021: ncks --cnk_dmn time,100 -L 6 MARv3.12-daily-ERA5-2000.nc MARv3.12-daily-ERA5-2000.nc4" ; :NCO = "netCDF Operators version 4.8.1 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)" ; :_NCProperties = "version=2,netcdf=4.7.3,hdf5=1.10.5," ; :_SuperblockVersion = 0 ; :_IsNetcdf4 = 1 ; :_Format = "netCDF-4" ; }
- WARN
- Only check some to save compute time.
echo 'ice'
(cd ${DATADIR}/RACMO/freshwater/runoff_ice_1km; find . -type f -name "*.nc" | LC_ALL=C sort | head -n8 | parallel --keep-order md5sum)
(cd ${DATADIR}/RACMO/freshwater/runoff_ice_1km; find . -type f -name "*.nc" | LC_ALL=C sort | tail -n8 | parallel --keep-order md5sum)
echo 'land_5.5_km'
(cd ${DATADIR}/RACMO/freshwater/runoff_land_5.5km; find . -type f -name "*.nc" | LC_ALL=C sort | head -n8 | parallel --keep-order md5sum)
(cd ${DATADIR}/RACMO/freshwater/runoff_land_5.5km; find . -type f -name "*.nc" | LC_ALL=C sort | tail -n8 | parallel --keep-order md5sum)
echo 'land_1_km'
(cd ${DATADIR}/RACMO/freshwater/runoff_land_1km_regrid; find . -type f -name "*.nc" | LC_ALL=C sort | head -n8 | parallel --keep-order md5sum)
(cd ${DATADIR}/RACMO/freshwater/runoff_land_1km_regrid; find . -type f -name "*.nc" | LC_ALL=C sort | tail -n8 | parallel --keep-order md5sum)
ice | |
219834ba9f1bc98e0a87ab37ea88863d | ./runoff.1990_AMJ.BN_RACMO2.3p2_ERA5_3h_FGRN055.1km.DD.nc |
7bde9226e0651e2cfbb8082424ba5181 | ./runoff.1990_JAS.BN_RACMO2.3p2_ERA5_3h_FGRN055.1km.DD.nc |
76d77c3ac9881521825d93c1c43af7d5 | ./runoff.1990_JFM.BN_RACMO2.3p2_ERA5_3h_FGRN055.1km.DD.nc |
17dd60fa5964af322624e72d9b58ea7c | ./runoff.1990_OND.BN_RACMO2.3p2_ERA5_3h_FGRN055.1km.DD.nc |
fcdd6121b4c1c94d1fe5018966deacac | ./runoff.1991_AMJ.BN_RACMO2.3p2_ERA5_3h_FGRN055.1km.DD.nc |
48c894ad8d7c95b5521892c18183ea41 | ./runoff.1991_JAS.BN_RACMO2.3p2_ERA5_3h_FGRN055.1km.DD.nc |
d405a62f91a201e3b7be931c12e1f678 | ./runoff.1991_JFM.BN_RACMO2.3p2_ERA5_3h_FGRN055.1km.DD.nc |
bcaadbf2e742b533a612417f748dd5f0 | ./runoff.1991_OND.BN_RACMO2.3p2_ERA5_3h_FGRN055.1km.DD.nc |
acf9b55165baf20b613067383ea3d1fc | ./runoff_WJB_int.1988_AMJ.BN_RACMO2.3p2_FGRN055_1km.DD.nc |
08d48cb9b85912742785fc089a8462bb | ./runoff_WJB_int.1988_JAS.BN_RACMO2.3p2_FGRN055_1km.DD.nc |
9c26be97b007d967bdcc48c34ad1c3ea | ./runoff_WJB_int.1988_JFM.BN_RACMO2.3p2_FGRN055_1km.DD.nc |
0673a072bd62d2957f01d5ca6c6cf570 | ./runoff_WJB_int.1988_OND.BN_RACMO2.3p2_FGRN055_1km.DD.nc |
c9d47dea1a7fe5317447734ac19b7faa | ./runoff_WJB_int.1989_AMJ.BN_RACMO2.3p2_FGRN055_1km.DD.nc |
46b5f01a3dce922eee06f48f415ec9fb | ./runoff_WJB_int.1989_JAS.BN_RACMO2.3p2_FGRN055_1km.DD.nc |
dbfdf673dc82d957b5109b34724db0a6 | ./runoff_WJB_int.1989_JFM.BN_RACMO2.3p2_FGRN055_1km.DD.nc |
af207835fe80b60d58e06b5f6035bf18 | ./runoff_WJB_int.1989_OND.BN_RACMO2.3p2_FGRN055_1km.DD.nc |
land_5.5_km | |
f60e7505e75e58133e82d381db5caa43 | ./runoff.1957.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc |
7b2de3637a9b79647aa600c8da7d2990 | ./runoff.1958.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc |
cf60a1b81f84975ee741b483ddedb46c | ./runoff.1959.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc |
fd74c2d30a0d6589303b8ccbc4679694 | ./runoff.1960.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc |
d364c5db102ee36981d55e0b46277673 | ./runoff.1961.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc |
fcb4b8154fce45145ba468c7b0c2a1e6 | ./runoff.1962.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc |
b87dfb7d13426080a58745d9a917de2b | ./runoff.1963.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc |
3d055d1d9c0ddacae46acad23a421a9e | ./runoff.1964.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc |
32e7fa643ae40f346999e09c1b17a6d5 | ./runoff.2014.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc |
1a6d4fdbb8dd14aa5311b3d43954c331 | ./runoff.2015.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc |
a5f91f7731fb2a7418a7e89e9f67fb43 | ./runoff.2016.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc |
90634adf3dd1b9ee409002a29ac3e08f | ./runoff.2017.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc |
906751f527cdb51f5448fd324a705a29 | ./runoff.2018.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc |
0e481615b966397f0a4c50f48ba1dab9 | ./runoff.2019.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc |
a813eb69a66bd4c23cf15afcac592420 | ./runoff.2020.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc |
c5ae487f3ecc5987a17c3c06083ebd5a | ./runoff.2021.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc |
land_1_km | |
879f40c2e9c4598a83458ab93bcd2246 | ./Grid_regular_EPSG3413_1km.nc |
c2caab22f5dc8b816d26b15b07c6c732 | ./Tundra_Mask_1km.nc |
c0c69a24c76cb909ea2ee463d79d10b6 | ./mask.nc |
a3c0a6c3762155ad404fe6be75c49251 | ./out/runoff.1957.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc.1km-bilin.nc |
76d34a54d246b6872d913e7889f66772 | ./out/runoff.1958.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc.1km-bilin.nc |
5f1f04983c6f79fc1d89ccb15ec5946b | ./out/runoff.1959.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc.1km-bilin.nc |
b5d9ffdd8b73d22d9b6bc7da5cbc3c0d | ./out/runoff.1960.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc.1km-bilin.nc |
dc269d5d295640ed71f033bce1d30489 | ./out/runoff.1961.FGRN055_BN_RACMO2.3p2_FGRN055.DD.nc.1km-bilin.nc |
78000c74d9e9553e5237c20205b6f891 | ./out/runoff.2017.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc.1km-bilin.nc |
ced4b9dd79d5c2e0a612b01bd07d2afd | ./out/runoff.2018.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc.1km-bilin.nc |
9f520292d3ab4dbf40246bc462cda4e3 | ./out/runoff.2019.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc.1km-bilin.nc |
59b9237449e3c8dfe1b62e8a330bffbd | ./out/runoff.2020.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc.1km-bilin.nc |
c03ce8c3865abad695537b78e5fcbe13 | ./out/runoff.2021.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc.1km-bilin.nc |
0d0023a324f077136cf8e69b96440617 | ./out/runoff.2022.FGRN055_BN_RACMO2.3p2_ERA5_3h_FGRN055.DD.nc.1km-bilin.nc |
c5e011d91b9feb6b7a533b9f146486f8 | ./weights_FGRN055_to_EPSG3413_1km_bilinear.nc |
8f7bea66a91abcea8f2995a9d36b1132 | ./weights_FGRN055_to_EPSG3413_1km_neareststod.nc |
ROOT=${DATADIR}/RACMO/freshwater/runoff_ice_1km
ncdump -chs ${ROOT}/runoff_WJB_int.1979_AMJ.BN_RACMO2.3p2_FGRN055_1km.DD.nc
netcdf runoff_WJB_int.1979_AMJ.BN_RACMO2.3p2_FGRN055_1km.DD { dimensions: time = 91 ; x = 1496 ; y = 2700 ; variables: float time(time) ; time:units = "DAYS since 1979-04-01 00:00:00" ; time:long_name = "time" ; time:standard_name = "time" ; time:_Storage = "chunked" ; time:_ChunkSizes = 91 ; time:_DeflateLevel = 1 ; time:_Endianness = "little" ; time:_NoFill = "true" ; float x(x) ; x:units = "km" ; x:long_name = "x" ; x:standard_name = "x" ; x:_Storage = "chunked" ; x:_ChunkSizes = 1496 ; x:_DeflateLevel = 1 ; x:_Endianness = "little" ; x:_NoFill = "true" ; float y(y) ; y:units = "km" ; y:long_name = "y" ; y:standard_name = "y" ; y:_Storage = "chunked" ; y:_ChunkSizes = 2700 ; y:_DeflateLevel = 1 ; y:_Endianness = "little" ; y:_NoFill = "true" ; float LON(y, x) ; LON:units = "km" ; LON:long_name = "Easting" ; LON:standard_name = "Easting" ; LON:actual_range = 0.f, 0.f ; LON:missing_value = 0.f, -1.225254e+28f ; LON:_Storage = "chunked" ; LON:_ChunkSizes = 1350, 748 ; LON:_DeflateLevel = 1 ; LON:_Endianness = "little" ; LON:_NoFill = "true" ; float LAT(y, x) ; LAT:units = "km" ; LAT:long_name = "Northing" ; LAT:standard_name = "Northing" ; LAT:actual_range = 0.f, 0.f ; LAT:missing_value = 0.f, -1.225254e+28f ; LAT:_Storage = "chunked" ; LAT:_ChunkSizes = 1350, 748 ; LAT:_DeflateLevel = 1 ; LAT:_Endianness = "little" ; LAT:_NoFill = "true" ; float runoffcorr(time, y, x) ; runoffcorr:units = "mm w.e. per day" ; runoffcorr:long_name = "Downscaled corrected snowmelt" ; runoffcorr:standard_name = "Downscaled_corrected_snowmelt" ; runoffcorr:actual_range = 0.f, 508.326f ; runoffcorr:missing_value = -1.e+30f ; runoffcorr:_Storage = "chunked" ; runoffcorr:_ChunkSizes = 12, 338, 187 ; runoffcorr:_DeflateLevel = 1 ; runoffcorr:_Endianness = "little" ; runoffcorr:_NoFill = "true" ; // global attributes: :title = "Daily runoff field (RACMO2.3p2)" ; :institution = "IMAU (Brice Noel)" ; :grid = "Map Projection:Polar Stereographic Ellipsoid - Map Reference Latitude: 90.0 - Map Reference Longitude: -39.0 - Map Second Reference Latitude: 71.0 - Map Eccentricity: 0.081819190843 ;wgs84 - Map Equatorial Radius: 6378137.0 ;wgs84 meters - Grid Map Origin Column: 160 - Grid Map Origin Row: -120 - Grid Map Units per Cell: 5000 - Grid Width: 301 - Grid Height: 561" ; :history = "libUN (2013.05.22) - Fri Mar 16 18:19:46 2018" ; " ; :_NCProperties = "version=1|netcdflibversion=4.6.0|hdf5libversion=1.10.0" ; :_SuperblockVersion = 0 ; :_IsNetcdf4 = 0 ; :_Format = "netCDF-4 classic model" ; }
FILE=${DATADIR}/Bamber_2018/FWF17.v3_a.nc
md5sum ${FILE}
ncdump -chs ${FILE}
ed9d9bd0580e124146a5503832ced95e /home/kdm/data/Bamber_2018/FWF17.v3_a.nc netcdf FWF17.v3_a { dimensions: Y = 785 ; X = 752 ; TIME = 708 ; variables: double Y(Y) ; Y:_FillValue = NaN ; Y:units = "meters" ; Y:standard_name = "projection_y_coordinate" ; Y:point_spacing = "even" ; Y:axis = "Y" ; Y:_Storage = "contiguous" ; Y:_Endianness = "little" ; double X(X) ; X:_FillValue = NaN ; X:units = "meters" ; X:standard_name = "projection_x_coordinate" ; X:point_spacing = "even" ; X:axis = "X" ; X:_Storage = "contiguous" ; X:_Endianness = "little" ; int64 TIME(TIME) ; TIME:units = "days since 1958-01-01 00:00:00" ; TIME:calendar = "proleptic_gregorian" ; TIME:_Storage = "contiguous" ; TIME:_Endianness = "little" ; short runoff_tundra(TIME, Y, X) ; runoff_tundra:_FillValue = -9999s ; runoff_tundra:long_name = "Tundra runoff" ; runoff_tundra:units = "km3" ; runoff_tundra:grid_mapping = "polar_stereographic" ; runoff_tundra:description = "WARNING! This variable contains runoff routed from all land tundra (i.e. including CAA, Svalbard, Iceland), whereas the paper only shows tundra runoff from Greenland. To reproduce Greenland-only runoff you must mask the tundra variable with the LSMGr mask provided in order to set all tundra runoff originating outside the Greenland land mass to zero." ; runoff_tundra:scale_factor = 0.01 ; runoff_tundra:_Storage = "chunked" ; runoff_tundra:_ChunkSizes = 118, 131, 126 ; runoff_tundra:_DeflateLevel = 4 ; runoff_tundra:_Shuffle = "true" ; runoff_tundra:_Endianness = "little" ; double lon(Y, X) ; lon:_FillValue = NaN ; lon:grid_mapping = "polar_stereographic" ; lon:units = "degrees" ; lon:standard_name = "longitude" ; lon:_Storage = "contiguous" ; lon:_Endianness = "little" ; short runoff_ice(TIME, Y, X) ; runoff_ice:_FillValue = -9999s ; runoff_ice:long_name = "Ice sheet runoff" ; runoff_ice:units = "km3" ; runoff_ice:grid_mapping = "polar_stereographic" ; runoff_ice:scale_factor = 0.01 ; runoff_ice:_Storage = "chunked" ; runoff_ice:_ChunkSizes = 118, 131, 126 ; runoff_ice:_DeflateLevel = 4 ; runoff_ice:_Shuffle = "true" ; runoff_ice:_Endianness = "little" ; double lat(Y, X) ; lat:_FillValue = NaN ; lat:grid_mapping = "polar_stereographic" ; lat:units = "degrees" ; lat:standard_name = "latitude" ; lat:_Storage = "contiguous" ; lat:_Endianness = "little" ; byte polar_stereographic ; polar_stereographic:grid_mapping_name = "polar_stereographic" ; polar_stereographic:scale_factor_at_central_origin = 1. ; polar_stereographic:standard_parallel = 70. ; polar_stereographic:straight_vertical_longitude_from_pole = -45. ; polar_stereographic:false_easting = 0. ; polar_stereographic:false_northing = 0. ; polar_stereographic:latitude_of_projection_origin = 90. ; byte LSMGr(Y, X) ; LSMGr:_FillValue = -99b ; LSMGr:long_name = "Hole-filled Greenland land mass mask" ; LSMGr:units = "none" ; LSMGr:grid_mapping = "polar_stereographic" ; LSMGr:_Storage = "chunked" ; LSMGr:_ChunkSizes = 785, 752 ; LSMGr:_DeflateLevel = 4 ; LSMGr:_Shuffle = "true" ; byte ocean_basins(Y, X) ; ocean_basins:_FillValue = -99b ; ocean_basins:long_name = "ID number of oceanographic basin which each coastal pixel drains into." ; ocean_basins:units = "none" ; ocean_basins:grid_mapping = "polar_stereographic" ; ocean_basins:basins = ", 26:Labrador Sea, 27:Hudson Strait, 28:Davis Strait, 29:Baffin Bay, 30:Lincoln Sea, 32:Irish Sea and St. George\'s Channel, 33:Inner Seas off the West Coast of Scotland, 55:Gulf of Riga, 56:Baltic Sea, 57:Gulf of Finland, 58:Gulf of Bothnia, 59:White Sea, 76:North Atlantic Ocean, 77:Gulf of St. Lawrence, 80:Celtic Sea, 82:Hudson Bay, 83:The Northwestern Passages, 84:Arctic Ocean, 86:Barentsz Sea, 87:Greenland Sea, 88:North Sea, 96:Kattegat, 98:Skagerrak, 99:Norwegian Sea" ; ocean_basins:history = "Basins are as defined by \"Limits of Oceans & Seas, Special Publication No. 23\" published by the IHO in 1953. The dataset was composed by the Flanders Marine Data and Information Centre and downloaded from www.marineregions.org on 6 February 2017. Coastal runoff pixels were allocated an ocean basin to run off into using fwf/allocate_coast_to_basin.py." ; ocean_basins:_Storage = "chunked" ; ocean_basins:_ChunkSizes = 785, 752 ; ocean_basins:_DeflateLevel = 4 ; ocean_basins:_Shuffle = "true" ; short solid_ice(TIME, Y, X) ; solid_ice:_FillValue = -9999s ; solid_ice:long_name = "Solid ice discharge" ; solid_ice:units = "km3" ; solid_ice:grid_mapping = "polar_stereographic" ; solid_ice:description = "the monthly discharge data are mean annual values divided by 12." ; solid_ice:scale_factor = 0.001 ; solid_ice:_Storage = "chunked" ; solid_ice:_ChunkSizes = 118, 131, 126 ; solid_ice:_DeflateLevel = 4 ; solid_ice:_Shuffle = "true" ; solid_ice:_Endianness = "little" ; // global attributes: :Conventions = "CF-1.4" ; :institution = "University of Bristol (Andrew Tedstone)" ; :title = "Monthly freshwater fluxes to the ocean across the Arctic, 1958-2016" ; :nx = 752. ; :ny = 785. ; :xmin = -1777980. ; :ymax = -67308. ; :spacing = 5000. ; :description = "This is the dataset that underlies the paper \"Bamber, J., A. Tedstone, M. King, I. Howat, E. Enderlin, M. van den Broeke and B. Noel (2018) Land ice freshwater budget of the Arctic and North Atlantic Oceans. Part I: Data, methods and results. Journal of Geophysical Research: Oceans.\"" ; :_NCProperties = "version=1|netcdflibversion=4.4.1.1|hdf5libversion=1.10.1" ; :_SuperblockVersion = 0 ; :_IsNetcdf4 = 1 ; :_Format = "netCDF-4" ; }
dir=${DATADIR}/van_As_2018
md5sum ${dir}/*
echo ""
head ${dir}/Watson_discharge_day_v03.txt
echo ""
tail ${dir}/Watson_discharge_day_v03.txt
176d156b2881b7b4c3ed75b2ede522e0 /home/kdm/data/van_As_2018/readme Watson.txt 9bde46206b085e4fb7d8c4c62dc9758e /home/kdm/data/van_As_2018/Supplemental Data Set.xls a1b9c2ae0c25e18014a28e873d18b873 /home/kdm/data/van_As_2018/uaar_a_1433799_sm6008.zip 0589c8ac34275f5002ca8cfd5a23b8a6 /home/kdm/data/van_As_2018/Watson_discharge_day_v03.txt b5bc413b8db8b6944b35f3b5b5a0d0b2 /home/kdm/data/van_As_2018/Watson_discharge_hour_v03.txt db56b7f1a7848f15b7158238df1943b1 /home/kdm/data/van_As_2018/Watson River discharge reconstruction 1949-2018.txt Year MonthOfYear DayOfMonth DayOfYear DayOfCentury WaterFluxDiversOnly(m3/s) Uncertainty(m3/s) WaterFluxDivers&Temperature(m3/s) Uncertainty(m3/s) WaterFluxCumulative(km3) Uncertainty(km3) 2006 4 10 100 2292 -999.00 -999.00 0.00 0.00 0.00 0.00 2006 4 11 101 2293 -999.00 -999.00 0.00 0.00 0.00 0.00 2006 4 12 102 2294 -999.00 -999.00 0.00 0.00 0.00 0.00 2006 4 13 103 2295 -999.00 -999.00 0.00 0.00 0.00 0.00 2006 4 14 104 2296 -999.00 -999.00 0.00 0.00 0.00 0.00 2006 4 15 105 2297 -999.00 -999.00 0.00 0.00 0.00 0.00 2006 4 16 106 2298 -999.00 -999.00 0.00 0.00 0.00 0.00 2006 4 17 107 2299 -999.00 -999.00 0.00 0.00 0.00 0.00 2006 4 18 108 2300 -999.00 -999.00 0.00 0.00 0.00 0.00 2019 10 18 291 7231 -999.00 -999.00 0.00 0.00 8.33 1.37 2019 10 19 292 7232 -999.00 -999.00 0.00 0.00 8.33 1.37 2019 10 20 293 7233 -999.00 -999.00 0.00 0.00 8.33 1.37 2019 10 21 294 7234 -999.00 -999.00 0.00 0.00 8.33 1.37 2019 10 22 295 7235 -999.00 -999.00 0.00 0.00 8.33 1.37 2019 10 23 296 7236 -999.00 -999.00 0.00 0.00 8.33 1.37 2019 10 24 297 7237 -999.00 -999.00 0.00 0.00 8.33 1.37 2019 10 25 298 7238 -999.00 -999.00 0.00 0.00 8.33 1.37 2019 10 26 299 7239 -999.00 -999.00 0.00 0.00 8.33 1.37 2019 10 27 300 7240 -999.00 -999.00 0.00 0.00 8.33 1.37
root=~/data.me/qaanaaq
md5sum ${root}/discharge2018.txt
head ${root}/discharge2018.txt
for f in $(ls ${DATADIR}/Tedstone_2017/leverett_Q_????_UTC.csv); do
md5sum ${f}
done
echo " "
head -n15 ${f}
# ls ${DATADIR}/GEM
md5sum ${DATADIR}/GEM/GEM.csv
echo ""
head ${DATADIR}/GEM/GEM.csv
echo ""
tail ${DATADIR}/GEM/GEM.csv
md5sum ${DATADIR}/Hawkings_2016/NarsarsuaqDischarge2013.xlsx
<<init>>
g.mapset -c GIMP_0714
r.in.gdal -o input=${DATADIR}/GIMP/0714/1999.07.01/GimpOceanMask_90m output=mask_ocean_lakes
# ocean mask has some cut-off oceans a.k.a lakes. See for example near -274731,-2747470.
g.region raster=mask_ocean_lakes
# Fill the lakes and treat as land.
r.null map=mask_ocean_lakes setnull=0 # only operate on ocean cells
r.clump input=mask_ocean_lakes output=clumps
main_clump=$(r.stats -c -n clumps sort=desc | head -n1 | cut -d" " -f1)
r.mapcalc "mask_ocean = if(clumps == ${main_clump}, 1, null())"
r.null mask_ocean null=0
g.mapset -c Citterio_2013
mkdir -p tmp
# fix bad shapefile that deosn't import directly into GRASS.
ogr2ogr -s_srs EPSG:4326 -t_srs EPSG:3413 ./tmp/PROMICE_3413 ${DATADIR}/Citterio_2013/PROMICE_250_2019-12-04_EPSG4326
v.in.ogr input=./tmp/PROMICE_3413 output=mask
g.region raster=mask_ocean@GIMP_0714
v.to.rast input=mask output=mask_ice_holes use=val val=1
# Everything should route to the coast, this means we need to fill in
# all the nunatuks so that there are no interior drainage basns.
r.mapcalc "not_ice = if(isnull(mask_ice_holes), 1, null())"
r.clump input=not_ice output=clumps
main_clump=$(r.stats -c -n clumps sort=desc | head -n1 | cut -d" " -f1) # ocean
r.mapcalc "mask_ice = mask_ice_holes ||| if(clumps != ${main_clump})"
r.null mask_ice null=0
log_info "Importing ArcticDEM"
g.mapset -c ArcticDEM
r.external source=${DATADIR}/ArcticDEM/arcticdem_mosaic_100m_v4.1_dem.tif output=arctic_raw # all arctic
g.region raster=mask_ocean@GIMP_0714 align=arctic_raw
g.region save=ArcticDEM
g.mapset PERMANENT
g.region region=ArcticDEM@ArcticDEM
g.region -sp # default location
g.mapset ArcticDEM
r.mapcalc "mask_land = if((mask_ice@Citterio_2013 == 0) && (mask_ocean@GIMP_0714 == 0), 1, 0)"
# set up mask: (o)cean=1 (l)and=2 (i)ce=3
r.mapcalc "mask_o_l_i_4 = (mask_ocean@GIMP_0714 + mask_land*2 + mask_ice@Citterio_2013*3)"
r.mapcalc "mask_o_l_i = if(mask_o_l_i_4 == 4, 1, mask_o_l_i_4)" # cleanup some multi-flagged as ocean.
r.mapcalc "z_s_ocean = arctic_raw" # subset and move from external to internal
r.null map=z_s_ocean null=0
r.mapcalc "z_s = if(mask_o_l_i == 1, null(), z_s_ocean)"
log_info "Importing BedMachine"
g.mapset -c BedMachine
ROOT=${DATADIR}/Morlighem_2017
for var in mask surface thickness bed errbed geoid source; do
echo $var
r.in.gdal -o input="NetCDF:${DATADIR}/Morlighem_2017/BedMachineGreenland-v5.nc:"${var} output=${var}
done
g.rename raster=surface,z_s
g.rename raster=bed,z_b
g.region raster=z_s
g.region res=150 -pa
log_info "Importing NSIDC 0713"
g.mapset -c NSIDC_0713
g.region -d
r.external -o input=${DATADIR}/GIMP/0713/RGB.vrt output=rgb
log_info "Loading Mouginot 2019"
g.mapset -c Mouginot_2019
v.import input=${DATADIR}/Mouginot_2019/Greenland_Basins_PS_v1.4.2.shp output=basins snap=1
# v.to.rast input=basins output=basins use=cat label_column=NAME
# v.to.rast input=basins output=regions use=cat label_column=SUBREGION1
log_info "Loading Zwally 2012"
g.mapset -c Zwally_2012
v.import input=${DATADIR}/Zwally_2012/sectors output=sectors snap=1
# v.to.rast input=sectors output=sectors use=attr column=cat_
- According to ”New in this version” section, use doi:10.22008/promice/data/ice_discharge/gates/v02/UFW2WY
g.mapset -c Mankoff_2020
log_info "Importing Mankoff 2020 gates"
mkdir -p tmp/Mankoff_2020_gates
dest=./tmp/Mankoff_2020_gates/gates.gpkg
if [[ ! -f ${dest} ]]; then
wget "https://dataverse.geus.dk/api/access/datafile/:persistentId?persistentId=doi:10.22008/promice/data/ice_discharge/gates/v02/UFW2WY" -O ${dest}
fi
md5sum ./tmp/Mankoff_2020_gates/*
# d28f4a1c4c72490dc83893c18352f579 ./tmp/Mankoff_2020_gates/gates.gpkg
v.in.ogr input=./tmp/Mankoff_2020_gates/gates.gpkg output=gates
v.db.select map=gates|head
g.mapset -c Bjork_2015
log_info "Importing Bjørk 2015 names"
# ogr2ogr -f "ESRI Shapefile" -t_srs EPSG:3413 EPSG_3413 GreenlandGlacierNames_GGNv01_WGS84.shp
v.in.ogr input=${DATADIR}/Bjørk_2015/EPSG_3413/ output=names
v.db.select map=names|head
<<xy_permanent>>
This is just for the outlines on the Disko island figure.
Sum a bunch of the data, then assume wherever it is not 0 is an outlet grid cell.
g.mapset -c Bamber_2018
FILE=${DATADIR}/Bamber_2018/FWF17.v3_a.nc
cdo -v -b F32 timsum ${FILE} ./tmp/B2018.nc # timcumsum?
r.external source=netcdf:./tmp/B2018.nc:runoff_ice output=ice # band=707
r.external source=netcdf:./tmp/B2018.nc:runoff_tundra output=tundra # band=707
r.mask -r
r.mapcalc "MASK = if(ice == 0, null(), 1)" --o
r.to.vect input=ice output=ice type=area
r.mask -r
r.mapcalc "MASK = if(tundra == 0, null(), 1)" --o
r.to.vect input=tundra output=tundra type=area
r.mask -r
Data are too large for Google Earth (app), but using Google Earth tiles in QGIS works fine.
See ./freshwater.qgz and add one or more of the following to QGIS Tile Server (XYZ)
Google Maps: https://mt1.google.com/vt/lyrs=r&x={x}&y={y}&z={z} Google Satellite: http://www.google.cn/maps/vt?lyrs=s@189&gl=cn&x={x}&y={y}&z={z} Google Satellite Hybrid: https://mt1.google.com/vt/lyrs=y&x={x}&y={y}&z={z} Google Terrain: https://mt1.google.com/vt/lyrs=t&x={x}&y={y}&z={z} Bing Aerial: http://ecn.t3.tiles.virtualearth.net/tiles/a{q}.jpeg?g=1
g.mapset -c lakes
r.fill.dir input=head@land_surf output=filled direction=dir areas=problems format=grass
r.mapcalc "lakes = if(filled - head@land_surf > 0.0, 1, null() )"
r.to.vect input=lakes output=lakes type=area
Then load in QGIS
- All land outlets should be at 0 m, or within ArcticDEM uncertainty.
- Some other elevations may exist because masks are not aligned.
- In reality, outlets could exist at high elevations if they enter the ocean via waterfall from a cliff. See for example outlets at the edge of Petermann fjord.
TODO: Weight by runoff
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import matplotlib
from matplotlib import rc
rc('font', size=12)
rc('text', usetex=False)
# matplotlib.pyplot.xkcd()
# plt.close(1)
fig = plt.figure(1, figsize=(5,4)) # w,h
# get_current_fig_manager().window.move(0,0)
fig.clf()
fig.set_tight_layout(True)
# import matplotlib.gridspec as gridspec
# gs = gridspec.GridSpec(1, 1) #w,h
# ax = plt.subplot(gs[:,:])
ax = fig.add_subplot(211)
df = pd.read_csv("./freshwater/ice_100/outlets.csv")
df['elev'].plot.hist(bins = 100,
alpha = 0.5,
bottom = 0.1,
color = 'b',
label="Ice",
ax = ax)
df = pd.read_csv("./freshwater/land_100/outlets.csv")
df['elev'].plot.hist(bins = 100,
alpha = 0.5,
bottom = 0.1,
color = 'g',
label="Land",
ax = ax)
ax.set_yscale("log")
ax.set_xlabel("Outlet elevation [m]")
ax.set_ylabel("Count")
ax.set_yticks([1E0,1E1,1E2,1E3,1E4])
ax2 = fig.add_subplot(212)
mm = df['elev'].abs().max()
h,b = np.histogram(df['elev'].abs().values,
bins=mm)
ax2.plot(b[1:], np.cumsum(h),
alpha = 0.5,
label="Land",
color = 'g')
# mm = df['elev'].max()
# h,b = np.histogram(df[df['elev'] >= 0]['elev'].values,
# bins=mm)
# ax2.plot(b[1:], np.cumsum(h),
# alpha = 0.5,
# linestyle = '--',
# label="Land > 0",
# color = 'g')
# # ax2.set_xlim([0.1,mm])
ax2.set_ylim([1E0,3.1E4])
ax2.set_yticks([0,1E4,2E4,3E4])
ax2.set_xlim([1,1000])
# ax2.set_yscale("log")
ax2.set_xscale("log")
# ax2.set_yticks([1E0,1E1,1E2,1E3,1E4,1E5])
ax2.set_ylabel("Cumulative\ncounts")
ax2.set_xlabel("Elevation [m]")
ax.legend()
# ax2.legend(loc=4)
from adjust_spines import adjust_spines as adj
adj(ax, ['left','bottom'])
adj(ax2, ['left','bottom'])
df = pd.DataFrame()
df['Elevation [m]'] = np.append(np.arange(10), np.arange(10, 101, 10))
df.set_index('Elevation [m]', inplace=True)
df['CDF [%]'] = cumsum(h)[df.index] / cumsum(h).max() * 100
plt.savefig('./fig/outlet_elevation_histogram_cdf.png', transparent=True, bbox_inches='tight', dpi=300)
df
# (cumsum(h)[0:100+1:10])/cumsum(h).max()
Elevation [m] | CDF [%] |
---|---|
0 | 0.200127 |
1 | 39.3483 |
2 | 45.9024 |
3 | 51.4292 |
4 | 56.6392 |
5 | 61.2721 |
6 | 65.2647 |
7 | 68.5634 |
8 | 71.4352 |
9 | 73.9168 |
10 | 76.0015 |
20 | 86.0745 |
30 | 89.8769 |
40 | 92.055 |
50 | 93.5259 |
60 | 94.5566 |
70 | 95.3604 |
80 | 95.9841 |
90 | 96.5078 |
100 | 96.9147 |
conda activate freshwater
cfchecks ./freshwater/land/runoff/MAR_2000.nc
CHECKING NetCDF FILE: ./freshwater/land/runoff/MAR_2000.nc ===================== Using CF Checker Version 4.0.0 Checking against CF Version CF-1.7 Using Standard Name Table Version 76 (2020-10-13T12:38:27Z) Using Area Type Table Version 10 (23 June 2020) Using Standardized Region Name Table Version 4 (18 December 2018) ------------------ Checking variable: station ------------------ WARN: (3.1): units attribute should be present ------------------ Checking variable: time ------------------ ------------------ Checking variable: runoff ------------------ ------------------ Checking variable: lat ------------------ ------------------ Checking variable: lon ------------------ ------------------ Checking variable: alt ------------------ ERRORS detected: 0 WARNINGS given: 1 INFORMATION messages: 0
ncdump -chs ./freshwater/land/runoff/MAR_2000.nc
netcdf MAR_2000 { dimensions: time = UNLIMITED ; // (366 currently) station = 29635 ; variables: uint64 station(station) ; station:long_name = "outlet id" ; station:cf_role = "timeseries_id" ; station:_Storage = "contiguous" ; station:_Endianness = "little" ; int64 time(time) ; time:long_name = "time of measurement" ; time:standard_name = "time" ; time:axis = "T" ; time:units = "days since 2000-01-01 00:00:00" ; time:calendar = "proleptic_gregorian" ; time:_Storage = "chunked" ; time:_ChunkSizes = 512 ; time:_Endianness = "little" ; double runoff(station, time) ; runoff:_FillValue = NaN ; runoff:long_name = "MAR runoff" ; runoff:standard_name = "water_volume_transport_in_river_channel" ; runoff:units = "m3 s-1" ; runoff:coordinates = "lat lon alt station" ; runoff:_Storage = "chunked" ; runoff:_ChunkSizes = 29635, 1 ; runoff:_DeflateLevel = 2 ; runoff:_Shuffle = "true" ; runoff:_Endianness = "little" ; float lat(station) ; lat:_FillValue = NaNf ; lat:long_name = "latitude" ; lat:standard_name = "latitude" ; lat:units = "degrees_north" ; lat:axis = "Y" ; lat:_Storage = "contiguous" ; lat:_Endianness = "little" ; float lon(station) ; lon:_FillValue = NaNf ; lon:long_name = "longitude" ; lon:standard_name = "longitude" ; lon:units = "degrees_east" ; lon:axis = "X" ; lon:_Storage = "contiguous" ; lon:_Endianness = "little" ; float alt(station) ; alt:_FillValue = NaNf ; alt:long_name = "height_above_mean_sea_level" ; alt:standard_name = "altitude" ; alt:units = "m" ; alt:positive = "up" ; alt:axis = "Z" ; alt:_Storage = "contiguous" ; alt:_Endianness = "little" ; // global attributes: :featureType = "timeSeries" ; :title = "Greenland runoff" ; :summary = "Greenland RCM runoff at basin outlets" ; :keywords = "Hydrology; Greenland; Runoff; Freshwater" ; :Conventions = "CF-1.7" ; :source = "git commit: 29aa681" ; :creator_name = "Ken Mankoff" ; :creator_email = "kdm@geus.dk" ; :creator_url = "http://kenmankoff.com" ; :institution = "GEUS" ; :references = "10.22008/promice/freshwater" ; :product_version = 1. ; :_NCProperties = "version=2,netcdf=4.7.4,hdf5=1.10.5" ; :_SuperblockVersion = 0 ; :_IsNetcdf4 = 1 ; :_Format = "netCDF-4" ; }
Yes, within ~0.3 %.
import xarray as xr
import glob as glob
import pandas as pd
df[80] = xr.open_mfdataset(glob.glob("./freshwater/ice_80/runoff/MAR_*.nc"), combine='by_coords')\
.resample({'time':'A'})\
.sum()\
.sum(dim='station')['runoff']\
.to_dataframe()\
.values
df[90] = xr.open_mfdataset(glob.glob("./freshwater/ice_90/runoff/MAR_*.nc"), combine='by_coords')\
.resample({'time':'A'})\
.sum()\
.sum(dim='station')['runoff']\
.to_dataframe()\
.values
df[100] = xr.open_mfdataset(glob.glob("./freshwater/ice_100/runoff/MAR_*.nc"), combine='by_coords')\
.resample({'time':'A'})\
.sum()\
.sum(dim='station')['runoff']\
.to_dataframe()\
.values
df['range'] = df.max(axis='columns') - df.min(axis='columns')
df['%'] = df['range'] / df.max(axis='columns') * 100
df.describe()
The 1) MAR and RACMO models, 2) basins based on the GIMP surface, and 3) reality, are not always correctly aligned. Misalignment includes the following:
Reality | Basin | Model | Comment |
---|---|---|---|
Glacier | Glacier | Glacier | aligned |
Land | Land | Land | aligned |
Glacier | Glacier | Land | |
Glacier | Land | Glacier | |
Glacier | Land | Land | |
Land | Land | Glacier | |
Land | Glacier | Land | |
Land | Glacier | Glacier |
We ignore “Reality” and treat the basin output as the standard, which reduces Table tab:misalignment to just two cases where basins and model disagree, with the following effects:
- Where basin = glacier and model = land, glacier runoff is lost
- Where basin = land and model = glacier, land runoff is lost
Category Information | square | ||
description | kilometers | ||
---|---|---|---|
1 | land | 361,950 | |
---------------------------------------------------------------- | ---------- | ||
0 | model fjord. . . . . . . . . . . . . . . . . . . . . . . . . . | 717 | |
1 | model land . . . . . . . . . . . . . . . . . . . . . . . . . . | 339,749 | |
2 | model ice. . . . . . . . . . . . . . . . . . . . . . . . . . . | 21,484 | |
* | no data | 3,676,860 | |
---------------------------------------------------------------- | ---------- | ||
0 | model fjord. . . . . . . . . . . . . . . . . . . . . . . . . . | 1,863,957 | |
1 | model land . . . . . . . . . . . . . . . . . . . . . . . . . . | 51,532 | |
2 | model ice. . . . . . . . . . . . . . . . . . . . . . . . . . . | 1,761,221 | |
* | no data. . . . . . . . . . . . . . . . . . . . . . . . . . . . | 150 | |
TOTAL | 4,038,810 |
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
Category Information | square | ||
description | kilometers | ||
---|---|---|---|
1 | ice | 1,781,816 | |
---------------------------------------------------------------- | ---------- | ||
0 | model fjord. . . . . . . . . . . . . . . . . . . . . . . . . . | 27 | |
1 | model land . . . . . . . . . . . . . . . . . . . . . . . . . . | 20,876 | |
2 | model ice. . . . . . . . . . . . . . . . . . . . . . . . . . . | 1,760,912 | |
* | no data | 2,256,994 | |
---------------------------------------------------------------- | ---------- | ||
0 | model fjord. . . . . . . . . . . . . . . . . . . . . . . . . . | 1,864,647 | |
1 | model land . . . . . . . . . . . . . . . . . . . . . . . . . . | 370,404 | |
2 | model ice. . . . . . . . . . . . . . . . . . . . . . . . . . . | 21,793 | |
* | no data. . . . . . . . . . . . . . . . . . . . . . . . . . . . | 150 | |
TOTAL | 4,038,810 |
-----------------------------------------------------------------------------
We adjust the model results to the basin in the following ways. For speed, the following is done once using a basin and model mask, not using the daily values. A coverage factor is calculated one time, and then used as a daily multiplier later in the workflow.
Where a basin reports ice and the model reports land, we divide the daily runoff values of the basin cells that do have model ice by the ratio of the total basin area to the area with ice (with runoff data). That is, the model land runoff is discarded, and the nearby model ice runoff is scaled to compensate for the uncovered basin cells. For example, if an ice basin is 90 % covered by model ice, model runoff is divided by 0.9 to scale it to an estimate of what 100 % runoff would be. This scenario occurs frequently at the ice margin because the 1 km2 model grid cell rarely matches 90 m2 basin boundary. It also occurs wherever nunatuks exist, because ice sheet interior “holes” are filled, otherwise they act as interior drains that block large areas from routing to the coast, something unlikely to occur in reality. When a small basin has no model cells covering any part of it, that basin never has any reported runoff.
Where a basin reports land and the model reports ice, the same treatment as above is performed, but for land. That is, the model ice runoff is discarded, and the nearby model land runoff is scaled to compensate for the uncovered basin cells. As above, this occurs frequently at the ice margin for the same reason. Large areas of model ice (and reality ice) are occasionally reported over land basins where outlet glaciers are not properly captured in the model masks (FIGURE near UAK).
Where the model mask reports ocean, no basins exist and model runoff, both land and ice, are discarded. Large amounts of real ocean and fjord locations are classified as land in the MAR model mask variable.
The above means that model runoff is not conserved.
Scale the model area covering each basin so that it equals the basin area.
First, lets test this - some visual examination
rm -fR ./G/test_coverage
grass -c ./G/test_coverage
g.mapset -c coverage_test
g.region -d
r.mask -r
# pick one and zoom in
r.mapcalc "land_basin = if(basins@ice_surf == 353, 353, null())" --q --o
r.colors map=land_basin color=green
r.mapcalc "ice_basin = if(basins@ice_surf == 12, 12, null())" --q --o
r.colors map=ice_basin color=water
g.region zoom=land_basin --q
d.mon start=wx0
d.erase
d.rast land_basin
d.rast ice_basin
# # check if model ice exists here
d.rast mask_ice_MAR@MAR # yes
r.report --q -h map=ice_basin,mask_ice_MAR@MAR units=k | grep -A4 "|12|"
# SAME -> setting mask not needed for ice.
g.region zoom=land_basin --q
r.report --q -h map=land_basin,mask_land_MAR@MAR units=k | grep -A4 "|353|"
g.region zoom=land_basin --q
r.report --q -h map=ice_basins@land_surf,mask_ice_MAR@MAR units=k | grep -A4 "|353|"
# coverage report
r.report --q -h map=basins@land_surf,mask_ice_MAR@MAR units=k
Generate a test file
grass -c ./G/test_coverage
r.report --q -h map=basins@ice_surf,mask_ice_MAR@MAR units=k | head -n 100
r.stats --q -aN input=basins@ice_surf,mask_ice_MAR@MAR separator=, > ./tmp/coverage_test.csv
exit
Compute coverage
The key line is:
r.stats --q -aN input=basins@ice_surf,mask_ice_MAR@MAR separator=, > ./tmp/coverage_test.csv
In the main processing loop, I’ll add a line like:
r.stats --q -aN input=basins@<MAPSET>,mask_<domain>_<model>@<model> separator=, \
> ./tmp/coverage_<mapset>_<domain>_<model>.csv
And then in post-processing I will calculate coverage and apply it to each basin.
g.mapset -c test_alignment
g.region n=-3227490 s=-3253500 w=7110 e=48960 res=90 -pa align=mask_ice_basin@MAR
g.region save=test --o
# MAR grid to this mapset and region subset
r.to.vect input=mask_ice_MAR@MAR output=MAR_grid_ice type=area
r.to.vect input=mask_land_MAR@MAR output=MAR_grid_land type=area
# MAR grid centers at MAR res to this mapset and region subset
g.region res=1000 -p
r.to.vect input=mask_ice_MAR@MAR output=MAR_grid_ice_point type=point
r.to.vect input=mask_land_MAR@MAR output=MAR_grid_land_point type=point
g.region region=test -p
d.mon start=wx1
d.erase
d.vect basins@land_surf fill_color=178:223:138 color=100:100:100 legend_label="Land Basin"
d.vect basins@ice_surf fill_color=166:206:227 color=100:100:100 legend_label="Ice Basin"
# d.vect outlets@land_surf fill_color=green icon=basic/circle size=7 legend_label="Ice Outlet"
# d.vect outlets@ice_surf fill_color=blue icon=basic/circle size=7 legend_label="Land Outlet"
d.vect MAR_grid_ice fill_color=none color=black width=3 legend_label="MAR Ice/Land Boundary"
d.vect MAR_grid_ice_point icon=basic/box color=black fill_color=blue size=3 legend_label="MAR Ice Cell"
d.vect MAR_grid_land_point icon=basic/box color=black fill_color=green size=3 legend_label="MAR Land Cell"
NOTE - better figure generated in QGIS, but using the data created above. See ./fig/alignment.qgz and output ./fig/basin_MAR_align.png
ps.map -e input=./tmp/basin_MAR_align.psmap output=./fig/basin_MAR_align.eps --o
convert -trim -density 300 ./fig/basin_MAR_align.eps ./fig/basin_MAR_align.png
o ./fig/basin_MAR_align.png
border n
# scale 1:1000000
paper a4
end
vareas MAR_grid_ice
color black
pat $GISBASE/etc/paint/patterns/diag_down.eps
scale 1
fcolor 55:126:184
width 1
label "MAR ice"
end
vareas MAR_grid_land
color black
pat $GISBASE/etc/paint/patterns/diag_up.eps
scale 1
fcolor 51:160:44
width 1
label "MAR land"
end
# hatch fill
vareas basins@ice_surf
# pat $GISBASE/etc/paint/patterns/diag_up.eps
# scale 0.5
# fcolor 55:126:184
color 55:126:184
# fcolor 166:206:227
fcolor 240:240:240
label "Basin ice"
end
# vareas basins@ice_surf
# color 55:126:184
# fcolor 166:206:227
# end
vareas basins@land_surf
color 51:160:44
fcolor 127:223:138
label "Basin land"
end
# # crosses on land
# vpoints MAR_grid_land_point
# type point
# color 51:160:44
# scale 3
# # see ${GISBAS}/etc/symbol/basic/
# symbol basic/x
# label "Model Land Grid Cell"
# end
vlegend
where 5.7 1.1
fontsize 14
end
scalebar f
length 4
units kilometers
segment 2
where 1 5
fontsize 14
end
<<py_init>>
<<py_init_graphics>>
import geopandas as gp
from shapely.geometry import Point
import matplotlib
from matplotlib import rc
rc('font', size=12)
rc('text', usetex=False)
# matplotlib.pyplot.xkcd()
# plt.close(1)
fig = plt.figure(1, figsize=(8,11)) # w,h
# get_current_fig_manager().window.move(0,0)
fig.clf()
fig.set_tight_layout(True)
ax = fig.add_subplot(111)
if 'land' not in locals():
land = (gp.read_file("./freshwater/land_100/outlets.gpkg").set_index("cat"))\
.merge(gp.read_file("./freshwater/land_100/basins_filled.gpkg").set_index("cat"), left_index=True, right_index=True)\
.rename(columns={"geometry_x":"outlet", "geometry_y":"basin"})\
.set_geometry('basin')
ice = (gp.read_file("./freshwater/ice_100/outlets.gpkg").set_index("cat"))\
.merge(gp.read_file("./freshwater/ice_100/basins.gpkg").set_index("cat"), left_index=True, right_index=True)\
.rename(columns={"geometry_x":"outlet", "geometry_y":"basin"})\
.set_geometry('basin')
kw = {'ax':ax, 'linewidth':0.3}
land.plot(color=C_lightgreen, edgecolor=C_darkgreen, **kw)
ice[ice.elev > -10].plot(color=(0.8, 0.8, 0.8), edgecolor=C_darkblue, **kw)
ice[ice.elev <= -10].plot(color=C_lightblue, edgecolor=C_darkblue, **kw)
from matplotlib_scalebar.scalebar import ScaleBar
ax.add_artist(ScaleBar(1))
gdf = gp.GeoDataFrame(obs_xy, columns=['abbrev','x','y'], crs="EPSG:3413").set_index("abbrev")
gdf['geometry'] = [Point(x,y) for x,y in zip(gdf['x'],gdf['y'])]
import matplotlib.patheffects as pe
kw = {'color':'k', 'path_effects':[pe.withStroke(linewidth=4, foreground="white")]}
for obs in gdf.index:
basin = land[land.contains(gdf.loc[obs].geometry)].geometry
if obs == 'L':
basin = ice[ice.contains(gdf.loc[obs].geometry)].geometry
box = basin.envelope.scale(1.1, 1.1)
box.plot(ax=ax, color='none', edgecolor='k')
xy = box.iloc[0].exterior.coords.xy
x,y = xy[0][-2], xy[1][-2]
txt=obs
# if obs == 'L': txt = ''
# if obs == 'W':
# txt = 'W & L'
# ax.text(x-35000,y+12000, txt, horizontalalignment='left', **kw)
if obs not in ['K','Kb','O']:
ax.text(x-3000,y+5000, txt, horizontalalignment='right', **kw)
else: # obs in ['K','Kb','O']:
if obs == 'K':
x,y = xy[0][-1], xy[1][-1]
ax.text(x-3000,y-5000, txt, horizontalalignment='right', verticalalignment='top', **kw)
if obs == 'O':
x,y = xy[0][2], xy[1][2]
ax.text(x+3000,y+5000, txt, horizontalalignment='left', verticalalignment='bottom', **kw)
if obs == 'Kb':
x,y = xy[0][1], xy[1][1]
ax.text(x+3000,y-5000, txt, horizontalalignment='left', verticalalignment='top', **kw)
plt.axis('off')
plt.subplots_adjust(top = 1, bottom = 0, right = 1, left = 0, hspace = 0, wspace = 0)
plt.margins(0,0)
plt.gca().xaxis.set_major_locator(plt.NullLocator())
plt.gca().yaxis.set_major_locator(plt.NullLocator())
plt.savefig("./fig/overview.png", bbox_inches='tight', dpi=300, pad_inches=0)
- To find out how much each upstream cell moved between two basin delineations, B0 and B1…
- For each outlet in B0 and B1 get the x and y coordinate and the outlet ID (same as the basin ID).
- Build four raster maps, B0_x, B0_y, B1_x, B1_y, where each cell in x(y) has the x(y) coordinate of the outlet for that cell.
- Distance in outlet location is then sqrt((B0_x - B1_x)^2 + (B0_y - B1_y)^2) for each cell.
- When comparing many (n) basin delineations, each cell has n (x,y) possible outlet locations.
- Calculate mean_x, mean_y for each cell from n (x,y) possibilities
- Then calculate average/max/std distance for each cell from the n (x,y) to the mean.
- Then, one map captures n possible delineations
<<init>>
g.mapset -c k_basin_change
log_info "Exporting outlet coordinates"
mapsets=$(g.mapsets -l separator="\n" | grep -E "^ice_*")
parallel --bar "r.out.xyz input=outlets@{1} | awk -F'|' '{print \$3, \$1}' > ./tmp/outlets_{1}_x" ::: ${mapsets}
parallel --bar "r.out.xyz input=outlets@{1} | awk -F'|' '{print \$3, \$2}' > ./tmp/outlets_{1}_y" ::: ${mapsets}
- First encode the outlet location as the category information of each basin, rather than as the value for each (x,y) cell.
- This means we encode n basin bits of info (20,000?) rather than n grid cells bits of info (4.5M?)
- Then create new rasters where the grid cells contain the outlet info directly. Easier for math further down in the code.
log_info "Encoding outlet coordinates"
coord="x y"
parallel --bar "g.copy basins@{1},cat_{1}_{2}" ::: ${mapsets} ::: ${coord}
parallel --bar "r.category map=cat_{1}_{2} rules=./tmp/outlets_{1}_{2} separator=space" ::: ${mapsets} ::: ${coord}
parallel --bar "r.mapcalc '{1}_{2} = @cat_{1}_{2}'" ::: ${mapsets} ::: ${coord}
log_info "Exporting cells..."
log_warn "Reducing resolution."
LIST=$(g.list type=raster pattern='^[i|l][ceand]*_*[x|y]$' separator=,)
echo "x,y,${LIST}" | tr "," "|" > ./tmp/cells_xy.bsv
g.region res=1000 -a
r.out.xyz input=${LIST} output=- >> ./tmp/cells_xy.bsv
g.region -d
LIST=ice_100_x,ice_100_y,ice_80_x,ice_80_y,ice_90_x,ice_90_y,z_s@ArcticDEM
echo "x,y,${LIST}" | tr "," "|" > ./tmp/cells_elev.bsv
g.region res=1000 -a
r.out.xyz input=${LIST} output=- >> ./tmp/cells_elev.bsv
g.region -d
import pandas as pd
import numpy as np
df_xy = pd.read_csv("./tmp/cells_xy.bsv", delimiter="|", usecols=['x','y']).astype(np.int)
header = pd.read_csv("./tmp/cells_xy.bsv", delimiter="|", nrows=1)
cols_in = header.columns[['ice' in _ for _ in header.columns]]
df = pd.read_csv("./tmp/cells_xy.bsv", delimiter="|", usecols=cols_in)
multi_hdr = [np.array([_.split("_")[1] for _ in cols_in]).astype(np.int),
[_.split("_")[2] for _ in cols_in]]
df.columns = pd.MultiIndex.from_arrays(multi_hdr, names=['k','coord'])
k = np.unique(multi_hdr[0])
for kk in k:
# mean distance for this k =
# all other columns minus this column squared for x
# also for y
# square root of all that
# take the mean of all columns
df[(kk,'avg_dist')] = (((df.xs('x', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('x', level='coord', axis='columns')[kk]))**2 + (df.xs('y', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('y', level='coord', axis='columns')[kk]))**2)**0.5).mean(axis='columns')
# same for max, except last step
df[(kk,'max_dist')] = (((df.xs('x', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('x', level='coord', axis='columns')[kk]))**2 + (df.xs('y', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('y', level='coord', axis='columns')[kk]))**2)**0.5).max(axis='columns')
# df[(kk,'std_dist')] = (((df.xs('x', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('x', level='coord', axis='columns')[kk]))**2 + (df.xs('y', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('y', level='coord', axis='columns')[kk]))**2)**0.5).std(axis='columns')
df_out = pd.DataFrame()
df_out['avg_avg'] = df.xs('avg_dist', level='coord', axis='columns').mean(axis='columns')
# df_out['avg_max'] = df.xs('max_dist', level='coord', axis='columns').mean(axis='columns')
# df_out['avg_std'] = df.xs('std_dist', level='coord', axis='columns').mean(axis='columns')
# df_out['max_avg'] = df.xs('avg_dist', level='coord', axis='columns').max(axis='columns')
df_out['max_max'] = df.xs('max_dist', level='coord', axis='columns').max(axis='columns')
# df_out['max_std'] = df.xs('std_dist', level='coord', axis='columns').max(axis='columns')
# df_out['std_avg'] = df.xs('avg_dist', level='coord', axis='columns').std(axis='columns')
# df_out['std_max'] = df.xs('max_dist', level='coord', axis='columns').std(axis='columns')
# df_out['std_std'] = df.xs('std_dist', level='coord', axis='columns').std(axis='columns')
df_out = df_out.merge(df_xy, left_index=True, right_index=True)
df_out.to_csv("./tmp/cells_xy_ice_stats.bsv", sep="|", float_format='%d')
NO - Now that only doing 1x land_100, this method no longer works
cols_in = header.columns[['land' in _ for _ in header.columns]]
df = pd.read_csv("./tmp/cells_xy.bsv", delimiter="|", usecols=cols_in)
df.columns = pd.MultiIndex.from_arrays(multi_hdr, names=['k','coord'])
for kk in k:
df[(kk,'avg_dist')] = (((df.xs('x', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('x', level='coord', axis='columns')[kk]))**2 + (df.xs('y', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('y', level='coord', axis='columns')[kk]))**2)**0.5).mean(axis='columns')
df[(kk,'max_dist')] = (((df.xs('x', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('x', level='coord', axis='columns')[kk]))**2 + (df.xs('y', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('y', level='coord', axis='columns')[kk]))**2)**0.5).max(axis='columns')
df_out = pd.DataFrame()
df_out['avg_avg'] = df.xs('avg_dist', level='coord', axis='columns').mean(axis='columns')
df_out['max_max'] = df.xs('max_dist', level='coord', axis='columns').max(axis='columns')
df_out = df_out.merge(df_xy, left_index=True, right_index=True)
df_out.to_csv("./tmp/cells_xy_land_stats.bsv", sep="|", float_format='%d')
# run python code above to calculate basin outlet statistics
log_info "Calculating cell outlet statistics..."
python ./k_basin_change.py
log_info "Re-importing rasters."
log_warn "Reduced resolution."
g.region res=1000 -a
r.in.xyz input=./tmp/cells_xy_ice_stats.bsv output=ice_max method=mean x=4 y=5 z=3 skip=1
r.in.xyz input=./tmp/cells_xy_ice_stats.bsv output=ice_avg method=mean x=4 y=5 z=2 skip=1
# r.in.xyz input=./tmp/cells_xy_land_stats.bsv output=land_max method=mean x=4 y=5 z=3 skip=1
# r.in.xyz input=./tmp/cells_xy_land_stats.bsv output=land_avg method=mean x=4 y=5 z=2 skip=1
-1 = -1
0 thru 2999 = 1 0 to < 3 km
3000 thru 10000 = 2 3 to < 10 km
10001 thru 29999 = 3 10 to < 30 km
30000 thru 99999 = 4 30 to < 100 km
100000 thru 10000000 = 5 > 100 km
# for raster in ice_max ice_avg land_max land_avg; do
for raster in ice_max ice_avg; do
r.mask raster=mask_o_l_i@ArcticDEM maskcats="2 3" --o
r.mapcalc "${raster}_land = if(isnull(${raster}), -1 * (mask_o_l_i@ArcticDEM >=2), ${raster})" --o
r.mask -r
r.reclass input=${raster}_land output=${raster}_class --o rules=./tmp/categories.txt
cat << EOF | r.colors map=${raster}_class rules=-
-1 128:128:128
5 255:255:204
4 161:218:180
3 65:182:196
2 44:127:184
1 37:52:148
EOF
done
# https://colorbrewer2.org/?type=sequential&scheme=YlGnBu&n=5
g.mapset k_basin_change
g.region -d
r.mapcalc "outline = if(mask_o_l_i@ArcticDEM != 1)"
r.to.vect input=outline output=outline type=area
g.region res=1000 -a
r.mapcalc "outline_low = if(mask_o_l_i@ArcticDEM != 1, 1, null())"
r.to.vect input=outline_low output=outline_low type=area
g.region -d
- Could be useful to show an elevation contour representative of the max melt elevation
r.mask outline_low maskcats=1 --o
g.region res=1000 -a
r.mapcalc "z_s = z_s@BedMachine" # mask
r.contour input=z_s output=z_s levels=1000,1500,2000
g.region -d
r.mask -r
border n
# scale 1:1000000
paper a3
end
raster ${raster}
# vareas outline_low
# color gray
# fcolor gray
# # width 0.1
# end
vlines z_s
type line
where level = 1500
color black
width 1
end
vlines z_s
type line
where level = 1500
color white
width 5
end
colortable y
nodata n
where 5.5 12
fontsize 24
end
scalebar s
length 100
units kilometers
segment 1
where 7 11.5
fontsize 24
end
# All of Greenland
g.region -d
g.region res=500 -a
raster=land_avg_class
cat << EOF | ps.map -e input=- output=./tmp/${raster}.eps --o
<<psmap>>
<<psmap_legend_GL>>
EOF
convert -trim ./tmp/${raster}.{eps,png}
o ./tmp/${raster}.png
g.region -d
- Use the PSMAP block above
log_info "Generating graphic..."
mkdir -p ./fig
for domain in ice; do
for stat in max avg; do
raster=${domain}_${stat}_class
g.region -d
g.region res=1000 -a
r.mask -r
cat << EOF | ps.map input=- output=./tmp/${raster}.ps --o
<<psmap>>
EOF
convert -trim -transparent white ./tmp/${raster}.ps ./fig/${raster}.png
g.region -d
done
done
import pandas as pd
import numpy as np
df_xy = pd.read_csv("./tmp/cells_elev.bsv", delimiter="|", usecols=['x','y','z_s@ArcticDEM']).astype(np.int)
header = pd.read_csv("./tmp/cells_elev.bsv", delimiter="|", nrows=1)
cols_in = header.columns[['ice' in _ for _ in header.columns]]
df = pd.read_csv("./tmp/cells_elev.bsv", delimiter="|", usecols=cols_in)
multi_hdr = [np.array([_.split("_")[1] for _ in cols_in]).astype(np.int),
[_.split("_")[2] for _ in cols_in]]
df.columns = pd.MultiIndex.from_arrays(multi_hdr, names=['k','coord'])
k = np.unique(multi_hdr[0])
for kk in k:
# mean distance for this k =
# all other columns minus this column squared for x
# also for y
# square root of all that
# take the mean of all columns
df[(kk,'avg_dist')] = (((df.xs('x', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('x', level='coord', axis='columns')[kk]))**2 + (df.xs('y', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('y', level='coord', axis='columns')[kk]))**2)**0.5).mean(axis='columns')
# same for max, except last step
df[(kk,'max_dist')] = (((df.xs('x', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('x', level='coord', axis='columns')[kk]))**2 + (df.xs('y', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('y', level='coord', axis='columns')[kk]))**2)**0.5).max(axis='columns')
# df[(kk,'std_dist')] = (((df.xs('x', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('x', level='coord', axis='columns')[kk]))**2 + (df.xs('y', level='coord', axis='columns').drop(kk, axis='columns').apply(lambda c: c-df.xs('y', level='coord', axis='columns')[kk]))**2)**0.5).std(axis='columns')
df_out = pd.DataFrame()
df_out['avg_avg'] = df.xs('avg_dist', level='coord', axis='columns').mean(axis='columns')
# df_out['avg_max'] = df.xs('max_dist', level='coord', axis='columns').mean(axis='columns')
# df_out['avg_std'] = df.xs('std_dist', level='coord', axis='columns').mean(axis='columns')
# df_out['max_avg'] = df.xs('avg_dist', level='coord', axis='columns').max(axis='columns')
df_out['max_max'] = df.xs('max_dist', level='coord', axis='columns').max(axis='columns')
# df_out['max_std'] = df.xs('std_dist', level='coord', axis='columns').max(axis='columns')
# df_out['std_avg'] = df.xs('avg_dist', level='coord', axis='columns').std(axis='columns')
# df_out['std_max'] = df.xs('max_dist', level='coord', axis='columns').std(axis='columns')
# df_out['std_std'] = df.xs('std_dist', level='coord', axis='columns').std(axis='columns')
df_out = df_out.merge(df_xy, left_index=True, right_index=True)
# df_out.to_csv("./tmp/cells_xy_ice_stats.bsv", sep="|", float_format='%d')
import matplotlib.pyplot as plt
import matplotlib
from matplotlib import rc
rc('font', size=12)
rc('text', usetex=False)
# matplotlib.pyplot.xkcd()
# plt.close(1)
fig = plt.figure(1, figsize=(5,4)) # w,h
# get_current_fig_manager().window.move(0,0)
fig.clf()
fig.set_tight_layout(True)
# import matplotlib.gridspec as gridspec
# gs = gridspec.GridSpec(1, 1) #w,h
# ax = plt.subplot(gs[:,:])
ax = fig.add_subplot(111)
df2 = df_out.copy(deep=True)
df2 = df2[df2['z_s@ArcticDEM'] < 1500]
df2 = df2[df2['max_max'] < 250*1E3]
df2 = df2[df2['max_max'] > 100*10*sqrt(2)]
ax.hexbin(df2['max_max']/1E3,df2['z_s@ArcticDEM'], mincnt=1, gridsize=(50,25), bins='log')
# ax.set_xlim(0,100E3)
# ax.set_ylim(0,3000)
# ax.plot(np.sin(np.arange(0,2*pi,0.1)))
# plt.savefig('.png', transparent=True, bbox_inches='tight', dpi=300)
# this is in Bash w/o GRASS
rm -f ./tmp/obs_loc.bsv
for key in "${!tbl_obs[@]}"; do
row=(${tbl_obs[$key]})
echo ${row[0]}"|"${row[1]}"|"${key}"|" >> ./tmp/obs_loc.bsv
done
# in GRASS
<<grass_init_mapset>>
g.mapset -c obs
v.in.ascii input=./tmp/obs_loc.bsv output=obs_loc x=1 y=2
v.db.renamecolumn map=obs_loc column=int_1,x
v.db.renamecolumn map=obs_loc column=int_2,y
v.db.renamecolumn map=obs_loc column=str_1,label
v.db.dropcolumn map=obs_loc column=int_3
db.select table=obs_loc
Recall locations:
for key in "${!tbl[@]}"; do
row=(${tbl[$key]})
echo ${row[@]} ${key}
done
# also
# cat ./tmp/obs_loc.bsv
import geopandas as gpd
from shapely.geometry import Point
gdf = gpd.GeoDataFrame(tbl, columns=['abbrev','x','y'], crs="EPSG:3413").set_index("abbrev")
gdf['geometry'] = [Point(x,y) for x,y in zip(gdf['x'],gdf['y'])]
gdf.to_file("./tmp/obs.gpkg", driver="GPKG")
gdf
abbrev | x | y | geometry |
---|---|---|---|
Ks | -18335 | -3183360 | POINT (-18335 -3183360) |
K | -326372 | -2829354 | POINT (-326372 -2829354) |
Kb | -316602 | -2831048 | POINT (-316602 -2831048) |
L | -216646 | -2504583 | POINT (-216646 -2504583) |
O | -317396 | -2826710 | POINT (-317396 -2826710) |
Q | -560538 | -1241281 | POINT (-560538 -1241281) |
R | -335678 | -2246371 | POINT (-335678 -2246371) |
T | -324548 | -2827284 | POINT (-324548 -2827284) |
W | -249713 | -2510668 | POINT (-249713 -2510668) |
Z | 699990 | -1540459 | POINT (699990 -1540459) |
View
qgis ./tmp/obs.gpkg &
Issues
- Leverett always in micro-basin at ice sheet edge. Suggest manually moving or using
upstream
to collect other nearby micro-catchments.- Used “upstream” from original Leverett and there was too much runoff.
- New Leverett is adjusted into the largest nearby ice basin.
- Narsarsuaq in micro-basin at glacier snout. Seems likely that
upstream
fixes this. It appears that upstream glaciers drain to land and then under this glacier.
source ~/.bash_profile
conda activate freshwater
mkdir -p ./dat/runoff
for line in $(cat ./tmp/obs_loc.bsv); do
echo $line
IFS="|" read x y abb <<< ${line[@]}
echo $abb $x $y
python ./discharge.py --base ./freshwater --roi=${x},${y} -u -d > ./dat/runoff/${abb}.csv
done
This takes an hour to run…
g.mapset -c obs
# | Color | R | G | B | hex |
# |-------------+-----+-----+-----+---------|
# | light blue | 166 | 206 | 227 | #a6cee3 |
# | dark blue | 31 | 120 | 180 | #1f78b4 |
# | light green | 178 | 223 | 138 | #b2df8a |
# | dark green | 51 | 160 | 44 | #33a02c |
# | pink | 251 | 154 | 153 | #fb9a99 |
# | red | 227 | 26 | 28 | #e31a1c |
# | pale orange | 253 | 191 | 111 | #fdbf6f |
# | orange | 255 | 127 | 0 | #ff7f00 |
C_lightblue=#a6cee3
C_darkblue=#1f78b4
C_lightgreen=#b2df8a
C_darkgreen=#33a02c
C_pink=#fb9a99
export GRASS_OVERWRITE=1
# for label in $(db.select -c sql="select label from obs_loc where label == 'L'"); do
for label in $(db.select -c sql="select label from obs_loc"); do
v.extract input=obs_loc where="label=='${label}'" output=this
# set up the rasters and vectors
<<plan_view_obs_loc>>
# Make the map
cat << EOF | ps.map -e input=- output=./fig/${label}.eps --o
<<plan_view_obs_loc_psmap>>
EOF
convert -trim -density 300 ./fig/${label}.eps ./fig/${label}.png
# o ./fig/${label}.png
done
## Open each in inkscape, adjust scalebar if necessary (x4 for L and W), save all (even if not adjustments)
# for f in $(ls fig/{?,??}.eps); do inkscape $f; done
## Make PNGs:
# parallel --verbose --bar "convert -trim -density 300 ./fig/{1}.eps ./fig/{1}.png" ::: Kb K Ks L O Q R T W Z
# # B2018 cell nearby
# x=$(db.select -c sql="select round(x) from b2018 where label == '${label}'")
# y=$(db.select -c sql="select round(y) from b2018 where label == '${label}'")
# g.region w=$(( ${x} - 2500 )) e=$(( ${x} + 2499 )) s=$(( ${y} - 2500 )) n=$(( ${y} + 2499 )) -pa
# v.in.region output=b2018_this # vector outlining region. Useful for cropping.
v.select ainput=basins_filled@land_100 binput=this output=selected operator=contains
g.region vector=selected -pa
# expand by 10 %
eval $(g.region -pg)
w_adj=$(echo "(${e} - ${w}) * 0.05" | bc)
h_adj=$(echo "(${n} - ${s}) * 0.05" | bc)
g.region e=e+${w_adj} w=w-${w_adj} n=n+${h_adj} s=s-${h_adj} -pa
g.region w=w-${w_adj}
g.region w=w-${w_adj} # another 10 % for the legend
v.in.region output=region # vector outlining region. Useful for cropping.
# All basins within region
v.overlay ainput=region binput=basins@land_100 operator=and output=basins_land
v.overlay ainput=region binput=basins_filled@land_100 operator=and output=basins_land_filled
v.overlay ainput=region binput=basins@ice_100 operator=and output=basins_ice
# this land basin + outlet + stream
v.select ainput=basins_land binput=this operator=contains output=basins_land_this
v.select ainput=basins_land_filled binput=this operator=contains output=basins_land_filled_this
v.select ainput=outlets@land_100 binput=basins_land_this operator=within output=outlets_land_this
v.select ainput=streams@land_100 atype=line binput=basins_land_this operator=overlap output=streams_land_this
# ice outlets, basins, and streams draining into this basin
v.select ainput=outlets@ice_100 binput=basins_land_filled_this operator=within output=outlets_ice_this
v.select ainput=basins_ice binput=outlets_ice_this operator=contains output=basins_ice_this
v.select ainput=streams@ice_100 atype=line binput=basins_ice_this operator=overlap output=streams_ice_this
if [[ ${label} == "L" ]]; then
v.select ainput=this binput=basins_ice operator=within output=outlets_ice_this
v.select ainput=basins_ice binput=this operator=contains output=outlets_ice_this
v.select ainput=basins_ice binput=outlets_ice_this operator=contains output=basins_ice_this
v.select ainput=streams@ice_100 atype=line binput=basins_ice_this operator=overlap output=streams_ice_this
fi
### Basemap
g.region res=15 -a
parallel --verbose --bar "r.mapcalc \"{}float = (rgb.{#}@NSIDC_0713)^0.33\"" ::: r g b
eval $(r.univar map=rfloat,gfloat,bfloat -g) # get max of all three
parallel --verbose --bar "r.mapcalc \"{} = int( round( ({}float - ${min}) / (${max} - ${min}) * 255 ) )\"" ::: r g b
r.colors map=r,g,b color=grey
# RCM cells here
r.to.vect input=mask_ice_MAR@MAR output=MAR_ice type=area
r.to.vect input=mask_ice_RACMO@RACMO output=RACMO_ice type=area
# d.rgb red=r green=g blue=b
# d.vect basins@land_100 fill_color=none color=${C_darkgreen} width=2
# d.vect outlets@land_100 icon=basic/cross2 fillcolor=${C_darkgreen} color=${C_darkgreen} size=10
# d.vect streams@land_100 color=${C_lightgreen}
# d.vect basins@ice_100 fill_color=none color=${C_darkblue} width=2
# d.vect outlets@ice_100 icon=basic/cross2 fillcolor=${C_darkblue} color=${C_darkblue} size=10
# d.vect streams@ice_100 color=${C_lightblue}
# d.vect this icon=basic/cross2 color=red fill_color=red size=15
# d.out.file output=./fig/b.png size=${width},${height}
border n
paper a4
end
vareas basins_land
color ${C_darkgreen}
fcolor none
lpos 0
width 0.33
end
vareas basins_ice_this
color ${C_darkblue}
fcolor none
label Ice basins
end
vareas basins_ice
color ${C_darkblue}
fcolor none
lpos 0
width 0.25
end
vareas basins_land_this
color ${C_darkgreen}
fcolor none
label Land basin
end
# vlines streams_ice_this
# color ${C_lightblue}
# # lpos 0
# label Streams
# end
vlines streams_land_this
color ${C_lightgreen}
# lpos 0
label Streams
end
vpoints outlets_ice_this
fcolor 255:127:0
# fcolor white
color none
# symbol basic/box
label Outlets
size 5
end
vpoints outlets_land_this
# color 51:160:44
fcolor 255:127:0
color none
size 5
lpos 0
end
vpoints this
type point
color 255:127:0
fcolor none
label Station
end
vareas MAR_ice
color ${C_lightblue}
# pat $GISBASE/etc/paint/patterns/diag_down.eps
# fcolor 126:206:227
fcolor none
scale 1
width 3
label RCM ice
end
vareas RACMO_ice
color ${C_lightblue}
# pat $GISBASE/etc/paint/patterns/diag_up.eps
# fcolor 126:206:227
fcolor none
scale 1
width 1
# label RACMO ice
lpos 0
end
# vareas b2018_this
# color ${C_pink}
# # pat $GISBASE/etc/paint/patterns/diag_up.eps
# # fcolor 126:206:227
# fcolor none
# scale 1
# width 1
# # label RACMO ice
# lpos 0
# end
rgb r g b
vlegend
where 0.1 1
fontsize 8
end
text 3% 3% a
background white
fontsize 20
end
scalebar s
length 1
units kilometers
segment 1
where 1 3.25
fontsize 12
end
<<py_init>>
<<py_init_graphics>>
<<load_all_obs>>
# k='W'; obs={k:obs[k]}
for k in obs.keys():
fig = plt.figure(1, figsize=(8,3.5)) # w,h
fig.clf()
fig.set_tight_layout(True)
ax1 = fig.add_subplot(111)
adjust_spines(ax1, ['left','bottom'])
df = obs[k]
name = df.attrs['name']
df = df.replace(0, np.nan)
kw = {'ax': ax1, 'drawstyle':'steps-post'}
dfly = df[df.index.year == df.dropna().index.year[-1]] # .dropna() # last year
dfly = dfly.loc[dfly['obs'].first_valid_index():dfly['obs'].last_valid_index()]
# dfly = df
dfly['obs'].plot(color=C_darkblue, **kw, alpha=1, label=name + ' observed')
dfly['MAR'].plot(color=C_MAR, **kw, alpha=0.5, label='MAR')
dfly['RACMO'].plot(color=C_RACMO, **kw, alpha=0.5, label='RACMO')
# \pm 15 % according to Xavier
kw={'step':'post', 'alpha':0.25}
ax1.fill_between(dfly.index, dfly['MAR']*0.85, dfly['MAR']*1.15,color=C_MAR, **kw)
ax1.fill_between(dfly.index, dfly['RACMO']*0.85, dfly['RACMO']*1.15,color=C_RACMO, **kw)
# ax1.fill_between(dfly.index, dfly['MAR_90']*0.85, dfly['MAR_90']*1.15, color=C_MAR, **kw)
# ax1.fill_between(dfly.index, dfly['RACMO_90']*0.85, dfly['RACMO_90']*1.15, color=C_RACMO, **kw)
# ax1.fill_between(dfly.index, dfly['MAR_80']*0.85, dfly['MAR_80']*1.15, color=C_MAR, **kw)
# ax1.fill_between(dfly.index, dfly['RACMO_80']*0.85, dfly['RACMO_80']*1.15, color=C_RACMO, **kw)
if k == 'Q': # 9 % 2017, 22 % 2018 and 2019 error according to Ken Kondo email
dfly['err'] = dfly['obs'] * 0.22
ax1.fill_between(dfly.index, dfly['obs']-dfly['err'], dfly['obs']+dfly['err'], color=C_darkblue, step='post', alpha=0.25)
if k == 'W': # Dirk provides err w/ his data
ax1.fill_between(dfly.index, dfly['obs']-dfly['err'], dfly['obs']+dfly['err'], color=C_darkblue, step='post', alpha=0.25)
plt.legend(fontsize=8, fancybox=False, frameon=False)
ax1.set_xlabel("Time")
ax1.set_ylabel("Runoff [m$^{3}$ s$^{-1}$]")
ax1.text(-0.14, 0.95, 'b', size=15, color='white', bbox=dict(facecolor='black'), transform=ax1.transAxes)
plt.savefig("./fig/" + k + "_ts.png", bbox_inches='tight', dpi=300)
<<py_init>>
<<py_init_graphics>>
<<load_all_obs>>
# k="Q"; obs = {k:obs[k]}
for k in obs.keys():
fig = plt.figure(1, figsize=(8,6.5)) # w,h
fig.clf()
fig.set_tight_layout(True)
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(223)
ax4 = fig.add_subplot(224)
adjust_spines(ax1, ['left','bottom'])
adjust_spines(ax2, ['right','bottom'])
adjust_spines(ax3, ['left','bottom'])
adjust_spines(ax4, ['right','bottom'])
df = obs[k]
df = df.replace(0, np.nan).dropna()
df = np.log10(df)
# df = df[~df.isin([np.nan, np.inf, -np.inf]).any(1)] # clean up nan and inf
kw = {'alpha': 1, 'cmap': cm.viridis, 'marker':'.', 'c':df.index.dayofyear, 'edgecolor':'none', 'clip_on':False}
s = ax1.scatter(df['obs'], df['MAR'], **kw)
s = ax2.scatter(df['obs'], df['RACMO'], **kw)
kw = {'color':'k', 'alpha':0.66, 'linewidth':0.5, 'levels':5}
sns.kdeplot(df['obs'], df['MAR'], ax=ax1, **kw)
sns.kdeplot(df['obs'], df['RACMO'], ax=ax2, **kw)
lims = [np.min([ax1.get_xlim()[0], ax1.get_ylim()[0], ax2.get_xlim()[0], ax2.get_ylim()[0]]),
np.max([ax1.get_xlim()[1], ax1.get_ylim()[1], ax2.get_xlim()[1], ax2.get_ylim()[1]])]
# drop 5/95 outliers
# q = df['obs'].quantile([0.05, 0.95])
# df = df[(df['obs'] > q[0.05]) & (df['obs'] < q[0.95])]
df.sort_values(by='obs', inplace=True)
x = df['obs'].values
y_MAR = df['MAR'].values
y_RACMO = df['RACMO'].values
X = sm.add_constant(x)
# X = x
model = sm.OLS(y_MAR, X)
results = model.fit()
prstd, iv_l, iv_u = wls_prediction_std(results)
ax1.fill_between(x, iv_u, iv_l, color="grey", alpha=0.25)
ax1.text(0.05, 0.85, 'r$^{2}$:' + str(round(results.rsquared,2)), transform=ax1.transAxes, horizontalalignment='left')
model = sm.OLS(y_RACMO, X)
results = model.fit()
prstd, iv_l, iv_u = wls_prediction_std(results)
ax2.fill_between(x, iv_u, iv_l, color="grey", alpha=0.25)
ax2.text(0.05, 0.85, 'r$^{2}$:' + str(round(results.rsquared,2)), transform=ax2.transAxes, horizontalalignment='left')
ax1.text(0.05, 0.95, 'n:' + str(df['obs'].size), transform=ax1.transAxes, horizontalalignment='left')
ax1.set_xlabel('Observed [m$^{3}$ s$^{-1}$]')
ax1.set_ylabel('MAR [m$^{3}$ s$^{-1}$]')
ax2.set_xlabel('Observed [m$^{3}$ s$^{-1}$]')
ax2.set_ylabel('RACMO [m$^{3}$ s$^{-1}$]')
if lims[0] < -3: lims[0] = -3
lims_upstreamp = np.log10((10**np.array(lims))*5)
lims_down = np.log10((10**np.array(lims))/5)
kw = {'alpha':0.5, 'linewidth':1, 'color':'k', 'linestyle':'-'}
for ax in [ax1,ax2]:
ax.plot(([lims[0],lims[1]]), ([lims[0],lims[1]]), **kw)
ax.plot(([lims[0],lims[1]]), ([lims_upstreamp[0],lims_upstreamp[1]]), **kw)
ax.plot(([lims[0],lims[1]]), ([lims_down[0],lims_down[1]]), **kw)
ax.set_ylim(lims[0], lims[1])
ax.set_xlim(lims[0], lims[1])
ticks = np.arange(round(lims[0]), round(lims[1])+1)
ax.set_yticks(ticks)
labels = ['10$^{' + str(int(_)) + '}$' for _ in ticks]
ax.set_yticklabels(labels)
ax.set_xticks(ax.get_yticks())
ax.set_xticklabels(ax.get_yticklabels())
# cax = fig.add_axes([0.46, 0.6, 0.01, 0.2])
# cb = fig.colorbar(s, cax=cax)
cax = fig.add_axes([0.40, 0.53, 0.2, 0.015])
cb = fig.colorbar(s, cax=cax, orientation='horizontal')
cax.xaxis.set_ticks_position('top')
cax.xaxis.set_label_position('top')
cb.set_label('Day of year')
plt.setp(ax1.xaxis.get_majorticklabels(), rotation=30)
plt.setp(ax2.xaxis.get_majorticklabels(), rotation=30)
### LOWER PANELS
# Included here so that we can easily match the x-axis limits
df = obs[k]
df['x'] = df['obs']
df['y_MAR'] = df['MAR'] / df['obs']
df['y_RACMO'] = df['RACMO'] / df['obs']
df = df.replace(0, np.nan).dropna()
df = df[~df.isin([np.nan, np.inf, -np.inf]).any(1)]
df = np.log10(df)
kw = {'mincnt':1,
'bins':'log',
'clip_on':True,
'gridsize':20,
'cmap':cm.cividis}
# plot all to get max of both for colorbar range
h_MAR = ax3.hexbin(df['x'], df['y_MAR'], alpha=0, **kw)
h_RACMO = ax4.hexbin(df['x'], df['y_RACMO'], alpha=0, **kw)
ymin = np.floor(np.min([ax1.get_ylim()[0], ax2.get_ylim()[0]]))
ymax = np.ceil(np.max([ax1.get_ylim()[1], ax2.get_ylim()[1]]))
THRESH=3
if ymax > THRESH: ymax = THRESH
kw['alpha'] = 1
kw['extent'] = [lims[0], lims[1], ymin, ymax]
kw['vmax'] = max([h_MAR.get_array().max(),h_RACMO.get_array().max()])
h_MAR = ax3.hexbin(df['x'], df['y_MAR'], **kw)
h_RACMO = ax4.hexbin(df['x'], df['y_RACMO'], **kw)
df_top = df[df['obs'] > df['obs'].quantile(0.33)]
df_bot = df[df['obs'] < df['obs'].quantile(0.33)]
kwline = {'color':'k'}
kwtext = {'path_effects':[pe.withStroke(linewidth=2, foreground="white")],
'color':'k',
'fontsize':10,
'verticalalignment':'center'}
for d in [df_top, df_bot]:
for ax in [ax3,ax4]:
if d is df_top:
max_MAR = np.max([_[0] for _ in h_MAR.get_offsets()])
max_RACMO = np.max([_[0] for _ in h_RACMO.get_offsets()])
xpos = np.max([max_MAR, max_RACMO])
# xpos = round(xmax)
kwtext['horizontalalignment'] = 'left'
elif d is df_bot:
min_MAR = np.min([_[0] for _ in h_MAR.get_offsets()])
min_RACMO = np.min([_[0] for _ in h_RACMO.get_offsets()])
xpos = np.min([min_MAR, min_RACMO])
kwtext['horizontalalignment'] = 'right'
if ax is ax3: yy = d['y_MAR']
if ax is ax4: yy = d['y_RACMO']
y = yy.mean()
ax.plot([d['x'].min(),d['x'].max()], [y,y], **kwline)
ax.text(xpos, y, str(round(10**y,2)), **kwtext)
y = yy.quantile(0.95)
ax.plot([d['x'].min(),d['x'].max()], [y,y], linestyle='--', **kwline)
ax.text(xpos, y, str(round(10**y,2)), **kwtext)
y = yy.quantile(0.05)
ax.plot([d['x'].min(),d['x'].max()], [y,y], linestyle='--', **kwline)
ax.text(xpos, y-0.15, str(round(10**y,2)), **kwtext)
ax3.set_xlabel('Observed [m$^{3}$ s$^{-1}$]')
ax3.set_ylabel('MAR / Observed')
ax4.set_xlabel('Observed [m$^{3}$ s$^{-1}$]')
ax4.set_ylabel('RACMO / Observed')
for ax in [ax3,ax4]:
ax.set_xlim(lims[0], lims[1])
ax.set_xticks(ax1.get_xticks())
ax.set_xticklabels(ax1.get_xticklabels())
ticks = np.arange(round(ymin), round(ymax)+1)
labels = ['10$^{' + str(int(_)) + '}$' for _ in ticks]
ax.set_yticks(ticks)
ax.set_ylim([ticks[0], ticks[-1]])
ax.set_yticklabels(labels)
ax1.text(-0.33, 0.95, 'c', size=15, color='white', bbox=dict(facecolor='black'), transform=ax1.transAxes)
ax3.text(-0.33, 0.99, 'd', size=15, color='white', bbox=dict(facecolor='black'), transform=ax3.transAxes)
cax = fig.add_axes([0.40, 0.5, 0.2, 0.015])
cb = fig.colorbar(h_MAR, cax=cax, orientation='horizontal')
if k in ['Ks','R','Q']: cax.xaxis.set_minor_formatter(plt.ScalarFormatter())
cb.set_label('Count')
plt.setp(ax3.xaxis.get_majorticklabels(), rotation=30)
plt.setp(ax4.xaxis.get_majorticklabels(), rotation=30)
# clean up upper panel
adjust_spines(ax1,['left'])
adjust_spines(ax2,['right'])
ax1.set_xlabel('')
ax2.set_xlabel('')
plt.savefig("./fig/" + k + "_tukey_scatter.png", bbox_inches='tight', dpi=300)
mkdir ./tmp/stack
f=$(ls fig/{?,??}_ts.png|head -n1) # debug
for f in $(ls fig/{?,??}_ts.png); do
ff=$(basename $f)
k=$(echo ${ff} | cut -d"_" -f1)
echo $ff $k
cp fig/${k}.png ./tmp/stack/map.png
cp fig/${k}_ts.png ./tmp/stack/ts.png
cp fig/${k}_tukey_scatter.png ./tmp/stack/tukey_scatter.png
convert ./tmp/stack/{map,ts,tukey_scatter}.png -gravity center -append fig/${k}_stack.png
done
convert -pointsize 40 -fill red -draw 'text 850,250 "X"' ./fig/L_stack.png ./fig/tmp.png
mv ./fig/tmp.png ./fig/L_stack.png
import matplotlib.pyplot as plt
import matplotlib
from matplotlib import rc
rc('font', size=12)
rc('text', usetex=False)
# matplotlib.pyplot.xkcd()
from discharge import discharge
import geopandas as gp
from adjust_spines import adjust_spines as adjust_spines
if 'r_land' not in locals():
# runoff at Watson land outlet, and for two ice basins to the S not included in the uptream Watson outlets
r_land = discharge(base="./freshwater", roi="-50.68,67.01", upstream=True, quiet=False).discharge()
r_ice0 = discharge(base="./freshwater", roi="-49.40,66.90", upstream=False, quiet=False).discharge()
r_ice1 = discharge(base="./freshwater", roi="-49.25,66.80", upstream=False, quiet=False).discharge()
g_land = discharge(base="./freshwater", roi="-50.68,67.01", upstream=True, quiet=False).outlets().set_geometry('basin')
g_ice0 = discharge(base="./freshwater", roi="-49.40,66.90", upstream=False, quiet=False).outlets().set_geometry('basin')
g_ice1 = discharge(base="./freshwater", roi="-49.25,66.80", upstream=False, quiet=False).outlets().set_geometry('basin')
# plt.close(1)
fig = plt.figure(1, figsize=(4,4.5)) # w,h
fig.clf()
fig.set_tight_layout(True)
import matplotlib.gridspec as gridspec
gs = gridspec.GridSpec(2, 2, width_ratios=[1,1000], height_ratios=[1,2]) #h, w
ax1 = plt.subplot(gs[0, :])
ax2 = plt.subplot(gs[1, :])
C_lightblue = np.array((166, 206, 227))/255
C_darkblue = np.array((31, 120, 180))/255
C_lightgreen = np.array((127, 223, 138))/255
C_darkgreen = np.array((51, 160, 44))/255
kw = {'ax':ax1, 'linewidth':0.3, 'clip_on':False}
g_land[g_land['domain'] == 'land'].plot(color=C_lightgreen, edgecolor='k', **kw)
g_land[g_land['domain'] == 'ice'].plot(color='orange', edgecolor='k', **kw)
g_land[g_land['domain'] == 'land'].set_geometry('outlet').plot(color='r', **kw)
g_ice0[g_ice0['domain'] == 'ice'].plot(color=C_darkblue, edgecolor='k', **kw)
g_ice1[g_ice1['domain'] == 'ice'].plot(color=C_darkblue, edgecolor='k', **kw)
# ax1.set_ylim([-2.59E6,-2.49E6])
ax1.set_ylim([-2.60E6,-2.50E6])
# write out for Google Eearth so we can draw approx location of contour line
# g_ice1[(g_ice1['domain'] == 'ice') & (g_ice1['k'] == 100)].drop(columns="outlet").to_file('./tmp/g_ice1.shp')
# 1500 @ -48.3
# 1850 @ -47.33
# most runoff below that (see van As 2017)
from shapely.geometry import Point, Polygon
xy = gp.GeoSeries(data=Point(-48.3, 66.6), crs="EPSG:4326").to_crs("EPSG:3413").loc[0].xy
ax1.plot(list(xy[0])*2,[-2.52E6,-2.57E6], 'k--', alpha=0.5, clip_on=False)
xy = gp.GeoSeries(data=Point(-47.33, 66.6), crs="EPSG:4326").to_crs("EPSG:3413").loc[0].xy
ax1.plot(list(xy[0])*2,[-2.54E6,-2.59E6], 'k--', alpha=0.5, clip_on=False)
# ax1.margins(0,0)
# ax1.xaxis.set_major_locator(plt.NullLocator())
# ax1.yaxis.set_major_locator(plt.NullLocator())
adjust_spines(ax1, [])
from matplotlib_scalebar.scalebar import ScaleBar
ax1.add_artist(ScaleBar(1))
W = pd.read_csv("./dat/runoff/obs_W.csv", index_col=0, parse_dates=True)
MAR = r_land[['MAR_land','MAR_ice_upstream']]\
.sum(dim=['land','ice_upstream'])\
.to_dataframe()\
.merge(r_ice0['MAR_ice']\
.squeeze()\
.to_dataframe()['MAR_ice'], left_index=True, right_index=True)\
.rename(columns={"MAR_ice":"ice0"})\
.merge(r_ice1['MAR_ice']\
.squeeze()\
.to_dataframe()['MAR_ice'], left_index=True, right_index=True)\
.rename(columns={"MAR_ice":"ice1"})\
.merge(W, left_index=True, right_index=True)
combine = ['MAR_land','MAR_ice_upstream']
MAR['land'] = MAR[combine].sum(axis='columns')
MAR.drop(columns=combine, inplace=True)
# identical to MAR
RACMO = r_land[['RACMO_land','RACMO_ice_upstream']]\
.sum(dim=['land','ice_upstream'])\
.to_dataframe()\
.merge(r_ice0['RACMO_ice']\
.squeeze()\
.to_dataframe()['RACMO_ice'], left_index=True, right_index=True)\
.rename(columns={"RACMO_ice":"ice0"})\
.merge(r_ice1['RACMO_ice']\
.squeeze()\
.to_dataframe()['RACMO_ice'], left_index=True, right_index=True)\
.rename(columns={"RACMO_ice":"ice1"})\
.merge(W, left_index=True, right_index=True)
combine = ['RACMO_land','RACMO_ice_upstream']
RACMO['land'] = RACMO[combine].sum(axis='columns')
RACMO.drop(columns=combine, inplace=True)
df = (MAR+RACMO)/2
df['merge'] = df['land'] + df['ice0'] + df['ice1']
df = np.log10(df).dropna()
df = df[~df.isin([np.nan, np.inf, -np.inf]).any(1)] # clean up nan and inf
import seaborn as sns
kw = {'alpha':0.66, 'shade_lowest':False}
sns.kdeplot(df['Observed'], df['land'], cmap=cm.Oranges, ax=ax2, shade=True, **kw)
sns.kdeplot(df['Observed'], df['merge'], cmap=cm.Blues, ax=ax2, **kw)
lims = np.array((ax2.get_xlim(),ax2.get_ylim())).flatten()
ax2.set_xlim(1, 3.8)
ax2.set_ylim(ax2.get_xlim())
ax2.set_xlabel('Observed [m$^{3}$ s$^{-1}$]')
lims = [1,3.8]
if lims[0] < -3: lims[0] = -3
lims_upstreamp = np.log10((10**np.array(lims))*2)
lims_down = np.log10((10**np.array(lims))/2)
kw = {'alpha':0.5, 'linewidth':1, 'color':'k'}
ax2.plot(([lims[0],lims[1]]), ([lims[0],lims[1]]), **kw)
# ax2.plot(([lims[0],lims[1]]), ([lims_upstreamp[0],lims_upstreamp[1]]), **kw)
ax2.plot(([lims[0],lims[1]]), ([lims_down[0],lims_down[1]]), linestyle='--', **kw)
ax2.set_ylim(lims[0], lims[1])
ax2.set_xlim(lims[0], lims[1])
ticks = np.arange(round(lims[0]), round(lims[1])+1)
ax2.set_yticks(ticks)
labels = ['10$^{' + str(int(_)) + '}$' for _ in ticks]
ax2.set_yticklabels(labels)
ax2.set_xticks(ax2.get_yticks())
ax2.set_xticklabels(ax2.get_yticklabels())
adjust_spines(ax2, ['left','bottom'])
ax2.set_ylabel('RCM mean [m$^{^3}$ s$^{-1}$]')
plt.savefig('./fig/watson_adjusted_south.png', transparent=True, bbox_inches='tight', dpi=300)
README for "Greenland liquid water runoff from 1950 through December 2022"
Data DOI: doi:10.22008/promice/freshwater
Source: https://github.com/GEUS-Glaciology-and-Climate/freshwater
* Data Description
Data sets released as part of this work include:
+ Streams
+ Outlets
+ Basins
+ Runoff
** Streams, outlets, and basins
Streams, outlets, and basins are calculated for both land coast outlets and ice margin outlets, and the upstream basins and streams that feed those outlets. For each of these two (land coast, ice margin) we estimate streams, outlets, and basins, assuming subglacial pressure is equal to ice overburden pressure.
Streams over land closely follow actual streams based on comparison with satellite imagery.
Subglacial streams are merely representative of where the model hydropotential routes most of the water.
Outlets are marked where streams leave the domain
Basins are the usptream grid cells that feed into each outlet.
** Runoff
Runoff comes from daily MAR and RACMO RCM runoff, partitioned into each basin, and assigned to each outlet.
The ice dataset report the ice runoff. The land dataset report the the land runoff. The land dataset includes the ID of the upstream ice basins that drain into each land basin, allowing one to find the ice melt runoff at any land outlet.
* Datasets
Each dataset is in a folder named <domain>, where =domain= is one of "ice" or "land". Within each folder the following files and sub-folders are provided:
| Filename | Description |
|--------------+---------------------------------------|
| streams.gpkg | GeoPackage of stream locations |
| streams.csv | Metadata for streams |
| outlets.gpkg | GeoPackage of outlet locations |
| outlets.csv | Metadata for outlets |
| basins.gpkg | GeoPackage of basin locations |
| basins.csv | Metadata for basins |
| runoff | NetCDF files of runoff at each outlet |
In the runoff sub-folders, files are named <RCM>_<YYYY>.nc, where =RCM= is one of "MAR" or "RACMO", and =YYYY= is the year of the data. Within each file are daily runoff values at each outlet.
* Database access software
The =discharge.py= script provides high-level access to the database. See =README.org= for usage examples.
Complete documentation and example use cases with code are available at https://github.com/GEUS-Glaciology-and-Climate/freshwater, or by contacting Ken Mankoff <kdm@geus.dk>
This work was performed using only open-source software, primarily GRASS GIS
citep:neteler_2012, CDO citep:CDO, NCO citep:NCO, GDAL citep:GDAL, and Python
citep:van-rossum_1995, in particular the Jupyter
citep:kluyver_2016, dask
citep:dask_sw,dask_paper, pandas
citep:mckinney_2010, geopandas
citep:geopandas, numpy
citep:oliphant_2006, x-array
citep:hoyer_2017, and Matplotlib
citep:hunter_2007 packages. The entire work was performed in Emacs
citep:stallman_1981 using Org Mode
citep:schulte_2012 on GNU/Linux and using many GNU utilities (See https://github.com/GEUS-Glaciology-and-Climate/freshwater citep:github_freshwater). The parallel
citep:tange_2011 tool was used to speed up processing. We used proj4
citep:proj4 to compute the errors in the EPSG 3413 projection. The color map for Fig. fig:k_basin_change comes from citet:colorbrewer.
\authorcontribution{KDM produced this work and wrote the code and the text. XF and BN supplied RCM inputs. APA, WIC, DVA, and RSF helped with discussions of methods, quality control, or writing. KK and SS supplied Qaanaaq data. KL provided GEM data.}
\competinginterests{The authors declare that they have no conflict of interest.}
This project uses software - bash, GRASS, Python, etc. The python environment is reproducible if you have Conda installed. Below I provide the version of the software(s) used to create this document in order to support the goal of bit-matching reproducibility.
for tool in gdal-bin parallel sed gawk netcdf-bin proj-bin nco cdo bash grass-gui datamash; do dpkg -l | grep "ii ${tool} " | cut -c5-90; done| sort
(org-version nil t)
Org mode version 9.5.5 (release_9.5.5 @ /home/kdm/local/src/org-mode/lisp/)
The code below produces ./environment.yml when this file is exported. If that file exists, then mamba env create -f environment.yml
and mamba activate freshwater
will install all the python packages used in this document.
conda env export --name freshwater | cat | tee environment.yml
name: freshwater channels: - conda-forge - defaults dependencies: - _libgcc_mutex=0.1=conda_forge - _openmp_mutex=4.5=0_gnu - attrs=19.3.0=py_0 - backcall=0.2.0=pyh9f0ad1d_0 - bleach=3.1.5=pyh9f0ad1d_0 - bokeh=2.1.1=py38h32f6830_0 - boost-cpp=1.72.0=h8e57a91_0 - bzip2=1.0.8=h516909a_2 - ca-certificates=2020.6.20=hecda079_0 - cairo=1.16.0=hcf35c78_1003 - certifi=2020.6.20=py38h924ce5b_2 - cfchecker=4.0.0=py_0 - cfitsio=3.470=hce51eda_6 - cftime=1.2.1=py38h8790de6_0 - cfunits=3.2.8=pyh9f0ad1d_0 - click=7.1.2=pyh9f0ad1d_0 - click-plugins=1.1.1=py_0 - cligj=0.5.0=py_0 - cloudpickle=1.5.0=py_0 - curl=7.71.1=he644dc0_3 - cycler=0.10.0=py_2 - cytoolz=0.10.1=py38h516909a_0 - dask=2.15.0=py_0 - dask-core=2.15.0=py_0 - dbus=1.13.6=he372182_0 - decorator=4.4.2=py_0 - defusedxml=0.6.0=py_0 - descartes=1.1.0=py_4 - distributed=2.21.0=py38h32f6830_0 - entrypoints=0.3=py38h32f6830_1001 - expat=2.2.9=he1b5a44_2 - fiona=1.8.13=py38h033e0f6_1 - fontconfig=2.13.1=h86ecdb6_1001 - freetype=2.10.2=he06d7ca_0 - freexl=1.0.5=h516909a_1002 - fsspec=0.7.4=py_0 - future=0.18.2=py38h32f6830_1 - gdal=3.0.4=py38h172510d_6 - geopandas=0.7.0=py_1 - geos=3.8.1=he1b5a44_0 - geotiff=1.5.1=h05acad5_10 - gettext=0.19.8.1=hc5be6a0_1002 - giflib=5.2.1=h516909a_2 - glib=2.65.0=h6f030ca_0 - gst-plugins-base=1.14.5=h0935bb2_2 - gstreamer=1.14.5=h36ae1b5_2 - h5netcdf=0.8.0=py_0 - h5py=2.10.0=nompi_py38h513d04c_102 - hdf4=4.2.13=hf30be14_1003 - hdf5=1.10.5=nompi_h3c11f04_1104 - heapdict=1.0.1=py_0 - icu=64.2=he1b5a44_1 - importlib-metadata=1.7.0=py38h32f6830_0 - importlib_metadata=1.7.0=0 - ipdb=0.13.3=pyh9f0ad1d_0 - ipykernel=5.3.4=py38h23f93f0_0 - ipython=7.16.1=py38h23f93f0_0 - ipython_genutils=0.2.0=py_1 - ipywidgets=7.5.1=py_0 - jedi=0.17.2=py38h32f6830_0 - jinja2=2.11.2=pyh9f0ad1d_0 - jpeg=9d=h516909a_0 - json-c=0.13.1=hbfbb72e_1002 - jsonschema=3.2.0=py38h32f6830_1 - jupyter=1.0.0=py_2 - jupyter_client=6.1.6=py_0 - jupyter_console=6.1.0=py_1 - jupyter_core=4.6.3=py38h32f6830_1 - kealib=1.4.13=hec59c27_0 - kiwisolver=1.2.0=py38hbf85e49_0 - krb5=1.17.1=hfafb76e_1 - lcms2=2.11=hbd6801e_0 - ld_impl_linux-64=2.34=h53a641e_7 - libblas=3.8.0=17_openblas - libcblas=3.8.0=17_openblas - libclang=9.0.1=default_hde54327_0 - libcurl=7.71.1=hcdd3856_3 - libdap4=3.20.6=h1d1bd15_1 - libedit=3.1.20191231=h46ee950_1 - libffi=3.2.1=he1b5a44_1007 - libgcc-ng=9.2.0=h24d8f2e_2 - libgdal=3.0.4=h3dfc09a_6 - libgfortran-ng=7.5.0=hdf63c60_6 - libgomp=9.2.0=h24d8f2e_2 - libiconv=1.15=h516909a_1006 - libkml=1.3.0=hb574062_1011 - liblapack=3.8.0=17_openblas - libllvm9=9.0.1=he513fc3_1 - libnetcdf=4.7.4=nompi_h9f9fd6a_101 - libopenblas=0.3.10=pthreads_hb3c22a3_3 - libpng=1.6.37=hed695b0_1 - libpq=12.3=h5513abc_0 - libsodium=1.0.17=h516909a_0 - libspatialindex=1.9.3=he1b5a44_3 - libspatialite=4.3.0a=h2482549_1038 - libssh2=1.9.0=hab1572f_4 - libstdcxx-ng=9.2.0=hdf63c60_2 - libtiff=4.1.0=hc7e4089_6 - libuuid=2.32.1=h14c3975_1000 - libwebp-base=1.1.0=h516909a_3 - libxcb=1.13=h14c3975_1002 - libxkbcommon=0.10.0=he1b5a44_0 - libxml2=2.9.10=hee79883_0 - locket=0.2.0=py_2 - lz4-c=1.9.2=he1b5a44_1 - markupsafe=1.1.1=py38h1e0a361_1 - matplotlib=3.2.1=0 - matplotlib-base=3.2.1=py38h2af1d28_0 - mistune=0.8.4=py38h1e0a361_1001 - msgpack-python=1.0.0=py38hbf85e49_1 - munch=2.5.0=py_0 - nbconvert=5.6.1=py38h32f6830_1 - nbformat=5.0.7=py_0 - ncurses=6.2=he1b5a44_1 - netcdf4=1.5.3=nompi_py38heb6102f_103 - notebook=6.0.3=py38h32f6830_1 - nspr=4.27=he1b5a44_0 - nss=3.47=he751ad9_0 - numpy=1.18.5=py38h8854b6b_0 - olefile=0.46=py_0 - openjpeg=2.3.1=h981e76c_3 - openssl=1.1.1h=h516909a_0 - packaging=20.4=pyh9f0ad1d_0 - pandas=1.0.4=py38hcb8c335_0 - pandoc=2.10=h14c3975_0 - pandocfilters=1.4.2=py_1 - parso=0.7.0=pyh9f0ad1d_0 - partd=1.1.0=py_0 - patsy=0.5.1=py_0 - pcre=8.44=he1b5a44_0 - pexpect=4.8.0=py38h32f6830_1 - pickleshare=0.7.5=py38h32f6830_1001 - pillow=7.2.0=py38h9776b28_1 - pip=20.1.1=py_1 - pixman=0.38.0=h516909a_1003 - poppler=0.67.0=h14e79db_8 - poppler-data=0.4.9=1 - postgresql=12.3=h8573dbc_0 - proj=7.0.0=h966b41f_5 - prometheus_client=0.8.0=pyh9f0ad1d_0 - prompt-toolkit=3.0.5=py_1 - prompt_toolkit=3.0.5=1 - psutil=5.7.2=py38h1e0a361_0 - pthread-stubs=0.4=h14c3975_1001 - ptyprocess=0.6.0=py_1001 - pygments=2.6.1=py_0 - pyparsing=2.4.7=pyh9f0ad1d_0 - pyproj=2.6.1.post1=py38h7521cb9_0 - pyqt=5.12.3=py38ha8c2ead_3 - pyrsistent=0.16.0=py38h1e0a361_0 - python=3.8.5=cpython_h4d41432_0 - python-dateutil=2.8.1=py_0 - python_abi=3.8=1_cp38 - pytz=2020.1=pyh9f0ad1d_0 - pyyaml=5.3.1=py38h1e0a361_0 - pyzmq=19.0.1=py38ha71036d_0 - qt=5.12.5=hd8c4c69_1 - qtconsole=4.7.5=pyh9f0ad1d_0 - qtpy=1.9.0=py_0 - readline=8.0=he28a2e2_2 - rtree=0.9.4=py38h08f867b_1 - scipy=1.4.1=py38h18bccfc_3 - seaborn=0.10.1=1 - seaborn-base=0.10.1=py_1 - send2trash=1.5.0=py_0 - setuptools=49.2.0=py38h32f6830_0 - shapely=1.7.0=py38hd168ffb_3 - six=1.15.0=pyh9f0ad1d_0 - sortedcontainers=2.2.2=pyh9f0ad1d_0 - sqlite=3.32.3=hcee41ef_1 - statsmodels=0.11.1=py38h1e0a361_2 - tabulate=0.8.7=pyh9f0ad1d_0 - tbb=2020.1=hc9558a2_0 - tblib=1.6.0=py_0 - terminado=0.8.3=py38h32f6830_1 - testpath=0.4.4=py_0 - tiledb=1.7.7=h8efa9f0_3 - tk=8.6.10=hed695b0_0 - toolz=0.10.0=py_0 - tornado=6.0.4=py38h1e0a361_1 - tqdm=4.46.0=pyh9f0ad1d_0 - traitlets=4.3.3=py38h32f6830_1 - typing_extensions=3.7.4.2=py_0 - tzcode=2020a=h516909a_0 - udunits2=2.2.27.6=h4e0c4b3_1001 - wcwidth=0.2.5=pyh9f0ad1d_0 - webencodings=0.5.1=py_1 - wheel=0.34.2=py_1 - widgetsnbextension=3.5.1=py38h32f6830_1 - xarray=0.15.1=py_0 - xerces-c=3.2.2=h8412b87_1004 - xlrd=1.2.0=py_0 - xorg-kbproto=1.0.7=h14c3975_1002 - xorg-libice=1.0.10=h516909a_0 - xorg-libsm=1.2.3=h84519dc_1000 - xorg-libx11=1.6.9=h516909a_0 - xorg-libxau=1.0.9=h14c3975_0 - xorg-libxdmcp=1.1.3=h516909a_0 - xorg-libxext=1.3.4=h516909a_0 - xorg-libxrender=0.9.10=h516909a_1002 - xorg-renderproto=0.11.1=h14c3975_1002 - xorg-xextproto=7.3.0=h14c3975_1002 - xorg-xproto=7.0.31=h14c3975_1007 - xz=5.2.5=h516909a_1 - yaml=0.2.5=h516909a_0 - zeromq=4.3.2=he1b5a44_2 - zict=2.0.0=py_0 - zipp=3.1.0=py_0 - zlib=1.2.11=h516909a_1006 - zstd=1.4.5=h6597ccf_1 - pip: - astroid==2.4.2 - grass-session==0.5 - isort==4.3.21 - lazy-object-proxy==1.4.3 - matplotlib-scalebar==0.6.2 - mccabe==0.6.1 - pylint==2.5.3 - pyqt5-sip==4.19.18 - pyqtchart==5.12 - pyqtwebengine==5.12.1 - toml==0.10.1 - wrapt==1.12.1 prefix: /home/kdm/local/miniconda3/envs/freshwater
(add-to-list 'org-latex-classes
`("copernicus"
"\\documentclass{copernicus}
[NO-DEFAULT-PACKAGES]
[NO-PACKAGES]
[EXTRA]"
("\\section{%s}" . "\\section*{%s}")
("\\subsection{%s}" . "\\subsection*{%s}")
("\\subsubsection{%s}" . "\\subsubsection*{%s}")
("\\paragraph{%s}" . "\\paragraph*{%s}")
("\\subparagraph{%s}" . "\\subparagraph*{%s}"))
)
(setq-local org-latex-title-command "")