# STREAMobj

### Bayesian Optimization of the mn-ratio

River profiles are concave upward if they are in a dynamic equilibrium between uplift and incision, and if our simplified assumptions of steady uplift and the stream power incision law (SPL) hold. The concavity derives from the SPL which states that along-river gradients S are proportional to upslope area A exponentiated by the negative mn-ratio.

I have mentioned the mn-ratio several times in this blog. Usually, we calculate it using slope-area plots or chi analysis both of which are included in TopoToolbox. However, these methods usually lack consistent ways to express the uncertainties of the mn-ratio. The lack of consistency is due to fitting autocorrelated data which elude a straightforward statistical analysis.

Today, I want to present a new function that uses Bayesian Optimization with cross-validation to find a suitable mn-ratio. While Bayesian Optimization is designed to find optimal values of objective functions involving numerous variables, solving an optimization problem with mn as the only variable nicely illustrates the approach.

Bayesian Optimization finds a minimum of a scalar-valued function in a bounded domain. In a classification problem, this value could be the classification loss, i.e., the price paid for misclassifications. In a regression problem, this value might refer to the sum of squared residuals. The value might also be derived using cross-validation, a common approach to assess the predictive performance of a model. Such cross-validation approaches might take into account only random subsets of data, which entails that the value to be optimized might not be the same for the same set of input parameters. Bayesian Optimization can handle stochastic functions.

Now how can we apply Bayesian Optimization for finding the right mn-ratio? The new function mnoptim uses chi analysis to linearize long-river profiles. If there are several river catchments (or drainage network trees), the function will pick a random subset of these trees to fit a mn-ratio and then tests it with another set of drainage basins. This allows us to assess how well an mn-ratio derived in one catchment can actually be applied to another catchment. The goal is to derive a mn-ratio that applies best to other catchments.

Now let’s try this using the new function mnoptim. Here is the code that I’ll use for entire SRTM-3 DEM of Taiwan. I’ll clip the stream network to the area upstream of the 300 m contour to avoid an influence of the alluvial low-lying reaches.

DEM = GRIDobj('taiwan.tif'); FD = FLOWobj(DEM); S = STREAMobj(FD,'minarea',1e6,'unit','map'); C = griddedcontour(DEM,[300 300],true); S = modify(S,'upstreamto',C); A = flowacc(FD); [mn,results] = mnoptim(S,DEM,A,'optvar','mn','crossval',true); % we can refine the results if we need results = resume(mn); % and get an optimal value of mn: bestPoint(results) ans = table mn _______ 0.41482

Now this nicely derives the optimal mn-value of 0.415 which is close to the often reported value of 0.45. Moreover, based on the plot, we gain an impression of the uncertainty of this value. In a transient landscape with frequent knickpoints, the uncertainty about the mn-ratio will be probably larger.

Note that mnoptim requires the latest version of MATLAB: 2017b, as well as the Statistics and Machine Learning Toolbox. It also runs with 2017a, but won’t be able to use parallel computing then.

### Avoiding edge effects in chi analysis

Analyses that use upslope area usually demand that catchments are completely covered by the DEM. Values of upslope areas may be too low if catchments are cut along DEM edges, and so are estimates of discharge. How can we avoid including these catchments into our analyses?

In one of my previous posts on chi analysis, I showed a rather long code to include those catchments that have 20% or less of their outlines on the DEM edges. Here, I’ll be more strict. I’ll remove those parts of the stream network that have pixels on DEM edges in their upslope area. By doing this, we make sure that drainage basins are complete which is vital for chi analysis or estimating discharge.

Here is the code:

DEM = GRIDobj('srtm_bigtujunga30m_utm11.tif'); FD = FLOWobj(DEM); S = STREAMobj(FD,'minarea',1e6,'unit','map'); I = GRIDobj(DEM,'logical'); I.Z(:,:) = true; I.Z(2:end-1,2:end-1) = false; I = influencemap(FD,I);

The influencemap function takes all edges pixels and derives those pixels that they would influence downstream. Again, this is going to be a logical GRIDobj where true values refer to pixels that might potentially be affected by edge effects. Let’s remove those pixels from the stream network S using the modify function.

S2 = modify(S,'upstreamto',~I); D = drainagebasins(FD,S2); imageschs(DEM,[],'colormap',[1 1 1],'colorbar',false) [~,x,y] = GRIDobj2polygon(D); hold on plot(x,y); plot(S,'k') plot(S2,'b','LineWidth',1.5)

Ok, now you have a drainage network that won’t be affected by edge effects for further analysis.

### An alternative to along-river swath profiles

Today, I want to show another application of TopoToolbox’s new smoothing suite that Dirk and I have recently released together with our discussion paper in ESURF. I will demonstrate that using different methods associated with STREAMobj enable you to create plots that are quite similar to swath profiles calculated along rivers. Let’s see how it works.

Swath profiles require a straight or curved path defined by a number of nodes. Each node has a certain orientation defined by one or two of its neighboring nodes. Creating swath profiles then involves mapping values from locations that are perpendicular to that orientation and within a specified distance. We will use a simplified version of mapping values lateral to the stream network which is implemented in the function maplateral. maplateral uses the function bwdist that returns a distance transform from all pixels (the target pixels) to a number of seed pixels in a binary image. In our case, seed pixels are the pixels of the stream network and bwdist identifies the target pixels from which it calculates the euclidean distance to the stream pixels. Usually, there are several target pixels for each seed pixel so that we require an aggregation function that calculates a scalar based on the vector of values that are associated with each seed pixel. Unlike swath profiles, however, this approach entails that some seed pixels may not have target pixels up to the maximum distance. We can thus be pretty sure that some pixels may be missing some of the information that we want to extract. While we cannot avoid this problem using this approach, we can at least reduce its impact using the nonparametric quantile regression implemented in the crs function.

Here is the approach. We’ll load a DEM, derive flow directions and a stream network which is in this case just one trunk river. We use the function maplateral to map elevations in a distance of 2 km to the stream network. Our mapping uses the maximum function to aggregate the values. Hence, we will have for each pixel along the stream network the maximum elevation in its 2 km nearest-neighborhood. The swath is plotted using the function imageschs and the second output of maplateral. Then we plot the maximum values along with a profile of the stream network.

DEM = GRIDobj('srtm_bigtujunga30m_utm11.tif'); FD = FLOWobj(DEM); S = STREAMobj(FD,'minarea',1e6,'unit','map'); S = klargestconncomps(S); S = trunk(S); z = getnal(S,DEM); [zmax,mask] = maplateral(S,DEM,2000,@max); subplot(2,1,1) imageschs(DEM,mask,'truecolor',[1 0 0],... 'colorbar',false,... 'ticklabels','nice'); subplot(2,1,2) plotdz(S,z) hold on plotdz(S,zmax) legend('River profile','maximum heights in 2 km distance')

This looks quite ok, but the topographic profile has a lot of scatter and a large number of values seem not to be representative. Most likely, those are the pixels whose closest pixels fail to extend to the maximum distance of 2 km. Rather, I’d expect a profile that runs along the maximum values of the zigzag line. How can we obtain this line? Well, the crs function could handle this. Though it was originally intended to be applied to longitudinal river profiles, it can also handle other data measured or calculated along stream networks. The only thing we need to “switch off” is the downstream minimum gradient that the function uses by default to create smooth profiles that monotoneously decrease in downstream direction. This is simply done by setting the ‘mingradient’-option to nan. I use a smoothing parameter value K=5 and set tau=0.99 which forces the profile to run along the 99th percentile of the data. You can experiment with different values of K and tau, if you like.

figure zmaxs = crs(S,zmax,'mingradient',nan,'K',5,'tau',0.99,'split',0); plotdz(S,z); hold on plotdzshaded(S,[zmaxs z],'FaceColor',[.6 .6 .6]) hold off

This looks much better and gives an impression on the steepness of the terrain adjacent to the stream network. Let’s finally compare this to a SWATHobj as obtained by the function STREAMobj2SWATHobj.

figure SW = STREAMobj2SWATHobj(S,DEM,'width',4000); plotdz(SW) xlabel('Distance upstream [m]'); ylabel('Elevation [m]');

Our previous results should be similar to the maximum line shown in the SWATHobj derived profile. And I think they are. In fact, the SWATHobj derived profiles also shows some scatter that is likely due to the sharp changes of orientation of the swath centerline. While we can remove some of this scatter by smoothing the SWATHobj centerline, I think that the combination of maplateral and the CRS algorithm provides a convenient approach to sketch along-river swath profiles.

### Steepness derived from smoothed river profiles

Today, I gave a talk in the weekly seminar of Manfred Strecker’s group at our institute. The talk had the title: *Introduction to TopoToolbox and its application in tectonic geomorphology*. Among other, I provided an example of how to calculate and plot the steepness of river networks (see the new function ksn) and how the new smoothing function crs (Schwanghart and Scherler 2017) is helpful here. I am providing the code here for the DEM of the Big Tujunga catchment, but you can easily adapt the code to your own needs.

First, load the DEM and derive flow directions and the stream network. In addition, calculate a flow accumulation grid that we need for calculating ksn.

DEM = GRIDobj('srtm_bigtujunga30m_utm11.tif'); FD = FLOWobj(DEM,'preprocess','carve'); A = flowacc(FD);

We then extract the stream network for a threshold upstream area of 1 sqkm.

S = STREAMobj(FD,'minarea',1e6,'unit','map'); S = klargestconncomps(S,1);

To hydrologically correct and smooth the river profile, we use the function crs:

zs = crs(S,DEM,'K',6,'tau',0.1);

And finally, we plot it all

plotc(S,k); imageschs(DEM,[],'colormap',[1 1 1],'colorbar',false,'ticklabels','nice'); hold on plotc(S,k); colormap(jet); hx = colorbar; hx.Label.String = 'k_{sn}';

Which parameter values should be used to smooth the stream profile? The values depend on the amount of scatter in the profiles, their spatial resolution, and the amount of smoothing. Visual cross-checking of the results is thus advisable. In this lecture, I presented the tool crsapp that let’s you identify a suitable set of parameter values. A screenshot of the tool can be seen below.

**References**

Schwanghart, W., Scherler, D., 2017. Bumps in river profiles: the good, the bad, and the ugly. Earth Surface Dynamics Discussions 1–30. [DOI: 10.5194/esurf-2017-50]

### Better slope-area plots

Plotting upslope area vs. slope in logarithmic axis has been the de-facto standard approach to river-profile analysis. In theory, if a river profile is in a perfect steady state, then this plot should yield a straight line. The TopoToolbox function slopearea calculates these plots. In addition, the function fits a power law S = k*A^(-theta). However, the function has drawbacks, and I want to address one of these in this post.

Slope-area plots require stream gradient, the first derivative of the river profile. If elevation values of the profile have errors, then this uncertainty becomes even more significant in the first derivative. To alleviate this problem, slope is usually averaged in bins. The function slopearea calculates the width of these bins based on upslope area so that bins are equally spaced in log space. However, this approach entails that bin width increases for larger upslope areas. Thus, for larger upslope areas, gradients are averaged over larger stream distances which may hide some of the structures that we wish to detect.

Another approach is to calculate slope averages over river segments of predefined length. While this is not supported by the function slopearea (yet), I’ll show here how to calculate it. The new function labelreach comes in handy here.

Let’s first load a DEM, derive a stream network.

DEM = GRIDobj('srtm_bigtujunga30m_utm11.tif'); FD = FLOWobj(DEM,'preprocess','carve'); S = STREAMobj(FD,'minarea',1000); S = removeshortstreams(S,1000); DEM = imposemin(S,DEM); A = flowacc(FD);

Then we calculate node-attribute lists for upslope area and gradient.

a = getnal(S,A)*(DEM.cellsize^2); g = gradient(S,DEM);

The function labelreach calculates a label for each reach in a STREAMobj. The option ‘seglength’ allows us to subdivide the labels into reaches of ~2000 m length. Note, however, that the function separates reaches at confluences so that segments may not always be exactly 2000 m.

label = labelreach(S,'seglength',2000);

The function accumarray is one of my favorite functions. Here we use it to aggregate the values in the node-attribute lists g and a based on label. As an aggregation function, we use the average, while we express errors as the standard error.

gg = accumarray(label,g,[],@mean); ggs = accumarray(label,g,[],@(x) std(x)/sqrt(numel(x))); ag = accumarray(label,a,[],@mean); ags = accumarray(label,a,[],@(x) std(x)/sqrt(numel(x)));

Finally, let’s plot the results.

plot([ag ag]',[gg+ggs max(gg-ggs,0.0001)]','color',[.7 .7 .7]); hold on plot([ag-ags ag+ags]',[gg gg]','color',[.7 .7 .7]); plot(ag,gg,'o','MarkerFaceColor',[.7 .7 .7]) hold off set(gca,'Xscale','log','Yscale','log'); xlabel('Area [m^3]') ylabel('Gradient [m m^{-1}]') ylim([1e-3 0.6])

Now that looks much better than area-binned slope-area plots. Give it a try…

### Chimaps in a few lines of code (5)

Having prepared a stream network and equipped with a reasonable value of the m/n ratio, we are now ready to plot a chimap that visualizes the planform patterns of chi. The main interest in these maps lies in chi values near catchment divides as large differences between adjacent catchment would indicate a transient behavior of drainage basin reorganization.

Some of you might have already experimented with TopoToolbox to create chimaps. Perhaps you became exasperated with the function chiplot that is restricted to calculations with only one drainage basin and has a bewildering structure array as output. The reason for the confusing output of chiplot is that it is fairly old. At this time, I hadn’t implemented node-attribute lists that are now more common with STREAMobj methods.

Realizing this shortcoming of chiplot, I wrote the function chitransform. chitransform is what I’d refer to as a low-level function that solves the chi-equation using upstream cumulative trapezoidal integration (see the function cumtrapz). chitransform requires a STREAMobj and a flow accumulation grid as input and optionally a mn-ratio (default is 0.45) and a reference area (default is 1 sqkm). It returns a node-attribute list, i.e., a vector with chi values for each node in the STREAMobj. Node-attribute lists are intrinsically tied to the STREAMobj from which they were derived. Yet, they can be used together with several other TopoToolbox functions to produce output.

Ok, let’s derive chi values for our stream network:

A = flowacc(FD); % calculate flow accumulation c = chitransform(S,A,'mn',0.4776);

In the next step, we will plot a color stream network on a grayscale hillshade:

imageschs(DEM,[],'colormap',[1 1 1],'colorbar',false,'ticklabel','nice'); hold on plotc(S,c) colormap(jet) colorbar hold off

Interestingly, there seem to be some locations with quite some differences in chi values on either side of the divide. “Victims” seem to be rather elongated catchments draining northwest. Let’s zoom into one of these locations.

Are these significant differences? Well, it seems by just looking at the range of values. However, to my knowledge no approach exists that provides a more objective way of evaluating the significance of contrasting chi values and their implications about rates of divide migration. Still, we now have a nice map that can aid our geomorphic assessment together with the tectonic and geodynamic interpretation of the Mendocino Triple Junction.

Unfortunately, I must leave the discussion to you since I am quite unfamiliar with the region. If anyone wants to share his or her interpretation, I’d be more than happy to provide space here. So far, I hope that these few posts on chimaps were useful to you and sufficiently informative to enable you to compute chimaps by yourself. In my next post, I will give a short summary and show with another example that eventually chimaps can be derived really in a few lines of code.

### Chimaps in a few lines of code (4)

My last post described the math behind chi analysis. By using the integral approach to the stream power model of incision, we derived an equation that allows us to model the longitudinal river profile as a function of chi. At the end of the last post I stated that this model is a straight line if we choose the right m/n ratio, that is the concavity index. Today, I’ll show how we can obtain this m/n ratio by means of nonlinear optimization using the function chiplot.

Ideally, we find the m/n ratio using a perfectly graded stream profile that is in steady state. I scanned through a number of streams using the GUI flowpathapp and found a nice one that might correspond to Cooskie Creek that Perron and Royden (2013) used in their analysis.

flowpathapp(FD,DEM,S)

The menu *Export > Export to workspace* allows saving the extracted STREAMobj St to the workspace. Then, I use the function chiplot to see how different values of the m/n ratio affect the transformation of the river profile. Setting the ‘mnplot’ option makes chiplot return a figure with chi-transformed profiles for m/n ratios ranging from 0.1 to 0.9 in steps of 0.1 width.

chiplot(S,DEM,A,'mnplot',true,'plot',false)

Clearly, the chi-transformed profile varies from concave upward for low m/n values to convex for high m/n values. A value of 0.4-0.5 seems to be most appropriate but choosing a value would be somewhat hand-wavy. Thus, let’s call the function again, but this time without any additional arguments.

C = chiplot(S,DEM,A);

Calling the function this way will prompt it to find an optimal value of m/n. In addition, it will return a figure with the linearized profile. By default, the function uses the optimization function fminsearch that will vary m/n until it finds a value that maximizes the linear correlation coefficient between the chi-transformed profile and a straight line. So what is that value? Let’s look at the output variable C:

C = mn: 0.4776 mnci: [] beta: 0.1565 betase: 2.2813e-04 a0: 1000000 ks: 114.8828 R2: 0.9988 x_nal: [581x1 double] y_nal: [581x1 double] chi_nal: [581x1 double] d_nal: [581x1 double] x: [582x1 double] y: [582x1 double] chi: [582x1 double] elev: [582x1 double] elevbl: [582x1 double] distance: [582x1 double] pred: [582x1 double] area: [582x1 double]

C is a structure array with a large number of fields. C.mn is the value we are interested in. It is 0.4776 and corresponds nicely to previously reported m/n ratios although it differs from the value 0.36 found by Perron and Royden (2013). Of course, this value might differ from river to river and you should repeat the analysis for other reaches to obtain an idea about the variability of m/n.

Ok, we have the m/n ratio now. In my next blog we will eventually produce the chimap that we initially set out for.

P.S.: We could have also used slope-area plots to derive the m/n ratio (see the function slopearea). However, along-river gradients are usually much more noisy than the profiles and power-law fitting a delicate matter. I’d definitely prefer chi-analysis over slope-area analysis.