Over the past few years, I’ve played around with developing ordered rivers networks for different projects. I am not an expert in hydrology, but I can get close for cartographic purposes. I am an expert; however, in asking for help from those who know best and I rely on a lot of very smart people to guide me on my journey.
Recently, I decided to put together a visualization of ordered rivers for New Zealand. I came across a very nice data set offered through the Ministry for the Environment via the Koordinates website and thought I’d like to put it to use.
The rivers vis project made me wonder if I could build this base dataset myself using some of the recently released elevation data sets via the LINZ Data Service. The short answer to my question is “sorta”. Doing it open source is not an issue, but building an accurate ordered river centerline network is another story. This is a task I cannot take on as a solo project right now, but I could do a little experimentation. Below, I’ll offer some of methods and things I learned along the way.
Tools and Data
The method I tested used TauDEM and a 1m DEM raster accessed from the LINZ Data Service. I down sampled the DEM to 2m and 5m resolutions and used small areas for testing. Finding and open source tool was easy. I sorted through a few available methods and finally landed on “Terrain Analysis Using Digital Elevation Models” (TauDEM). There are additional methods through GRASS and SAGA GIS. I chose TauDEM because I never used it before.
Method Tested
To my knowledge, there is no open source tool where a person can put in a DEM and get a networked rivers centerline vector out the other side. It requires a number of steps to achieve your goal.
The basic run down to process the DEM is to:
- Fill sinks
- Determine flow directions
- Determine watersheds
- Determine flow accumulation
- Stream classification
- Export to vector
TauDEM does require a few extra steps to complete the process, but these steps are explained in the documentation of the tool. It was more about keeping all my variables in the right places and using them at the right time. I recommend using the variable names TauDEM provides.
Click the arrow to the left to view the full BASH script below:
#!bin/bash
#Rough sketch for building river centerlines. Rasters have been clipped prior
BASEPATH=/dir/path/to/base
raster_list=$( find $BASEPATH -name "*.tif" )
taudem_outputs=/dir/path/to/outputs
reso=resolution_number
for i in $raster_list
do
#!bin/bash
#Rough sketch for building river centerlines. Rasters have been clipped prior
BASEPATH=/dir/path/to/base
raster_list=$( find $BASEPATH -name "*.tif" )
taudem_outputs=/dir/path/to/outputs
reso=resolution_number
for i in $raster_list
do
INPUT_RASTER=$i
file_name=$( basename $i )
strip_input_extension=$( echo $file_name | sed 's/.tif//' )
reso_name=$taudem_outputs/${strip_input_extension}_${reso}res
gdal_translate -tr $reso $reso -of GTiff $i $reso_name.tif
fel=${reso_name}_fel.tif
p=${reso_name}_p.tif
sd8=${reso_name}_sd8.tif
ad8=${reso_name}_ad8.tif
ang=${reso_name}_ang.tif
slp=${reso_name}_slp.tif
sca=${reso_name}_sca.tif
sa=${reso_name}_sa.tif
ssa=${reso_name}_ssa.tif
src=${reso_name}_src.tif
ord=${reso_name}_strahlerorder.tif
tree=${reso_name}_tree.dat
coord=${reso_name}_coord.dat
net=${reso_name}_network.shp
w=${reso_name}_watershed.tif
processed_input_file=$reso_name.tif
#TauDEM Commands
mpiexec -n 8 pitremove -z $processed_input_file -fel $fel
mpiexec -n 8 d8flowdir -fel $fel -p $p -sd8 $sd8
mpiexec -n 8 aread8 -p $p -ad8 $ad8 -nc
mpiexec -n 8 dinfflowdir -fel $fel -ang $ang -slp $slp
mpiexec -n 8 areadinf -ang $ang -sca $sca -nc
mpiexec -n 8 slopearea -slp $slp -sca $sca -sa $sa
mpiexec -n 8 d8flowpathextremeup -p $p -sa $sa -ssa $ssa -nc
mpiexec -n 8 threshold -ssa $ssa -src $src
mpiexec -n 8 streamnet -fel $fel -p $p -ad8 $ad8 -src $src -ord $ord -tree $tree -coord $coord -net $net -w $w
done
The script is a rough sketch, but does get results.
Challenges in the Process
One major challenge for this project was the size of the input DEM and my computers available RAM. I work primarily off a laptop. It’s a good machine but no match for a proper server set up with some spacious RAM. My laptop struggled with the large hi-resolution DEMs, so I needed to down-sample the images and choose a smaller test area to get it to work.
Clip the tiff with gdal_translate -projwin and down sample with -tr
gdal_translate -tr xres yres -projwin ulx uly lrx lry input.tif output.tif
The second challenge came up because I used a bounding box to clip my test regions. I recommend not doing this and instead clip your regions using a watershed boundary. Having square shapes for your test regions will give you very inaccurate and unhelpful results. For example, major channels in your DEM will be cut at the edges of your raster. You will not get accurate results.
Clipping a raster using a shapefile, like a watershed boundary, can be achieved using gdalwarp.
gdalwarp –cutline input.shp input.tif output.tif
Results
I ran my process and QCed the results against Aerial Imagery and a hillshade I developed from the DEM. The first run gave me good enough results to know I have a lot of work to do, but I did manage to develop a process I was happy with. The tool did a great job, but the accuracy of the DEM was a little more challenging. It’s a start. I captured a good number of river channels despite my incorrect usage of a square DEM, learned a lot about how DEM resolution affects outputs, and gained knowledge around how to spot troublesome artifacts.
Img 1: River capture in well defined channel.
From this experiment, there are a few ideas I’d like to explore further:
1. Accuracy of the DEM. The particular DEM I worked with had a number of ‘dams’ in the flows. Notably, bridges, culverts, vegetation artifacts, and other general errors that caused water to flow in interesting directions. When working with a data set like this, I am curious how manage these artifacts.
Img 1: River diversion at road.
Img 1: River diversion at culvert or bridge.
2. How to go beyond borders. This analysis can be broken down by watershed, but it will be necessary to link the outflows of those watersheds to the next for accurate results.
Img 1: Flow not captured at edge.
3. As DEMs are released with better resolution, there is a need for scaled up computing power. The process needs a large amount of RAM. What is the best computational set up for capturing the largest area?
4. Did I do this correctly? I perform this task about once every two years and usually weekends when the surf is flat and the garden is weeded, so I am not an expert. There is a lot more research to be done to determine if I am using the tools to the best of their abilities.