Table Of Contents

Previous topic

19. Landscape Modelling with LSDTopoTools: LSDCatchmentModel

This Page

20. The LSDCosmoBasin toolkit

The LSDCosmoBasin toolkit is designed to automate calculation of basinwide erosion rates determined estimated from the concentration of in situ cosmogenic nuclides. Currently 10Be and 26Al are supported.

The toolkit requires:

  • Data on cosmogenic samples.
  • A file containing filenames of the topographic data, and optional filenames for shielding rasters.
  • A parameter file.
The toolkit then produces
  • A csv file that contains results of the analysis.
  • A text file that can be copied into the CRONUS online calculator for data comparison.

20.1. Getting your raster into the correct format

You will need to get data that you have downloaded into a format LSDTopoToolbox can understand,

One of the things you will need is a raster or rasters containing topographic data for the basins from which sediment was sampled for cosmogenic analyis. You can download topographic data from a variety of sources (see Data Sources), and this data will come in a variety of formats and with a variety of projections.

For our purposes, you will need a georeferenced raster that is in a projected coordinate system. We recommend using a UTM coordinate system with the WGS 1984 coordinate system. You can read more about projections and coordinate systems here: Converting data from ArcMap and here: Data preparation 2: Notes for using GDAL to manipulate files for LSDTopoToolbox.

The format you MUST convert your data to is the ENVI bil format: this is a binary format (so that is saves space) that has georeferencing (so the cosmogenic program can determine the correct latitude and longitde of your DEM).

The best way to get rasters into ENVI bil format is with GDAL (read this: Translating your raster into something that can be used by LSDTopoToolbox).

If you are on a Windows computer you can get GDAL through the open source GIS QGIS (see Converting_data_in_QGIS).

20.2. Compiling the comsogenic program

The cosmogenic program is distributed as source code. You will need to compile this code. We use the g++ compiler. It is available on virtually all linux distributions.

If you are on windows, you can get the g++ compilier by installing cygwin. You will have to go into the devel menu in cygwin to get g++, make and gdb when you install.

When you download the code, you will get a directory with the objects (their names start with LSD).

Within the directory with the objects, there will be two additional directories:

  • driver_functions
  • TNT

The TNT folder contains routines for linear algebra, and it is made by NIST, you can happily do everything in LSDTopoToolbox without ever knowing anything about it, but if you want you can read what it is here: TNT homepage.

  1. First, go into the driver function folder

  2. Next, type:

    make -f Basinwide_CRN.make

    in a terminal.

  3. You will get a bunch of messages but at the end your code will have compiled. If it has compiled successfully the final message should look like:

    g++ -Wall -O3 Basinwide_cosmogenic_analysis.o
    ../LSDMostLikelyPartitionsFinder.o ../LSDChiNetwork.o ../LSDIndexRaster.o
    ../LSDRaster.o ../LSDShapeTools.o ../LSDFlowInfo.o ../LSDJunctionNetwork.o
    ../LSDIndexChannel.o ../LSDChannel.o ../LSDIndexChannelTree.o
    ../LSDStatsTools.o ../LSDBasin.o ../LSDParticle.o ../LSDCRNParameters.o
    ../LSDCosmoData.o ../LSDRasterInfo.o -o Basinwide_CRN.exe

20.3. Setting up your data directories

Before you can run the code, you need to set up some data structures.

  1. You can keep your topographic data separate from your cosmogenic data, if you so desire. You’ll need to know the directory paths to these data.

  2. In a single folder (again, it can be separate from topographic data), you must put a i) parameter file, a cosmogenic data file, and a raster filenames file .

  3. These three files must have the same prefix, and each have their own extensions.

    • The parameter file has the extension: .CRNParam.
    • The cosmogenic data file has the extension _CRNData.csv.
    • The raster filenames file has the extension _CRNRasters.csv.
  4. For example, if the prefix of your files is Chile_data_analysis, then your three data files will be Chile_data_analysis.CRNParam, Chile_data_analysis_CRNData.csv, and Chile_data_analysis_CRNRasters.csv.

  5. If the files do not have these naming conventions, the code WILL NOT WORK! Make sure you have named your files properly.

20.3.1. The parameter file

This must have the extension .CRNParam.

The parameter file could be empty, in which case parameters will just take default values. However, you may set various parameters. The format of the file is:

parameter_name: parameter_value

So for example a parameter file might look like:

min_slope: 0.0001
source_threshold: 12
search_radius_nodes: 1
threshold_stream_order: 1
theta_step: 30
phi_step: 30
Muon_scaling: Braucher
write_toposhield_raster: true
write_basin_index_raster: true

In fact all of the available parameters are listed above, and those listed above are default values. The parameter names are not case sensitive. The parameter values are case sensitive. These parameters are as follows:

  • min_slope: The minimum slope between pixels used in the filling function.
  • source_threshold: The number of pixels that must drain into a pixel to form a channel. NOTE this makes little difference, as the channel network only plays a role in setting channel pixels to which cosmo samples will snap. This merely needs to be set to a low enough value that ensures there are channels associated with each cosmogenic sample.
  • search_radius_nodes: The number of pixels around the location of the cosmo location to search for a channel. The appropriate setting will depend on the difference between the accuracy of the GPS used to collect sample locations and the resolution of the DEM. If you are using a 30 or 90m DEM, 1 pixel should be sufficient. More should be used for LiDAR data.
  • threshold_stream_order: The minimum stream or which the sample snapping routine considers a ‘true’ channel.
  • theta_step: Using in toposhielding calculations. This is the step of azimuth (in degrees) over which shielding and shadowing calculations are performed. Codilean (2005) recommends 5, but it seems to work without big changes differences with 15. An integer that must be divisible by 360 (although if not the code will force it to the closest appropriate integer).
  • phi_step: Using in toposhielding calculations. This is the step of inclination (in degrees) over which shielding and shadowing calculations are performed. Codilean (2005) recommends 5, but it seems to work without big changes differences with 10. An integer that must be divisible by 360 (although if not the code will force it to the closest appropriate integer).
  • Muon_scaling: The scaling scheme for muons. Options are “Braucher”, “Schaller” and “Granger”. If you give the parameter file something other than this it will default to Braucher scaling. These scalings take values reported in COSMOCALC as described by Vermeesch 2007.
  • write_toposhield_raster: If true this writes a toposhielding raster if one does not exist. Saves a bit of time but will take up some space on your hard disk!
  • write_basin_index_raster For each DEM this writes an LSDIndexRaster to file with the extension _BASINS that has each of the basins that have been found for CRN analysis listed by basinID.

20.3.2. The cosmogenic data file

This must have the extension _CRNData.csv.

This is a .csv file: that is a comma separated value file. It is in that format to be both excel and pandas friendly.

The first row is a header that names the columns, after that there should be 7 columns (separated by commas) and unlimited rows. The seven columns are:

sample_name, sample_latitude, sample_longitude, nuclide, concentration, AMS_uncertainty, standardisation
  • The sample name should not start with a number or have spaces.

  • The latitude and longitude should be in decimal degrees. Negative latitude indicates southern hemisphere.

  • Nuclide can be either “Be10” or “Al26”. Any other option will be rejected.

  • Concentration is in atoms/gram

  • AMS uncertainty is also in atoms/gram

  • Standardisation is the name of the standards used in the AMS measurements. This is not always so easy to find in published papers!! The defaults are “07KNSTD” for 10Be and “KNSTD” for 26Al. These seem to be used by many people after 2007 when Kuni Nishiizumi made them available (or at least that is when he published the paper). If the samples are from before 2007 and you don’t know the standard use, you should use “KNSTD” for 10Be and 26Al. There are many more standards floating around, but the Nishiizumi one seem the most widely used. The options are (take a deep breath), for 10Be:

    "07KNSTD", "KNSTD", "NIST_Certified", "LLNL31000", "LLNL10000", "LLNL3000", "LLNL1000"
    "LLNL300", "NIST_30000", "NIST_30200", "NIST_30300", "NIST_30600", "NIST_27900"
    "S555","S2007", "BEST433", "BEST433N", "S555N", "S2007N"
    

    And for 26Al:

    "KNSTD", "ZAL94", "SMAL11", "0", "ZAL94N", "ASTER", "Z92-0222"
    

An example file would look like this:

Sample_name,Latitude,Longitude,Nuclide,Concentration,Uncertainty,Standardisation
LC07_01,-32.986389,-71.4225,Be10,100000,2500,07KNSTD
LC07_04,-32.983528,-71.415556,Be10,150000,2300,07KNSTD
LC07_06,-32.983028,-71.415833,Al26,4000,2100,KNSTD
LC07_08,-32.941333,-71.426583,Be10,30000,1500,07KNSTD
LC07_10,-33.010139,-71.435389,Be10,140000,25000,07KNSTD
LC07_11,-31.122417,-71.576194,Be10,120502,2500,07KNSTD

20.3.3. The raster names file

This must have the extension _CRNRasters.csv.

This file is a csv file that has as many rows as you have rasters that cover your CRN data. Each row can contain between 1 and 4 columns.

  • The first column is the FULL path name to the Elevation raster and its prefix (that is, without the .bil, e.g.:

    /home/smudd/basin_data/Chile/CRN_basins/Site01/Site_lat26p0_UTM19_DEM
  • The next column is either a full path name to a snow shielding raster or a snow shielding effective depth. Both the raster and the single value should have units of g/cm^2 snow depth. If there is no number here the default is 0.

  • The next column is either a full path name to a self shielding raster or a self shielding effective depth. Both the raster and the single value should have units of g/cm^2 shielding depth. If there is no number here the default is 0.

  • The next column is the FULL path to a toposhielding raster. If this is blank the code will run topographic shielding for you. Note: topographic shielding is the rate limiting step in the cosmo analysis.

A typical file might will look like this:

/home/smudd/basin_data/Site01/Site01_DEM,0,0,/home/smudd/basin_data/Site01/Site01_DEM_TopoShield
/home/smudd/basin_data/Site02/Site02_DEM,5,10
/home/smudd/basin_data/Site03/Site03_DEM,5,/home/smudd/basin_data/Site03/Site03_DEM_SelfShield
/home/smudd/basin_data/Site04/Site04_DEM,/home/smudd/basin_data/Site04/Site04_DEM_SnowShield,/home/smudd/basin_data/Site04/Site04_DEM_SelfShield
/home/smudd/basin_data/Site05/Site05_DEM
/home/smudd/basin_data/Site06/Site06_DEM,/home/smudd/basin_data/Site06/Site06_DEM_SnowShield

20.4. Calculating topographic shielding rasters

In most cases, you will not have topographic shielding rasters available, and will need to calculate these.

Shielding calculation are computationally intensive, much more so than the actual erosion rate computations. Because of the computational expense of shielding calculations, we have prepared a series of tools for speeding this computation.

The topographic shielding routines take the rasters from the _CRNRasters.csv file and the _CRNData.csv file and computes the location of all CRN basins. They then clips a DEM around the basins (with a pixel buffer set by the user). These clipped basins are then used to make the shielding calculations and the erosion rate calculations.

This process of clipping out each basin spans a large number of new DEM that require a new directory structure. A python script is provided to set up this directory structure in order to organise the new rasters. This process used a large amount of storage on the hard disk because a new DEM will be written for each CRN basin.

20.4.1. Steps for preparing the rasters for shielding calculations

  1. First, place the _CRNRasters.csv and _CRNData.csv file into the same folder, and make sure the _CRNRasters.csv file points to the directories that contain the topographic data.

  2. Second, run the python script PrepareDirectoriesForBasinSpawn.py.

    • You can clone this script from GitHub; find it here: https://github.com/LSDtopotools/LSDAutomation
    • You will need to scroll to the bottom of the script and change the path (which is simply the directory path of the _CRNRasters.csv file.
    • You will need to scroll to the bottom of the script and change the prefix (which is simply prefix of the _CRNRasters.csv file; that is the filename before _CRNRasters.csv so if the filename is YoYoMa_CRNRasters.csv then prefix is YoYoMa. Note this is case sensitive.
    • This python script does a bunch of subtle things like checking directory paths and then makes a new folder for each DEM. The folders will contain all the CRN basins located on the source DEM.
  3. Now you will need to run a c++ program that spawns small rasters that will be used for shielding calculations. First you have to compile this program.

  4. To compile, navigate to the folder /driver_functions_CRNTools/ and type:

    make -f Spawn_DEMs_for_CRN.make
  5. The program will then compile (you may get some warnings–ignore them.)

  6. In the /driver_functions_CRNTools/ folder, you will now have a program Spawn_DEMs_for_CRN.exe. You need to give this program two arguments.

  7. You need to give Spawn_DEMs_for_CRN.exe, the path to the data files (i.e., _CRNRasters.csv and _CRNData.csv), and the prefix, so if they are called YoMa_CRNRaster.csv the prefix is YoMa). Run this with:

    Spawn_DEMs_for_CRN.exe PATHNAME DATAPREFIX

    in windows or:

    ./Spawn_DEMs_for_CRN.exe PATHNAME DATAPREFIX

    in Linux.

  8. Once this program has run, you should have subfolders containing small DEMs that contain the basins to be analysed.

  9. You will also have files that contain the same PATHNAME and PREFIX but have _Spawned added to the prefix. For example, if your original prefix was CRN_test, the new prefix will be CRN_test_Spawned.

  10. In the file PREFIX_Spawned_CRNRasters.csv you will find the paths and prefixes of all the spawned basins.

20.4.2. Shielding computation

The shielding computation is the most computationally expensive step of the CRN data analysis. Once you have spawned the basins (see above section), you will need to run the shielding calculations.

  1. You will first need to compile the program that calculates shielding. This can be compiled with:

    make -f Shielding_for_CRN.make
  2. The compiled program (Shielding_for_CRN.exe) takes two arguments: the PATHNAME and the PREFIX.

  3. You could simply run this on a single CPU after spawning the basins; for example if the original data had the prefix CRN_test before spawning, you could run the program with:

    ./Shielding_for_CRN.exe PATHNAME CRN_test_Spawned

    where PATHNAME is the path to your _CRNRasters.csv, _CRNData.csv, and .CRNParam (these files need to be in the same path).

  4. However, if you spawned a lot of basins, you might want to run the shielding on multiple CPUs. We provide a python script for running this using an embarrassingly parallel approach.

  5. To set the system up for embarrassingly parallel runs, you need to run the python script ManageShieldingComputation.py, which can be found here: https://github.com/LSDtopotools/LSDAutomation.

  6. In ManageShieldingComputation.py, navigate to the bottom of the script, and enter the path, prefix, and NJobs. NJobs is the number of jobs into which you want to break up the shielding computation.

  7. Once you run this computation, you will get files with the extension _bkNN where NN is a job number.

  8. In addition a text file is generated, with the extension _ShieldCommandPrompt.txt, and from this you can copy and paste job commands into a Linux terminal. These commands are designed for the GeoSciences cluster at the University of Edinburgh: if you use qsub you will need to write your own script.

  9. Note that the parameters for the shielding calculation are in the .CRNParam files. We recommend:

    theta_step:8 phi_step: 5

    These are based on extensive sensitivity analyses and balance computational speed with accuracy. Errors will be << 1% even in landscapes with extremely high relief. Our forthcoming paper has details on this.

  10. Again, these computations take a long time. Don’t start them a few days before your conference presentation!!

  11. Once the computations are finished, there will be a shielding raster for every spawned basin raster. In addition, the _CRNRasters.csv file will be updated to reflect the new shielding rasters so that the updated parameter files can be fed directly into the erosion rate calculators.

20.5. Running the driver function

Okay, now you are ready to run the driver function. You’ll need to run the function from the directory where the compiled code is located (extract_typical_metrics.exe), but it can work on data in some arbitrary location.

  1. You run the code with the command ./Basinwide_CRN.exe pathname_of_data file_prefix
  • The pathname_of_data is just the path to where your data is stored, so for example I put some data into a folder with the path /home/smudd/basin_data/Chile/CRN_data_analysis/. IMPORTANT You MUST remember to put a / at the end of your pathname.
  • The filename is the PREFIX of the files you need for the analysis (that is, without the extension).

20.6. The output files

There are two output files. Both of these files will end up in the pathname that you designated when calling the program.

The first is called file_prefix_CRNResults.csv and the second is called file_prefix_CRONUSInput.txt where file_prefix is the prefix you gave when you called the program.

So, for example, if you called the program with:

./basinwide_CRN.exe /home/smudd/basin_data/Chile/CRN_data_analysis/ My_Chile_Data

The outfiles will be called:

My_Chile_Data_CRNResults.csv
My_Chile_Data_CRONUSInput.txt

The _CRONUSInput.txt is formatted to be cut and pasted directly into the CRONUS calculator. The file has some notes (which are pasted into the top of the file):

->IMPORTANT nuclide concentrations are not original!
  They are scaled to the 07KNSTD!!
->Scaling is averaged over the basin for snow, self and topographic shielding.
->Snow and self shielding are considered by neutron spallation only.
->Pressure is an effective pressure that reproduces Stone scaled production
  that is calculated on a pixel by pixel basis.
->Self shielding is embedded in the shielding calculation and so
  sample thickness is set to 0.

The _CRNResults.csv is rather long. It contains the following data in comma separated columns:

basinID     (A unique identifier for each CRN sample)
sample_name
nuclide     (either 10Be or 26Al)
latitude
longitude
concentration (atms/g)
concentration_uncert (atoms/g)
erosion rate g_percm2_peryr
erosion rate AMS_uncert g_percm2_peryr
muon_uncert g_percm2_peryr
production_uncert g_percm2_peryr
total_uncert g_percm2_peryr
AvgProdScaling dimensionless
AverageTopoShielding dimensionless
AverageSelfShielding dimensionless
AverageSnowShielding dimensionless
AverageCombinedScaling dimensionless (this is averaged production scaling times toposhielding)
outlet_latitude
OutletPressure hPa
OutletEffPressure hPa (pressure needed to get basin averaged production scaling)
centroid_latitude
CentroidPressure hPa
CentroidEffPressure (pressure needed to get basin averaged production scaling)
ErosionRate_COSMOCALC_in_g_percm2_peryr (assumes 2650 kg/m^2): The erosion
    rate you would get if you took production weighted scaling and used
    cosmocalc.
ErosionRate_COSMOCALC_mmperkyr (assumes 2650 kg/m^2): The erosion
    rate you would get if you took production weighted scaling and used
    cosmocalc.
ErosionRate_in_mmperkyr (to check against cosmocalc, assumes 2650 kg/m^2)
ErosionRate_totaluncertainty_in_mmperkyr