I
nstitute for Arctic and Alpine Research, 

University of Colorado at Boulder

USEFUL LINKS

dbSEABED

- Home
- Bibliography
- Methods
- FAQ
- CoreNavigator

usSEABED
- Home
- Atlantic Margin
- Gulf of Mexico
- Pacific Margin

Corewall
- The project

- CW Wiki
- Corelyzer

Partners
- GSMFC
- INSTAAR
- PIES
- USGS CMG
- IOW
- USIMS
- NOAA NGDC
- UT Austin

 
Grid Interpolation of Seabed Data
 
Introduction Competent Seabed Interpolator (CSI) Statistical Validations Technical Notes References


Introduction
The most useful and visually presentable outputs from dbSEABED are gridded data, where map cells of size about 1km2 are assigned a parameter value, for instance on seabed mud content or average grainsize. Gridded maps are very suitable as illustrations in papers, for input to numerical models, and to drape on 3d surfaces.


Unfortunately, making the grids involves interpolation, a field of spatial data handling that is difficult and involves judgement. There is a bewildering choice of interpolation methods and statistical reliabilities, are very dependent on choice of interpolator and quality of the input data distribution. The reliabilities are usually only 80% at best (Cressie 1993; Dubois et al., 1998; Bengio et al., 2004).

Existing interpolators
Many GIS have embedded interpolators, including Inverse Distance Weighting (IDW), Kriging, Polynomial/Spline, Optimal Interpolator, and Natural Neighbour types. Experience shows that in almost every case those 'black box' interpolators give spurious results for seabed mapping, especially in coastal areas. The list of difficulties includes these:
  • Boundaries of offshore sediment zones are badly formed; they are often made to cross obvious environmental zonations (such as water depths) and of course, the coastlines.
  • Where inshore data is scarce (usually the case), properties of the offshore sediments are drawn in too close to the coastlines.
  • Whenever a wide search radius is set to deal with the sparse-data deep-water zones, good detailed information for well-mapped shallow areas is smeared.
  • Global interpolators - particularly the spline, polynomial and trend-surface methods - are particularly bad and make false highs and lows in areas of sparse data.
  • Sediment / rock distributions on the seabed can have sharp boundaries (e.g., Cacchione et al. 1984; imagery in Intelman et al., 2007). They are not mapped accurately using Kriging, Polynomials, or Optimal Interpolation which produce continuous-differentiable results more suitable for water properties and potential surfaces like gravity.
  • The error (uncertainty) calculated by Kriging and Optimal Interpolation are measures of local internal consistency of the data, not a full error analysis involving measurement error, assumptions (e.g., semivariogram model; Tomczak 2003), and other uncertainties.
  • Point data selection is distance based only. Strongly asymmetric results can result for gridcells lying near data clusters.
In some ways, these shortcomings represent the fact that most interpolation packages are unalloyed mathematical methods. In order to address points above, the mathematical processes need to be modified (i.e., directed and tuned). By doing this, we introduce factors that a human would use to contour data and make a result that fits better with expert knowledge of an area, for instance its environmental zonations. Of course, though, the mathematical underlay is necessary for rigour and to handle the large data volumes.

Competent Seabed Interpolator (CSI)
To resolve these issues an interpolator has been written for use with seabed data, in particular with dbSEABED datasets. It is called "competent": adequate for the job, fit for purpose; but capable of improvement.
It was written to meet a recurring need for reliable grid generation from dbSEABED. The software is publicly released, open for modification, and can be used for data from other sources. (Readme.txt)

The a
dvantages of CSI include:
  • Efficient to use. Requires an ASCII table of data, setup file, and 2  template rasters. Fortran code (g95) or Windows executable. Creates several useful products including an uncertainty grid and a data subset for calibration.
  • The IDW interpolation engine is enhanced. It uses water depth difference (Z; m) and geographic distances (X,Y; km) for weighting. With this the 3-dimensionality of the seafloor is recognized and results trend more with depth zones.
  • The search radius is varied by proximity to land (including islands and reefs), using small radius close inshore, and the maximum for the open ocean.
  • For cells with data the median is embedded (instead of IDW). This allows areas surveyed in detail, sharp seabed discontinuities, and the overall variance to be preserved in outputs.
  • The stock of point data that feeds each cell's result is subsetted (usually to 6), evenly prioritizing the nearest data within each of the 4 quadrants - NSEW. This increases the chance of a result that reflects the most local data that lies evenly around the gridcell. It also decreases ill effects from clustering of input data.
  • A different search radius is used for parameters, e.g., rock exposures are very localized on the seafloor - small radius (~5km); sand and mud are very dispersed - wide radius (~20km).
  • An uncertainty budget is computed, involving the spatial variabilities, measurement errors of the incoming data, disagreement (variance) between the data within a gridcell, navigational errors.
Statistical validations
By comparing the gridded maps including CSI generates, with seafloor properties at sites that have not contributed to the grid calculations, we can measure the performance of the gridding methods.

The testbed we used for this covers the Adriatic Sea (Figs 1-4) and Hawth's
Intersect Point Tool was used to match the point and grid data.

Consistency test
Consistency of the results in terms of data ranges, means and variance is tested by comparing
the griddings with the actual input data points.

Interpolation
Method
Av Value
SD Value
Mean Deviation
CSI (IDW; variable search radius up to 20km; XYZ weighting; embedded cell medians; quadrants) 50
45
17
IDW (20km search radius) AV3.x
55
33
17
Neighbourhood Mean (20km search radius) AV3.x
54
19
34
Proximity gridding (Thiessen polygons)  AV3.x
53
44
11
IDW gridding (6 point; power 2) AV3.x
55
37
13
Natural Neighbour gridding (12 point) AG9.x
54
37
14
Ordinary Kriging gridding (12 point) AG9.x 56
30
25
Point dataset (N=##)
54
44
-
Blue: Good performance; Red: Poor Performance.             AV3.x: ArcView version 3.x     AG9.x: ArcGIS version 9.x

Interpolation skill
The effectiveness of CSI at interpolation between data points was tested using withheld data (see REF). If this option is selected CSI lays aside 10% of the points, and computes a grid for testing from the remainder. Results are given below, compared to performance of other interpolators working on other datasets (SIC97).

Interpolation
Method
Av Value (Median)
MAE RMSE
CSI (As above) 51
26 (Rel: 46%)
43 (Rel: 80%)
MAE: Mean Absolute Error (Deviation)               RMSE: Root Mean Square Error (Deviation)

This skill seems low relative to interpolations of the SIC97 and SIC04 benchmarks on radiological and raingauge data. Partly that is because of the data: spatially very undersampled, diverse marine samplers, low precision lab analyses, use of parsed word-based data to handle mixed geologic-biologic substrates, and the 0-100% fixed data range, strong seabed temporal-spatial variations.

In basic terms the CSI interpolator achieved >20% of results with zero deviation, 50% within  8% deviation of mud contents.

The calibration suggests that the uncertainties calculated in an error budget by CSI may be too wide.  Nevertheless, frequency distributions on the grid cell-data deviations (signed and absolute) and the CSI uncertainty values for cells have similar behaviour (Fig. 10). The uncertainty results may still be correct because they allow for some uncertainty factors not explicit in the grid deviations.

Technical notes
Choice of IDW
IDW is not markedly less than the others including Kriging (a Best Linear Unbiased Estimate) on scattered environmental data (
e.g., Cressie 1993; Dubois et al., 1998; Bengio et al., 2004). It requires fewer assumptions about stationarity and continuity/differentiability. IDW is also more widely comprehended and used, and it is somewhat easier to modify in search radius, quadrants, embeddings, etc.

Artifacts
These are spurious patterns in an interpolation, resulting from the processing interacting badly with the data distribution. (Figure numbers.)
  • Crescents (4,5): formed when a point passes into a search radius, impacting on the result formed by the small number of points left; wrongly transfers the property of that point to the search radius rim; in IDW associated with a central "moon".
  • Jagged polygons (6): formed in Neighbourood Statistics (Thiessen-Voronoi Polygons)
  • Double foci (7,8): formed in Natural Neighbour between close, different valued points.
  • Loss of detail (5,9): exceptions are passed over; depending on settings, this is common from many interpolation engines.
  • Paintball (2): In data-sparse areas the search radius gives out, leaving blank areas.
  • Ignore data hull (6,8): The process proceeds without adapting to the end of data; very pronounced in Proximity and Natural Neighbour methods
Spatial Indexing
Without an optimized search method, gridding programs are very slow because of the intense spatial search requirements. CSI uses a grid-based spatial indexing (Wikipedia 2007) reading from direct access (DA) files. For cells where data exists a key is read from the cell-wise DA file 1. That key  points to a record in data-wise DA files 2,3. The key in 2 points to the first of a chain of data points for the cell, and in file 3 points to the data for this first point, held
in data-wise DA file 4. On large sets this arrangement gave 10^6 increase of program speed over brute force, requiring only 2N+1 file reads per cell with data, and only one for empty cells.

References
  • Cressie, N.A., 1993. Statistics for Spatial Data. New York: Wiley.
  • Wikipedia, 2007. Spatial Index. [URL: "http://en.wikipedia.org/wiki/Spatial_index"]
  • Dubois, G., et al. (Eds), 1998. Spatial Interpolation Comparison 97: Special Issue. Jl Geographic Information Decision Analysis, 2(1-2).
  • Bengio, S., et al. (Eds), 2004. Spatial Interpolation Comparison exercise 2004: Special issue. Applied GIS, 1(2), ##.
  • Cacchione, D.A., Grant, W.D. and Tate, G.B., 1984. Rippled scour depressions on the inner continental shelf off central California, Jl Sediment. Petrol. 54, 1280–1291.
  • Intelmann, S.S., Cochrane, G.R., Edward Bowlby, C., Brancato, M.S. and Hyland, J. 2007. Survey report of NOAA Ship McArthurII cruises AR-04-04, AR-05-05 and AR-06-03: Habitat classification of side scan sonar imagery in support of deep-sea coral/sponge explorations at the Olympic Coast National Marine Sanctuary. Marine Sanctuaries Conservation Series MSD-07-01. U.S. Department of Commerce, National Oceanic and Atmospheric Administration, National Marine Sanctuary Program, Silver Spring, MD. 50 pp. [URL: "http://sanctuaries.noaa.gov/science/conservation/pdfs/mcarthur1.pdf"]
  • Hawth, 2007. Hawth's Analysis Tools for ArcGIS. [URL: "http://www.spatialecology.com/htools/"]
  • Tomczak, M. 2003. Spatial Interpolation and its Uncertainty using Automated Anisotropic Inverse Distance Weighing (IDW) - Cross-validation/Jackknife Approach. In: EUR 2003. Mapping Radioactivity in the Environment. Spatial Interpolation Comparison
  • <>1997. EUR 20667 EN, EC. 268 pp. Dubois, G., Malczewski, G., and De Cort, M. (eds).
    Office for Official Publications of the European Communities, Luxembourg.


(Click any image to enlarge)


1. Input point data distribution


Competent Seabed Interpolator
csi gridding mud
2. CSI gridding (IDW, variable search radius, XYZ weighting,
embedded cell medians)


3. Uncertainties for CSI gridding

Download the CSI software

ArcView 3 (Spatial Analyst)

4. IDW gridding (20km search radius)


5. Neighbourhood Mean gridding (20km search radius)


6. Proximity gridding (Thiessen polygons)


7. IDW gridding (6 point, power 2)


ArcGis 9 (Spatial Analyst)

8. Natural Neighbour gridding (12 point)


9. Ordinary Kriging gridding (12 point)



10. Frequencies of grid cell-data deviations,
and of CSI computed uncertainties.




Author: Chris Jenkins
Date: 3 April 2007
Location: Boulder

Copyright 2007, Univ Colorado