|
Visually we know a pattern when we see one, in a GIS data layer, in a remotely sensed image, or on the landscape.
Machine learning is a form of computer pattern recognition that enables computers to mimic the human skill of
identifying patterns in data. Machine learning can be computationally intensive. This work investigates efficiencies,
expressed as cost and benefit, in the application of machine learning algorithms. The cost is computer time needed
to calibrate the algorithm, and the benefit is goodness of fit, how well the algorithm learns the pattern in the
data. There may be a point of diminishing returns where a further expenditure of computer resources does not
produce additional benefits. Stratified sampling is one cost reduction strategy. Cost and benefit for machine
learning are illustrated by statistical experiments for computing correlations between measures of roadless area
and population density for the San Francisco Bay Area. The alternative to training efficiencies is to rely on
high performance computer systems. These may require specialized programming and algorithms that are optimized
for parallel performance.
|
Publications and Websites:
Champion, Jr., Richard, 2007, Cost Benefit Analysis of Computer Resources for Machine Learning. USGS
Open File Report in Review
Sleeter, R. 2004, Dasymetric mapping techniques for the San Francisco Bay region, California: Urban and
Regional Information Systems Association, Annual Conference, Proceedings, Reno, Nev., November 7-10, 2004
Watts, R.D., R.W. Compton, J.H. McCammon, C.L. Rich, S.M. Wright, T .Owens, and D.S. Ouren. 2007. Roadless
Space of the Conterminous United States. Science Vol. 316, Num.736. pp. 736-737.
|