Towards Spatial Data Science for Smart Agriculture Big Data

Thursday, October 26, 2017 - 9:00am - 9:30am
Lind 305
Shashi Shekhar (University of Minnesota, Twin Cities)
There is tremendous growth in agriculture big data such as high-resolution remote sensing data for global agricultural monitoring, weather prediction for crop insurance, and precision agricultural maps from soil-quality to yield. Thus, there is growing interest in leveraging this big data for smart agriculture to enable farmers to optimize farm returns, reduce unnecessary applications of fertilizers as well as pesticides, preserve natural resources, and contend with impending weather events.

However, classical machine learning techniques often perform poorly when applied to agricultural data due to its spatial nature. First, the datasets are embedded in continuous space, whereas classical datasets (e.g., vectors) are often discrete. Second, cost of spurious patterns is high in agriculture. Finally, one of the common assumptions in classical machine learning is that data samples are independently generated. When it comes to the analysis of agricultural data, however, the assumption about the independence of samples is generally false. For example, nearby plots have similar soil-type, climate, precipitation, etc. In spatial statistics, this tendency is called autocorrelation. Ignoring auto-correlation when analyzing data with spatial characteristics may produce hypotheses or models that are inaccurate or inconsistent with the data set.

Thus, new spatial data science methods are needed to analyze agricultural data to discover interesting, useful and non-trivial patterns. This talk surveys new spatial data science methods fordiscovering hotspots (e.g., circular, linear, rings), interactions (e.g. co-locations, tele-connections), spatial outliers, change detection and spatial prediction models.