This project was centered around a GIS process known as ‘Image Classification’. The, ‘image’ in this process is any type of remotely sensed data captured through an optical sensor. This could be for example, areal photography or satellite imagery, and in either the visible or non-visible portions of the electromagnetic spectrum. The, ‘classification’ refers to the process in which this remotely sensed data is grouped together according to certain similar characteristics.
Classification methods can be broken down into two umbrella-schemes, supervised and unsupervised. Both schemes calculate an electromagnetic signature for each grouping, and try to match up each pixel to a group within the classification scheme itself. Supervised classifications require the user to digitize polygons over areas of known use (in our case the USGS’ LULC code). Once these polygons have been established, they are given a unique text attribute to differentiate them from group to group. This can be for example the LULC code itself, or a text abbreviation such as, ‘Residential’ or ‘Industrial. Supervised classifications schemes allow increased flexibility and accuracy, in that a user can specify what they feel is the ‘best example’ of a particular group. Furthermore, this method requires less cleanup after the classification has been achieved, as only user specified groups are created. This negates the necessity of lumping multiple groups together which resemble roughly the same use, which will be described later.
Below is the result of a supervised image classification on high resolution data for Bellingham, WA. The translucent gray polygons indicate the training areas used to develop this image. By examining this image, two strategies behind polygon digitization can be seen. With certain areas such as, ‘Mixed Forest’ and ‘Bay / Estuary’ we see that the polygons for the most part only include cells that belong to those respective groups. If one looks at ‘Residential’ however, we see another strategy. The polygons covering this particular group include a variety of cells belonging to both ‘Transportation’, ‘Residential’, and ‘Industrial’. Despite this amalgamation of pixels in the residential training polygon, the computer was able to differentiate the residential areas from the vegetation found mixed with it. This could possibly have something to with the fact that while the residential training polygons generated spectral signatures which took into account a mixture of three groups, that two of those groups (Industrial and Transport) had fairly definitive spectral signatures generated for themselves as a result of other training polygons. In other words, perhaps the computer understood two out of three of the signals found in the residential polygon, and was able to decide that the third must be different and unique. This is completely speculative however.

Click Here to Download a High-Res PDF
The second map shows a supervised classification scheme run on low resolution LANDSAT TM data. The process is the same as described above, the area of study and the range of the electromagetic field captured in the image are larger however. The LANDSAT data captures data in seven distinct bands, including visual, infrared, and thermal. The low resolution of the data however forces the study area to become much larger, encompasing the entire city of Bellingham and its surroundings as opposed to a subsection as was the case above.
Click Here to Download a High-Res PDF
The following two maps were derived from an unsupervised classification scheme. The computer will continue to use the default Maximum Likelihood method of determining which group to place a particular pixel in, but the spectral signiture generation is done entirely by the computer. Without the aid of a ground truthing layer, the computer needs to be told basic characteristics about the desired output, such as number of classifications allowed and minimum cell size for each classification. The general rule-of-thumb for this process is that at least 2x the amount of output classifications should be generated for what are desired. So for instance, if a user wishes to have eight final classification groups, they should instruct the computer to generate sixteen and group them accordingly. The unupervised classification of high resolution data depicted below required the creation of twenty output groups, which were then manually grouped by the user into LULC classifications. This proved time consuming on one hand, and required a great deal of manual interpretation by the user as well.

Click Here to Download a High-Res PDF
The image below is again the same unsupervised classification process, but being run on the LANDSAT TM data. This image was generated with sixteen different classifications, which were also grouped into corressponding LULC categories.

Click Here to Download a High-Res PDF
Overall, this was a very interesting process to watch unfold. Both types of classification as well as both types of data appear to have their own combination of strengths and weaknesses. The high resolution data for example proved to be almost ‘too good’ at times, created pixel by pixel differences in grouping. This was alleviated to a point by running the data through a 7×7 cell neighborhood analysis, which smoothed the data together into larger groups. Regardless, both classification methods took time to run. Time could be taken up in drawing the training polygons, refining them, then re-running the process. Time could also be taken up by choosing larger and larger numbers of unsupervised classes to be generated, and then categorizing them manually. The ability however, to generate spectral signatures through either an automated or guided process and to produce quick and relatively accurate land use maps is astounding.
