The Final Solution

This is the 8th and final post on on my paper “A Method for Determining and Improving the Horizontal Accuracy of Geospatial Features” Other posts on this topic:

7. Determining the Spatial Accuracy of Polygons Using Buffer Overlay

In the last post on the subject we described how Buffer Overlay can be used to determine the horizontal accuracy of polygon features. We ran Buffer Overlay on a single Township/Range where we had 259 permits to compare against parcels. This resulted in 259 cumulative probability (CP) curves one for each permit. In the sample below 21 curves are shown for each of these when the CP is greater than one the buffer distance is equal to the accuracy of the permit. If the curve does not cross the 1 CP then its accuracy could not be determined.

Here is our initial horizontal accuracy distribution for the 259 permits. In the graph below we have two peaks: < 2 feet where we have about 50 records and >= 60 where we have about 35 these are those feature for which accuracies could not be determined. The middle part of the curve shows a random distribution of accuracies.

Here is the cumulative curve. If the data was perfect all 259 permits would have a horizontal accuracy of .5 feet. In which case the cumulative curve (shown below) would be a straight line across at a cumulative count of 259.

This information is visualized below where you can see that there are many features at accuracy levels from  2 – 60 feet. This type of visualization is useful for manual correction of features but we were looking to automate the process.

In most cases the last buffer and clip operation will extract a line segment from control that forms a complete ring. This means that using standard topology tools the extracted control lines can be built into polygons and used to replace less accurate test (permit) features.

The solution that we implemented consisted of two processes:

  • If the clipped line work formed a complete ring the ring could be easily built into a polygon. This was the Phase 1 Correction.
  • If the clipped line work had gaps, we implemented a more complex algorithm that did a node-to-node comparison and then extended lines to form closed rings. In this case, it was possible to create multiple rings and once those rings were built into polygons their area was compared to the original permit (test feature) to assure that the new feature was not significantly larger or smaller than the original. This was the Phase 2 Correction.

In this video the first build is the trivial solution or Phase 1 Correction, the second build is the more complex Phase 2 Correction.

After the Phase 1 Correction we we generated statistics and found that we had corrected 104 permits or 40% of our 259 features to parcel. The horizontal accuracy distribution before and after Phase 1 correction is shown below.

In this close up we see that the Phase 1 Corrections curve spikes at 0.5 feet above the initial conditions and that the rest of the Phase 1 Correction curve is under the Initial curve as is expected.

The same improvements can be seen in the cumulative curve.

After the Phase 1 correction we also calculated the RMSE error and found a significant reduction in the horizontal error when compared to parcel (see below).

Lastly, a visual comparison was done showing the obvious improvements at the 0.5 feet horizontal accuracy (shown in yellow below).

The phase two correction made a 6% (16 record) improvement to our data. Here are some images of the features corrected. Red is the original permit the green area is the corrected feature following parcel.

The image below shows a common problem with our algorithm on the west side of the polygon where a corner has been clipped. However, in general the new feature is a better representation than the original in red.

The final horizontal accuracy measures after Phase 2 correction are provided below. Phase 2 is under Phase 1 is under Initial conditions as expected.

In conclusion, we have found that Buffer Overlay analysis is an excellent tool for determining and improving the horizontal accuracy of our permit features to parcels. The initial accuracy assessment is easy to perform and provides data that is easy to interpret. Visualizing the data by these generated accuracies provides an avenue for manual correction of the data. However, with a little more effort complete rings can be extracted from the last buffer and clip operation that can be built into more horizontally accurate polygons. This is the low hanging fruit but also provides the biggest bang for the buck.

Our additional efforts in Phase 2 where we connected nodes to close gaps and create rings only improved our data by 6%, however, we feel confident that we can tweak our algorithm to yielding at least a 10-15% gain in the future.


Spatial Accuracy Assessments Using an Excel Spreadsheet

This is the 5th post on on my paper “A Method for Determining and Improving the Horizontal Accuracy of Geospatial Features” Other posts on this topic:

4. Coordinate Sample Builder
6. Five Methods for Determine the Spatial Accuracy of Lines

The Geospatial Positioning Accuracy Standards – Part 3: National Standards for Spatial Data Accuracy states that twenty or more test points are required to conduct a statistically significant accuracy evaluation regardless of the size of the data set or area of coverage. Twenty points make a computation at the 95 percent confidence level reasonable. The 95 percent confidence level means that when 20 points are tested, it is acceptable that one point may exceed the computed accuracy.

Personally, I make use of a spreadsheet that requires 30 points to calculate RMSE and the Standard Deviation for RMSE in both x and y. With this information we can calculate the following accuracy measures:

1. Estimated Root Mean Square of the population errors
2. Estimated Variance of the population errors
3. Estimated Standard Deviation of the population errors
4. Greenwalt & Schultz CMAS Standard normal (Z) interval of the population errors at 95% probability
5. Greenwalt & Schultz CMAS Standard normal (Z) interval of the population errors at 90% probability
6. NSSDA Statistic
7. Confidence interval on the estimate of RMSE at 95% probability

Using this information we were able to calculate the confidence interval on the estimate of RMSEx/y at 95% probability for our permits as:

Here are the two MS Excel spreadsheets for 30 and 100 points. Just copy 30 coordinates pairs in and you will get your accuracy measures.

Spatial Accuracy Assessment for 30 Points
Spatial Accuracy Assessment for 100 Points

Coordinate Sample Builder

This is the 4th post on on my paper “A Method for Determining and Improving the Horizontal Accuracy of Geospatial Features” Other posts on this topic:

3. An Introduction to Parcels
5. Spatial Accuracy Assessments Using an Excel Spreadsheet

Coordinate Sample Builder (CSB)  is a toolbar for ArcMap that allows you to create a database of test and control coordinate sample pairs. The toolbar has two icons one to open the main form and one that allows the selection of vertecies. Once vertecies are selected there is a button at the bottom of the form that allows you to submit the record into the database and then by pressing the “Next feature” button you will be zoomed in to the next feature.

To prevent bias the users needs to sample the same way each time. The easiest way of doing this is to pick a primary corner say the upper right and search for vertecies in this area before going to a secondary corner say lower right. Doing this should eliminate most of the sampling bias. You only need to sample 30 points generate the RMSE @ 95% confidence interval.

Once we have 30 coordinates pairs we transfer this information o an Excel spreadsheet that has been setup to calculate several accuracy measures. I’ll cover that tomorrow.

I have placed the code on github and you can watch a video below: