Skip to main content

Accuracy vs Precision Data Quality Analysis

 In this assignment, we analyzed 50 points collected at the same location via a GPS handheld device. Through these collected points we determined their precision and accuracy. 

I first determined the mean of the points collected ("waypoints") by using the summary statistics tool, and found the exact coordinates of the "Average Waypoint" via the Absolute X,Y,Z tool. I re-projected and spatially joined the layers. Lastly, I created three new fields to determine the 50th, 68th, and 90th percentile. 

Map1: GPS datapoint distribution and precision/accuracy analysis

My horizontal precision for the 68th percentile is 4.5 meters. The distance between the "Average Waypoint" and the true reference point is 3.78 meters. Horizontal precision looks at the "consistency of a measurement method,"  and aims to provide "tightly packed results." (Bolstad, 2016) Horizontal accuracy on the other hand "measures how close a database representation of an object is to the true value." (Bolstad, 2016) 

My horizontal precision (4.5 meters) overestimated my horizontal precision as the actual distance between my average waypoint and reference point was significantly more [+0.78 meters] than the true distance of 3.78 meters. When looking at my mapped percentiles in the map above, my precision analysis was not great as 56% of the mapped points feel within the 68th percentile. The data collected should have also resided within 3.78 meters instead of 4.5 meters for optimal accuracy. 

My vertical average was 27.79, while the true reference point elevation was 22.58 meters. This means that my vertical accuracy was +5.21 meters. As stated above with my horizontal accuracy of +0.78 meters, I overestimated my vertical accuracy. An additional 5.21 meters is pretty significant and may result in unusable data. 

I believe that I can use the data collected by the GPS unit, but I would need to account for a certain level of inaccuracy. I would suggest that the company that obtained the handheld GPS data re-calibrate their devices and do a refresher training with their staff to reduce user error. From a GIS analyst's point, some errors I encountered could be derived from formatting errors when we changed the projection. The data could also be old, and the physical marker could have shifted over time. 

For the last step in this analysis, I determined the RSME and created a cumulative frequency distribution (CDF) as seen below.

Graph 1: Cumulative Distribution Function graph of Map 1 dataset

The CDF in this graph looks at the relationship between the mean RSME (x-axis) and the cumulative percentage (y-axis).  It is telling me how much mean RSME error is at a certain CP. For example, at 10 CP (or the 10th percentile) there is approximately 1.2 mean RSME. If we look at the median distribution at the 50th percentile, we have a mean RSME error of about 2.5. If we are basing the value on a scale out of 7, then our RSME value can be reduced to .36 and falls into the acceptable range between 0.2 and 0.5. Therefore, our data would have an acceptable level of error, and we could proceed forward with the data.  

Other values collected for RSME and CDF analysis:


Sources:

Paul Bolstad. 2016. GIS Fundamentals: A First Text on Geographic Information Systems.  5th Edition. Eider  Press.  ISBN-13: 978-1506695877

Comments

Popular posts from this blog

Positional Accuracy: NSSDA

 In this analysis, I compared the street and road intersect data collected for Alburquerque, NM by the City of Alburquerque and the application StreetMaps. I used an orthophoto base layer as the reference for this analysis, to compare and determine the accuracy of both the City and Streetmap layers using NSSDA procedures. The most difficult part of this analysis for me was how to determine what 20% per quadrant looks like. Because the reference map was divided into 208 quadrants, I had to determine how to subdivide all the quadrant's equality into 20%. After multiple trials and error, I decided to subdivide the entire area (208 sub-quadrants) into 4 equal-area subsections. In this way, I could do 5 random right intersection points per subsection or 20% per subsection.  Map 1: City of Albuquerque city map data.  Map 2: City of Alburquerque SteetMap data When selecting a random intersection to place the points within each quadrant, I choose a location that had data f...

Utilizing ERDAS Imagine to Analyze Map Features

 This week we learned how to utilize histograms and different bands to highlight different features in a map. On the following map that we worked on, dark bodies of water caused high peaks on the left of histograms while snow-peaked mountains were small blips on the far right. These simple distinctions help to quickly identify map features on a graph, that you can then utilize as a stepping stone to finding them on the image. I found it incredibly interesting how the different band layers highlighted different features on the map. Figure 1 below depicts three different features we found on the image.  Figure 1: Distinct features found on an image using ERDAS Imagine. Feature 1: Large body of water. Feature 2: Snow-capped mountains transitioning to thick vegetation. Feature 3: Shallow turbulent body of water near urbanized land, transitioning to deep calm body of water. 

Choropleth and Dot Mapping

 This week we explored choropleth and dot mapping. Choropleth is a thematic form of mapping that focuses on color units, whose color intensity is proportional to its corresponding data value. Dot mapping is also thematic. It uses either a proportional or graduated thematic symbol (like a circle), whose size increases due to its data value. Using ArcGIS pro, I analyzed the population densities of countries in Europe (person per square kilometer), as well as their wine consumption (liters per capita) to determine if there was a correlation between the two. In my choropleth map, I decided to use a natural breaks classification. I chose not to use Equal Interval because only 2 classes (with slight 3 rd class) were represented in the map, and it looked like almost just one color in the lower range. The standard deviation classification appeared to be more diverse at first glance but was actually skewed to the top ranges. I was then between Quantile and Natural Breaks. While both t...