I remember hearing during the SQL Server 2008 beta that there would be diagnostic stored procedures that would produce information about spatial indexes and help in troubleshooting why an index was not being used. In addition, these procedures would help in determining the best spatial index. A few weeks ago I had the good fortune to run across these procedures.
sp_help_spatial_geometry_index and sp_help_spatial_geography_index put out information as columns
sp_help_spatial_geometry_index_xml and sp_help_spatial_geography_index_xml put out XML format
Each procedure takes a table name, index name, and sample query. There's an option for verbose or basic properties. The parameters and output are doc'd in BOL. Rather than repeat the doc, I'd like to walk through a problem and show how you'd interpret the output.
Say that you are trying to decide between an index with high and medium density to run a query similar to the following:
Declare @geometry geometry =
Geometry::STGeomFromText('POLYGON((500000 300000,500000 340000,540000 340000,540000 300000,500000 300000))', 0)
exec sp_executesql N'select * from spatial_table where LOCATION.STIntersects(@g) = 1', N'@g geometry', @geometry
Create both indexes and run the procedure against each one, using the sample query. There is a lot of very detailed information about the index itself because I was running in verbose mode, including information about the number of cells at each level of tesselation. Before looking at this information, it would be good to review Isaac's excellent blog post on how tesselation works. The most useful info though, was contained in the "Number of rows selected by … filter" properties and filter efficiency properties. In my case there they were
High density index:
Number_Of_Rows_Selected_By_Primary_Filter: 2931
Number_Of_Rows_Selected_By_Internal_Filter: 2399
Number_Of_Times_Secondary_Filter_Is_Called: 532
Medium density index:
Number_Of_Rows_Selected_By_Primary_Filter: 2931
Number_Of_Rows_Selected_By_Internal_Filter: 2666
Number_Of_Times_Secondary_Filter_Is_Called: 265
Note that "primary filter" in this case is the spatial index in question, the goal is to have the rows selected by the primary or internal filter rather than run the secondary filter (the actual Intersects operation). Note that there can be false positives, the actual number of rows returned was not 2931 but 2779.
The best way to judge the effectiveness of the index, however, is with the property Internal_Filter_Efficiency or Primary_Filter_Efficiency. Internal_Filter_Efficiency, for example, showed the high density index at 86% and the medium denisty index at 96%. Clearly, the medium density index is a better choice for this query. In addition, looking at the statistics on the index itself shows that the medium density index is a smaller index with less rows and index row per base row. The figures were:
High density:
Total_Primary_Index_Rows: 5273534 rows
Average_Number_Of_Index_Rows_Per_Base_Row : 10
Medium density:
Total_Primary_Index_Rows: 1965865 rows
Average_Number_Of_Index_Rows_Per_Base_Row : 3
There is also a property, Total_Number_Of_ObjectCells_In_Level0_For_QuerySample, that will tell you if the index *can* be used. A non-zero value indicates this and properties that will assist in determining whether to adjust the bounding box on a geometry index.
Hope this will give you a good head start in working with these excellent informational and diagnostic stored procedures.
One thought on “Using the spatial index diagnostic stored procedures”
Good post, but have you thought about Using the spatial index diagnostic stored procedures before?
Comments are closed.