Guru's info collection: September 2011

Monday, September 26, 2011

SQL Server Statistics

SQL server maintains statistics on distribution of column values across the row. This helps SQL server to decide the right Execution Plan while reading data. We can see the statistics of a table by using SSMS.

If you see the properties of statistics, you can notice it capture the distribution of values in the table.

In the above case, it capture the selectivity of each National ID, This helps SQL server to know upfront how many rows a filter condition will fetch. These statistics are updated periodically if you configured your data base for “Auto Statistic Update”. You can also update the statistic by running TSQL Statement. The more your statistics are up to date, the SQL server can create better execution plan.

UPDATE STATISTICS table_or_indexed_view_name

When you configured your DB for Auto create statistics. When ever you use a column in Join Clause or Where clause the SQL server will create an statistics on the column automatically.

When you configured your DB for Auto Update Statistics, the statistics are get updated when there is more then 20%+500 row changes happens on the table.

SELECT rowmodctr
FROM SYSINDEXES WHERE ID = OBJECT_ID('P')
AND name = '_WA_Sys_00000001_36D11DD4'

You can see the changes of a particular column (not table change) are tracked in the row counter. Using this SQL Server will automatically trigger the Update Statistics.

Based on your system need you can disable the Auto Create/Update Statistics, which will give some performance gain. But in such case you make sure that you have some manual mechanisms to have proper statistics and update on frequent intervals.

The problem with Auto Create/Update Statistics is the system will no know you Peek user time, system ideal time. Due to this Update Statistics may get triggered at you peek performance load. In such case disable Auto update feature and have a nightly schedule to update the statistics.

Saturday, September 24, 2011

SQL Server storage internals

SQL Server stores table data in Pages. A Page is a the smallest unit of data storage in MS SQL Server. The page size is 8 KB unit, due to this, the maximum size of a row is 8 KB.

CREATE TABLE Tbl
(
    ID INT,
    Data CHAR(8000),
    Info CHAR(50)
)

The above code will throw error:

Msg 1701, Level 16, State 1, Line 1
Creating or altering table 'T' failed because the minimum row size would be 16011, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.
But when we create table with VARCHAR, TEXT, BLOB types the data get stored out side of the data Page and pointer is get stored in the data row, that the reason we can have table with multiple VARCHAR(MAX) columns.

Every heap table contains a Page called IAM which holds the pointers to all data pages. You can find this information by querying SYSINDEXES table.

SELECT [first],[root],[FirstIAM] 
FROM SYSINDEXES 
WHERE ID = OBJECT_ID('Tbl')

For tiny table which has less then 64 KB information the memory allocated page by page. The moment table reaches more than 64 KB (8 Pages), the memory allocation happens Extend by Extend. An extend is 8 consecutive pages. So for the table which are more than 64 KB, the IAM Page consists of pointer to all the Extends of the table.

So internally for any heap table IAM is the starting point which says where the data is stored. The similar concept applies for Clustered and non-clustered tables.

The following query fill give information about how many pages used for a given table.

SELECT dpages,reserved, [rows]
 FROM SYSINDEXES 
WHERE ID = OBJECT_ID('Tbl')

To know the data structure of each table type check MSDN

Extend

The next level of memory unit in SQL server is Extend. An Extend can be Mixed Extend or Uniform Extend. When an extend consists of pages of single table then it is Uniform Extend. If an Extend consists of pages which are belong to different tables then it is called Mixed Extend.

Mixed and uniform extents

File

The MDF, NDF files (data files) which we create for a Database are divided into Extends and used. The first Extend’s first page in each file is a file header Page that contains information about the file. The header page has details about the address of available free uniform extends GAM, available free mixed extends SGAM, Available pages with free spaces PFS.

GAM – Global Allocation Map

List of free uniform extends, which can be allocated. When ever the Engine needs a new uniform extends, it takes up one from GAM.

SGAM – Shared Global Allocation Map

List of free mixed extends, which can be allocated. When ever the Engine needs a new new page, it takes up one from SGAM, scan thru the Extend to occupy the free Page.

PFS – Page free space

List of pages which has some free space to hold new data row. When a new data row needs to be inserted it checks the table’s associated pages which has free space to accommodate the new row. If no page available with free space, it takes up new Page / Extend.

For further read check the blog, though it is very old still most of them applicable.

To summarize Database data can be stored in one or more files. Each file consists of logical unit called Extend. Each extend consists of 8 Pages. Each Page can hold one or more data rows.

Guru's info collection

Monday, September 26, 2011

SQL Server Statistics

Saturday, September 24, 2011

SQL Server storage internals

Teams PowerShell

Stackoverflow

Links