Before we begin our journey in SQL Indexing it is important to understand how a SQL Server stores its data.

Database FileGroups and Files

A MS SQL database uses a .mdf file as the primary data file, a number of .ldf files as transaction logs, and optionally a number of .ndf files as secondary data files. These data files (.mdf and .ndf) are organized to FileGroups, for management and allocation purposes.

By default, the first group that is automatically created is called the “Primary File Group”. Each .mdf and .ndf files that are created, automatically belong to that group.

Note that the system tables and the primary data file of a database, always belong to the “Primary File Group”. The Secondary data files may be assigned to a user file group by using the FILEGROUP keyword in the CREATE or ALTER DATABASE statement.

Pages and extents

The fundamental storage unit in SQL server is the “page”. A page is always 8KB and it is allocated at the primary and secondary data files. Note that I/O operations are performed at the page level.

SQL server has to efficiently manage these pages. For that purpose it groups 8 contiguous pages in a unit called the “extent”. Note that log files are not divided into pages, as it just consists of a series of log entries.

image_18

Extent Types

Extents are subdivided in mixed and uniform extents. A uniform extent contains pages from a single object, while a mixed extent contains pages from various objects. You see, an extent can contain up to 8 pages, thus a mixed type extent can contain up to 8 different page object owners.

image_19

Pages from a new table or index are usually allocated to a mixed extent. However, SQL server will allocate a new index to a uniform extent, as long as it has enough rows to generate 8 pages.

A table or index will switch to use uniform extents, for subsequent allocations, when it grows up to a point that uses 8 pages.

Page Types

Similarly to extents, pages are subdivided into categories: data pages, index pages, large object pages and other pages.

Data pages can store all data values except for those which identified as large types, such as text, xml, varchar(max), etc. These are stored in the “large object pages”.

As SQL server data rows cannot span to multiple data pages, large types are stored to a different page. This page type is called the “large object page”. Data values identified as large types, stored in the large object pages, are referred back to the original data page by using reference pointers, as shown in the picture below.

Some data types like varchar and varbinary, while can grow large, may be stored in the original data or index page. This can happen if the row size does not exceed the 8KB page limit. SQL server can dynamically move back and forth such columns, between data page and large object page according to the row size limit.