Thursday, 31 July 2008

SQL Server 2005 INDEXING

I like the quote said by "William James" (from MCITP Server 2005 Database Administration book):

~The art of being wise is the art of knowing what to overlook~

Anyway... here are the gists for INDEXES.

Indexes: Clustered + Nonclustered; Heap = the absence of index on a column

Clustered

Like a dictionary or phone book, the data is kept in *sorted* order

Leaf level rows = Data rows

Nonleaf level pages and data are in seperate storage areas

Only one clustered index on a table

PRIMARY KEY = Clustered by default

Nonclustered

Like the index in the back of the book.

Storage: 8 pages within an extent; 8KB for each page; so total 64KB for an extent;

When SQL Server reads data off the disk, it reads full extents at a time.

We can have up to 249 nonclustered indexes on a table.


Columns that are good candidates for indexes are _____

Primary keys and foreign keys

Cols that are frequently referenced in queries

Cols that are searched for ranges of key values (with WHERE clauses)

Cols that need to be sorted (used in the ORDER BY clause)

Cols that are frequently grouped together during aggregation (i.e., AVG func)


Columns that should not be indexed are _____

Cols that are seldom referenced in queries

Cols that aren't selective, or have few unique values

LOB (Large Object) data types {text + ntext + varchar(max) + nvarchar(max) + varbinary(max) + xml + image}

Composite Index

> 1 column

covering a query

When creating a composite index, one column is referred to as the key column, and the other columns are referred to as the included columns

Index *attributes* can also be learned at http://www.windowsitlibrary.com/Content/77/17/1.html

Fill Factor

fill factor 0 / 100 = 100 percent full
EXAM TIP The fill factor becomes a significant factor in the ultimate size of the index. Consider an index that’s 20MB with a fill factor of 100 percent.
How big would the index be if we set the fill factor at 50 percent, or allowed the pages to be half full? It would be twice as large.


Fragmentation

If you want to prevent fragmentation => use a fill factor appropriately for the number of inserts you expect so that you can prevent, or at least minimize page splits

Pad Index

It directs the system to use the same % as the fill factor when filling the nonleaf levels of the index.

Calculating Size

(S x R) / F

S = Size of index row
R = Number of rows in index
F = Fill factor

TempDB

If you have to frequently rebuild indexes and you leave the default of sorting the results in tempdb, you want to have the initial size of the tempdb set to a little bit larger than the largest index.

If tempdb doesn't start at the size that it needs to be, then Autogrowth will cause it to grow to the size needed. If it starts at 8MB and need to grow to 1GB insize, it'll have to resize itself 51 times!

While this will work, it is much more resource-intensive to have the database constantly resize itself than to set it to the size it needs to be originally. (from MCITP)

Creating an Indexed View

Three basic rules:

  • must be created with SCHEMABINDING option
  • objects must be referenced with a two-part name (i.e. schema.table).
  • the view must be deterministic (it cannot contain any non-deterministic funcs)

Statistics in SQL Server

Density

The % of duplicate rows in an index

Selectivity

The % of rows accessed or returned by a query

Viewing and Updating Statistics

Enable settings by default:

  • AUTO CREATE STATISTICS

  • AUTO UPDATE STATISTICS

DBCC SHOW_STATISTICS (tableName, indexName)

EXAM TIP A common symptom indicating statistics may be out of date is when queries begin to run slower than normal.

EXAM TIP A sudden jump in the number of stored procedures that are being recompiled could indicate that statistics have been updated.

Full-Text Index

Created on textual data to increase performance and capabilities with searches.

Steps:

Verify the database is enabled for full-text searches;

Create a full-text catalog;

Create a full-text indexes;

DataTypes = char + varchar + nvarchar + varchar(max) + varbinary(max) + XML + image

Clustered or nonclustered indexes are not useful with full-text searches

E.G.
SELECT DATABASEPROPERTYEX('AdventureWorks', 'IsFullTextEnabled')

EXEC [AdventureWorks].[dbo].[sp_fulltext_database] @action = 'enable'

CREATE FULLTEXT CATALOG ftCatalog as DEFAULT

CREATE FULLTEXT INDEX ON HumanResources.JobCandidate(Resume) KEY INDEX
PK_JobCandidate_JobCandidateID;

SELECT *
FROM HumanResources.JobCandidate
WHERE CONTAINS(Resume, ' "mechanical engineering" ');

Database Engine Tuniting Advisor (DTA)
● Recommend indexes (clustered and nonclustered) to create
● Recommend indexes to drop
● Recommend indexed views
● Analyze the effects of the proposed changes
● Provide reports summarizing the changes and the effects of the changes

Tuning Options
Physical Design Structures (PDS) in use database => what indexes / indexed views

Partitioning: Full + Aligned

PDA to keep in database => existing indexes / indexed views to be kept

Index Maintenance
Enable to drop unused indexes to eliminate the needless overhead.

DMV (Dynamic Management Views + Functions) :

  • Server-scoped -> VIEW SERVER STATE

  • Database-scoped -> VIEW DATABASE STATE
Fragmentation
logical (pages are out of order)
+
extent (extents are out of order)

TIP The DBCC SHOWCONFIG command has been deprecated in SQL Server 2005 and replaced with the sys.dm_db_index_physical_stats dynamic management function.

Degragmenting a Fragmented Index
Reorganising = ALTER INDEX REORGANIZE {online; statistics are not updated during this operation}

Rebuilding = ALTER INDEX REBUILD {online / offline; defaults -> offline}

Recreating = The index can be dropped and re-created from scratch. {offline;}

In SQL Server 2005, it is common to use the tempdb when rebuilding indexes. Because of this the initial size of the tempdb database is an important consideration. Set it to be a little larger than the largest index.

To easily determine how useful an index is, you can use the sys.dm_db_index_usage_stats dynamic management view.

DBCC (Database Console Commands)
DBCC CHECKDB = to check database integrity;
DBCC OPENTRAN = to check for open transactions within a database;
DBCC SQLPERF = to get information on the transaction log;

Deprecated command Alternative recommendation
INDEXDEFRAG => ALTER INDEX
DBREINDEX => ALTER INDEX
DBREPAIR => DROP DATABASE
SHOWCONTIG => sys.dm_db_index_physical_stats.
EXAM TIP Expect to see some of these deprecated commands as incorrect answers on the test. If Microsoft has deprecated the commands, now saying we should not use them, don’t expect them to be a correct answer on the test.

Filegroups

Single filegroup <- the .mdf file (named databaseName_data by default)

First filegroup - Primary <= primary filegroup

*The transaction log is a seperate file & is not kept in a filegroup at all*

Additional filegroups can be created for the following reasons:

Performance + Recoverability + Partitioning

possibilities:

  • One file and One filegroup
  • Multiple files and One filegroup
  • Multiple files and Multiple filegroups

extension of the secondary file is .ndf

ALTER DATABASE Chapter7 ADD FILEGROUP Secondary;
ALTER DATABASE Chapter7
ADD FILE (NAME = 'SecondFile', FILENAME = 'C:\Program Files\Microsoft
SQL Server\MSSQL.1\MSSQL\DATA\ SecondFile.ndf',
SIZE = 1024KB, FILEGROWTH = 1024KB)
TO FILEGROUP Secondary

Here is a good article about the difference between VARCHAR and NVARCHAR:

http://searchsqlserver.techtarget.com/tip/0,289483,sid87_gci1266201,00.html

No comments: