Calculating SQL Server Data Compression Savings

SQL Server 2008 Enterprise edition comes with an opportunity for storing table or index data in a compressed format which may save huge amount of storage space and - much more important - IO requests and buffer pool utilization. There's two different options for data compression, namely Row and Page level compression. This blog post is not concerned with how these two work internally and will also not explain the differences between the two. If you like to know more about this, you find much of useful information on the internet - including links to further articles (e.g. here, here, and here).

Whether compression is worth or not isn't an easy question to answer. One aspect that has to be taken into account is certainly the amount of storage that may be saved by storing a distinct table or index in any of the two compressed formats. SSMS offers a Data Compression Wizard than can provide storage-saving estimates for row or page level compression. From the context menu for a table or index just open Storage/Manage Compression. In the Combo box at the top select the compression type (Row or Page) and press the Calculate Button at the bottom. Here's a sample of a calculated saving for an index:

image

Unfortunately, SSMS does not offer an option for calculating estimated savings for more than one table or index at once. If you, let's say, would like to know the estimated storage savings of page level compression for you largest 10 tables, there's no GUI support in SSMS that will assist you in finding an answer. This is, where the stored procedure sp_estimate_data_compression_savings comes in handy. This procedure - as you may have guessed from its name - provides estimated savings for row or page level compression for any table or index. You have to provide the table or index as a parameter to the procedure. In other words: The procedure will only calculate the estimations for one table or index at a time. If you want to retrieve the calculations of more than one table or index as a result set, there's some more work to do, since the procedure has to be invoked multiple times. Here's a script that calculates the estimated savings of page level compression for the database in context.

-- Determine the estimated impact of compression
-- NOTE: This script is only for SQL Server Enterprise and Developer edition.

set
nocount on

-- We create a temp table for the result
if (object_id('tempdb..#comp', 'U') is not
null)
drop table #comp
go
create table #comp
(
object_name
sysname
,schema_name
sysname
,index_id
int
,partition_number
int
,[size_with_current_compression_setting (KB)]
bigint
,[size_with_requested_compression_setting (KB)] bigint

,[sample_size_with_current_compression_setting (KB)] bigint

,[sample_size_with_requested_compression_setting (KB)]
bigint
)
go

-- Calculate estimated impact of page level compression for all
-- user-tables and indexes in all schemas.
-- NOTE:
-- 1) To get the estimated impact of row level compression change the last parameter
-- of sp_estimate_data_compression_savings to 'row' instead.
-- 2) We don't care about partitioning here. If this is important for you,
-- you have to modify forth parameter of sp_estimate_data_compression_savings.
-- Please refer to BOL.

declare @cmd nvarchar(max
)
set @cmd =
''
select @cmd =
@cmd
+
';insert #comp exec sp_estimate_data_compression_savings '''
+ schema_name(schema_id)+''','''
+ name + ''',null, null, ''page'''
from sys.
tables
where objectproperty(object_id, 'IsUserTable') = 1
exec (@cmd)

;
-- Do some further calculations for a more meaningful result
with
compressionSavings as
(
select quotename(schema_name) + '.' + quotename(object_name) as
table_name
,
index_id
,
[size_with_current_compression_setting (KB)]
,
[size_with_requested_compression_setting (KB)]
,cast(
case
when [size_with_current_compression_setting (KB)] = 0 then
0
else 100.0*(1.0-
1.0
*
[size_with_requested_compression_setting (KB)]
/[size_with_current_compression_setting (KB)]
)
end as decimal(6,2)) as
[Estimated Savings (%)]
from #comp
)
select cs.
table_name
,isnull(i.name, i.type_desc) as
index_name
,cs.
[size_with_current_compression_setting (KB)]
,cs.
[size_with_requested_compression_setting (KB)]
,cs.[Estimated Savings (%)]

from compressionSavings as
cs
left outer join sys.indexes as
i
on i.index_id = cs.
index_id
and i.object_id = object_id(cs.table_name, 'U'
)
order by cs.[Estimated Savings (%)]
desc

-- Get rid of the temp table
drop
table #comp
go

The script calculates the impact of Page level compression but may easily be adapted to consider Row level compression instead. Please read the comments inside the script. Also, please notice that the script will only run on SQL Server Enterprise and Developer edition. All other editions don't provide the opportunity for data compression.

Here's a partial result retrieved from running the script against the AdventureWorksDW2008R2 database.

image

If you execute the script, please be aware that it may produce some extensive I/O. Running the script against your production database at business hours wouldn't be a very good idea therefore.