Swarndeep's TECH BLOGS: 2015

Nov 9, 2015

In Memory OLTP or Memory Optimized OLTP

If an enterprise server has to be upgraded to boost performance then memory can be the first and foremost resource that can be considered as a quick and cheap upgrade. Memory is not only cheap , but also very easy to install. If server supports, memory can be added without bringing the server down for maintenance.

Starting with SQL Server 2014, Microsoft introduced In Memory Tables also known as Memory Optimized Tables or In Memory OLTP. Memory optimized tables can take advantage of additional Memory (RAM) to provide better performance.

As name suggests, the data in these tables remain in memory when SQL Service is running. This does not mean that data will always vanish away when server is powered off or service interrupts. SQL Server keeps copy of data in Logs (write ahead log), if data has to be retained, the pages from log will be loaded back to memory optimized tables.

While defining the table you can specify whether the durability of the table will be Schema only or Schema and Data both.

Memory Optimized Table is brand new concept for SQL 2014 and the allocation works little differently from regular indices. Unlike regular tables, data in Memory Optimized table is stored in streams also called Data Streams. These Tables don't use IAM (Index Allocation Map) to link extent chain, because of this reason each Memory Optimized Table requires at least one index to link the data.

These are the requirements for Memory Optimized Tables are:

1. File Group designated for Memory Optimized Data,
2. Data Files in Memory Optimized File Group.
3. Disable Broker (or any feature that enables Event Notification) for that database. It is not supported.
4. At least one Index per Memory Optimized Table. Index can be Hash Index or Range Index. Hash Indexes use Buckets to organize and point the underlying data, and Range Indexes are the regular indexes using "BW"-Tree architecture.

You can either create a new Memory Optimized Database or you can add Memory Optimized Data file to existing database also.

Here is the small example to introduce with Memory Optimized OLTP/Table.

Create Database

CREATE DATABASE [MYDB]
CONTAINMENT = NONE
ON PRIMARY
(NAME ='DataFile',
FILENAME='H:\SQLDATAFILES\MYDB_DATAFILE.MDF',
SIZE = 4096KB,
MAXSIZE = 10240KB,
FILEGROWTH=1024KB
)
LOG ON
(NAME = 'LogFile',
FILENAME='H:\SQLDATAFILES\MYDB_LOGFILE.LDF',
SIZE=4096,
MAXSIZE=UNLIMITED,
FILEGROWTH=10%
)

Disable Broker

IF EXISTS (SELECT * FROM SYS.DATABASES WHERE NAME = 'MYDB' AND IS_BROKER_ENABLED = 1)
ALTER DATABASE MYDB SET DISABLE_BROKER

Add Memory Optimized File Group and Data File

ALTER DATABASE MYDB
ADD FILEGROUP [MYDB_MEMORY_OPTIMIZED_GROUP] CONTAINS MEMORY_OPTIMIZED_DATA

ALTER DATABASE MYDB
ADD FILE
(NAME='MYDB_MEM_OPT_DAT',
FILENAME='H:\SQLDATAFILES\MYDB_MEM_OPT_DAT',
MAXSIZE=UNLIMITED

) TO FILEGROUP [MYDB_MEMORY_OPTIMIZED_GROUP]

Create Table

USE MYDB;

CREATE TABLE MYDB_INMEMORY_TABLE
(
COLUMN1 INT IDENTITY(1,1) NOT NULL PRIMARY KEY NONCLUSTERED,
COLUMN2 INT NOT NULL INDEX INDEX1 HASH WITH (BUCKET_COUNT=1000),
COLUMN3 CHAR(100) COLLATE LATIN1_GENERAL_100_BIN2 NOT NULL INDEX INDEX2
)
WITH (MEMORY_OPTIMIZED=ON, DURABILITY=SCHEMA_AND_DATA)

It is all done and ready to use. Though it is recommended to create resource pooling on memory for the database so that memory utilization can be maintained and monitored.

Create Resource Pool

IF NOT EXISTS(SELECT * FROM SYS.RESOURCE_GOVERNOR_RESOURCE_POOLS WHERE NAME = 'POOL_MYDB')
BEGIN
CREATE RESOURCE POOL POOL_MYDB WITH (MAX_MEMORY_PERCENT=10);
ALTER RESOURCE GOVERNOR RECONFIGURE;
END

Map Resource Pool and Database

EXEC sp_xtp_bind_db_resource_pool 'MYDB', 'POOL_MYDB';

Take Database Offline and Online again to apply the changes

USE MASTER;
ALTER DATABASE MYDB SET OFFLINE WITH ROLLBACK IMMEDIATE;
ALTER DATABASE MYDB SET ONLINE;

Please leave your feedback, if you like this article or have any comments, below.

Oct 6, 2015

About Me

Swarndeep Singh is a SQL Server Solutions Specialist with expertise in Performance Turning, Database Designing, Cloud Computing, Database Security, Data Analysis, BI and Auditing. He uses scripting including T-SQL, Powershell and Python in his day to day work. He is also fond of working with other technologies like Postgresql, C#, Web Services, AWS and Azure.

In his free time he likes to write blogs on his blog at https://www.swarndeep.com. Swarndeep lives in Georgia, USA and working full time as Database Manager.

Oct 2, 2015

A glance at SQL Server Statistics

Statistics are sometimes ignored when it comes to Performance Tuning. Statistics can play a big role in identifying recurring slowness.

Before executing any query, SQL Engine Optimizer should know in advance some information, like the number of records, page density, indexes etc. If updated information is available, the query optimizer will use best execution plan to execute the query.

Before go further on this, here is an example, an online store makes sure that they have enough stock in warehouse that is ready to ship to meet the demand. But their system is broken and online ordering system is unaware of stock quantities in warehouse, then probably online ordering system will order item directly from distributor which will take longer time for item delivery.

On the other hand if online store has outdated information that their warehouse has enough stock, the online order will be mistakenly sent to warehouse, which then be re-routed to the distributor.

Both of these scenarios have bad impacts because information was outdated somewhere.

This example may not fit 100% with SQL Server statistics but it is very close.

Let's take a look at SQL Server with same example. Store A has 10 warehouses in a state at different zip code locations. Store decided to merge all warehouses in one big warehouse to reduce overhead cost.

After the physical change, this information has been updated in database, by simply updating affecting zip codes to one zip code. Note that the index is built on Zip code. There is an pre-existing index on the zipcode column.

Order entry system complains that after merging the Zip codes, the system is little sluggish. The reason of this sluggishness lies behind mass updates. The reason behind this issue could most probably be outdated statistics. Let's take this example:

Create a table:

CREATE TABLE [dbo].[Warehouse_Inventory](
[ItemCode] [varchar](50) NOT NULL,
[ZipCode] [int] NOT NULL
) ON [PRIMARY]
GO

Now insert some data:

declare @zip int, @inventory int
set @zip = 1
while (@zip < 200000)
begin
set @inventory = @zip+22.5
insert into [dbo].[Warehouse_Inventory]
select 'Inventory ' + cast(@inventory as varchar(10)), @zip
select @zip = @zip+1
end

Create a non clustered index on zip code:

CREATE NONCLUSTERED INDEX [ClusteredIndex-myindex] ON [dbo].[Warehouse_Inventory]
(
[ZipCode] ASC
)
GO

Let's run our first query to get inventory from location 30001. Only 1 row should be returned.

select zipcode from [Warehouse_Inventory] where ZipCode = 30001 OPTION (RECOMPILE)

Query used the index as expected, let's look at the index statistics. See from the screenshot below the Estimated and Actual number of rows matches. What query optimizer estimated, same was returned. This sounds good.

Execution Plan after creating Index

Since the zip codes were updated, about 90% of the warehouses are moved to one zip code, 30001. In table, we need to update those 90% of the zip codes.

update [dbo].[Warehouse_Inventory]
set zipcode = 30001
where zipcode < 180000

Without making any further changes, without rebuilding the indexes etc., running the same query again should return 179,000 rows. Let's look at the same index execution plan created by query optimizer.

Query optimizer estimated only 1 row to be returned, though there were 179,000 rows expected.

Now, let's fix the statistic by updating it manually.

update statistics dbo.warehouse_inventory.

After updating the statistics the number of rows estimated by query optimizer are 179,000. This is what we wanted to see in execution plan.

If query optimizer sends wrong estimate then probably wrong execution plan can be picked which can cause some performance sluggishness.

Here is the query that can be used to find out how many modifications were made to statistics. This can probably help in devising a custom solution for some of the indexes which are updated throughout the day in production application.

SELECT
[sp].[object_id],
OBJECT_NAME([sp].[object_id]) AS 'ObjectName',
[sp].[stats_id] AS 'StatID',
[st].[name] AS 'StatName',
[sp].[last_updated] AS 'LastTimeStatUpdated',
[sp].[modification_counter] AS 'NumberOfModifications'
FROM [sys].[stats] AS [st]
OUTER APPLY sys.dm_db_stats_properties ([st].[object_id],[st].[stats_id]) AS [sp]
where OBJECT_NAME([sp].[object_id]) is not null
and OBJECT_NAME([sp].[object_id]) not like 'sys%'
and OBJECT_NAME([sp].[object_id]) in ('Warehouse_Inventory')

The example above is very simple and probably you may not see any major difference in performance, it was just to demonstrate how to proceed with Statistiscs.

There are some options that SQL Server provides at database level. These options are Auto Update Statistics and Auto Update Statistics Asynchronously.

Auto Update Statistics: If while executing the query, the statistics found outdated by a ratio then update the statistics before executing the query.

Auto Update Statistics Asynchronously: If while executing the query, the statistics found outdated by a ratio and statistics update request will be sent to engine in background.

One or both of these options can be switched on. Since all servers have different load, environment, usage, thus be little careful before opting these options. In General, OLTP may respond very well to ASYNC, though database warehouses have SYNC options go well.

Swarndeep's TECH BLOGS

About Swarndeep