Sep 25, 2009

SQL Server Virtualization

Introduction

When I first heard SQL Server Virtualization, I made an assumption that it is a new feature of SQL Server that provides Virtualization at SQL Server Level, that is not the case. SQL Server Virtualization is nothing but running SQL Server Instance on Virtual Machine. SQL Server Virtualization is one of the hottest topics these days, especially after Microsoft has launched Microsoft Hyper-V (AKA Microsoft Hyper-Visor) and VMware ESX Server. In more technical terms Virtual Machines are also known as Hyper-Visors. As far as SQL Server is concerned, there is nothing specific you need to configure on SQL Server for Virtualization. The only configurations are made using Virtualization Software and sometimes system BIOS. The Original HyperVisor was developed at IBM in the 1960s called CP/CMS.

Type of Virtual Machine Or HyperVisor or Virtual Machine Monitors or Virtualizations:

  • Type 1: Hosted. 

  • Type 2: Native or Bare-Metal.



Type 1 hypervisors are the common virtual machine monitors mostly used. In this scenario, Virtual Machine Monitor or VM tool is installed or Hosted on Operating System. This is why it is called Hosted Virtualization too. Thus Virtual Machine Communicates with Operating System for all of its resources Requirements. VMware Server, Microsoft Virtual PC are some of the famous Virtual Machine Monitors. Following is the basic architecture of Hosted Virtualization.
 
Type 2 hypervisors run on host's hardware and communicates directly with the Hardware instead of communicating with hardware through Host OS. The hardware resources and I/O are shared among virtualized OSs. That is why it is also called Bare-Metal Virtualization. It is also possible to assign specific devices to specific Virtualized OS, this is called partitioning. VMware ESX Server, Microsoft Hyper-V, Citrix Xen, LynuxWorks  are some of the software available in market. Following is a basic architecture of Bare Metal Virtualization. 



Why or Why not use SQL Virtualization.

When Total Cost of Ownership is your concern then Virtualization is the answer. Since virtualization runs on single set of hardware, thus the cost of buying new hardware can be reduced. Also need to consider that the performance of Virtualization will not be as good as running individual machine, because SQL Virtualization does not only includes SQL Server, it also includes Operating System, and other required tools for OS and SQL Server. There can be three priorities of an organization, Performance, Integrity, Cost. If Performance is only concern then avoid using virtualization because at some point you may notice lack of enough resources to run environments in virtualization. If integrity or cost is the concern then one may consider virtualization.


Conclusion

Virtualization is a nice technology and can save hundred thousand of dollars for organization, though there is downside too. If planned wisely, the downsides can be avoided.
The best ever plan for most of the organizations can be consider deploying SQL Virtualization on Development, UAT or any non-production environment and avoid deploying SQL Virtualization on production environment as much as possible unless you have one of the best hardware that can give good performance even if multiple OS are installed with SQL Server.
Installing multiple instances on same Operating System should also be considered while planning for SQL virtualization.

Send Email when No Rows are transferred in Data Flow Task of SSIS - SQL 2008

There might be some situations when user must be notified that NO RECORD was transferred to destination while executing SSIS Package. There is no one way of doing this, however, I find the following as the most simple and less complicated way. The steps are:

1. Declare a variable at project Scope level in SSIS.



2. Open the Data Flow task where the 0 row count needs to be validated. Add Row Count task between Source and Destination, and assign row count result to a variable that was declared above. Look at the following two images:

 



3. Go back to control Flow Tab, Just below the Data Flow task, that was modified above, add a Precedence Constraint and join it to Send Mail Task, add another Precedence Constraint and Join it to the task that should be executed next. At this point, under the same Data Flow Task, there will be two Precedence Constraints with Green Color (Success). Next, condition needs to be added to these Precedence Constraints, so that in case there is no record an email should be sent to the user.

4. Remember that already a variable has been set above with Row Count, same variable will be used here to decide whether there was any row returned or not. Configure the Precedence Constraint just similar to what is shown in figure below. Doing this, it will make sure that email will be sent only if RowCount is 0.



5. Configure second Precedent Constraint, that was added later so that step next to is executed if RowCount is greater than 0. Similar the screen below:




6. Once these steps are completed, you will notice 'fx' icon next to each of the Precedent Constraint.
7. Next, Set the Precedent Constraint between Email Task and the task that should be executed when more than 0 records are transformed. This step should be ignored, if SSIS has not to proceed further if 0 rows are returned.

After completing all the steps, the Control Flow Task will appear like:






Sep 21, 2009

Red Gate SQL Backup 5.4 vs SQL Server 2008 Backup


RedGate 5.4 Backup Vs SQL 2008 Backup Vs SQL 2008 Backup with Encryption

RedGate 5.4 Encrypted Compressed Backup

How it works?

1. Backup the database,
2. Compress the Database,
3. Encrypt the Database with a Password,
4. Password needs to be supplied while restoring/backing up the database, or password needs to be place in script in job.

Database size for testing

7 GB

Backup Time

                134 Secs

Backup Size

                715 MB

Restore Time

                247 Seconds.

Encryption

                Yes, 256 Bit.

Compression Level

                3 - Full (Available options, 1, 2, 3)

Advantages

1. Encrypts the Backup file, not the database.
2. Flexible Compression levels can be used.
3. Compression Ratio is large than SQL Compression.
4. Backup can be split into multiple files.

Limitations

                1. Requires to buy another License.
2. Password is displayed on backup/restore script, anyone can see the password, who has access to view the jobs.

Tested Script

DECLARE @exitcode int
DECLARE @sqlerrorcode int
exec master..sqlbackup N'-SQL "BACKUP DATABASES [myDB]  TO DISK = ''C:\SQL Data\backup\'' WITH NAME = '''', DESCRIPTION = '''', ERASEFILES = 5, COMPRESSION = 1, THREADS = 1,MAILTO = ''DBATeam@Domain.com''"', @exitcode OUTPUT, @sqlerrorcode OUTPUT
IF (@exitcode <>0) OR (@sqlerrorcode <> 0)
BEGIN
  RAISERROR ('SQL Backup job failed with exitcode: %d  SQL error code: %d', 16, 1, @exitcode, @sqlerrorcode
END

SQL 2008 Ent Full Backup Compressed

How it works?

1. Backup the Database,
2. Compress the Database,
3. Password Protection is available, however without encryption. Open URL below for more information - http://www.mssqltips.com/tip.asp?tip=1108

Database size for testing

7 GB

Backup Time

120 Seconds

Backup Size

865 MB

Restore Time

246 Seconds

Encryption

No.

Compression Level

Full, Only Option

Advantages

1. No Need to purchase extra License, this is part of SQL 2008 License.

2. Password protection is available to backups.

3. Supported by Microsoft as part of Microsoft Product.

4. Compresses the Database.

Limitations

1. Does not support Encryption.

Tested Script

BACKUP DATABASE MyDB TO DISK= 'C:\SQL Data\backup\myDB.bak'
WITH COMPRESSION
GO

RESTORE DATABASE  MyDB FROM DISK = 'C:\SQL Data\backup\myDB.bak'
WITH RECOVERY,
MOVe 'DATA1' TO 'C:\SQL Data\Test\DATA1.mdf',
Move 'LOG1' TO 'C:\SQL Data\Test\LOG1.ldf',
Move 'INDEX1' To 'C:\SQL Data\Test\INDEX1.mdf'
GO

SQL 2008 Ent Full Backup Encrypted Compressed

How it works?

1. SQL 2008 supports Encryption at database level and encryption has to be enabled first time using the following steps.

     a. A database certificate has

          to be created followed by a Master Key.

     b. Using certificate, Database Encryption Key

          can be created to make database encrypted.

     c. Encryption can be enabled on database at

         this point (CPU Intensive task).

2. Backup the Encrypted Database.

3. Compresses the database.

Database size for testing

7 GB

Backup Time

218 Seconds

Backup Size

                4.73 GB

Restore Time

                337 Seconds

Encryption

                Yes, 256 bit.

Compression Level

                Full, Only Option

Advantages

1. No need to display the password on restore command or backup command. SQL Server uses the inbuilt database certificate for security.

2. No Need to purchase extra License, this is part of SQL 2008 License.

Limitations

1. CPU Intensive Task.

2. Backup Compression ratio least of other two option.

3. In case certificate is lost, we can lose the Data.

Tested Script


--1. CREATE MASTER ENCRYPTION KEY
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '*****'


--2. CREATE DATABASE CERTIFICATE
CREATE CERTIFICATE dbCERT WITH SUBJECT = 'Database Certificate TDE'

--3. BACKUP CERTIFICATE
BACKUP CERTIFICATE dbCERT TO FILE = 'C:\SQL Data\DB CERTIFICATE\db_bkupcert_1.cert'
WITH PRIVATE KEY(
 FILE = 'C:\SQL Data\DB CERTIFICATE\db_bkupcert_1.cert', ENCRYPTION BY PASSWORD = '*****');

--4. CREATE DEK -- DATABASE ENCRYPTION KEY
CREATE DATABASE ENCRYPTION KEY
WITH ALGORITHM = AES_256
ENCRYPTION BY SERVER CERTIFICATE dbCERT


--5. ENABLE ENCRYPTION ON SELECTED DATABASE
ALTER DATABASE myDB SET ENCRYPTION ON

--6. ABOVE COMMAND WILL START A BACKUP PROCESS OF ENCRYPTING DATABASE, MONITOR THE STATUS OF ENCRYPTION PROCESS USING
--FOLLOWING COMMAND.

SELECT db_name(database_id), encryption_state,percent_complete
FROM sys.dm_database_encryption_keys

Sep 8, 2009

Improving database performance with Table Partitioning


Table Partitioning is new concept introduced in MS SQL 2005. Partitioning is concept of taking performance benefit from SQL server by splitting the tables in multiple parts, on the basis of values in pre-defined column. However in old days of sql (6.5-2000), good developers knew how to take benefit from sql by splitting tables. They used to create multiple tables instead of one table and writing check constraint to group data in tables. Let me try to explain you this concept using following example:


create table Sales_1
(
SKU varchar(20),
SaleDate datetime
)

alter table Sales_1
ADD CONSTRAINT sales_date_1 check(SaleDate < '2009-01-01')

create table sales_2 (
SKU varchar(20)
SaleDate datetime
)

alter table Sales_2 ADD CONSTRAINT sales_date_1 check (saledate >= '2009-01-01')

We have two tables, sales_1 and sales_2, on each table we have setup the check constraint. On one of the tables we added constraint to keep the records prior to 2009 and in other table we are keeping records where sales belong to 2009.


create view sales
as
select * from sales_1
union all
select * from sales_2

We created a view sales, in which we used union to get combine the rows from both of the tables. If you query data for date prior to 2009, it will hit only sales_1 table, and otherwise it will hit sales_2. This approach is when we think of archiving data, since in this case, we almost archived our prior to 2009 data and that table will not get hit, unless we write query to do that.

The only limitation to this approach was to inserting rows using views.

In sql 2005, it is quite easy and systematic approach to achieve the same thing, but using Table Partitioning. By default sql creates one partition for each table in database, however more than one partitions can be added to tables. Logically speaking partitioning is a horizontal split of rows. Each set of rows will be saved into certain partition that matches the value in specified column on the basis of certain rule.

Let's say, In your database you have data for ten years with million of rows, and you can split the data (rows) into ten different partitions and each partition belongs to each year. Thus it logically makes it well organized and also boosts the performance. This is importance to know that to get performance benefit, partitioning should be applied to large tables, where performance is generally poor, while selecting and storing data.

Well, once partitioning is in place, you don't need to worry about putting data into selective partitions, or while selecting data you don't need to worry about writing more than one queries. It would be SQL's duty to take care of partitioning rules.

Prior to creating partition on tables, two things needs to be followed, First creating function and second creating a scheme on basis of function. Once partition function and scheme are in place, last step will be to create table on the basis of function and scheme.

Partition Function



In simple terms, a partition function defines the logical boundary for the data. Let's take the same example as I took for partitioned view. I want to distribute the sales information of year 2008-2009 in first partition, 2009-2010 in second partition. To do this, first I need to decide which data type need to be considered to partition. Of-course in this case, it will be datetime datatype. One thing needs to be understand here that Partition Functions are independent of any tables. Partition functions can be referenced by any table.


Now I know that I have to make the groups or boundaries, i.e. 2008-1-1 to 2008-12-31, then 2009-1-1 to 2009-12-31. There are two type of partition functions, Left Range Functions and Right Range Functions.


Left Range

Data equivalent to the boundary will move into the Left Partition. If left boundary is defined as the first boundary in the range then all data equivalent to the boundary or less than the boundary will move into the first partition. If we create a Left Range Function with boundaries of 2008-12-31, and 2009-12-31, then it creates three partitions. Partition 1 - values equal to 2008-12-31 or less than this. Partition 2 - values equal to 2009-12-31 and greater than 2008-12-31. Any value greater than 2009-12-31 will go into partition 3.


CREATE PARTITION FUNCTION part_sales(DATETIME)
AS RANGE LEFT FOR VALUES
('2008-12-31','2009-12-31')
GO



Right Range

The only difference between Left Range and Right Range is that using Right Range data equal to the boundary will move into the next partition. To accomplish the same result as Left Range, the Right Range can be defined in following way.


CREATE PARTITION FUNCTION part_sales(DATETIME)
AS RANGE RIGHT FOR VALUES
('2008-1-1', '2009-1-1')
GO


Partition Schemes

This is not all. Once partition function is created, next step is to decide in to which file group partitioned data should go. In the function we created three different boundaries and to gain highest performance out of the partition function, we need to create three different file groups. Let's assume that I have created three different file groups, fg1, fg2 and fg3 on three different storage arrays. Partition scheme decides which file group the data should go for each partition.


CREATE PARTITION SCHEME ps_Sales
AS PARTITION part_sales
TO(fg1,fg2,fg3)
GO





Partition Table

Once the partition function and partition schemes are created, next step is to create the partition table. A partition table must be created with non clustered primary key. The clustered index while adding to the table, will partition the table on the basis of partition scheme.

CREATE TABLE sales(
order_id INT NOT NULL PRIMARY KEY NONCLUSTERED,
productID int NOT NULL,
Qty INT NOT NULL,
Sale_agent VARCHAR(50) NOT NULL,
Sale_Final_Date datetime NOT NULL
)

CREATE CLUSTERED INDEX idx_Sales_Date
ON sales(sale_final_date)
ON ps_Sales(sale_final_date);


When you query the partition table, no special query is required. Query Optimizer it self choses the right partition to display the results.

I will write more blogs on partitioning that will show how to edit the partition, removing partition etc.

Optimizing Indexes with Execution Plans

Contact Me

Name

Email *

Message *