For a SQL server DBA – How to start Azure Learning – Part 4 (Database Sharding)

In continuation to my previous post , I will write about SQL Database sharding(For SQL Azure databases, PaaS) in this post. As we know, we have various tiers of SQL Azure databases :

Service Tier/Performance Level DTU MAX DB Size Max Concurrent Requests Max Concurrent Logins Max Sessions Benchmark Transaction Rate Predictability
Basic 5 2 GB 30 30 300 16,600 transactions per hour Good
Standard/S0 10 250 GB 60 60 600 521 transactions per minute Better
Standard/S1 20 250 GB 90 90 900 934 transactions per minute Better
Standard/S2 50 250 GB 120 120 1,200 2,570 transactions per minute Better
Standard/S3 100 250 GB 200 200 2,400 5,100 transactions per minute Better
Premium/P1 125 500 GB 200 200 2,400 105 transactions per second Best
Premium/P2 250 500 GB 400 400 4,800 228 transactions per second Best
Premium/P3 1000 500 GB 1,600 1,600 19,200 735 transactions per second Best

If we see in the above table, we have a limit on the size of the database (Maximum size if 500 GB) . What If , the database size which has to be hosted in SQL Azure is bigger than 500 GB or you need more throughput to handle more transactions.

Generally, for on-premise servers – when we need more throughput , we generally talk about two things:

1. Scale Up – It means we add more hardware for better throughput i.e. add more CPUs/RAM/Disk space as per the requirement.
2. Scale Out – It means move some of the workload to other server e.g. moving read only traffic to AlwaysON secondary server.

For Azure servers – the definition is:

1. Scale Up – It means we move to higher tier for better throughput e.g. if Standard/S3 was being used then upgrade the tier to P3. But sometimes moving to the highest tier is not enough and we don’t have any option to add hardware.
2. Scale Out – It means move some of the workload to other server/databases i.e. splitting the databases and routing the traffic appropriately for the great performance.

Sharding comes into picture when we want to scale out or when there is no further option of scaling up. if you read my first post, I talked about declaring the data types wisely and that’s just because of the reason of keeping the space utilization optimal otherwise, if you hit the maximum limit of the tier then you may need to move to higher performance tier (if below 500 GB) else scale out and go for Sharding.

With Sharding , we create multiple databases on Azure Server/s – and move the data based on a shard key. It’s logically same as table partitioning and the only difference is instead of keeping the data under same table, the data is actually moved under different databases. There is a component known as Shard map manager which helps to route the database queries to appropriate shards.

Note – Earlier, federations were the options for scaling out but along with Web and Business edition, federations will also expire.

Let’s see how Sharding can be enabled. There are two major areas:

1. Development for Shards
2. Administration/Management of Shards

Under Development for Shards, the options are:
1. Shard Map Management –> Define Groups of Shards for your application.
2. Data Dependant Routing –> Route incoming requests to the correct shard
3. Multi-Shard querying –> Running a query across multiple shards

Under Administration/Management of Shards, the options are:
1. Shard Split/Merge –> Grow/Shrink capacity by adding or removing the databases
2. Shard Elasticity –> Dynamically adjust scale factor of the databases e.g. change the performance tier from standard to Premium or vice-versa

Actually, everything is very well documented already so, I won’t re-invent the wheel and will share the good reference material on this topic.

For :

Under Development for Shards, the options are:
1. Shard Map Management –> Define Groups of Shards for your application.
2. Data Dependant Routing –> Route incoming requests to the correct shard
3. Multi-Shard querying –> Running a query across multiple shards

Please refer this link with the step by step configuration for sharding. Once the recommended package is downloaded, everything is just right there. Please ensure to have Visual Studio 2012 or higher with C# installed:

Shard_studio

If you want all this using PowerShell , you could download the package from https://gallery.technet.microsoft.com/scriptcenter/Elastic-Scale-Shard-c9530cbe

Under Administration/Management of Shards, the options are:
1. Shard Split/Merge –> Grow/Shrink capacity by adding or removing the databases – Check https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-overview-split-and-merge/
2. Shard Elasticity –> Dynamically adjust scale factor of the databases e.g. change the performance tier from standard to Premium or vice-versa – https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-elasticity/

Everything related to Sharding is mentioned on this link: https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/.

Reference Links –
Video – https://channel9.msdn.com/events/Ignite/2015/BRK3568
Text – https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/

In the next post, I will discuss about how Azure DB monitoring is done and how it’s different from on-premise monitoring.

HTH!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s