In continuation to my previous post , I will write about SQL Database sharding(For SQL Azure databases, PaaS) in this post. As we know, we have various tiers of SQL Azure databases :
Service Tier/Performance Level | DTU | MAX DB Size | Max Concurrent Requests | Max Concurrent Logins | Max Sessions | Benchmark Transaction Rate | Predictability |
Basic | 5 | 2 GB | 30 | 30 | 300 | 16,600 transactions per hour | Good |
Standard/S0 | 10 | 250 GB | 60 | 60 | 600 | 521 transactions per minute | Better |
Standard/S1 | 20 | 250 GB | 90 | 90 | 900 | 934 transactions per minute | Better |
Standard/S2 | 50 | 250 GB | 120 | 120 | 1,200 | 2,570 transactions per minute | Better |
Standard/S3 | 100 | 250 GB | 200 | 200 | 2,400 | 5,100 transactions per minute | Better |
Premium/P1 | 125 | 500 GB | 200 | 200 | 2,400 | 105 transactions per second | Best |
Premium/P2 | 250 | 500 GB | 400 | 400 | 4,800 | 228 transactions per second | Best |
Premium/P3 | 1000 | 500 GB | 1,600 | 1,600 | 19,200 | 735 transactions per second | Best |
If we see in the above table, we have a limit on the size of the database (Maximum size if 500 GB) . What If , the database size which has to be hosted in SQL Azure is bigger than 500 GB or you need more throughput to handle more transactions.
Generally, for on-premise servers – when we need more throughput , we generally talk about two things:
1. Scale Up – It means we add more hardware for better throughput i.e. add more CPUs/RAM/Disk space as per the requirement.
2. Scale Out – It means move some of the workload to other server e.g. moving read only traffic to AlwaysON secondary server.
For Azure servers – the definition is:
1. Scale Up – It means we move to higher tier for better throughput e.g. if Standard/S3 was being used then upgrade the tier to P3. But sometimes moving to the highest tier is not enough and we don’t have any option to add hardware.
2. Scale Out – It means move some of the workload to other server/databases i.e. splitting the databases and routing the traffic appropriately for the great performance.
Sharding comes into picture when we want to scale out or when there is no further option of scaling up. if you read my first post, I talked about declaring the data types wisely and that’s just because of the reason of keeping the space utilization optimal otherwise, if you hit the maximum limit of the tier then you may need to move to higher performance tier (if below 500 GB) else scale out and go for Sharding.
With Sharding , we create multiple databases on Azure Server/s – and move the data based on a shard key. It’s logically same as table partitioning and the only difference is instead of keeping the data under same table, the data is actually moved under different databases. There is a component known as Shard map manager which helps to route the database queries to appropriate shards.
Note – Earlier, federations were the options for scaling out but along with Web and Business edition, federations will also expire.
Let’s see how Sharding can be enabled. There are two major areas:
1. Development for Shards
2. Administration/Management of Shards
Under Development for Shards, the options are:
1. Shard Map Management –> Define Groups of Shards for your application.
2. Data Dependant Routing –> Route incoming requests to the correct shard
3. Multi-Shard querying –> Running a query across multiple shards
Under Administration/Management of Shards, the options are:
1. Shard Split/Merge –> Grow/Shrink capacity by adding or removing the databases
2. Shard Elasticity –> Dynamically adjust scale factor of the databases e.g. change the performance tier from standard to Premium or vice-versa
Actually, everything is very well documented already so, I won’t re-invent the wheel and will share the good reference material on this topic.
For :
Under Development for Shards, the options are:
1. Shard Map Management –> Define Groups of Shards for your application.
2. Data Dependant Routing –> Route incoming requests to the correct shard
3. Multi-Shard querying –> Running a query across multiple shards
Please refer this link with the step by step configuration for sharding. Once the recommended package is downloaded, everything is just right there. Please ensure to have Visual Studio 2012 or higher with C# installed:
If you want all this using PowerShell , you could download the package from https://gallery.technet.microsoft.com/scriptcenter/Elastic-Scale-Shard-c9530cbe
Under Administration/Management of Shards, the options are:
1. Shard Split/Merge –> Grow/Shrink capacity by adding or removing the databases – Check https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-overview-split-and-merge/
2. Shard Elasticity –> Dynamically adjust scale factor of the databases e.g. change the performance tier from standard to Premium or vice-versa – https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-elasticity/
Everything related to Sharding is mentioned on this link: https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/.
Reference Links –
Video – https://channel9.msdn.com/events/Ignite/2015/BRK3568
Text – https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/
In the next post, I will discuss about how Azure DB monitoring is done and how it’s different from on-premise monitoring.
HTH!