Embrace NoSQL technologies as a relational Guy – Graph DB

It’s a fact that NoSQL technologies are growing at a rapid pace. I even heard someone saying NoSQL is old now , NewSQL is the new trend. NewSQL gives performance of NoSQL and follows ACID principals of RDBMS system. Anyways, lets focus on Graph DB for now.

Graph databases are specialized in dealing with relationship data. They specialize in finding the patters and relationships between the certain events or employees of organization or certain operations. It can help to make the application more interactive by suggesting more options based on the previous patterns of browsing or shopping.

Have you noticed:

1. Facebook offers option of suggested friends

2. LinkedIn offering suggested connections or connections from the same organization-that’s the use of Graph Databases.

3. Flipkart/Amazon offering “people also viewed” (Real time recommendations) options help you purchase more products.

4. Master data management where based on the support case, knowledge base article could be suggested for the faster resolution.

5. Dependency analysis of shutting down an IT operation i.e. users which may be impacted if this router is shut down for maintenance. It can help to send the advanced notification to those employees.

Graph DBs are being used for all of the above scenarios. Just check the below picture to see how relationships look like:


Neo4J is one of the best Graph DB companies today. The language used for Neo4J is Cypher. It’s being used largely by the major Tech./healthcare/manufacturing companies.

Please check this video for more details about relationship and properties. It’s series of videos which you could look for to understand more about this subject.



Community Event – HADR with SQL server 2016 AlwaysON – Part 2

After the discussion on the AlwaysON FCI in this post, the focus was on AlwaysON AG. The biggest USP for AlwaysON AG was – unlike AlwaysON FCI , it didn’t need any shared storage.

It’s build on Windows Cluster (For failover) and enhanced database Mirroring (for data replication) features. That’s the reason it’s most usable feature of SQL server 2012/2014.

The only caveat which we had was, the supportability of distributed and cross database transactions(However, cross database reference on the same instance is still not supported). It’s now supported on SQL 2016 AlwaysON AG hosted on only windows 2016 and windows 2012 R2 (KB3090973 installed) + WITH DTC_SUPPORT = PER_DB. Check this article for more information.

The reason of not supporting the distributed and cross databases transactions was – there was no functionality with DTC to support the AlwaysON AG cross database and distributed transaction. Therefore, either SQL database with lower DBID or the SQL instance itself used to play the role of transaction coordinator. During the failover, there were the challenges to locate the transactions for roll forward/back. This article explains this problem in detail.

Let’s move on to AlwaysON AG jargon :

1. Availability replica  -> Instances involved in AlwaysON
2. Availability Databases –> Databases being replicated for AlwaysON
3. Availability Group listener –> It’s like a SQL cluster Network Name which is used for the connection to the SQL server. if failover happens client is still able to connect without changing the connection strings.
4. Availability Mode – Synchronous and Asynchronous (Same like mirroring)

For AlwaysON setup check this article – http://blogs.technet.com/b/canitpro/archive/2013/08/20/step-by-step-creating-a-sql-server-2012-alwayson-availability-group.aspx

1. The new change in SQL 2016 AlwaysON is, now we can have 3 synchronous replicas out of 8 total replicas. Prior to SQL 2016, the count of the replicas was same but we could have 2 synchronous replicas only.

2. Due to the one additional AlwaysON synchronous replica, we can now have failover to next secondary if the first secondary is not able to bring database online due to some reason.

3. Prior to SQL Server 2016, we could route our read traffic to one secondary replica at a time. Through read only routing, we could define which secondary will take over the load if the current secondary is unavailable. Starting SQL 2016 , we can load balance the routing of the read workloads. The load will be distributed across the selected replicas in the round robin fashion.

4. Starting SQL server 2016 we have got Improved health monitoring. Now the failover will happen depending on database health degradation. In the earlier versions, it was based on instance level failover policies.

5. SQL Server 2016 support Group Managed Service accounts (gMSA).The advantage of gMSA over MSA is, it can be mapped to multiple machines and can support Kerberos authentication.


PPTs can be found here – SQL 2016 AlwaysON AG enhancements  AlwaysON Basics

Community Event – HADR with SQL server 2016 AlwaysON (Part 1)

There have been lots of requests from community in Delhi to talk about HADR with SQL Server. Finally, we managed to get the event organized based on the theme of HADR using SQL 2016 AlwaysON. AlwaysON is in the market since SQL server 2012 but the recent enhancement of supporting the distributed/cross databases transactions has made this feature even more usable. Agenda of the event was as follows –

1. Basics of Windows Cluster and the quorum models – Yogesh Arora
2. Basics of AlwaysON FCI (SQL Cluster) and flexible failover policy using Sp_server_diagnostics – Satya Prakash
3. AlwaysON architecture and Internals – Udhay Bhanu Pathania
4. AlwaysON enhancements in SQL 2016 – Sumit Sarabhai
5. How Split-Brain Scenario is avoided by AlwaysON – Sumit Sarabhai
6. Setup a Geo Cluster – Harsh Chawla | Twitter , Blog

To cater to the entire audience, we started the session with the Windows clustering concepts. In that session, the discussion was around:

1. Purpose of windows cluster
2. Various Quorum models and how it helps with automatic failover :

  • Node and File Share majority
  • Node and Disk Majority
  • Majority Node Cluster
  • No Majority Cluster – Disk only

3. purpose of votes in the quorum. Refer this article for quorum models/votes.

After a brief discussion on Windows Cluster, there was a discussion on SQL Cluster and changes in SQL 2016 (post SQL 2008):

1. Flexible Failover Policy – Before SQL 2012, the failover used to be dependant on Isalive and Looksalive checks done by resource DLL and there was very less control with the administrator except changing the timeout values of the above two parameters. Starting SQL 2012, the administrators have more control as there’s an option of flexible failover policy. The resource availability is now being checked using sp_server_diagnostics and failover can be customized depending on the level selected.


For more information on flexible failover policy,please check – https://msdn.microsoft.com/en-us/library/hh710061.aspx

2. Starting SQL 2012 , we have AlwaysON (integrated solution for HADR) which has two parts:

1. AlwaysON FCI (Failover Cluster Instance) formerly known as SQL server failover Cluster
2. AlwaysON AG(Availability Group)

Just to setup the base, AlwaysON FCI is a SQL Cluster as HA solution which needs shared storage to function. The disk ownership stays with the primary node and as soon as the failover happens, the storage ownership changes to the current active node.

3. Another enhancement in AlwaysON FCI is, Multi-subnet cluster. With SQL server 2008 R2 FCI – there was no option for cluster based on IPs from the different subnets. To build Multi-site cluster, stretched V-LAN was required. All the nodes from the cluster had to be mapped to this V-LAN. After the failover the same IP address was taken over by the node in the different subnet. However, SQL 2012 onwards, multi-subnet cluster is supported with multiple IPs and the resource dependency for the virtual network name can use ‘OR’ relationship on these IPs.

4. Clustering on SMB fileshare and CSV storage – SAN is not the only option for shared storage for AlwaysON FCI now, SMB fileshare and CSV storage can also be used to build the SQL cluster.

5. Domain independent availability groups – Starting SQL 2016, SQL cluster nodes can reside on different domains or even no domains at all using Windows 2016.Imagine the cluster in workgroups that’s really cool.

Please find the PPTs for this session attached. I will write about AlwaysON in the next post.


PPTs can be found here – Windows Cluster basics , AlwaysON FCI

Query Writing Guidelines for SQL Server DBAs and Developers

Working on SQL Server Performance and Tuning for last 8 years has given a great experience to understand the common practices at customers’ end. Recently, I was looking at the list of companies in India I have worked with ,  during my tenure at Microsoft is around 48. I wanted to share the common trends and my observation with you through this post. It may be a great help to sort out really common unknown issues on the job as a SQL DBAs and developers.

I have observed a disconnect between DBA and Developer team most of the times. The common dialogues which I have heard from both the DBAs and Developers are:

1. Developers don’t know how to write a query
2. Developers create indexes without knowing their impact
3. Developers never see execution plan of the queries before releasing those on the production server

1. DBAs don’t do nothing 🙂
2. No visibility to the queries performing bad
3. Even if we have the list of queries, we don’t know how to tune them
4. What to look in the execution plans
5. No time for query tuning because of the strict deadlines

If we see all this from the top, there’s noone at fault. For DBAs, they are agitated because entire blame is put on them and for developers, they have to listen to the music from the DBA team. It’s all about educating the developers, on how to write the optimal query, the repercussions of the bad code and how it can bring the entire system down. They need to try their best to avoid writing bad code and there has to be efficient review process in place.

For DBA’s , they need to help the developers for the SQL query writing/tuning education and the challenges they face if the queries are bad. While all that is done, the DBA also need to tune the bad code so that, the application can perform optimally. Moreover , if possible have a monthly session on what queries were tuned and the benefit it yielded for the application. The bad queries should be shared with the developers team on regular intervals so that, those can be tuned/written by the developers. All the process has to function in tandem and wherever, I have observed a disconnect between the teams – the duration of SQL Server downtime is higher.

One big improvement point which I have observed for both the teams can be, the deep conceptual knowledge of how indexing works. That’s the reason for having multiple sessions and blog posts on indexing itself.


All in all, if developers start writing optimal queries – there was lesser chances of performance degradation due to application code and lesser the DBAs need to work for query tuning. There should be monthly sessions between DBAs and Developers on the application queries tuning and writing education. it will not only help the teams to work in tandem but will also help to resolve the teams the common challenges they face during the query writing/tuning process.

We had organized a session on , Optimal Query Writing which was delivered by my colleague Sourabh Agarwal. Please go though this post and let any of us know if you have any questions. It covers some best practices for query writing which can be really helpful for developers.

Check the summary of entire sessions on Query writing and tuning here:


The views expressed on this website/blog are mine alone and do not reflect the views of my company. All postings on this blog are provided “AS IS” with no warranties, and confers no rights

For a SQL server DBA – How to start Azure Learning – Part 4 (Database Sharding)

In continuation to my previous post , I will write about SQL Database sharding(For SQL Azure databases, PaaS) in this post. As we know, we have various tiers of SQL Azure databases :

Service Tier/Performance Level DTU MAX DB Size Max Concurrent Requests Max Concurrent Logins Max Sessions Benchmark Transaction Rate Predictability
Basic 5 2 GB 30 30 300 16,600 transactions per hour Good
Standard/S0 10 250 GB 60 60 600 521 transactions per minute Better
Standard/S1 20 250 GB 90 90 900 934 transactions per minute Better
Standard/S2 50 250 GB 120 120 1,200 2,570 transactions per minute Better
Standard/S3 100 250 GB 200 200 2,400 5,100 transactions per minute Better
Premium/P1 125 500 GB 200 200 2,400 105 transactions per second Best
Premium/P2 250 500 GB 400 400 4,800 228 transactions per second Best
Premium/P3 1000 500 GB 1,600 1,600 19,200 735 transactions per second Best

If we see in the above table, we have a limit on the size of the database (Maximum size if 500 GB) . What If , the database size which has to be hosted in SQL Azure is bigger than 500 GB or you need more throughput to handle more transactions.

Generally, for on-premise servers – when we need more throughput , we generally talk about two things:

1. Scale Up – It means we add more hardware for better throughput i.e. add more CPUs/RAM/Disk space as per the requirement.
2. Scale Out – It means move some of the workload to other server e.g. moving read only traffic to AlwaysON secondary server.

For Azure servers – the definition is:

1. Scale Up – It means we move to higher tier for better throughput e.g. if Standard/S3 was being used then upgrade the tier to P3. But sometimes moving to the highest tier is not enough and we don’t have any option to add hardware.
2. Scale Out – It means move some of the workload to other server/databases i.e. splitting the databases and routing the traffic appropriately for the great performance.

Sharding comes into picture when we want to scale out or when there is no further option of scaling up. if you read my first post, I talked about declaring the data types wisely and that’s just because of the reason of keeping the space utilization optimal otherwise, if you hit the maximum limit of the tier then you may need to move to higher performance tier (if below 500 GB) else scale out and go for Sharding.

With Sharding , we create multiple databases on Azure Server/s – and move the data based on a shard key. It’s logically same as table partitioning and the only difference is instead of keeping the data under same table, the data is actually moved under different databases. There is a component known as Shard map manager which helps to route the database queries to appropriate shards.

Note – Earlier, federations were the options for scaling out but along with Web and Business edition, federations will also expire.

Let’s see how Sharding can be enabled. There are two major areas:

1. Development for Shards
2. Administration/Management of Shards

Under Development for Shards, the options are:
1. Shard Map Management –> Define Groups of Shards for your application.
2. Data Dependant Routing –> Route incoming requests to the correct shard
3. Multi-Shard querying –> Running a query across multiple shards

Under Administration/Management of Shards, the options are:
1. Shard Split/Merge –> Grow/Shrink capacity by adding or removing the databases
2. Shard Elasticity –> Dynamically adjust scale factor of the databases e.g. change the performance tier from standard to Premium or vice-versa

Actually, everything is very well documented already so, I won’t re-invent the wheel and will share the good reference material on this topic.

For :

Under Development for Shards, the options are:
1. Shard Map Management –> Define Groups of Shards for your application.
2. Data Dependant Routing –> Route incoming requests to the correct shard
3. Multi-Shard querying –> Running a query across multiple shards

Please refer this link with the step by step configuration for sharding. Once the recommended package is downloaded, everything is just right there. Please ensure to have Visual Studio 2012 or higher with C# installed:


If you want all this using PowerShell , you could download the package from https://gallery.technet.microsoft.com/scriptcenter/Elastic-Scale-Shard-c9530cbe

Under Administration/Management of Shards, the options are:
1. Shard Split/Merge –> Grow/Shrink capacity by adding or removing the databases – Check https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-overview-split-and-merge/
2. Shard Elasticity –> Dynamically adjust scale factor of the databases e.g. change the performance tier from standard to Premium or vice-versa – https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-elasticity/

Everything related to Sharding is mentioned on this link: https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/.

Reference Links –
Video – https://channel9.msdn.com/events/Ignite/2015/BRK3568
Text – https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/

In the next post, I will discuss about how Azure DB monitoring is done and how it’s different from on-premise monitoring.


RCA of SQL server Database went offline after transaction failed to rollback

Recently, I was working a scenario where we had to find out the cause of the database went offline during rollback of a transaction.

As usual first thing was to check the SQL error log where we found below errors:

Error: 9002, Severity: 17, State: 4. 2012-05-12 14:01:22.02 spid56      The transaction log for database ‘Test’ is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

Here is when the transaction logs were full and caused the failure of rollback of the transactions. Errors as follows:

2012-05-12 14:01:22.04 spid56      Error: 3314, Severity: 21, State: 4. 2012-05-12 14:01:22.04 spid56      During undoing of a logged operation in database ‘Test’, an error occurred at log record ID (88888:1000:101). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.

Below errors show the reason of the shutdown of the database which intern caused the transaction rollback to fail:

2012-05-12 14:01:22.05 spid56      Database Test was shutdown due to error 9002 in routine ‘XdesRMReadWrite::RollbackToLsn’. Restart for non-snapshot databases will be attempted after all connections to the database are aborted.

The issue of database shutdown was caused due to the transaction log running out of space. The reason why SQL shuts down the database is to avoid violation to ACID properties and to avoid database inconsistancy.

Here are more details about the transaction failure errors :- http://msdn.microsoft.com/en-us/library/ff713991(v=sql.100).aspx

Another scenario where we saw the same issue: – http://blogs.msdn.com/b/sqlsakthi/archive/2012/05/08/inf-what-happens-to-a-database-when-transaction-rollback-fails.aspx