Embrace NoSQL technologies as a relational Guy – Document Store

This is the one of the most famous NoSQL technologies today. In Document Store we don’t store word/PowerPoint/Excel Spreadsheets but majorly JSON documents.

I am sure you have heard about XML (Extensible Markup Language) documents, it’s simply used as intermediate data when we want to communicate with various devices/applications.e.g. if .net based application wants to communicate with Java based application or even C language based device – they can send XML documents to share the data across each other. JSON(JavaScript Object Notation) is based on similar concept but it’s more optimized and easier to use hence largely accepted as a standard of communication.

 

image

 

The reason why JSON has become so popular , with websites like Flipkart,Amazon or even gaming companies or cross device communication, is the ease to use and high performance while working with application of large scale.

Let’s take an example of an application which stores data from a sensing device. It hits thousands of time within seconds and moreover, the data transmission is in the form of JSON. If the data is stored in the RDBMS system,with every hit the JSON will have to converted to a plain data and inserted into the table. If it’s to be done thousands of time per second , there will definitely be an overhead. There will be an issue with the scalability of this system. How about saving the data in Document Store in the form of JSON itself and then if required just query to fetch the results.

Another example is, the shopping websites or blogging websites or even online book library where we don’t want to be schema bound and there is no need to maintain relationships between data, Document Store DBs play a vital role. The release cycle of the application becomes shorter due to schema free architecture and lesser need to perform database impact analysis.

Just for information, If you have heard about Polyglot persistence based applications e.g. shopping websites / Online Libraries or event video libraries , these kinds of services use multiple database systems e.g. for product displays they use JSON (Document store) , for transactions they may use RDBMS , for relationship data they may use graph DB etc. . We have lots of flexibility these days to leverage various DB systems to cater to different needs.

DocumentDB, MongoDB , CouchDB , RavenDB are key players in the market for Document Store. DocumentDB  and Mongolabs (MongoDB on Azure) are two managed services that can be hosted in Azure PaaS platform. However, MongoDB , CouchDB and RavenDB can be installed on bare metal machines.

image

For a free Demo of DocumentDB, check this URL – http://www.documentdb.com/sql/demo#

image

We will discuss about DocumentDB in detail in the future posts. First, I will try to finish the introduction and use of all NoSQL databases types.

HTH!

Advertisements

Embrace NoSQL technologies as a relational guy! Intro.

I have been writing in the form of series quite a bit. Recently, I wrote a series of posts for SQL Azure DB which really helped the readers to understand the subject. Even, I had to write the same series for SQL on Azure VMs but somehow I couldn’t finish that. Hopefully, I will finish that in the near future.

For now, lets talk on what’s NoSQL all about. There has been lots of discussion and publicity around this subject over the past few years and it’s gaining lot of popularity because of it’s efficiency. If you are in the world where SQL/Oracle/DB2 etc. are the only resorts for the data storage, then you really need to upgrade. It doesn’t mean RDBMS systems like SQL/Oracle/DB2 etc. are going out of trend , it just means that now for different needs we need to pick different database systems. People no longer rely just on RDBMS systems.

I have been reading some of the really amazing blog posts,written by David Campbell. For now, I’m going to diverge a little bit from the subject. Using the technique mentioned by David , I will explain the bigger picture.The best way to understand where the data world is going, is to understand the below mentioned data categories:

1. Operational Data
2. Analytical Data
3. Streaming Data

image

Operational Data – Operational data is the data used by the applications to maintain their state e.g. Payment data/Customer information like we have on OLTP or non-transaction systems. However, slowly people realized the real value of historical data which could be used to  understand the trends for higher customer satisfaction or for building the business strategies. That’s how all the technologies for data warehousing started gaining traction.

Analytical Data
–  This is the read only data , analyzed using Data warehouse or Big Data systems to understand the business trends and historical data. Due to the huge volume of the data , Big Data is gaining the trends as it needs extreme hardware capacity to analyze PBs of data.However, for the smaller systems traditional OLAP systems work perfectly fine.

Streaming Data
– In this modern world, people want analytics in the real time e.g. fitness tracker on the people’s wrists, toll payment devices in the cars or sensors on the oil well etc. One way is to store the entire data and then do the analysis but sometimes delay in the processing is not affordable e.g. if oil company wants to raise an alert if the pressure in the well is increasing or if you wanted to know how many cars passed through a specific city in last 30 minutes. There has to be a provision to read live stream and make sense out of that. This is the world of IOT.

Understanding the above terms , was really important for everyone working in the data platform. There is a plethora of companies working towards making data platform really a happening world. Have you heard about 3Vs of data – Volume , Variety , Velocity? In today’s world it’s difficult to manage these 3 Vs in RDBMS. When it comes to RDBMS, we talk about structured data in the form of Tables/Columns. Anything and everything you want to store in the RDBMS systems has to be stored in the form of Tables/Columns. How about the data like flat files/JSON/ telemetry data where there is no structure? As I shared in the beginning, for different types of data we need different database systems.

In a nutshell, there are five major categories of database technologies:

1. Relational databases (RDBMS)
2. Document Store
3. Column family Store
4. Key Value Store
5. Graph Databases

We will discuss each of the category for the NoSQL in detail, in upcoming posts.

HTH!

Disclaimer: The views expressed on this website/blog are mine alone and do not reflect the views of my company. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.