Sunday, January 3, 2016

Big Data: Is it 3 Vs, 4 Vs or 5 Vs?

Big Data is not a new term for anyone now and the definition of it with widely-known 3 Vs or 3 main attributes has already been discussed in many forums in detail and they are well known. Although it is described with famous three attributes; Volume, Velocity, and Variety, few more attributes are discussed in some forums by experts, highlighting the significance of them, probably marking them as more important than initial attributes. Since this is still being asked during sessions, thought to put them in simple way for explaining all 5 Vs.

This was considered as the key for Big Data, The Big part was added because of vast amount of data generated by many sources every second that makes traditional approaches and technologies used for processing data, powerless and weak. The world of data being generated daily started with Megabyte to Gigabyte, and within a shorter period, now it is more about Exabyte to Zettabyte. This vast amount of data is generated with 1.49 billion monthly active users in FB, 175 millions tweets per day and much more. However, modern big data technology can process data which was either ignored or discarded before.

This is all about speed of data generation. It is not about hundreds of transactions generated in a second but thousands, perhaps millions of transactions generated with continuous stream, and capture them for analysis without even storing them in a database, and then use for making a decision related. Again, big data technologies have given us many ways for handling velocity which was previously either impossible or hard-to-process.

We are familiar with structured, neat data, easy to capture, easy to hold and of course not that hard to process them for analysis. But it cannot be forgotten that, 80%-90% data is unstructured and always ignored because of difficulties related to capturing, holding and processing. Now Variety: Different form of data, dataset like, sentiment data, clickstream data, photos, videos, documents can be easily handled using modern big data projects.

Here is something we do not discuss much with big data. This is all about uncertainty of data, trustworthiness of data. Because of the volume, velocity and variety, a quality data set cannot be expected, accuracy is not guaranteed, and in most cases, it is not considered. However, we can work with such data with big data technologies and it helps us to handle it too.

Perhaps, the most important V. Since we love modern technologies, we always get ourselves trapped with buzz words and initiate solutions based on them without knowing whether the solution adds something valuable to the business or not. Make sure your big data solution gives something significant to the business, it allows you to see the insight, that is Value of big data.

No comments: