What is Big Data?
First of all, big data is not really a big data file, let’s say I have a file of 10,0000GB, is it big data? No, it’s just a big file.
Big Data actually refers to innovative ways of data processing.
Gartner’s definition of big data: Big Data is high volume, high velocity, and/or multiple information assets that require cost-effective and innovative forms of information processing to enhance insight, decision-making, and process automation.
Big Data is the process used when traditional data mining and processing techniques cannot reveal the insights and meaning of the underlying data. Relational database engines cannot handle unstructured or time-sensitive or very large data. Such data requires another processing method called Big Data, which uses massively parallel processing on easy-to-use hardware.
In short, Big Data reflects the ever-changing world in which we live. The more change there is, the more change is captured and recorded. Take the weather, for example. For weather forecasters, the amount of data collected around the world about local conditions is significant. Logically, it makes sense that local conditions determine regional effects, and regional effects determine global effects, but the reverse is also true. This weather data reflects in one way or another the properties of big data, where large amounts of data need to be processed in real-time and where large amounts of input can be generated by machines, personal observations, or external forces such as sunspots.
Handling information like this shows why big data is becoming so important
- Most of the data collected today are unstructured and require different storage and processing methods than traditional relational databases.
- The available computing power is growing by leaps and bounds, which means there are more opportunities to process big data.
- The Internet has democratized data and is generating more and more raw data while increasing the amount of available data.
Data in raw format has no value. Data needs to be processed to have value. However, there is a problem inherent in big data here. Is it worth it to process data from raw object format to usable insights? Or is there too much data with unknown values to justify the gamble of processing it using big data tools? Most of us agree that there will be value in being able to predict the weather, the question is whether that value will outweigh the cost of collating all the real-time data into a weather report that can be relied upon.