But with so much hype around the subject of big data, a few myths have been propelled into existence.
1. Everyone is ahead of the game
Interest in big data is definitely at a record high, with 73 per cent of the organisations surveyed in recent Gartner research investing or planning to invest in them. Note that everyone is still planning to do so and are still in the early stages of adoption. In fact, only 13 per cent of those surveyed deployed these solutions.
2. A huge volume of data makes individual data quality flaws insignificant
Many IT leaders believe that individual data quality flaws don’t influence the overall outcome because each flaw is only a tiny part of the mass of data in their organisation.
But this simply isn’t true. Ted Friedman, vice president of Gartner, explains that “although each individual flaw has a much smaller impact on the whole dataset than it did when there was less data, there are more flaws than before because there is more data. Therefore, the overall impact of poor-quality data on the whole dataset remains the same.
“In addition, much of the data that organisations use in a big data context comes from outside, or is of unknown structure and origin. This means that the likelihood of data quality issues is even higher than before. So data quality is actually more important in the world of big data.”
3. Big data tech enables organisations to read the same sources using multiple data models
It is believed that this flexibility enables end users to determine how to interpret any data asset on demand. However, most information users rely on “schema on write” scenarios in which data is described, content is prescribed, and there is agreement about the integrity of data and how it relates to the scenarios.
4. Building a data warehouse is a pointless exercise
Although information management (IM) leaders consider building a data warehouse to be time-consuming, the fact is that many advanced analytics projects DO use a data warehouse during the analysis. When they don’t, IM leaders must refine new data types that are part of big data to make them suitable for analysis.
5. Data lakes are replacing data warehouses
It’s misleading for vendors to position data lakes as replacements as a data lake’s foundational technologies lack the maturity and breadth of the features found in established data warehouse technologies.
“Data warehouses already have the capabilities to support a broad variety of users throughout an organization. IM leaders don’t have to wait for data lakes to catch up,” said Nick Heudecker, research director at Gartner.