The core premise and significant positive business impacts of big data is a well documented and highly discussed topic. The implications of reduced time to decision and in some cases insight into previously unanalyzed trending and industry specific patterns are at the top of the list of the bulk of CXOs I speak with. As with most new technologies there are still a number of unanswered questions that need to be addressed before broad acceptance can occur. One of those areas of concern is focused on security.
Simply put there are two fundamental problems. First is that current Hadoop distributions are inherently insecure. Current distributions of Hadoop are based primarily on open source code projects that were not created with enterprise security in mind. Hardened security, compliance, encryption, policy enablement, and risk management are glaring holes in the security posture of the current code distributions. Secondly traditional security products and controls cannot scale to Hadoop velocity or volume.
Lets take a look at one of several use cases through a typical security lens. The Hadoop architecture leverages a highly distributed storage and computation framework. In typical enterprise software where functional nodes share workloads communication encryption and automated management of that encryption are leveraged. This fundamental function is not possible out of the box with most Hadoop distributions. To mitigate this issue, OS layer or third-party products are required, creating a separate silo for management and maintenance.
As business’s need for competitive advantage push more and more big data implementations, the ability to correctly secure these new environments will become paramount. In round two on this topic I will share with you a set of three more security issues with Hadoop and discuss how EMC is helping their customers mitigate the risk and gain business advantage in this exciting big data world. Trust me… this is not your father’s data warehouse.