After returning from Strata + Hadoop World in New York, I am struck with how mainstream Big Data has become. No longer the playground of only a few quiet people with black shirts and ponytails, it has broadened to include business users, leaders, and yes, of course, plenty of engineers, architects, and Data Scientists. Amidst the hype of Big Data, I’ll offer 3 main takeaways from the conference.
1. Strata + Hadoop World growth may be keeping pace with data growth.
Two or three years ago, the conference drew about 500 people; a size that a hotel in midtown Manhattan could accommodate. This year had 5,500 attendees, and required moving the conference to the Javits Center, a huge convention center on the City’s west side. There were a multitude of sessions with 11 concurrent tracks. It was an almost dizzying array of choices of which session to attend at a given time slot. Although the main industry firms (and vendors for that matter) claim there is huge opportunity with Big Data, there are still plenty of naysayers insisting it’s a flash in the pan, or only for an elite minority. This explosion in attendees (10x in 2 years) signals the attraction of Big Data and that there is a growing contingent of supporters, and those who believe in the value Big Data brings.
2. White-Hot Big Data Market
Several years ago, the lone conference message board was populated with notes about finding people’s lost keys, coats, and the like. Then when someone hand-wrote that they wanted to hire a Data Scientist, others quickly followed suit as I mentioned in a previous blog post.
This year, there were two large boards, specifically for job postings. Companies posted typed job openings, their business cards, and of course hand wrote jobs on the board, divided into areas such as Data Scientist, Data Engineer, Architect, and Business. Clearly, companies are actively looking for people and held many simultaneous recruiting sessions at rooftop bars in the city, during the day and evenings. The competition for talent is intense, with friends of mine commenting that they were invited to several recruiting happy hour sessions by internet giants at once (decisions, decisions…).
3. The Future is Memory
Many people still equate Big Data with Hadoop. That’s so three years ago! The trends have moved from Hadoop (think Apache, Pivotal HD, Cloudera, Hortonworks), to SQL on Hadoop (HAWQ, Impala, Stinger, Hive), to the current trend of In-Memory computing (Spark, Gemfire, Tachyon, and many other new entrants).
Spark, an in-memory database architecture designed for very high processing performance, was a huge focus at the conference, and I expect it to be for the next few years. In fact, the Strata talks were organized into groups related to Hadoop, Business, and Spark.
As Spark is emerging as an open source standard for in-memory computing, expect other enterprise-grade versions of in-memory databases (and related in-memory management tools, such as Tachyon) to come to the forefront, as happened with Hadoop. More to come on this in my next post.