Top Big Data Influencers of 2015

2015 was an exciting year for big data and hadoop ecosystem. We saw hadoop becoming an essential part of data management strategy of almost all major enterprise organizations. There is cut throat competition among IT vendors now to help realize the vision of data hub, data lake and data warehouse with Hadoop and Spark.

As part of its annual assessment of big data and hadoop ecosystem, HadoopSphere publishes a list of top big data influencers each year. The list is derived based on a scientific methodology which involves assessing various parameters in each category of influencers. HadoopSphere Top Big Data Influencers list reflects the people, products, organizations and portals that exercised the most influence on big data and ecosystem in a particular year. The influencers have been listed in the following categories:

Analysts
Social Media
Online Media
Products
Techies
Coach
Thought Leaders

Click here to read the methodology used.

Analysts:
Doug Henschen	It might have been hard to miss Doug Henschen writing for InformationWeek. With his accomplished media experience and proven expertise in industry analysis, Doug has now joined Wang at Constellation Research talking about big data. His current focus areas include good data, streaming, cloud solutions and self-service of data.
Merv Adrian	The saner voice on big data in the important research firm Gartner, Merv Adrian makes sure we make sense out of the dichotomy between data warehouse and data lake. He understands the breadth and depth of Hadoop ecosystem and provides the vision to cross the hype.
Tony Baer	When you talk to Tony Baer, don’t expect rebel thoughts just plain incisive wisdom unravelling with each statement. More prose looking like poetry, the analysis casts an indelible effect on your understanding of the big data ecosystem. He remains top of the Hadoop analyst game for many years in a row now.

Social Media:
Bernard Marr	Bernard Marr is an author, speaker and consultant with wider interests in strategic performance, analytics, KPIs and big data. He is the founder of Advanced Performance Institute and provides consulting to various organizations. Bernard has a massive following on Twitter and his LinkedIn posts' generate huge interaction and interest.
Cloudera	Cloudera is the market leader in Hadoop distros and at the same time continues to influence social media followers. It may not have the most number of followers compared to other companies but most of it’s messages gets the right amplification and impact. Kudos to Cloudera social marketing team.
Gregory Piatetsky-Shapiro	As the President of KDnuggets, Gregory is a founder of KDD (Knowledge Discovery and Data mining conferences). His social media messages attract the right amount of traffic and eye balls making him one of the most relevant social media influencers.


Online Media:
O’Reilly Media	O’Reilly Media is a diversified group now with interests ranging from books to blogs, webinars to conferences. With Strata Hadoop World as one of its most visible product now after books, O’Reilly media is definitely shaping up the big data opinion in the industry.
TDWI	With research papers, blogs, webinars and education events, TDWI continues to attract impressions and leads for marketers.
The Cube	The Cube is a pioneering online television series filmed at various industry events. It brings the best minds on the show speaking up the future of big data. Chic image setting television, it boasts of the CxO speakers like no other forum can.


Products:
Actian Vortex	Actian Vortex is one real sharp SQL in Hadoop product which brings the best of database SQL to Hadoop and YARN world. With innovative engineering under the hood to support ACID transactions and higher performance, it has motivated quite a few solutions in its arena.
Apache Flink	Apache Flink started off a research product and soon created a unique identity for its streaming capabilities. It has influenced quite a few features in other competing streaming products like off heap memory management, datasets and the like.
Kyvos Insights	Kyvos Insights is an OLAP product building cubes at big data scale while assuring low latency SLA on Hadoop. With pre-canned cubes, interactive queries on terabytes of data within 2 seconds is a real possibility and an eye catcher. As the trendsetter for cubes on Hadoop, it has inspired a few other imitations on its trail but none at par so far.

Techies:
Reynold Xin	As one of the co-founders of Databricks and Apache Spark, Reynold Xin continues to influence major innovations in Spark including Tungsten memory management, Dataframes and many more. Sharp and futuristic, he is a real tech force.
Roman Shaposhnik	With the Open Data Platform (ODP), Roman Shaposhnik has got a new home for corporate Hadoop and continues to lead the initiative magnificently. Pushing many other Apache projects alongside like BigTop and acting as mentors to others like Ignite, Roman emerged as a true tech leader in last year.
Todd Lipcon	When Todd brought HA to Hadoop, he brought Hadoop to the enterprise infrastructure. When Todd Lipcon has brought Kudu to Hadoop, he has brought Hadoop to the enterprise database. Believe it or not, but Todd has unassumingly and unwittingly become the enterprise champion for Hadoop.

Coach:
Paco Nathan	If you are looking for a Spark session in an industry event or on online resources, chances are you have may have attended one of Paco Nathan’s session. Evil mad scientist as he likes to proclaim himself, he is much more than Spark and lot of data science, maths, venture capital and learning coach among his many-many interests.
Shane Curcuru	Community over code and Apache open source over corporate proprietary, Shane Curcuru has been evangelizing Apache for years now. As one of ASF directors, he ensures Apache brand name is taken care of in right measure and the community driven projects get their right share of sun.

Thought Leaders:
Ion Stoica	As one of the main founders of Apache Spark, Ion Stoica already has rallied the entire data world around one product. However, his vison with Databricks does not seem to be confined to just a batch execution engine. It seems Databricks is out there to get a bigger share of data center with its cloud offerings and the innovations continue rolling in at an unprecedented velocity.
Mike Olson	As the Chief Strategy Officer and Chairperson of Cloudera, Mike Olson has made sure Cloudera remains at the top of the Hadoop game. Resisting off the market IPO or acquisition bait and maintaining the innovation path, he has been keeping Cloudera a steady ship. Open to disruptions like Spark and embracing partners, he has been one true leader who thinks and acts with vision and authority.

<< HadoopSphere Top Big Data Influencers of 2014

Data deduplication tactics with HDFS and MapReduce

As the amount of data continues to grow exponentially, there has been increased focus on stored data reduction methods. Data compression, single instance store and data deduplication are among the common techniques employed for stored data reduction. Deduplication often refers to elimination of redundant subfiles (also known as chunks, blocks, or extents). Unlike compression, data is not changed and eliminates storage capacity for identical data. Data deduplication offers significant advantage in terms of reduction in storage, network bandwidth and promises increased scalability. From a simplistic use case perspective, we can see application in removing duplicates in Call Detail Record (CDR) for a Telecom carrier. Similarly, we may apply the technique to optimize on network traffic carrying the same data packets. Some of the common methods for data deduplication in storage architecture include hashing, binary comparison and delta differencing. In this post, we focus o...

Ahna CarlsonApril 5, 2016 at 11:51 AM
Congratulations Merv Adrian, Tony Baer, Constellation, TDWI.
StaarwdAugust 4, 2018 at 10:48 PM
Monitoring and debugging of serverless computing can be tough.This is great blog. If you want to know more about this visit here Apache Hadoop Service.

HadoopSphere

Search HadoopSphere

Top Big Data Influencers of 2015

Analysts:

Social Media:

Online Media:

Products:

Techies:

Coach:

Thought Leaders:

Labels

Comments

Post a Comment

Popular articles

5 online tools in data visualization playground

Data deduplication tactics with HDFS and MapReduce

In-memory data model with Apache Gora