With Hadoop
2.0's arrival, you will need to explain the benefits of the reigning big data
platform to business types and C-suite executives.
Big
data is a popular topic these days, not only in the tech media, but also among mainstream news outlets. And October's official
release of big data software framework Hadoop 2.0 is
generating even more media buzz.
But while you,
InformationWeek reader, clearly understand Hadoop's significance, there's a
high probability that many people in your organization -- including more than a
few managerial types in the C-suite -- aren't really sure what Hadoop is, what
it does, or why it's important.
So, how do you explain
Hadoop to non-geeks? One approach is to focus on the benefits of Hadoop and big
data, rather than providing mind-numbing details (with forgettable acronyms) on
how it all works.
Forrester
analyst Mike Gualtieri took this "benefits" approach in June when he
posted a brief tutorial video that
provided an easy-to-grasp overview of Hadoop. He calls it a platform that makes
big data easier to manage.
[Here's
why your business users may want to know more about Hadoop: Hadoop's Second Generation Offers More To Enterprises.]
"To understand Hadoop,
you have to understand two fundamental things about it," Gualtieri
explained in his video. They are: How Hadoop stores files, and how it processes
data.
He added: "Imagine you
had a file that was larger than your PC's capacity. You could not store that
file, right? Hadoop lets you store files bigger than what can be stored on one
particular node or server. So you can store very, very large files. It also
lets you store many, many files."
By focusing less on the
jargon of Hadoop and big data, and more on the platform's real-world benefits,
experts can effectively convey its value to business colleagues who do not have
data-science backgrounds.
Mike Gualtieri explains Hadoop in a video posted on the Forrester blog.
"Mainstream business
users don't need to know how Hadoop works," Gualtieri told InformationWeek
via email. "But they do need to understand that the constraints they once
had on storing and processing data are removed when Hadoop is installed."
As a result, "the
business can start thinking big again when it comes to data," he added.
The barrage of news reports
on all facets of big data, including its potential to fight various diseases,
reduce government bureaucracy, locate terrorists, and on a more mundane level,
help businesses sell more stuff, has helped introduce business people to
Hadoop, even though a lot more education is needed.
"There is less
confusion than there was 12 months ago," Gualtieri said. "Executives
just know that it is a big data technology, and that is enough for them."
OK, so what's this
"MapReduce" thing then? It's part of Hadoop too, right? As Gualtieri
explained in his video: "The second characteristic of Hadoop is its
ability to process that data, or at least (provide) a framework for processing
that data. That's called MapReduce."
But rather than take the
conventional step of moving data over a network to be processed by software,
MapReduce uses a smarter approach tailor made for big data sets.
Moving data over a network
"can be very, very slow, especially for really large data sets,"
Gualtieri added in the video. "Imagine if you're opening a really, really
big file on your laptop, it takes a long, long time. It takes much longer than
if it's a short, tiny file."
So rather than move the
data to the software, MapReduce moves the processing software to the data.
Hadoop is still very complex to use, but many startups and established
companies are creating tools to change that, a promising trend that should help
remove much of the mystery and complexity that shrouds Hadoop today.

0 comments:
Post a Comment