create a new kind of enterprise search-enabled content management. Today, under new CEO Meg Whitman, the company revealed some of the fruits of that acquisition, including support for the Hadoop big-data file system and a new product intended to help marketing folks better understand who uses their websites.
Last summer, when HP gave up on smartphones to place a $10.3 billion bet on semantic-search company Autonomy, then-CEO Leo Apotheker's bold plan was toThe killer app for big data, according to HP ? and thus for Autonomy ? is consumer sentiment analytics: the ability to glean from the Web's enormous collection of textual communication the gist of what buyers are saying about a given product or service. And HP isn't the only one moving in this direction. Recently, SAP has recast its HANA in-memory database system as an analytics tool. Radian6 has launched its cloud-based service from Salesforce's cloud, and startups have been trying to eke out space for themselves, as well.
Pop IDOL
In this context, Autonomy's crown jewel is a system called Intelligent Data Operating Layer (IDOL), which is a deeply rooted semantic network. Unlike Google, whose indexing is based on pattern-matching coupled with categorical contextualization, IDOL is a network of weighted associations between passages of text ascertained to have similar meanings. In fact, back when it was autonomous, Autonomy's management marketed IDOL as "the meaning network."
Based on today's announcements, HP's plan for the analytics space now looks something like this: Rather than rely on simple database queries and transforms that comprise the nucleus of an in-memory database, Autonomy will market IDOL as a data discovery tool for researching and ascertaining the meaning of big data as it's being collected. To ensure Autonomy's connection to big data, HP also announced it will support the Hadoop file system. Reflecting the many mergers and amalgamations that culminated in this product, the result is being called (are you ready?) HP Autonomy Optimost Clickstream Analytics.
Autonomy acquired Promote Multichannel Technology in 2007, giving it a foothold in social media and consumer interactions. The new HP plan puts the man in charge of this division - Andrew Joiner, whose title remains CEO of Autonomy Promote - in the point position. In an interview with ReadWriteWeb, Joiner said that what distinguishes Clickstream from its in-memory competition like SAP HANA and VMware SQLFire are the way it ascertains the meanings behind unstructured data and places a value on those meanings - value that may be helpful in making important real-world assessments.
"Imagine if you're a Wall Street firm, and you're communicating with customers, subject to a whole host of regulations and corporate [policies]. If you're trying to assess your risk, you cannot only look at the structure side of the equation," Joiner says. "Just because someone sends 50 messages more than another person doesn't necessarily mean that individual has more risk. You have to look at a combination of things: Potentially that person's sending them late at night, and they're a registered person, and they're talking about these types of topics, calling me on a nonreported line, calling me on my personal cell phone - that's a measure of risk. So you have to look at when that person [talks] and who that person is, but then combine that with the topics of the information that are contained in that email, which is the human-friendly information. The combination gives you an assessment of risk."
We've seen use cases in which a system ascertains which customers are making the most revealing comments about a product through Twitter or Facebook. But this is something else, which is why Joiner's product purview has been described using the word "surveillance." Is the information you're reading about a company in which your firm has a significant investment actionable? IDOL is designed to answer questions like that.
In a case like this, Joiner says, businesses want to be able to ascertain the relative value of the information they receive based on the behavior of the people providing that information. "What are they interested in today? What have they shared with me today? Can I leverage that information to make my website, the [phone] call with me, my engagement with social media more personalized, dynamic and relevant?"
Who Develops the Analytics Reports?
Will there be applications that Clickstream Analytics users will be able to use right out of the box (to borrow a phrase from the days when software came in boxes) that will enable enterprises whose data is already stored using Hadoop to generate meaningful analyses?
"Because it's geared towards marketers, we tried to remove a lot of the technical orientation," Joiner says. "In other words, we're trying to ensure that, from a deployment perspective, it can be done in the cloud, and they can get up-and-running with clicking and tracking very quickly." There are a few components involved, he continues. One is called the universal tracker, which is used on websites to track user activity. That activity is stored within the HP Vertica infrastructure (which is part of yet another HP acquisition). That information by itself will be enough to generate results using raw data. From there, he goes on, the analytics application answers some of the most difficult questions immediately.
"For instance, we'd like to have a better understanding of who is coming to your website - not what they're doing or what they're actually clicking on, but who those individuals are. We're taking socio-demographic information about your users - behavior that we can learn. What are they reading on the page? That information then becomes part of the corpus of analysis, so you can ask simple questions of [what will make] your website more dynamic and more personable... the real questions that you're trying to ask, that traditional structured analytics can't tell you."
But what is it about the quality of Clickstream's analysis that renders the speed advantage of an in-memory database moot? "Essentially what an in-memory database gives you is a simpler answer, faster," the division CEO says. "There are a couple of things that give those types of [in-memory] platforms a lot of difficulty. One, they're not capable of dealing with audio or video or the rich forms of information that are out there - a whole class of information that exists in rich media, that they're not able to analyze. They can't tell you what customers are calling about in the contact center, they can't tell you what they're sharing in a social channel that includes video. Secondly, if the data is changing - and that's frequently what's happening in social media - to have to structure that information and write explicit queries against it, you're simply going to miss the insights. You have to have something that naturally, statistically derives the meaning of that information such that you can continue to ask the evolving questions."
The Information-Centered Economy
So from Andrew Joiner's perspective, has HP's mission for Autonomy changed - perhaps been fine-tuned since the tumult at the top replaced Leo Apotheker with Meg Whitman? "I believe the macro-vision hasn't and shouldn't change. For the past 30 or 40 years, we've only been focused on the technology side of the equation. And rightfully so - we've had nine evolutionary changes from mainframes to client/server to PCs to mobile and cloud. But if you look at where we're going in the next 30 to 40 years, it's going to be driven around information. The ability to derive and understand information is going to be your source of competitive advantage."
He discussed the rapid evolution within organizations of professionals whose entire job function is the examination and analysis of information - including in the sales department analyzing customers, in the marketing department assessing social media activity, and in the legal department analyzing corporate policies and international regulations and variations on risk. "You have throughout your organization a tremendous amount of individuals dealing with information, and it's a source of advantage if you can process it effectively. I don't believe that macro-vision has changed. What's changed is, HP now has all the components to actually deliver that promise of information."
Photo above: Rafael Matos driving the HP IndyCar in Sao Paulo, Brazil, in 2009. [Photo credit: HP]