TechED 2013 Ramp – Do You Have Big Data?
TechED is only a week away and as I’ve noted in a previous post, I have several sessions. I thought I would spend the week running up to TechED 2013 North America wetting your appetite for the presentations either in person or on Channel 9. Today I’ll discuss the session titled Do You Have Big Data? The session is on Tuesday June 4th, 2013 at 8:30 am. If you are attending TechEd, please shake off the previous night on Bourbon Street and head over early to this session on Big Data.
Let me start off with the obvious by saying it would be a real let down if I had a session called Do You Have Big Data? and it turned out I thought the answer was no. So yes, I think most organizations have it and they just haven’t begun the journey in earnest yet. What the heck is Big Data you say?
Traditionally we describe it along the three V’s of Velocity, Volume, and Variety.
- Most define Velocity as the speed of which data arrives. I personally think this belies the issue as we can handle a great deal of velocity by this definition with traditional data handling methods. But if you define Velocity as the speed at which you can respond to the the data, then that is a different question.
- Volume is usually defined as the total amount of data stored. The problem is how much Volume is needed to qualify as big? I would say its variable on both the other V’s along with your capability to handle it with traditional means.
- Variety is usually described as a data set in which you don’t know the structure of the data as it arrives. Typically I think of documents and web logs as examples that many customers may be encountering.
Some say if you meet two of the three V’s, you have Big Data. Meh.
The problem with the Volume, Velocity, Variety description is that you could be talking about
The same could be said for Water, Humans, or a many other objects.
How does any of that help you explain to Business Users how Big Data can solve a problem they have? I don’t have a clue. We are doing what us technologists do all the time…answer a technology question with technical answers. But that doesn’t tell us what something is. So let’s come at it from another angle.
Organizations have been collecting data for a long time now. Data about Customers, Sales, Marketing, Inventory, Production Processes, Fleet Management and on and on. They have struggled for years creating data warehouses attempting at getting to the single version of the truth. Companies have sprouted Business Intelligence organizations to provide insight into what is the state of the business. Generally BI Reports tell us what happened. This is called Performance Analytics and it is a good goal for any organization to have but the reality is these reports simple replaced reporting an organization previously did within ledgers. Maybe the reporting is slightly prettier, but I probably see as many legacy reports from customers that look like they are straight ports of a ledger report from 1985 than i see really exciting BI Reports that tell me a story.
Predictive Analytics, on the other hand, is a game changer for businesses. Predictive Analytics is all about determining causality between events. If you listen carefully enough, your treasure trove of data will speak to you.
- If I market to a customer, will he buy?
- If I use certain words in my online data profile will that improve my response rate?
- What products are customers likely to buy during a hurricane watch?
Big Data is a New Paradigm
A different way to think about Big Data is that it refers to things you can do at scale that cannot be done with traditional data methodologies. By leveraging your historical data and combining it with additional data sources, you can create a new form of data in a way that allows the business to see its markets, customers, or prices in a new light.
What can you do with your data so that you can improve your business? Look at the questions above and answer this question: Do you know how to answer these questions that might be relevant to your organization? I understand the derivation may be different, but you should get the idea by now. Let me give you a hint: You’ll need to usually combine multiple data sets and here is where the essence of Big Data lies.
Big Data is the ability to use the three V’s to your advantage. Use the Volume, Variety, and Velocity to be able to answer new questions that will improve your business. How do you answer what products will customers buy during a hurricane watch? Well you’ll need lots of historical purchase information and mash that up with some weather data. Walmart determined that in the United States one of those items is Pop Tarts. Now they place them prominently at the front of the store to make it easier for us to get a sugar high during the stress of hurricane season.
In my session at TechED we will discuss these concepts in more depth. We’ll then spend a good deal of time walking through examples of what other organizations are doing with Big Data. We’ll explore the algorithms that Data Scientists use to answer these questions. Finally we’ll discuss what tools Microsoft is making available to democratize this process and make it easier to be a practitioner in this field.
If you missed it, you can catch the recording on Channel 9 here.