The Challenges of Big Data
- Published by Alexandre Langlois
We have been hearing of Big Data for several months now with definitions that vary depending on the context. In fact, during my analysis of the subject, I was able to determine that the problems of Big Data are the challenges created by the mass of information generated by the information ecosystem affecting the analytical performance required by the company.
There are possible limitations at the source extraction, data transformation, integration with the MDM layer, response time of the analytical layer and physical limitations of the network levels, among others.
Whether we are talking about appliances (Netezza, SAP HANA, Teradata) or analytical layers (Tableau, Actian) promoting the processing of vast amounts of data, these components do not solve the problem from beginning to end. Big Data is not a big data warehouse or an analytical tool on steroids – it is more than that. Big Data is not a component from any architecture – it is an issue addressed by an architecture solution specific to the challenges faced by the problem. This difference is slight, but important.
Above all, be careful not to confuse performance problems related to deficient solution architectures that can be solved with existing technology with the challenges of Big Data.
You probably already know that the volume of information has been growing exponentially for several years. As stated by IBM in a paper on Big Data, "Everyday, we create 2.5 quintillion bytes of data - so much that 90% of the data in the world today has been created in the last two years alone". This becomes more and more real every day with the ever-growing use of data collected by social media and online games as well as the use of WebTV and mobile devices – in short, any data that would allow the creation of a consumer profile for our virtual identity. I am not excluding the other cases that could lead to use of Big Data such as the information environments of large financial firms and retail chains, but the situations that lead to problems related to Big Data in these contexts are rather rare.
As stated by Duncan Steward in “Deloitte's 2012 TMT prediction ", Big Data will expand in 2012, mainly for Internet companies and the public, finance, retail and media/entertainment sectors. So there will indeed be a market for Big Data, but it will be very targeted.
A concrete example
In late July 2010, Amazon created an informational link between the recommendations of its site and the data available on Facebook from your profile, your friends, your interests and your country of residence to generate content for your virtual existence. Say it is your friend's birthday. You may have forgotten about it but not Amazon. It will suggest a gift for that friend according to his/her interests. Imagine the amount of information that must be collected in order to create an analytical environment allowing an analysis of every Amazon customer and their Facebook links. Each time a comment is published, a friend “likes” or “dislikes” something or anything else happens, all this information must be indexed and made available for analysis. We are talking about real Big Data.
There are many other examples including CastleVille, a free Facebook game from Zynga, which has already reached 12.5 million players – that is a very large number! The whole business model behind this game is based on using the information available on the players to convince them to buy virtual items to enhance their gaming experience. The producer has every reason to analyze in real time the available information on the players to ensure maximum consumer usage (i.e. purchases and expenses) and a pleasant gaming experience, thus retaining customers and generating revenue. We are talking about "just in time" BI, supported by an architecture allowing the processing of Big Data. Both can go hand in hand.
Big Data is not for everyone
I have given you some concrete, real examples of Big Data to demonstrate one thing: the trend is real but does not necessarily apply to everyone. It is important to differentiate between a Big Data issue and a solution architecture issue. The answers required in both cases are quite different. Certainly, there will be more and more players in the world of Big Data looking for solutions to their problems, but remember that the basis of an investment remains its return on investment (tangible or not). I think it is relevant to recall the spectacular failures of large data warehouse projects back when they were fashionable. The organizational (or analytic) maturity, the added value and the expertise available must all be taken into account prior to the implementation of architecture allowing the processing of Big Data, in order to avoid ending up with a... Big Bertha.
Interested in receiving the latest news about analytics and Big Data? Provide us with your email to receive the blog articles of our experts.