Essential Guide to Big Data: Part 1
By Mark Dexter
12th September 2011

Step forward “big data”, a term that has evolved to describe exceptionally large, diverse repositories of digital information and the processes organisations use to organise, search and extract the data they contain.

Never slow to spot a revenue opportunity, many of today’s IT giants are big on big data, especially where demand for massively parallel processing (MPP) systems capable of handling those huge databases dovetails neatly with these vendors’ existing server and storage hardware platforms.

Evidence of how highly those vendors rate that opportunity can be seen in EMC’s cash purchase of data warehouse specialist Greenplum last year, for example, with rival IBM acquiring business intelligence and analytics company Netazza for $1.7bn in September 2010, and HP grabbing Vertica for an undisclosed sum last March.

Many other companies, including Microsoft, Oracle, SAP and Endeca are looking to sell enhanced database, analytics and business intelligence tools based on the big data concept, though the very definition of the term tends to be manipulated to play to individual product strengths in each case, meaning big data remains a moving target in many respects.

The data deluge

Handling large data sets is certainly a real problem for many organisations, and it’s one that’s getting bigger by the minute. IDC’s last Digital Universe study estimated that the total volume of data being stored in the world will reach 35ZB (one zettabyte is equal to a trillion gigabytes) by 2020, although much of that will be stored in personal, rather than corporate, systems and not used for business analytics or reporting purposes.

Of more relevance to this particular discussion, perhaps, is recent research from McKinsey Global Institute (MGI) which estimated that organisations across nearly all sectors in the US economy had at least an average of 200TB stored somewhere within their IT infrastructure, with many storing more than 1 petabyte.

Complexity and speed

Some industry experts, including Microsoft CEO Steve Ballmer, believe that big data should focus less on size and more on the type of data being processed and analysed, including information stored outside the corporate firewall.

Data being searched for analytical and reporting purposes could be anything from internet text, search indexes, call records, medical records, digital images, high definition (HD) video archives, surveillance footage and e-commerce transactions, for example, as well as datasets created by academic, scientific and research departments or by development projects that process large volumes of information.

And all of that information could be unstructured, or distributed in flat schemas with little or no cross-reference relationships, and could also involve time stamped events extracted from log-files, sensors and social networks.

“The true challenge is not one of big data but the more complex issues across all dimensions of information management… variety, complexity and velocity of data are equally significant,” wrote Gartner analyst Stephen Prentice in a research note published in May.

Read more at source:


Currently there are no comments. Be the first to post one!

Post Comment


Top 5 articles of 2017

2018 is almost upon us, and now is a time for looking back over the year that has been 2017. Here at KDR we have a had a very busy year; with our brand-new expansion to the USA , setting up... Read More

Information Matters – Real-time analytics & consumer spending

We are very proud to announce the fourth issue of Information Matters ! As recruiters in the information management and data analytics industry we consider it vital to be in the know about issues and events facing our industry and your... Read More

How the evolution of AI is transforming the e-commerce industry

Artificial Intelligence has unleashed the power for e-commerce businesses to explore countless opportunities to dramatically improve customer experiences, generate new leads and better understand their customers. Businesses are continuing to evolve and are steadily incorporating Artificial Intelligence into their strategies... Read More

Why do I headhunt?

The data and analytics industry is a competitive market, with many of the best candidates not actively looking for roles. This means as a recruiter I have to search and sometimes ‘cold call’ the best candidates. Headhunting calls (and recruiters)... Read More

Where should we send our newsletter?