How are the latest trends increasing demand for Data Scientists?
February 9th, 2023
In this age of big data, and with the need to make better relationships with customers, businesses are demanding faster and easier access to information in order to make reliable and smart decisions. In-Memory is an emerging technology that is gaining attention as it enables users to have immediate access to the right information which results in more informed decisions. I’ll try and explain some of the benefits of the In-Memory approach, some limitations, and how it has started to become more prevalent in the Database arena.
Firstly, In-Memory is not a new concept. Hasso Plattner and Alexander Zeier have mentioned in their study ‘In-Memory Data Management: An Inflection Point for Enterprise Applications’ that in-memory databases have been around since the 1980s. The problem has always been the cost and the limited amount of RAM database servers could use. For example, 2 megabytes of RAM in 1985 would have cost around £600. This is in direct contrast to current pricing, where you can pick up 8 gigabytes of RAM for well under £100.
Secondly, Modern operating systems such as Windows are now based on 64-bit architecture and are able to handle 1 terabyte of RAM with ease, allowing enterprise applications to be delivered on standard, off the shelf servers, without a mainframe in sight.
Simply, In-Memory databases reside in the RAM of the server they are installed on. This can provide performance improvements of up 10x – 100x over traditional databases systems that use disks to store their data. They are able to handle an incredibly high throughput of data due to this improved performance, and therefore answer queries in a matter of seconds that would have taken minutes or hours on standard tools.
Well, despite the drop in the price of RAM and the improvements in the processing capabilities of the server running the appliance, some things are still, indeed, finite. The biggest issue for the in-memory challenge is the amount of memory a server can physically have installed. In-memory vendors have resorted to several ‘workarounds’ to this problem:
In fact, a number of vendors actually combine some or all of the above into their solutions, addressing the scalability issues found within in-memory architecture.
Ok – so the problem is solved from a technology perspective – so what is really holding people back?
When dealing with the world of Big Data – and by that I mean billions of rows of data from multiple systems, you require ‘big hardware and big ticket software’ to match. The leading lights (and by that I mean Oracle, SAP, IBM etc) in the database world often hold their cards close to their chests on this, but hardware and software alone can easily reach several hundred thousand dollars, if not more – and that doesn’t include installation, support or training. It is also worth remembering that the database is only one piece of the bigger puzzle, with other technologies required for processes such as Extract, Transform and Load (ETL), data cleansing, manipulation and data mining.
Which brings us onto the second point – ease of use. While the business is certainly demanding faster access to data, they still have to ask their IT department to help get to that data. This is because the Business User (who has the domain expertise, and therefore should have control of the data) does not have the programming expertise to use the tools. This means despite the technological advances made, the process for data led solutions remains the same.
Business requirements gathering is a lengthy and arduous task, the requirements change often and the results don’t always hit the mark. As In-Memory technology is often forced to look at aggregated data (for the reasons stated above) – we can see that it still comes with traditional problems associated with any analytical implementation.
Solutions are coming to the market that will combine the advances in technology with an approach that gives them access to their raw data, and create applications that meet the business needs first time. This ‘Data Driven’ approach will allow the business to get more from their data in a shorter timeframe.
These solutions also encompass new ways of storing and analysing data, using new algorithms and architectures (such as a ‘Pattern Based architecture’) to shrink the data by over 30 times – allowing for huge datasets to be analysed within the confines of a single server.
However, Relational databases will continue to dominate the marketplace until the perception of In-Memory (which can be difficult to deploy and expensive to maintain) changes to be one of speed, ease of use and low cost. This can only be achieved by integrating new tools for delivery, ones which enable the business user to reap the rewards of improved access to mission critical data.
The time is fast approaching when technology will have an answer to these problems, so in future; the question will be ‘When will In-Memory take over the world?’ – and that point is not far off.
Matthew Napleton of Data-Re
Biography
Since graduating in 2003 Matthew has been involved with both selling and marketing cutting edge Enterprise technology solutions to the Retail, Logistics, Finance and Government sectors.
Over the last 6 years at Data-Re he has witnessed first-hand the explosion in enterprise data, the challenges that it brings, and how new technologies and approaches can help alleviate these issues – demonstrated by their unique solution, the ZiZo Data Platform. For more information please visit www.datare.co.uk
KDR Recruitment is the home of the best data, technology and analytics jobs. For more data news and views follow KDR on LinkedIn.