Why Hadoop is complicated, how to approach it more simply

As written previously Toby Wolpe, Gartner released a study of members of the Research Circle, which shows that the adoption of Hadoop by enterprises is not at the height of media buzz.

For almost any new technology, there is generally a gap between what technology journalists and analysts imply that everyone makes this technology and that everyone really made of this technology.

Despite the finding of a Gartner study that, in the words of Toby, “Only 26% deploy, or piloting Hadoop already experimenting,” finally is a very promising figure. Indeed, the legacy of Hadoop is that of a specialist tool, not a business tool. Things change, but the process is far from over. Knowing this, a penetration rate of 26% is pretty good and will only get better.

Hadoop and the current database

Last week during the conference Ignite Microsoft, the Redmond company announced the next version of SQL Server 2016 (read the article by Mary Jo Foley), future version of its popular relational database management system ( RDBMS). An important part of this announcement was that POLYBASE, which serves as a bridge between Hadoop and SQL Server will be available in the main version of SQL Server, not only in the version “Analytics Platform System” and “Azure Data Warehouse” based on cloud (the latter having been announced only the previous week).

In other words, Microsoft offers the ability to map data stored in the Hadoop Distributed File System (HDFS) as external tables in SQL Server and make this possibility available as a feature to customers of the RDBMS ‘company. We must not forget that SQL Server is one of the first RDBMS on the market in terms of installed units and revenue. This is not anything to allow everyone in this vast ecosystem to access data in Hadoop, using the skills they have (that is, the programming language and Transact SQL queries).

Viewpoint opposite

The contrary interpretation of the Gartner study considers that Hadoop is somewhat feeble. But what is feeble, but rather the willingness of companies to invest in new high-end skills to be low productivity implies working with Hadoop via its various command line shells and languages script. A good data engine must operate behind the scenes, and not under the limelight. SQL Server technology polybase

Microsoft is only an architectural approach for Hadoop is a powerful solution instead of something that customers should look closely and own personally.

There are other approaches, both from the point of view of the implementation of a Hadoop cluster that working with Hadoop. Companies such as Qubole and AltiScale meet this first point and, in a less abstract measure, it is also the case of Amazon Web Services, Microsoft and Google. Other products and tools take into account the frontal system Hadoop, sometimes with a SQL interface, sometimes without.

Hadoop is definitely there

Storing data in HDFS can be very interesting from an economic point of view. In many ways, HDFS is the flagship application of Hadoop. If it were not for that reason, Hadoop is already here to stay. But thanks to mature analytical tools for Hadoop, the abstraction layers of the DBMS and Hadoop cloud offerings as a service that Hadoop will become usable for the majority of technology users. It is not by reducing them to a mode terminal window character platform disappear.

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS