Big Data and analysis IoT: gateways are thrown

The Internet of Things (ILO) is expected to record strong growth, with 4.9 billion devices connected this year 2015 and over 25 billion by 2020, says Gartner. Revenues generated by the IoT’s products and services are expected to exceed $ 300 billion in 2020. The impact is already visible in most sectors – whether in safety, health, environment, etc. However, because of the very high volume of data made available by those connected objects, the link with the Big Data solutions is now clear.

Thus, the IoT can “overcome noticeable problems in the ‘data mining’ Because it provides new methods for the collection, analysis and effective use of these data,” the note Gartner analysts. The Internet of Things (IoT) is announced as the new technological revolution. The solutions will therefore evolve and expand rapidly, with new contributions in Open Source; but not only.

Building a ‘stack’ analytics for IoT

The main bricks to build a software stack (stack) IoT are already largely if not totally available. Thus, for the data transfer protocol (via a wireless connection -Wifi, Bluetooth or cellular LAN), one often cites MQTT (Message Queue Telemetry Transport) but XMPP.

must then store all these data collected. The lesson Hadoop Hive or the NoSQL database (or Couchbase, for example, that brings high-speed and low network latency). The addition of additional data can be performed with the screen including Hadoop HDFS. Messages can also be pushed and distributed via a ‘broker’ messages such as Apache Kafka, or the parentage MQTT we find mosquitto. For transfers messages or ‘workflow’, we can also use motors “real time” in processing workflows, such as Apache Storm. We are at the heart of Open Source environments.

Whatever the protocol selected, the workflow of event messages and notifications will run smoothly. These data will be stored advantage at source if one wants to ‘debug’ or restart sequences for testing. For we must be able to make changes, alterations or enhancements – some data may be missing or replaced by others.

In the environment Hadoop

These are some benchmarks. Data storage also opens the possibility of further analysis at a later date, using various tools of your choice. There are a wide range in Hadoop environment, such as Pig, for example.

Recent software solutions within this line, HPe introduced this summer a new version of Vertica’s Big Data platform. In this version codenamed ‘Excavator’, the functionality of streaming data has been enriched and the log search capabilities in text files, especially for IoT devices. This version supports Apache Kafka. With its Open Source message distribution system, it paves the way for monitoring and process control deployment in the industry, health or finance.

The search logs thus opened will allow to collect and organize very large data sets generated by applications or systems, which broadens the application failures or prevention of cyber attacks and detecting unauthorized access.

In parallel, the publisher confirmed its commitment by announcing the Big Data Haven Startup Accelerator program to allow developers access to its ‘libraries’ community Vertica. It was also announced native integration with Apache Vertica Spark and compatibility with SQL requests from the Hadoop environment. It will be possible to build data models in Spark then run in the engine of Vertica analysis.

Clearly, the border between owners universe and Open source will continue to open.

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS