Blog Archives

IoT messaging systems

RabbitMQ QPID HiveMQ Mosquitto Kafka Redis Focus broker, enterprise, multi protocoll broker, enterprise small footprint, non relaible networks small footprint, non relaible networks high througthput in memory db + messaging (text only) Semantics queue (standart, priority), pub/sub queue (standart, priority,

Veröffentlicht in Allgemein, BigData, Messaging

Hadoop stream processing

Apache Storm Spark Apache Samza Apache Flink Apache Apex Developer Hortonworks (Twitter) Databricks  LinkedIn dataArtisans DataTorrent Computation model Storm – streaming Trident – micro-batching Micro-batching Streaming Streaming or batching Streaming with time boundaries API Storm – programmatic Trident – declarative

Veröffentlicht in Allgemein, BigData, Java, Messaging

Hadoop file format comparison

Use case and environment IoT datalake use case. 6000 devices (with unique ID), measuring 3 values 60 time per second (60 Herz). One day of data (24 hours) – 31.104.000.000 records in database. Row in a table – [ID:int, timestamp:long,

Veröffentlicht in Allgemein, BigData, DWH

Hive vs Spak vs Impala

Hive 0.13 Spark 1.6 Impala 2.1 Support Hortonworks + Yahoo DataBricks + Yahoo Cloudera Cluster Management YARN YARN, Mesos, local YARN (Llama) Engine MR, Tez Spark impalad Where are tables stored HDFS HDFS (through Hive Metastore). Distributed shared object space

Veröffentlicht in Allgemein, BigData