Big Data 18 min read

Choosing an IoT Big Data Platform: Hadoop vs TDengine and Other Time‑Series Databases

This article examines the challenges of selecting an IoT big‑data platform, compares traditional real‑time databases, Hadoop‑based solutions, and modern time‑series databases such as TDengine, InfluxDB and ClickHouse, and provides practical case studies and criteria for making an informed choice.

DataFunTalk

Oct 30, 2019

Choosing an IoT Big Data Platform: Hadoop vs TDengine and Other Time‑Series Databases

The presentation introduces the rapid growth of IoT data, distinguishing static and dynamic (time‑series) data, and explains why traditional real‑time databases struggle with horizontal scaling, outdated architectures, weak analytics, and lack of cloud support.

It outlines the four‑step IoT/Industry 4.0 data pipeline—data collection, edge computing, storage/query/compute, and application delivery—highlighting the importance of edge processing and cloud data engines.

Traditional real‑time databases (e.g., OSIsoft PI) are described, followed by their limitations: no horizontal scaling, costly hardware, limited analytics, and no PaaS capability.

The article then discusses generic big‑data solutions using Hadoop, Kafka, Spark/Flink, and Redis, noting their suitability for massive batch processing but high cost and complexity for smaller deployments.

Several real‑world case studies are presented:

Smart‑park power monitoring system using an outdated real‑time database and Oracle for history, suffering from limited historical analysis and upgrade difficulty.

Vehicle telematics data warehouse built on Hadoop, facing high hardware and maintenance costs, poor real‑time query performance, and scaling challenges.

Industrial equipment management system combining Kafka, Redis, relational DB, and Cassandra, with fast writes but slow queries.

High‑frequency factory data acquisition requiring 20 ms sampling and massive throughput, suggesting TDengine or Prometheus.

Electric‑vehicle real‑time detection system that outgrew MySQL and switched to TDengine for better write performance and scalability.

Key selection criteria for time‑series databases are summarized: high‑performance real‑time writes and queries, columnar storage for structured data, unique per‑sensor data streams, minimal updates/deletes, efficient expiration handling, write‑heavy workloads, stable traffic patterns, and support for massive data volumes.

The article concludes with a checklist covering performance gains, business value, total cost of ownership (hardware, operations, development), and emphasizes testing specific pain points to choose the optimal IoT big‑data platform.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

TDengine Time Series Database IoT Hadoop database selection

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.