Tagged articles
6 articles
Page 1 of 1
Tencent Cloud Developer
Tencent Cloud Developer
Jul 2, 2024 · Big Data

Apache Flink Deployment with Pulsar Connector: Setup, Demos, and Best Practices

This guide shows how to deploy Apache Flink 1.17 in Docker, configure off‑heap memory, connect it to Pulsar via the 4.1.0‑1.17 connector, run example jobs that copy topics and perform windowed word‑count, and provides Maven dependencies, custom serialization tips, batching settings, and version‑specific best‑practice notes.

Apache FlinkDataStreamDocker deployment
0 likes · 20 min read
Apache Flink Deployment with Pulsar Connector: Setup, Demos, and Best Practices
ITPUB
ITPUB
Dec 14, 2023 · Big Data

How to Build a Python‑Hadoop Word Count on a Single‑Node Cluster

This step‑by‑step guide shows how to install and configure a single‑node Hadoop 3.2.0 environment on CentOS 7, set up Python 3.7, write MapReduce mapper and reducer scripts in Python, and run a word‑count job using Hadoop streaming, illustrating core Hadoop concepts and their relevance today.

HadoopMapReducePython
0 likes · 21 min read
How to Build a Python‑Hadoop Word Count on a Single‑Node Cluster
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 2, 2019 · Big Data

Understanding Hadoop MapReduce: Programming Model, WordCount Example, and Job Execution Mechanism

The article explains Hadoop's MapReduce framework as both a programming model and execution engine, detailing its map and reduce phases, the WordCount example code, job startup components, data shuffling, partitioning, and how large‑scale distributed computations are orchestrated across a cluster.

Big DataDistributed computingHadoop
0 likes · 10 min read
Understanding Hadoop MapReduce: Programming Model, WordCount Example, and Job Execution Mechanism