Databases 7 min read

Practical Guide to Querying HBase with Python happybase and JPype

This tutorial walks through setting up the Python happybase library, installing JPype for Java integration, and demonstrates end‑to‑end code examples for connecting to an HBase Thrift server, generating row keys via Java utilities, querying data, and handling type conversions.

Big Data Technology Architecture

Nov 11, 2019

Practical Guide to Querying HBase with Python happybase and JPype

1. Introduction

Python can interact with HBase through the Thrift protocol, which requires the HBase ThriftServer to be running. The happybase library provides a Pythonic wrapper around Thrift, while jpype enables calling Java code (e.g., custom row‑key generators) from Python.

2. Environment Preparation

2.1 Install happybase

First verify whether happybase is installed. Install it via pip, which also pulls the thriftpy2 dependency. # pip install happybase For offline environments, download the thriftpy2 and happybase source packages and install them manually:

# pip install thriftpy2-0.4.8.tar.gz
# pip install happybase-1.2.0.tar.gz

2.2 Install jpype

JPype depends on numpy, so install numpy first, then install the JPype wheel or source package.

# pip install numpy
# pip install JPype1-0.7.0.tar.gz

On Windows, ensure the appropriate .whl file matches the Python architecture (32‑bit or 64‑bit) and that the wheel package is available.

3. Practice

3.1 Using happybase to query data

Create a connection to the Thrift server (replace 'thriftserver_ip' with the actual address):

connection = happybase.Connection('thriftserver_ip', 9090, table_prefix=b'ns1', table_prefix_separator=b':')

Obtain a table object and fetch a row by its key:

table = connection.table('tablename')
# row_key = b'\x01\x91!\x02\x00\x00\x00\x04007720181210'
row = table.row(row_key)
if len(row) != 0:
    print(row[b'f:column1'])          # bytes output
    print(row.get(b'f:column1', b'').decode())  # string output

Close the connection when finished: connection.close() happybase also supports scan, put, delete, and batch operations; refer to the official documentation for details.

3.2 Invoking Java classes for row‑key generation

Prepare the required JAR files and start the JVM:

jars = ["/app/lib/custom-1.2.0.jar", "/app/lib/commons-codec-1.9.jar"]
jvm_classpath = "-Djava.class.path={}".format(":".join(jars))
if not jpype.isJVMStarted():
    jpype.startJVM(jpype.getDefaultJVMPath(), "-ea", jvm_classpath)

Import the Java utility classes and generate the row key:

MD5Util = JClass("com.example.MD5Util")
BytesUtil = JClass("com.example.BytesUtil")
rowkey_bs = MD5Util.getHashBytes(BytesUtil.toBytes(pk_id))
Hex = JClass("org.apache.commons.codec.binary.Hex")
row_key = bytes.fromhex(Hex.encodeHexString(rowkey_bs))

This converts the Java byte[] to a Python bytes object, completing the string → byte[] → bytes transformation required for HBase queries.

4. Conclusion

The guide demonstrates how to install and configure happybase and jpype, start the JVM, invoke Java utilities for row‑key creation, and perform basic HBase operations from Python. The steps cover both online and offline installation scenarios and highlight important type‑conversion details.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Database HBase Tutorial JPype Thrift happybase

Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.