Databases 17 min read

Hopsworks Feature Store: Transparent Dual‑Storage System for Online and Offline Machine Learning Features

This article explains how Hopsworks’ feature store unifies online low‑latency and offline high‑throughput storage using a dual‑system architecture built on RonDB, detailing its API, metadata handling, ingestion pipeline, benchmarks, and how it simplifies production machine‑learning feature access.

Big Data Technology Architecture
Big Data Technology Architecture
Big Data Technology Architecture
Hopsworks Feature Store: Transparent Dual‑Storage System for Online and Offline Machine Learning Features

Hopsworks Feature Store provides a unified feature repository that abstracts the complexity of dual online and offline storage systems, delivering low‑latency access to fresh feature values for real‑time ML applications while supporting high‑throughput batch processing.

1. Production Machine‑Learning Models

Online inference requires the latest feature values for a given primary key, which cannot be satisfied efficiently by a single database; teams therefore maintain separate offline lakes for training and micro‑services for online feature engineering, creating barriers to rapid iteration.

Data‑science view: tight coupling of data and infrastructure hinders production transition and feature reuse.

ML‑engineer view: extensive engineering effort is needed to guarantee consistent access to production data.

2. Hopsworks Feature Store: Transparent Dual‑Storage System

The store combines a high‑bandwidth, low‑cost offline layer (Apache Hudi tables on HopsFS, S3, Azure Blob, external tables) with a low‑latency online key‑value store that holds the latest value per feature. It satisfies four key requirements: consistent features for training and serving, feature freshness, sub‑millisecond latency, and intuitive API accessibility.

HSFS API offers a DataFrame‑style interface for both online and offline reads/writes (Spark, Structured Streaming, Pandas).

Rich metadata enables discovery, management, and reuse of features.

Scalable stateless engine writes to the online store without write amplification.

RonDB provides the underlying fast SQL‑capable key‑value database.

3. RonDB: Foundation for Online Store, Filesystem and Metadata

RonDB (formerly NDB Cluster) powers the online store and stores all feature‑store metadata, including schemas, statistics, and file‑system metadata. It is accessed via JDBC and supports high‑performance, encrypted, authenticated connections.

4. OnlineFS: Scalable Online Feature Materialization Engine

OnlineFS is a stateless service built on ClusterJ (a high‑performance JNI layer over the native NDB API). It ingests feature DataFrames via Kafka, decodes Avro‑encoded rows, and performs batched upserts into RonDB, scaling linearly with the number of service instances.

Data ingestion steps:

Write features from Pandas or Spark DataFrames to the feature store.

Encode rows with Avro and publish to a dedicated Kafka topic.

Consume and decode messages from Kafka.

Upsert rows into RonDB using primary‑key based batches.

Code example for creating a feature‑group and inserting data:

# populate feature group metadata object
store_fg_meta = fs.create_feature_group(name="store_fg", version=1,
    primary_key=["store"], description="Store related features", online_enabled=True)
# insert DataFrame
fg.insert(Dataframe)
# or stream insert
fg.insert_stream(streaming_Dataframe)

5. Accessibility: Transparent API

HSFS abstracts the dual storage so that the same .insert() call writes to both online and offline layers, and the same DataFrame API can be used for reads, enabling seamless integration with existing ETL pipelines or streaming jobs.

6. Benchmarks

Benchmarks on a 3‑node AWS cluster (m5.2xlarge) show OnlineFS sustaining ~126 K rows/sec for 11‑feature vectors and ~60 K rows/sec for 51‑feature vectors. Write latency p99 is ~250 ms for 1 KB rows. Service lookup throughput scales linearly up to 16 clients with sub‑10 ms latency; batch lookups of 100 vectors increase throughput 15‑fold with modest latency growth.

7. Conclusion

Hopsworks, together with a hosted RonDB cluster, delivers a high‑performance, highly available online feature store that achieves >250 k ops/sec and sub‑10 ms p99 latency for 1 KB feature vectors, positioning it as one of the fastest solutions on the market.

References

[1] The world’s fastest SQL‑capable key‑value store: https://www.logicalclocks.com/blog/rondb-the-worlds-fastest-key-value-store-is-now-in-the-cloud

Data Engineeringmachine learningbenchmarkFeature Storeonline storageRonDB
Big Data Technology Architecture
Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.