JDFrame/SDFrame: A Semantic Java Stream DataFrame Library for Simplified Data Processing
This article introduces JDFrame/SDFrame, a JVM‑level DataFrame library that provides a more semantic and concise API for Java 8 stream operations, demonstrates how to add the Maven dependency, shows practical examples for filtering, grouping, sorting, joining, pagination, and explains the differences between the mutable JDFrame and the immutable SDFrame.
The author, a senior architect, presents a Java library called JDFrame/SDFrame that mimics DataFrame concepts from Spark/Pandas to make Java 8 stream processing more expressive and less error‑prone.
Quick start : add the Maven dependency
<dependency>
<groupId>io.github.burukeyou</groupId>
<artifactId>jdframe</artifactId>
<version>0.0.4</version>
</dependency>Then create a list of Student objects (the article provides the full POJO definition) and use the library to perform a query that selects schools with students aged 9‑16, sums their scores, and returns the top two schools:
SDFrame<FI2<String, BigDecimal>> sdf2 = SDFrame.read(studentList)
.whereNotNull(Student::getAge)
.whereBetween(Student::getAge, 9, 16)
.groupBySum(Student::getSchool, Student::getScore)
.sortDesc(FI2::getC2)
.cutFirst(2);
sdf2.show();The output shows the school name and the aggregated score.
API catalogue covers matrix viewing, filtering, aggregation, deduplication, grouping, sorting, joining, pagination, frame configuration, and miscellaneous utilities such as percentage conversion, partitioning, and row‑number generation. Example snippets include:
// matrix view
void show(int n);
List
columns();
List
col(Function
function);
T head();
List
head(int n);
T tail();
List
tail(int n);
List
page(int page, int pageSize); // filtering
.whereBetween(Student::getAge, 3, 6)
.whereBetweenR(Student::getAge, 3, 6) // (3,6]
.whereNotNull(Student::getName)
.whereGt(Student::getAge, 3)
.whereIn(Student::getAge, Arrays.asList(3,7,8))
.whereLike(Student::getName, "jay"); // aggregation
frame.max(Student::getAge);
frame.avg(Student::getAge);
frame.sum(Student::getAge);
frame.groupBySum(Student::getSchool, Student::getAge);
frame.groupByCount(Student::getSchool); // deduplication
SDFrame.read(studentList).distinct().toLists();
SDFrame.read(studentList).distinct(Student::getSchool).toLists(); // sorting
SDFrame.read(studentList).sortDesc(Student::getAge);
SDFrame.read(studentList).sortAsc(Sorter.sortDescBy(Student::getAge).sortAsc(Student::getLevel)); // joining
UserInfo userInfo = new UserInfo();
userInfo.setKey1(a.getSchool());
userInfo.setKey2(b.getC2().intValue());
userInfo.setKey3(String.valueOf(a.getId()));
return userInfo;The article also explains the key difference between JDFrame (stateful, operations take effect immediately) and SDFrame (stateless, similar to Java streams, requiring a new read after each terminal operation).
Finally, the author provides links to the source repository, Maven Central, and a tutorial on window functions, and invites readers to discuss further extensions.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.