Databases 11 min read

Performance Evaluation of Multi‑Table Joins in MySQL and Oracle with Large Data Sets

This article investigates the practical limits of joining more than three tables in MySQL by designing experiments with up to 1.5 billion rows, comparing indexed and non‑indexed queries, and contrasting the results with Oracle's performance, while providing full SQL scripts for data generation and analysis.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
Performance Evaluation of Multi‑Table Joins in MySQL and Oracle with Large Data Sets

Problem Statement – The Alibaba Java Development Manual advises against joining more than three tables in MySQL; this article questions the rule by designing experiments to test its validity.

Experiment Analysis – The author creates a four‑table schema (student, teacher, course, sc) and formulates a query to find the highest‑scoring student for a specific teacher, then decomposes the query into three simpler statements for clarity.

Test Environment – VMware 10, CentOS 7.4, MySQL 5.7.22 with 4.5 GB RAM, 4 CPU cores, 50 GB SSD. Two data scales are used: 10 million enrollment records and 100 million enrollment records, with corresponding numbers of students, teachers, and courses.

SQL Queries

select Student.Sname, course.cname, score
from Student, SC, Course, Teacher
where Student.s_id=SC.s_id and SC.c_id=Course.c_id and sc.t_id=teacher.t_id
  and Teacher.Tname='tname553'
  and SC.score=(select max(score) from SC where sc.t_id=teacher.t_Id);

The query is split into three parts, each shown in the source.

Data Generation Scripts (MySQL)

use stu;
drop table if exists student;
create table student (
  s_id int(11) not null auto_increment,
  sno int(11),
  sname varchar(50),
  sage int(11),
  ssex varchar(8),
  father_id int(11),
  mather_id int(11),
  note varchar(500),
  primary key (s_id),
  unique key uk_sno (sno)
) engine=innodb default charset=utf8mb4;

/* function to insert 50 million student rows */
create function insert_student_data() returns int deterministic
begin
  declare i int;
  set i = 1;
  while i < 50000000 do
    insert into student values (i, i, concat('name', i), i,
      case when floor(rand()*10)%2 = 0 then 'f' else 'm' end,
      floor(rand()*100000), floor(rand()*1000000), concat('note', i));
    set i = i + 1;
  end while;
  return 1;
end;

Similar scripts are provided for course , sc , and teacher tables, as well as Oracle equivalents using PL/SQL procedures.

Results

The tests reveal that:

Without indexes on join keys, queries become extremely slow, confirming the need for proper indexing.

For data volumes above 100 million rows, simple queries remain marginally acceptable, but multi‑table joins start to strain MySQL.

MySQL fails to complete a four‑table join on 150 million rows, while Oracle finishes the same query in about 26 seconds, demonstrating Oracle's superior join handling.

Conclusion – The “no join over three tables” rule holds for large‑scale MySQL workloads because of performance degradation, though it is not a hard technical limit; with sufficient resources and indexing, MySQL can handle larger joins, but developers should consider moving complex logic to the application layer.

SQLData GenerationMySQLDatabase OptimizationOracleJoin Performance
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.