Databases 5 min read

Understanding IN vs EXISTS and NOT IN vs NOT EXISTS in SQL

This article explains the performance differences between IN, EXISTS, NOT IN, and NOT EXISTS in SQL queries, provides guidance on when to use each construct based on table sizes and indexes, and demonstrates the concepts with practical code examples.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
Understanding IN vs EXISTS and NOT IN vs NOT EXISTS in SQL

1. IN and EXISTS

IN creates a hash join between the outer and inner tables, while EXISTS performs a loop over the outer table and queries the inner table for each iteration; the common belief that EXISTS is always faster than IN is inaccurate.

If the two tables are of comparable size, the performance difference between IN and EXISTS is minimal; if one table is much larger, use EXISTS when the subquery table is large and IN when the subquery table is small.

Example with Table A (small) and Table B (large):

select * from A where cc in (select cc from B)  -- slower, uses index on A.cc
select * from A where exists (select 1 from B where B.cc = A.cc)  -- faster, uses index on B.cc

Conversely, when Table B is the outer table:

select * from B where cc in (select cc from A)  -- faster, uses index on B.cc
select * from B where exists (select 1 from A where A.cc = B.cc)  -- slower, uses index on A.cc

2. NOT IN and NOT EXISTS

NOT IN is not logically equivalent to NOT EXISTS; misuse of NOT IN can introduce serious bugs. The following example demonstrates the issue:

create table #t1(c1 int,c2 int);
create table #t2(c1 int,c2 int);
insert into #t1 values(1,2);
insert into #t1 values(1,3);
insert into #t2 values(1,2);
insert into #t2 values(1,null);

select * from #t1 where c2 not in (select c2 from #t2);  -- returns no rows
select * from #t1 where not exists (select 1 from #t2 where #t2.c2 = #t1.c2);  -- returns rows 1 3

The NOT IN query yields an unexpected empty result set because the subquery returns a NULL value, causing the whole predicate to evaluate to UNKNOWN. NOT EXISTS, on the other hand, can still use indexes and typically performs better.

If the subquery returns any NULL, NOT IN will return no rows; in such cases, NOT EXISTS is preferable. When the subquery column is defined as NOT NULL, NOT IN can be used safely, but EXISTS generally provides better index utilization.

3. Difference between IN and =

select name from student where name in ('zhang','wang','zhao');
select name from student where name='zhang' or name='wang' or name='zhao';

Both queries produce the same result set.

performanceSQLDatabaseEXISTSIN
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.