Why IN and NOT IN Can Be Problematic in SQL and How to Replace Them
The article explains that using IN and NOT IN in SQL queries often leads to poor performance and incorrect results, especially with large tables or NULL values, and demonstrates safer alternatives such as EXISTS, NOT EXISTS, and JOIN with clear code examples.
WHY?
IN and NOT IN are frequently used keywords, but they should be avoided because they can cause serious performance degradation and produce wrong query results.
1. Low efficiency
In a project two tables (t1 and t2) each contain about 1.5 million rows (≈600 MB). The following query using NOT IN runs for many minutes because the predicate cannot use the index on phone :
select * from t1 where phone not in (select phone from t2)Rewriting the same logic with NOT EXISTS reduces the execution time to about 20 seconds:
select * from t1 where not EXISTS (select phone from t2 where t1.phone = t2.phone)2. Easy to produce wrong results
Consider two tables test1(id1 int) and test2(id2 int) . The intended query to find ids in test1 that also exist in test2 is:
select id1 from test1 where id1 in (select id2 from test2)If the column name in the sub‑query is mistakenly written as id1 instead of id2 , the query still runs and returns all rows (1, 2, 3) without any error, which can be misleading.
Another pitfall appears with NOT IN when the sub‑query contains a NULL value. Adding a NULL to test2 and running:
select id1 from test1 where id1 not in (select id2 from test2)returns an empty result set because the comparison with NULL yields UNKNOWN, so rows that should be returned (e.g., id 3) are filtered out.
Tip: avoid allowing NULLs in columns that are used in set‑based predicates.
HOW?
1. Use EXISTS or NOT EXISTS instead of IN/NOT IN
select * from test1 where EXISTS (select * from test2 where id2 = id1) select * from test1 where NOT EXISTS (select * from test2 where id2 = id1)2. Use JOINs instead of IN/NOT IN
-- Inner join to get matching rows
select id1 from test1 INNER JOIN test2 ON id2 = id1; -- Left join with a NULL check to get non‑matching rows
select id1 from test1 LEFT JOIN test2 ON id2 = id1 where id2 IS NULL;These alternatives are reliable and usually allow the optimizer to use indexes effectively.
Note: IN can still be used safely when the set is a small, constant list, such as IN (0, 1, 2) .
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.