Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-column IN clause for subquery produces wrong results #6651

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

Multi-column IN clause for subquery produces wrong results #6651

monetdb-team opened this issue Nov 30, 2020 · 0 comments
Labels

Comments

@monetdb-team
Copy link

Date: 2018-10-11 20:20:16 +0200
From: Joseph Dsilva <>
To: SQL devs <>
Version: 11.31.7 (Aug2018)
CC: joedsilva, martin.van.dinther, @PedroTadim

Last updated: 2019-01-14 17:29:12 +0100

Comment 26638

Date: 2018-10-11 20:20:16 +0200
From: Joseph Dsilva <>

User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36
Build Identifier:

A SQL Query with multiple columns in the IN clause, such as
... WHERE (c1, c2) IN ( SELECT c1, c2 FROM ... )

produces incorrect results (see reproducibility for sample test case)

Reproducible: Always

Steps to Reproduce:

  1. Create a test table
    CREATE TABLE T1
    (
    C1 INTEGER,
    C2 INTEGER,
    C3 INTEGER
    );

  2. Insert some records
    INSERT INTO T1
    VALUES (1, 2, 3)
    ,(1, 2, 4)
    ,(2, 2, 5)
    ,(1, 3, 6)
    ;

  3. Let us see which c1, c2 combination has more than one entry (results are correct)
    SELECT C1, C2, COUNT()
    FROM T1
    GROUP BY C1, C2
    HAVING COUNT(
    ) > 1
    ;
    +------+------+------+
    | c1 | c2 | L4 |
    +======+======+======+
    | 1 | 2 | 2 |
    +------+------+------+
    1 tuple

  4. Let us find all records from T1 such that C1, C2 has multiple entries for a given value combination. (correct result)
    SELECT T1.C1, T1.C2, T1.C3
    FROM T1,
    (
    SELECT C1, C2
    FROM T1
    GROUP BY C1, C2
    HAVING COUNT(*) > 1
    ) X
    WHERE T1.C1 = X.C1 AND T1.C2 = X.C2
    ;
    +------+------+------+
    | c1 | c2 | c3 |
    +======+======+======+
    | 1 | 2 | 3 |
    | 1 | 2 | 4 |
    +------+------+------+

  5. Let us write the same logic in (4) as a subquery. (results incorrect, the last row should not be there).
    SELECT C1, C2, C3
    FROM T1
    WHERE (C1, C2) IN
    (
    SELECT C1, C2
    FROM T1
    GROUP BY C1, C2
    HAVING COUNT(*) > 1
    )
    ;
    +------+------+------+
    | c1 | c2 | c3 |
    +======+======+======+
    | 1 | 2 | 3 |
    | 1 | 2 | 4 |
    | 1 | 3 | 6 |
    +------+------+------+
    3 tuples

Actual Results:

Please see Reproducibility.

Expected Results:

Please see Reproducibility.

Please see Reproducibility.

$ ./mserver5 --version
MonetDB 5 server v11.31.7 "Aug2018" (64-bit, 128-bit integers)
Copyright (c) 1993 - July 2008 CWI
Copyright (c) August 2008 - 2018 MonetDB B.V., all rights reserved
Visit https://www.monetdb.org/ for further information
Found 11.7GiB available memory, 8 available cpu cores
Libraries:
libpcre: 8.38 2015-11-23 (compiled with 8.38)
openssl: OpenSSL 1.0.2g 1 Mar 2016 (compiled with OpenSSL 1.0.2g 1 Mar 2016)
libxml2: 2.9.3 (compiled with 2.9.3)
Compiled by: jdsilv2@cerberus (x86_64-pc-linux-gnu)
Compilation: gcc -O3 -fomit-frame-pointer -pipe -D_FORTIFY_SOURCE=2
Linking : /usr/bin/ld -m elf_x86_64 -Wl,-Bsymbolic-functions

Comment 26640

Date: 2018-10-12 11:24:55 +0200
From: MonetDB Mercurial Repository <>

Changeset 7352568e47e0 made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=7352568e47e0

Changeset description:

Added test for bug #6651.

Comment 26678

Date: 2018-11-14 16:19:42 +0100
From: MonetDB Mercurial Repository <>

Changeset e8917e90a56c made by Martin van Dinther martin.van.dinther@monetdbsolutions.com in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=e8917e90a56c

Changeset description:

Found another multi-column IN query variant which produces wrong output:
SELECT C1, C2, C3
  FROM T1
 WHERE (C2, C1) IN
       (
	 SELECT C2, C1
	   FROM T1
	  GROUP BY C1, C2
	 HAVING COUNT(*) > 1
       );
Added it to the test script for bug #6651.

Comment 26679

Date: 2018-11-14 16:22:27 +0100
From: Martin van Dinther <<martin.van.dinther>>

Update importance to High-critical

Comment 26681

Date: 2018-11-15 19:11:36 +0100
From: Martin van Dinther <<martin.van.dinther>>

Some more queries to analyse the problem and find some workaround (see query 8):

-- 7. Same query as 5 but using qualified column names in subquery.
-- (produces incorrect results, same as 5)
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) IN
(
SELECT T1.C1, T1.C2
FROM T1
GROUP BY T1.C1, T1.C2
HAVING COUNT(*) > 1
);

-- 8. Same query as 5 but using alias for table and qualified column names
-- in subquery. (produces correct result, so can be used as a workaround)
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) IN
(
SELECT T.C1, T.C2
FROM T1 AS T
GROUP BY T.C1, T.C2
HAVING COUNT(*) > 1
);

-- 9. Query using NOT IN instead of IN (and change COUNT() = 1)
-- (produces correct result in this use case data)
SELECT C1, C2, C3
FROM T1
WHERE (C1, C2) NOT IN
(
SELECT C1, C2
FROM T1
GROUP BY C1, C2
HAVING COUNT(
) = 1
);

From queries 7 and 8 we can see it has to do with a name scoping problem.

Query 8 shows how you can workaround it in the meantime (till there is a software patch released).

Comment 26690

Date: 2018-11-20 10:44:46 +0100
From: MonetDB Mercurial Repository <>

Changeset e9be2776cf1d made by Pedro Ferreira pedro.ferreira@monetdbsolutions.com in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=e9be2776cf1d

Changeset description:

Fix for bug #6651. Label expressions for in's right relation, so the rel_optimizer doesn't get confused.

Also added defensive line for uninitialized variable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants