Wrong answer from merge table after content changes #6374

monetdb-team · 2020-11-30T15:51:11Z

Date: 2017-07-17 14:07:00 +0200
From: Richard Hughes <<richard.monetdb>>
To: SQL devs <>
Version: 11.25.23 (Dec2016-SP5)
CC: @njnes, shawpolakcrax12

Last updated: 2019-06-07 09:47:18 +0200

Comment 25492

Date: 2017-07-17 14:07:00 +0200
From: Richard Hughes <<richard.monetdb>>

Build: default branch 8125f999c846 (2017-7-14)
Reproducible: always

sql>create table sub1 (i int);
operation successful (27.770ms)
sql>create table sub2 (i int);
operation successful (13.348ms)
sql>create merge table mt as select * from sub1 with no data;
operation successful (12.755ms)
sql>alter table mt add table sub1;
operation successful (5.060ms)
sql>alter table mt add table sub2;
operation successful (5.471ms)
sql>insert into sub1 values (1),(2);
2 affected rows (6.190ms)
sql>insert into sub2 values (11),(12);
2 affected rows (6.607ms)
sql>alter table sub1 set read only;
operation successful (6.666ms)
sql>alter table sub2 set read only;
operation successful (6.624ms)
sql>analyze sys.sub1;
sql>analyze sys.sub2;
sql>plan select count() from mt where i >= 5 and i < 100;
+------------------------------------------------------------------+
| rel |
+==================================================================+
| project ( |
| | group by ( |
| | | select ( |
| | | | table(sys.sub2) [ "mt"."i", "mt"."%TID%" NOT NULL ] COUNT |
| | | ) [ int "5" <= "mt"."i" < int "100" ] |
| | ) [ ] [ sys.count() NOT NULL as "L3"."L3" ] |
| ) [ "L3" NOT NULL as "L4"."L3" ] |
+------------------------------------------------------------------+
7 tuples (1.274ms)
sql>select count() from mt where i >= 5 and i < 100;
+------+
| L3 |
+======+
| 2 |
+------+
1 tuple (3.162ms)
sql>alter table sub1 set read write;
operation successful (13.473ms)
sql>insert into sub1 values (7),(8);
2 affected rows (6.402ms)
sql>alter table sub1 set read only;
operation successful (6.637ms)
sql>analyze sys.sub1;
sql>plan select count() from mt where i >= 5 and i < 100;
+------------------------------------------------------------------+
| rel |
+==================================================================+
| project ( |
| | group by ( |
| | | select ( |
| | | | table(sys.sub2) [ "mt"."i", "mt"."%TID%" NOT NULL ] COUNT |
| | | ) [ int "5" <= "mt"."i" < int "100" ] |
| | ) [ ] [ sys.count() NOT NULL as "L3"."L3" ] |
| ) [ "L3" NOT NULL as "L4"."L3" ] |
+------------------------------------------------------------------+
7 tuples (1.264ms)
sql>select count() from mt where i >= 5 and i < 100;
+------+
| L3 |
+======+
| 2 |
+------+
1 tuple (4.480ms)
sql>select T.name,S.* from sys.statistics S join sys.columns C on S.column_id=C.id join sys.tables T on T.id=C.table_id where C.name='i' and T.name like 'sub_';
+------+-----------+------+-------+----------------------------+--------+-------+--------+------+--------+--------+--------+-----------+
| name | column_id | type | width | stamp | sample | count | unique | nils | minval | maxval | sorted | revsorted |
+======+===========+======+=======+============================+========+=======+========+======+========+========+========+===========+
| sub1 | 8265 | int | 4 | 2017-07-17 11:15:28.000000 | 4 | 4 | 4 | 0 | 1 | 8 | true | false |
| sub2 | 8268 | int | 4 | 2017-07-17 11:15:28.000000 | 2 | 2 | 2 | 0 | 11 | 12 | true | false |
+------+-----------+------+-------+----------------------------+--------+-------+--------+------+--------+--------+--------+-----------+
2 tuples (66.370ms)

Notice that the final plan is wrong, resulting in the wrong count(*) immediately below. The plan should include 'sub1' and the count should be 4. The contents of sys.statistics is correct.

What's happening is that there's an in-memory cache of the statistics in sql_column::min/max. The following patch, which simply bodges out all usage of those values, confirms this by making the above test work:

diff -r 8125f999c846 sql/storage/store.c
--- a/sql/storage/store.c Fri Jul 14 16:22:22 2017 +0200
+++ b/sql/storage/store.c Mon Jul 17 12:28:45 2017 +0100
@@ -4724,11 +4724,6 @@
sql_schema *sys = find_sql_schema(tr, "sys");
sql_table *stats = find_sql_table(sys, "statistics");

```
          if (col->min && col->max) {
```
```
                  *min = col->min;
```
```
                  *max = col->max;
```
```
                  return 1;
```

          }
          if (stats) {
                  sql_column *stats_column_id = find_sql_column(stats, "column_id");
                  oid rid = table_funcs.column_find_row(tr, stats_column_id, &col->base.id, NULL);

I'm not sure what the right fix is, however I've looked at all the uses of sql_trans_ranges() and concluded that the cache isn't used enough to be important for performance for us. Removing it entirely, therefore, might not necessarily be wrong.

Comment 25522

Date: 2017-07-28 20:47:43 +0200
From: @njnes

added cleanup code in the sql_analyze call.

Comment 25524

Date: 2017-07-28 22:45:49 +0200
From: MonetDB Mercurial Repository <>

Changeset 6e732bac4683 made by Niels Nes niels@cwi.nl in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=6e732bac4683

Changeset description:

fixed bug #6374, ie clear cached statistics on analyze call

Comment 25530

Date: 2017-07-31 11:28:10 +0200
From: Richard Hughes <<richard.monetdb>>

This fix doesn't work, I'm afraid.

Running the original test across two connections, one doing all the writes and one read-only, still reproduces the problem:

=====connection 1======
create table sub1 (i int);
create table sub2 (i int);
create merge table mt as select * from sub1 with no data;
alter table mt add table sub1;
alter table mt add table sub2;
insert into sub1 values (1),(2);
insert into sub2 values (11),(12);
alter table sub1 set read only;
alter table sub2 set read only;
analyze sys.sub1;
analyze sys.sub2;

=====connection 1======
alter table sub1 set read write;
insert into sub1 values (7),(8);
alter table sub1 set read only;
analyze sys.sub1;

It looks to me like sql_analyze() is wiping out only its own copy of sql_column, not the one(s) belonging to other connections.

Comment 25650

Date: 2017-09-20 08:55:50 +0200
From: @njnes

the reset_column now also clears the cache on column changes.

The text was updated successfully, but these errors were encountered:

monetdb-team added bug Something isn't working normal SQL labels Nov 30, 2020

monetdb-team closed this as completed Nov 30, 2020

sjoerdmullender added this to the Ancient Release milestone Feb 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong answer from merge table after content changes #6374

Wrong answer from merge table after content changes #6374

monetdb-team commented Nov 30, 2020

Wrong answer from merge table after content changes #6374

Wrong answer from merge table after content changes #6374

Comments

monetdb-team commented Nov 30, 2020

Comment 25492

Comment 25522

Comment 25524

Comment 25530

Comment 25650