You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.57 Safari/537.36
Build Identifier:
Perhaps I missed something in the recent MonetDB developments, but after noticing unexpected MAL plan explosions (see BUG #3294), I came to this simple query:
explain
select name from sys.functions where mod='str'
union all
select name from sys.functions where mod='str';
To further clarify, this is an explain from an older MonetDB version (Dec2011), where you can clearly see that the result of the same select is appended twice to the final result.
In the one I posted in the original report, it seems to me that the same select is computed twice.
A critical test for common term elimination is that instructions should be side-effect free. Appends are not.
Also the equality of the arguments A0, and A1 can not be determined statically.
It would have to be detected at the SQL level leading to a template with just one argument.
A difference that I notice with the Dec2011 plan that I posted is indeed that the two strings are now parameters of the function.
This justifies your explanation about arguments A0 and A1.
Understanding why this is happening doesn't really satisfy me though. The behaviour of the older version was far more correct, I believe. Notice that in real-life queries I get very long plan portions repeated over and over again, just because of this.
I recently had the problem with Feb2013 (or more precisely SciQL-2, but I don't expect that it was related / limited to the SciQL extensions on to of Feb2013) that when using a UNION ALL of several sub-queries in a function declaration, common sub-expressions --- in my specific case an aggregation that was identical in the select- and the having-clause of a group by sub-query --- were not recognized as such, i.e., the identical aggregation was evaluated twice. In my case, introducing a separate sub-function per sub-query and UNION ALL over the sub-function calls "solved" the problem.
(Given to many other obligations, I did not have/find the time to file my problem as a proper bug report ... :-()
While I assumed the problem was related to "too complicated" function definitions, it might actually be that it is related to UNION ALL queries --- the most prominent commonality between Roborto's and my case.
While appends are indeed not side-effect free, selections are.
Roberto's point / question is why the identical selects are not shared / re-used, but evaluated twice ("redundantly").
The same holds for the identical aggregations in my group-by-having case.
(I'll try to provide a reproducible example, but cannot promise when ...)
Comment 19151
Date: 2013-09-09 18:38:22 +0200
From: @mlkersten
Indeed, in this case it all boils down to the handling of constants at the SQL level to derive templates. It is up to the SQL engineer to validate and reassign it.
For what it's worth:
While both share UNION ALL, a notable difference between Roberto's and my case is that the common sub-expression appears "across" the UNION ALL in Roberto's case, while it appears within a sub-query "under" the UNION ALL in my case.
Having said that, with the following simple SQL-only example trying to mimic my (actually more complex SciQL) case, I cannot reproduce the problem, i.e., common subexpression elimination works fine with Feb2013-SP3 & latest Feb2013 from HG (changeset 23a797ab1392):
create table bug_3361_stm ( a int, b int );
explain
select a, sum(b), min(b) from bug_3361_StM group by a having sum(b) > 0 and min(b) > 0
union all
select a, sum(b), max(b) from bug_3361_StM group by a having sum(b) < 0 and max(b) < 0;
And another difference between Roberto's and my case is that in Roberto's case the common sub-expression contains a literal, while mine does not.
As Martin pointed out, the current (Feb2013) query translation creates a MAL plan (function) that uses separate function arguments for each literal, even if in the given instance some literals happen to have the same values. Thus, while there is a common sub-expression in the SQL query, this is no longer "visible"/present in the generated MAL plan.
I would put it this way: rewriting the sql query helps the translator produce better MAL code. Ideally, that should not be needed, as SQL is supposed to be declarative. Especially considering that in many real-life applications query are generated - thus more likely to contain verbose patterns, I would consider this something to try and improve. A good reason is also that it came as an unexpected side effect of other developments, reducing the effectiveness of previously working code optimizations.
On the sql side the 2 variables are created because of the query cache. For now you could switch off the query cache (set cache=false). This should push the problem back to the commonterms optimizer. Currently the optimizer fails at finding the duplicates.
- improved usage for filter functions
- improved push down of selects (with multiple tid() statements)
- made enabling/disabling cache function again (needed for bug #3361)
Indeed, the window for searching for constants in a MAL plan is limited
and also not handled at the SQL layer as such.
The underlying reason is that constants are not separately administered
from ordinary variables.
Similarly, the window for searching identical instructions is limited.
Otherwise, the common term optimizer would become way too expensive O(n^2)
In this case we have
X_48 := algebra.subselect(X_9,X_6,"str":str,"str":str,true,true,false)
X_49 := algebra.subselect(r1_12,"str":str,"str":str,true,true,false);
X_50 := algebra.subselect(X_15,X_6,"str":str,"str":str,true,true,false);
X_51 := algebra.subselect(X_9,X_27,"str":str,"str":str,true,true,false)
X_52 := algebra.subselect(X_15,X_27,"str":str,"str":str,true,true,false);
None of the arguments pairs are identical and in this simple common term
matching fails. The fact that X_27 is an alias for X_6 is not (yet) taken
into account. This could be handled by an extra alias removal before the
commonterms optimizer starts. [TODO]
Martin, do you mean that simply adding the existing aliasRemoval before commonTerms in the current pipeline would do the trick? Or would they need changes anyway?
I am aware of the limited window. We discussed this a few times and I still don't agree with it. ;-)
That's why there is no limit in my code (that's the only change), so the window cannot be a reason in this case.
Yes, that optimizer is expensive, but with the current limitations it's hardly ever useful and its failed application can be way worse than its quadratic execution on long plans (that's my opinion of course).
Comment 20934
Date: 2015-06-19 20:38:53 +0200
From: @mlkersten
If you add the alias optimizer:
....
"optimizer.deadcode();"
"optimizer.aliases();"
"optimizer.commonTerms();"
....
You will get the following more interesting result.
From my point of view this bug report can be closed as fixed.
However.... ;) I hope you don't mind if I submit others on the same topic. Re-use of intermediate results remains a nasty issue with generated SQL queries.
The conclusion for this issues seemed to be, from the last comments, that that aliases optimizer should have been placed before the commonTerms optimizer, in order to catch instructions that are not identical but equivalent.
The bug has been closed as fixed in Jul2015, but I see the default pipe hasn't changed.
explain
select name from sys.functions where mod='str'
union all
select name from sys.functions where mod='str';
eventually worked correctly by
setting cache to false
calling aliases optimizer before commonTerms optimizer
Now this seems not to be enough any more.
To me this remains a serious issue. Plans are exploding and repeating 40-50 times the same computations.
Comment 22037
Date: 2016-04-14 13:08:01 +0200
From: @mlkersten
Yes, the issue is known and we are considering to make caching default false. Calling the optimizer for each query instead. That would remove this issue.
reordered the optimizers such that the alias gets removed. This solve the commonterms problem. For now still the cache needs to be disabled. We are still thinking/testing if the agressive statement cache could be removed (or only used for a smaller set of statements (is updates)).
Date: 2013-09-09 17:11:03 +0200
From: @swingbit
To: SQL devs <>
Version: -- development
CC: @mlkersten, @njnes, @drstmane
Last updated: 2020-09-23 17:31:10 +0200
Comment 19143
Date: 2013-09-09 17:11:03 +0200
From: @swingbit
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.57 Safari/537.36
Build Identifier:
Perhaps I missed something in the recent MonetDB developments, but after noticing unexpected MAL plan explosions (see BUG #3294), I came to this simple query:
explain
select name from sys.functions where mod='str'
union all
select name from sys.functions where mod='str';
Which translates to:
+----------------------------------------------------------------+
| mal |
+================================================================+
| function user.s2_1(A0:str,A1:str):void; |
| X_4:bat[:oid,:str] := bat.new(nil:oid,nil:str); |
| X_3 := sql.mvc(); |
| X_10 := sql.bind(X_3,"sys","functions","mod",0); |
| X_7:bat[:oid,:oid] := sql.tid(X_3,"sys","functions"); |
| X_56 := algebra.subselect(X_10,X_7,A0,A0,true,true,false); |
| (X_13,r1_13) := sql.bind(X_3,"sys","functions","mod",2); |
| X_57 := algebra.subselect(r1_13,A0,A0,true,true,false); |
| X_16 := sql.bind(X_3,"sys","functions","mod",1); |
| X_58 := algebra.subselect(X_16,X_7,A0,A0,true,true,false); |
| X_18 := sql.subdelta(X_56,X_7,X_13,X_57,X_58); |
| X_21 := sql.bind(X_3,"sys","functions","name",0); |
| (X_23,r1_28) := sql.bind(X_3,"sys","functions","name",2); |
| X_25 := sql.bind(X_3,"sys","functions","name",1); |
| X_26 := sql.projectdelta(X_18,X_21,X_23,r1_28,X_25); |
| X_27 := bat.append(X_4,X_26,true); |
| X_28:bat[:oid,:oid] := X_7; |
| X_29 := sql.projectdelta(X_28,X_10,X_13,r1_13,X_16); |
| X_30 := algebra.subselect(X_29,A1,A1,true,true,false); |
| X_33 := sql.projectdelta(X_28,X_21,X_23,r1_28,X_25); |
| X_35 := algebra.leftfetchjoin(X_30,X_33); |
| X_36 := bat.append(X_27,X_35,true); |
| X_38 := sql.resultSet(1,1,X_36); |
| sql.rsColumn(X_38,".L","name","varchar",256,0,X_36); |
| X_43 := io.stdout(); |
| sql.exportResult(X_43,X_38); |
| end s2_1; |
| optimizer.dataflow() |
+----------------------------------------------------------------+
It looks to me that this repeats twice the same select. Is that true?
Reproducible: Always
Comment 19144
Date: 2013-09-09 17:41:02 +0200
From: @mlkersten
Roberto,
The select statements are not identical.
They depend on different components
| X_56 := algebra.subselect(X_10,X_7,A0,A0,true,true,false); | mod,0
| X_57 := algebra.subselect(r1_13,A0,A0,true,true,false); | mod,2
| X_58 := algebra.subselect(X_16,X_7,A0,A0,true,true,false); | mod,1
| X_30 := algebra.subselect(X_29,A1,A1,true,true,false); | projection
Optimizer seems to have squeezed it as it should.
regards, Martin
Comment 19145
Date: 2013-09-09 17:49:16 +0200
From: @swingbit
Hi Martin,
I was aware of the 0,1,2 binds.
What I don't understand is: if the result consists of the same selection appended twice, why does X_36 (the second append) read as:
X_36 := bat.append(X_27,X_35,true);
and not as
X_36 := bat.append(X_27,X_26,true);
?
Comment 19146
Date: 2013-09-09 18:02:30 +0200
From: @swingbit
To further clarify, this is an explain from an older MonetDB version (Dec2011), where you can clearly see that the result of the same select is appended twice to the final result.
In the one I posted in the original report, it seems to me that the same select is computed twice.
+----------------------------------------------------------------------+
| mal |
+======================================================================+
| function user.s2_9():void; |
| X_30:bat[:oid,:str] := bat.new(nil:oid,nil:str); |
| X_1 := sql.mvc(); |
| X_21:bat[:oid,:str] := sql.bind(X_1,"sys","functions","mod",0); |
| X_22 := algebra.uselect(X_21,"str":str); |
| X_19:bat[:oid,:str] := sql.bind(X_1,"sys","functions","mod",2); |
| X_23 := algebra.kdifference(X_22,X_19); |
| X_20 := algebra.uselect(X_19,"str":str); |
| X_24 := algebra.kunion(X_23,X_20); |
| X_17:bat[:oid,:str] := sql.bind(X_1,"sys","functions","mod",1); |
| X_18 := algebra.uselect(X_17,"str":str); |
| X_25 := algebra.kunion(X_24,X_18); |
| X_15:bat[:oid,:oid] := sql.bind_dbat(X_1,"sys","functions",1); |
| X_16 := bat.reverse(X_15); |
| X_26 := algebra.kdifference(X_25,X_16); |
| X_27 := algebra.markT(X_26,0@0:oid); |
| X_28 := bat.reverse(X_27); |
| X_9:bat[:oid,:str] := sql.bind(X_1,"sys","functions","name",0); |
| X_7:bat[:oid,:str] := sql.bind(X_1,"sys","functions","name",2); |
| X_11 := algebra.kdifference(X_9,X_7); |
| X_12 := algebra.kunion(X_11,X_7); |
| X_4:bat[:oid,:str] := sql.bind(X_1,"sys","functions","name",1); |
| X_13 := algebra.kunion(X_12,X_4); |
| X_29 := algebra.leftjoin(X_28,X_13); |
| X_33 := bat.append(X_30,X_29,true); |
| X_35 := bat.append(X_33,X_29,true); |
| X_36 := sql.resultSet(1,1,X_35); |
| sql.rsColumn(X_36,".functions","name":str,"varchar",256,0,X_35); |
| X_44 := io.stdout(); |
| sql.exportResult(X_44,X_36); |
| end s2_9; |
+----------------------------------------------------------------------+
Comment 19147
Date: 2013-09-09 18:12:35 +0200
From: @mlkersten
A critical test for common term elimination is that instructions should be side-effect free. Appends are not.
Also the equality of the arguments A0, and A1 can not be determined statically.
It would have to be detected at the SQL level leading to a template with just one argument.
Comment 19148
Date: 2013-09-09 18:23:16 +0200
From: @swingbit
A difference that I notice with the Dec2011 plan that I posted is indeed that the two strings are now parameters of the function.
This justifies your explanation about arguments A0 and A1.
Understanding why this is happening doesn't really satisfy me though. The behaviour of the older version was far more correct, I believe. Notice that in real-life queries I get very long plan portions repeated over and over again, just because of this.
Do you think this is the way it should behave?
Comment 19149
Date: 2013-09-09 18:32:44 +0200
From: @drstmane
Possibly related:
I recently had the problem with Feb2013 (or more precisely SciQL-2, but I don't expect that it was related / limited to the SciQL extensions on to of Feb2013) that when using a UNION ALL of several sub-queries in a function declaration, common sub-expressions --- in my specific case an aggregation that was identical in the select- and the having-clause of a group by sub-query --- were not recognized as such, i.e., the identical aggregation was evaluated twice. In my case, introducing a separate sub-function per sub-query and UNION ALL over the sub-function calls "solved" the problem.
(Given to many other obligations, I did not have/find the time to file my problem as a proper bug report ... :-()
While I assumed the problem was related to "too complicated" function definitions, it might actually be that it is related to UNION ALL queries --- the most prominent commonality between Roborto's and my case.
Would this make this bug report "valid", again?
Comment 19150
Date: 2013-09-09 18:36:08 +0200
From: @drstmane
NB:
While appends are indeed not side-effect free, selections are.
Roberto's point / question is why the identical selects are not shared / re-used, but evaluated twice ("redundantly").
The same holds for the identical aggregations in my group-by-having case.
(I'll try to provide a reproducible example, but cannot promise when ...)
Comment 19151
Date: 2013-09-09 18:38:22 +0200
From: @mlkersten
Indeed, in this case it all boils down to the handling of constants at the SQL level to derive templates. It is up to the SQL engineer to validate and reassign it.
Comment 19152
Date: 2013-09-09 19:23:14 +0200
From: @drstmane
For what it's worth:
While both share UNION ALL, a notable difference between Roberto's and my case is that the common sub-expression appears "across" the UNION ALL in Roberto's case, while it appears within a sub-query "under" the UNION ALL in my case.
Comment 19153
Date: 2013-09-09 19:49:08 +0200
From: @drstmane
Having said that, with the following simple SQL-only example trying to mimic my (actually more complex SciQL) case, I cannot reproduce the problem, i.e., common subexpression elimination works fine with Feb2013-SP3 & latest Feb2013 from HG (changeset 23a797ab1392):
create table bug_3361_stm ( a int, b int );
explain
select a, sum(b), min(b) from bug_3361_StM group by a having sum(b) > 0 and min(b) > 0
union all
select a, sum(b), max(b) from bug_3361_StM group by a having sum(b) < 0 and max(b) < 0;
+------------------------------------------------------------------------+
| mal |
+========================================================================+
| function user.s7_1{autoCommit=true}(A0:lng,A1:int,A2:lng,A3:int):void; |
| X_7:bat[:oid,:int] := bat.new(nil:oid,nil:int); |
| X_6 := sql.mvc(); |
| X_10:bat[:oid,:oid] := sql.tid(X_6,"sys","bug_3361_stm"); |
| X_13 := sql.bind(X_6,"sys","bug_3361_stm","b",0); |
| (X_16,r1_16) := sql.bind(X_6,"sys","bug_3361_stm","b",2); |
| X_19 := sql.bind(X_6,"sys","bug_3361_stm","b",1); |
| X_21 := sql.delta(X_13,X_16,r1_16,X_19); |
| X_22 := algebra.leftfetchjoin(X_10,X_21); |
| X_23 := sql.bind(X_6,"sys","bug_3361_stm","a",0); |
| (X_25,r1_25) := sql.bind(X_6,"sys","bug_3361_stm","a",2); |
| X_27 := sql.bind(X_6,"sys","bug_3361_stm","a",1); |
| X_28 := sql.delta(X_23,X_25,r1_25,X_27); |
| X_29 := algebra.leftfetchjoin(X_10,X_28); |
| (X_30,r1_30,r2_30) := group.subgroupdone(X_29); |
| X_33:bat[:oid,:lng] := aggr.subsum(X_22,X_30,r1_30,true,true); |
^^^^ ^^^
| X_34:bat[:oid,:int] := aggr.submin(X_22,X_30,r1_30,true); |
^^^^ ^^^
| X_38 := algebra.leftfetchjoin(r1_30,X_29); |
| X_42:bat[:oid,:int] := aggr.submax(X_22,X_30,r1_30,true); |
^^^^ ^^^
| X_35 := algebra.thetasubselect(X_34,A1,">"); |
^^^^
| X_37 := algebra.thetasubselect(X_33,X_35,A0,">"); |
^^^^
| X_39 := algebra.leftfetchjoin(X_37,X_38); |
| X_40 := bat.append(X_7,X_39,true); |
| X_43 := algebra.thetasubselect(X_42,A3,"<"); |
^^^^
| X_45 := algebra.thetasubselect(X_33,X_43,A2,"<"); |
^^^^
| X_46 := algebra.leftfetchjoin(X_45,X_38); |
| X_47 := bat.append(X_40,X_46,true); |
| X_56:bat[:oid,:int] := bat.new(nil:oid,nil:int); |
| X_58 := algebra.leftfetchjoin(X_37,X_34); |
^^^^
| X_59 := bat.append(X_56,X_58,true); |
| X_60 := algebra.leftfetchjoin(X_45,X_42); |
^^^^
| X_61 := bat.append(X_59,X_60,true); |
| X_49:bat[:oid,:lng] := bat.new(nil:oid,nil:lng); |
| X_52 := algebra.leftfetchjoin(X_37,X_33); |
^^^^
| X_53 := bat.append(X_49,X_52,true); |
| X_54 := algebra.leftfetchjoin(X_45,X_33); |
^^^^
| X_55 := bat.append(X_53,X_54,true); |
| X_62 := sql.resultSet(3,1,X_47); |
| sql.rsColumn(X_62,".L10","a","int",32,0,X_47); |
| sql.rsColumn(X_62,".L10","L1","bigint",32,0,X_55); |
| sql.rsColumn(X_62,".L10","L2","int",32,0,X_61); |
| X_76 := io.stdout(); |
| sql.exportResult(X_76,X_62); |
| end s7_1; |
+------------------------------------------------------------------------+
44 tuples (3.083ms)
Comment 19154
Date: 2013-09-09 20:29:27 +0200
From: @drstmane
And another difference between Roberto's and my case is that in Roberto's case the common sub-expression contains a literal, while mine does not.
As Martin pointed out, the current (Feb2013) query translation creates a MAL plan (function) that uses separate function arguments for each literal, even if in the given instance some literals happen to have the same values. Thus, while there is a common sub-expression in the SQL query, this is no longer "visible"/present in the generated MAL plan.
Comment 19155
Date: 2013-09-11 11:51:48 +0200
From: @swingbit
How literals are passed as function parameters seems indeed the be the main cause here.
If I rewrite my original query as
explain
with sel as (select name from sys.functions where mod='str')
select name from sel
union all
select name from sel;
Then I get the MAL plan that I would expect:
+----------------------------------------------------------------+
| mal |
+================================================================+
| function user.s5_1(A0:str):void; |
| X_3:bat[:oid,:str] := bat.new(nil:oid,nil:str); |
| X_2 := sql.mvc(); |
| X_9 := sql.bind(X_2,"sys","functions","mod",0); |
| X_6:bat[:oid,:oid] := sql.tid(X_2,"sys","functions"); |
| X_48 := algebra.subselect(X_9,X_6,A0,A0,true,true,false); |
| (X_12,r1_12) := sql.bind(X_2,"sys","functions","mod",2); |
| X_49 := algebra.subselect(r1_12,A0,A0,true,true,false); |
| X_15 := sql.bind(X_2,"sys","functions","mod",1); |
| X_50 := algebra.subselect(X_15,X_6,A0,A0,true,true,false); |
| X_17 := sql.subdelta(X_48,X_6,X_12,X_49,X_50); |
| X_20 := sql.bind(X_2,"sys","functions","name",0); |
| (X_22,r1_27) := sql.bind(X_2,"sys","functions","name",2); |
| X_24 := sql.bind(X_2,"sys","functions","name",1); |
| X_25 := sql.projectdelta(X_17,X_20,X_22,r1_27,X_24); |
| X_26 := bat.append(X_3,X_25,true); |
| X_27 := bat.append(X_26,X_25,true); |
| X_28 := sql.resultSet(1,1,X_27); |
| sql.rsColumn(X_28,".L","name","varchar",256,0,X_27); |
| X_35 := io.stdout(); |
| sql.exportResult(X_35,X_28); |
| end s5_1; |
| optimizer.dataflow() |
+----------------------------------------------------------------+
I would put it this way: rewriting the sql query helps the translator produce better MAL code. Ideally, that should not be needed, as SQL is supposed to be declarative. Especially considering that in many real-life applications query are generated - thus more likely to contain verbose patterns, I would consider this something to try and improve. A good reason is also that it came as an unexpected side effect of other developments, reducing the effectiveness of previously working code optimizations.
Will this report be reconsidered?
Comment 20104
Date: 2014-08-29 10:29:45 +0200
From: @swingbit
A year later, I'd like to reopen this bug report.
It was marked as "INVALID", but as the discussion shows, it was not invalid at all.
Perhaps you want to mark it as "WON'T FIX", but I hope you reconsider, as I find it serious for every non-trivial query scenario.
Comment 20536
Date: 2014-12-17 16:55:21 +0100
From: @njnes
On the sql side the 2 variables are created because of the query cache. For now you could switch off the query cache (set cache=false). This should push the problem back to the commonterms optimizer. Currently the optimizer fails at finding the duplicates.
Comment 20537
Date: 2014-12-17 16:57:10 +0100
From: MonetDB Mercurial Repository <>
Changeset 9a7952fd5a0c made by Niels Nes niels@cwi.nl in the MonetDB repo, refers to this bug.
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=9a7952fd5a0c
Changeset description:
Comment 20931
Date: 2015-06-19 18:23:12 +0200
From: @swingbit
Setting cache=false does indeed inline the parameters.
However commonTerms cannot find the duplicates yet. I don't really understand why.
Notice that the problem I am describing here is the root of MAL plans longer than 30K lines (where I would expect max 2-3K lines).
On Oct2014-SP4, with cache=false (Jul2015 behaves the same):
sql>set cache=false;
operation successful (0.641ms)
sql>explain
more>select name from sys.functions where mod='str'
more>union all
more>select name from sys.functions where mod='str';
+-------------------------------------------------------------------------------+
| mal |
+===============================================================================+
| function user.main{autoCommit=true}():void; |
| barrier X_63 := language.dataflow(); |
| X_3:bat[:oid,:str] := bat.new(nil:oid,nil:str); |
| X_2 := sql.mvc(); |
| X_9:bat[:oid,:str] := sql.bind(X_2,"sys","functions","mod",0); |
| X_6:bat[:oid,:oid] := sql.tid(X_2,"sys","functions"); |
| X_48 := algebra.subselect(X_9,X_6,"str":str,"str":str,true,true,false); |
| (X_12,r1_12) := sql.bind(X_2,"sys","functions","mod",2); |
| X_49 := algebra.subselect(r1_12,"str":str,"str":str,true,true,false); |
| X_15:bat[:oid,:str] := sql.bind(X_2,"sys","functions","mod",1); |
| X_50 := algebra.subselect(X_15,X_6,"str":str,"str":str,true,true,false); |
| X_17 := sql.subdelta(X_48,X_6,X_12,X_49,X_50); |
| X_19:bat[:oid,:str] := sql.bind(X_2,"sys","functions","name",0); |
| (X_21,r1_28) := sql.bind(X_2,"sys","functions","name",2); |
| X_23:bat[:oid,:str] := sql.bind(X_2,"sys","functions","name",1); |
| X_24 := sql.projectdelta(X_17,X_19,X_21,r1_28,X_23); |
| X_25 := bat.append(X_3,X_24,true); |
| X_27:bat[:oid,:oid] := X_6; |
| X_51 := algebra.subselect(X_9,X_27,"str":str,"str":str,true,true,false); |
| X_52 := algebra.subselect(X_15,X_27,"str":str,"str":str,true,true,false); |
| X_29 := sql.subdelta(X_51,X_27,X_12,X_49,X_52); |
| X_31 := sql.projectdelta(X_29,X_19,X_21,r1_28,X_23); |
| X_32 := bat.append(X_25,X_31,true); |
| language.pass(X_6); |
| language.pass(X_9); |
| language.pass(X_15); |
| language.pass(X_27); |
| language.pass(X_12); |
| language.pass(X_49); |
| language.pass(X_19); |
| language.pass(X_21); |
| language.pass(r1_28); |
| language.pass(X_23); |
| exit X_63; |
| X_33 := sql.resultSet(1,1,X_32); |
| sql.rsColumn(X_33,".L","name","varchar",256,0,X_32); |
| X_39 := io.stdout(); |
| sql.exportResult(X_39,X_33); |
| end main; |
+-------------------------------------------------------------------------------+
39 tuples (3.649ms)
Comment 20932
Date: 2015-06-19 18:37:22 +0200
From: @mlkersten
Indeed, the window for searching for constants in a MAL plan is limited
and also not handled at the SQL layer as such.
The underlying reason is that constants are not separately administered
from ordinary variables.
Similarly, the window for searching identical instructions is limited.
Otherwise, the common term optimizer would become way too expensive O(n^2)
In this case we have
X_48 := algebra.subselect(X_9,X_6,"str":str,"str":str,true,true,false)
X_49 := algebra.subselect(r1_12,"str":str,"str":str,true,true,false);
X_50 := algebra.subselect(X_15,X_6,"str":str,"str":str,true,true,false);
X_51 := algebra.subselect(X_9,X_27,"str":str,"str":str,true,true,false)
X_52 := algebra.subselect(X_15,X_27,"str":str,"str":str,true,true,false);
None of the arguments pairs are identical and in this simple common term
matching fails. The fact that X_27 is an alias for X_6 is not (yet) taken
into account. This could be handled by an extra alias removal before the
commonterms optimizer starts. [TODO]
Comment 20933
Date: 2015-06-19 18:49:47 +0200
From: @swingbit
Martin, do you mean that simply adding the existing aliasRemoval before commonTerms in the current pipeline would do the trick? Or would they need changes anyway?
I am aware of the limited window. We discussed this a few times and I still don't agree with it. ;-)
That's why there is no limit in my code (that's the only change), so the window cannot be a reason in this case.
Yes, that optimizer is expensive, but with the current limitations it's hardly ever useful and its failed application can be way worse than its quadratic execution on long plans (that's my opinion of course).
Comment 20934
Date: 2015-06-19 20:38:53 +0200
From: @mlkersten
If you add the alias optimizer:
....
"optimizer.deadcode();"
"optimizer.aliases();"
"optimizer.commonTerms();"
....
You will get the following more interesting result.
sql>set cache=false;
operation successful (0.629ms)
sql>explain select name from sys.functions where mod='str'
more>union
more>select name from sys.functions where mod='str' ;
+--------------------------------------------------------------------------------------+
| mal |
+======================================================================================+
| function user.main{autoCommit=true}():void; |
| X_38 := bat.new(nil:oid,nil:str); |
| X_46 := bat.append(X_38,".L"); |
| X_41 := bat.new(nil:oid,nil:str); |
| X_48 := bat.append(X_41,"name"); |
| X_42 := bat.new(nil:oid,nil:str); |
| X_50 := bat.append(X_42,"varchar"); |
| X_43 := bat.new(nil:oid,nil:int); |
| X_52 := bat.append(X_43,256); |
| X_45 := bat.new(nil:oid,nil:int); |
| X_54 := bat.append(X_45,0); |
| X_3:bat[:oid,:str] := bat.new(nil:oid,nil:str); |
| X_2 := sql.mvc(); |
| X_9:bat[:oid,:str] := sql.bind(X_2,"sys","functions","mod",0); |
| X_6:bat[:oid,:oid] := sql.tid(X_2,"sys","functions"); |
| X_69 := algebra.subselect(X_9,X_6,"str","str",true,true,false); |
| (X_12,r1_12) := sql.bind(X_2,"sys","functions","mod",2); |
| X_70 := algebra.subselect(r1_12,nil:bat[:oid,:oid],"str","str",true,true,false); |
| X_15:bat[:oid,:str] := sql.bind(X_2,"sys","functions","mod",1); |
| X_72 := algebra.subselect(X_15,X_6,"str","str",true,true,false); |
| X_18 := sql.subdelta(X_69,X_6,X_12,X_70,X_72); |
| X_20:bat[:oid,:str] := sql.bind(X_2,"sys","functions","name",0); |
| (X_22,r1_25) := sql.bind(X_2,"sys","functions","name",2); |
| X_24:bat[:oid,:str] := sql.bind(X_2,"sys","functions","name",1); |
| X_25 := sql.projectdelta(X_18,X_20,X_22,r1_25,X_24); |
| (X_26,r1_31,r2_31) := group.subgroupdone(X_25); |
| X_29 := algebra.leftfetchjoin(r1_31,X_25); |
| X_30 := bat.append(X_3,X_29,true); |
| X_32 := bat.append(X_30,X_29,true); |
| (X_33,r1_64,r2_64) := group.subgroupdone(X_32); |
| X_36 := algebra.leftfetchjoin(r1_64,X_32); |
| sql.resultSet(X_46,X_48,X_50,X_52,X_54,X_36); |
| end user.main; |
| optimizer.mitosis() |
| optimizer.dataflow() |
+--------------------------------------------------------------------------------------+
35 tuples (4.034ms)
Happy? ;)
Comment 20936
Date: 2015-06-22 10:46:43 +0200
From: @swingbit
Thanks, that looks much better indeed :)
From my point of view this bug report can be closed as fixed.
However.... ;) I hope you don't mind if I submit others on the same topic. Re-use of intermediate results remains a nasty issue with generated SQL queries.
Comment 21211
Date: 2015-08-28 13:42:22 +0200
From: @sjoerdmullender
Jul2015 has been released.
Comment 22034
Date: 2016-04-14 12:07:04 +0200
From: @swingbit
The conclusion for this issues seemed to be, from the last comments, that that aliases optimizer should have been placed before the commonTerms optimizer, in order to catch instructions that are not identical but equivalent.
The bug has been closed as fixed in Jul2015, but I see the default pipe hasn't changed.
Is there a reason? Was that not a good fix?
Comment 22036
Date: 2016-04-14 12:57:05 +0200
From: @swingbit
Actually: the test case
explain
select name from sys.functions where mod='str'
union all
select name from sys.functions where mod='str';
eventually worked correctly by
Now this seems not to be enough any more.
To me this remains a serious issue. Plans are exploding and repeating 40-50 times the same computations.
Comment 22037
Date: 2016-04-14 13:08:01 +0200
From: @mlkersten
Yes, the issue is known and we are considering to make caching default false. Calling the optimizer for each query instead. That would remove this issue.
Comment 22060
Date: 2016-04-17 11:41:10 +0200
From: MonetDB Mercurial Repository <>
Changeset 6db31f60efeb made by Niels Nes niels@cwi.nl in the MonetDB repo, refers to this bug.
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=6db31f60efeb
Changeset description:
Comment 22061
Date: 2016-04-17 11:42:47 +0200
From: @njnes
reordered the optimizers such that the alias gets removed. This solve the commonterms problem. For now still the cache needs to be disabled. We are still thinking/testing if the agressive statement cache could be removed (or only used for a smaller set of statements (is updates)).
Comment 22125
Date: 2016-05-06 19:02:04 +0200
From: @swingbit
Just a reminder that the commit above modifies only default_pipe. The others still have the pushselect optimizer after the alias optimizer.
Comment 28115
Date: 2020-09-23 17:31:10 +0200
From: @njnes
the oct2020 will no longer have the query cache, ie should run these issues without problems.
The text was updated successfully, but these errors were encountered: