You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When doing concurrent 2 mclient with one querying sys.queue, mserver segfault with the given (abbreviated) stacktrace
11 0x00007f4257d5a312 in __GI___assert_fail (assertion=0x7f42598838de "(v[i] & 0xC0) == 0x80", file=0x7f4259883710 "gdk_atoms.c", line=1283, function=0x7f4259883af6 <PRETTY_FUNCTION.10385> "strPut") at assert.c:101
12 0x00007f425959c941 in strPut (h=0x7f42341f3050, dst=0x7f4250dad290, v=0x7f41d6596b20 '\333' <repeats 160 times>, "\300") at gdk_atoms.c:1283
13 0x00007f425963fef5 in BUNappend (b=0x7f42342cc010, t=0x7f41d6596b20, force=0 '\000') at gdk_bat.c:1318
14 0x00007f4259d93785 in SYSMONqueue (cntxt=0x7f4252f7f330, mb=0x7f4234239880, stk=0x7f42342dee20, pci=0x7f42342fb5c0) at sysmon.c:103
15 0x00007f4259c1418a in runMALsequence (cntxt=0x7f4252f7f330, mb=0x7f4234239880, startpc=1, stoppc=0, stk=0x7f42342dee20, env=0x0, pcicaller=0x0) at mal_interpreter.c:631
16 0x00007f4259c132db in callMAL (cntxt=0x7f4252f7f330, mb=0x7f4234239880, env=0x7f4250dad998, argv=0x7f4250dada40, debug=0 '\000') at mal_interpreter.c:447
From what I could debug, here is what happened:
QRYqueue[0] and QRYqueue[1] are filled
QRYqueue[0] is finished and cleaned in runtimeProfileFinish, QRYqueue[1] is moved in QRYqueue[0].
We still have a QRYqueue information in QRYqueue[1], that should not pose problems since qtop is here
When a query is done on sys.queue, the loop in sysmon.c:89 is done from 0 to QRYqueue[i].tag, so we hit QRYqueue[1].query which has been freed but not setted to NULL.
Exporting qtop and doing the loop until qtop seems to solve the problem.
Looks like an off-by one (h->base[h->size] is set to 0)
No. memset sets the memory starting at h->base+h->free for (h->size-h->free) bytes. That is to say, from h->base+h->free up to (but not including(!)) h->base+h->free+(h->size-h->free) == h->base+h->size.
Still, the bug on sys.queue is valid. Freed memory is read and garbage can appear in the query column.
Just by doing 2 simultaneous
while mclient db -s "select * from sys.queue;" ; do :; done
This will trigger an "utf8strlenmax: Assertion `n == 0' failed."
Comment 22254
Date: 2016-07-25 21:04:50 +0200
From: @mlkersten
Running the one liner does not cause problems.
Running two clients causes an assertion in BUNappend
in SYSMONqueue, which is protected by a mal_delaylock (weird)
0 0x00007fd50b4389c8 in raise () from /lib64/libc.so.6
1 0x00007fd50b43a65a in abort () from /lib64/libc.so.6
2 0x00007fd50b431187 in __assert_fail_base () from /lib64/libc.so.6
3 0x00007fd50b431232 in __assert_fail () from /lib64/libc.so.6
4 0x00007fd50e0f44f1 in strPut (h=0x7fd454155c60, dst=0x7fd461f2f288, v=0x7fd45414f1f0 '\333' <repeats 40 times>, "q") at /export/scratch1/mk/default//package/gdk/gdk_atoms.c:1323
5 0x00007fd50e18ef54 in BUNappend (b=0x7fd454155b20, t=0x7fd45414f1f0, force=0 '\000') at /export/scratch1/mk/default//package/gdk/gdk_bat.c:955
6 0x00007fd50e83af7d in SYSMONqueue (cntxt=0x7fd5014bc330, mb=0x7fd454107280, stk=0x7fd45415a320, pci=0x7fd4541371b0) at /export/scratch1/mk/default//package/monetdb5/modules/mal/sysmon.c:77
Run the server under valgrind. One of the messages is:
==21299== Invalid read of size 1
==21299== at 0x5566DF1: strCmp (gdk_atoms.c:1084)
==21299== by 0x5601AF9: setcolprops (gdk_bat.c:818)
==21299== by 0x56025CC: BUNappend (gdk_bat.c:952)
==21299== by 0x4F94D5D: SYSMONqueue (sysmon.c:76)
==21299== by 0x4E85B1E: runMALsequence (mal_interpreter.c:629)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
==21299== by 0x4EACD2F: runScenarioBody (mal_scenario.c:574)
==21299== by 0x4EACE3F: runScenario (mal_scenario.c:594)
==21299== Address 0x1353c920 is 16 bytes inside a block of size 56 free'd
==21299== at 0x4C2CD5A: free (vg_replace_malloc.c:530)
==21299== by 0x55614DA: GDKfree (gdk_utils.c:1804)
==21299== by 0x4E7AF32: runtimeProfileFinish (mal_runtime.c:113)
==21299== by 0x4E86654: runMALsequence (mal_interpreter.c:760)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
==21299== by 0x4EACD2F: runScenarioBody (mal_scenario.c:574)
==21299== by 0x4EACE3F: runScenario (mal_scenario.c:594)
==21299== by 0x4EAE928: MSserveClient (mal_session.c:449)
==21299== Block was alloc'd at
==21299== at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==21299== by 0x55610E0: GDKmalloc_prefixsize (gdk_utils.c:1679)
==21299== by 0x55611F7: GDKmallocmax (gdk_utils.c:1722)
==21299== by 0x55612FF: GDKmalloc (gdk_utils.c:1746)
==21299== by 0x55617B8: GDKstrdup (gdk_utils.c:1881)
==21299== by 0x4E7AA32: runtimeProfileInit (mal_runtime.c:87)
==21299== by 0x4E851E5: runMALsequence (mal_interpreter.c:500)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
In other words, the query being added by SYSMONqueue has aleady been freed.
Date: 2016-03-02 09:14:04 +0100
From: anthonin.bonnefoy
To: MonetDB5 devs <>
Version: 11.21.19 (Jul2015-SP4)
CC: @mlkersten
Last updated: 2016-10-13 10:04:07 +0200
Comment 21849
Date: 2016-03-02 09:14:04 +0100
From: anthonin.bonnefoy
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0 Iceweasel/45.0
Build Identifier:
When doing concurrent 2 mclient with one querying sys.queue, mserver segfault with the given (abbreviated) stacktrace
11 0x00007f4257d5a312 in __GI___assert_fail (assertion=0x7f42598838de "(v[i] & 0xC0) == 0x80", file=0x7f4259883710 "gdk_atoms.c", line=1283, function=0x7f4259883af6 <PRETTY_FUNCTION.10385> "strPut") at assert.c:101
12 0x00007f425959c941 in strPut (h=0x7f42341f3050, dst=0x7f4250dad290, v=0x7f41d6596b20 '\333' <repeats 160 times>, "\300") at gdk_atoms.c:1283
13 0x00007f425963fef5 in BUNappend (b=0x7f42342cc010, t=0x7f41d6596b20, force=0 '\000') at gdk_bat.c:1318
14 0x00007f4259d93785 in SYSMONqueue (cntxt=0x7f4252f7f330, mb=0x7f4234239880, stk=0x7f42342dee20, pci=0x7f42342fb5c0) at sysmon.c:103
15 0x00007f4259c1418a in runMALsequence (cntxt=0x7f4252f7f330, mb=0x7f4234239880, startpc=1, stoppc=0, stk=0x7f42342dee20, env=0x0, pcicaller=0x0) at mal_interpreter.c:631
16 0x00007f4259c132db in callMAL (cntxt=0x7f4252f7f330, mb=0x7f4234239880, env=0x7f4250dad998, argv=0x7f4250dada40, debug=0 '\000') at mal_interpreter.c:447
From what I could debug, here is what happened:
We still have a QRYqueue information in QRYqueue[1], that should not pose problems since qtop is here
Exporting qtop and doing the loop until qtop seems to solve the problem.
Reproducible: Always
Comment 21850
Date: 2016-03-02 09:40:06 +0100
From: anthonin.bonnefoy
By the way, in gdk_atoms.c:1247, the line
memset(h->base + h->free, 0, h->size - h->free);
Looks like an off-by one (h->base[h->size] is set to 0)
Comment 21909
Date: 2016-03-15 10:30:58 +0100
From: @sjoerdmullender
(In reply to anthonin.bonnefoy from comment 1)
No. memset sets the memory starting at h->base+h->free for (h->size-h->free) bytes. That is to say, from h->base+h->free up to (but not including(!)) h->base+h->free+(h->size-h->free) == h->base+h->size.
Comment 22072
Date: 2016-04-21 10:19:41 +0200
From: anthonin.bonnefoy
Yeah, my bad.
Still, the bug on sys.queue is valid. Freed memory is read and garbage can appear in the query column.
Just by doing 2 simultaneous
while mclient db -s "select * from sys.queue;" ; do :; done
This will trigger an "utf8strlenmax: Assertion `n == 0' failed."
Comment 22254
Date: 2016-07-25 21:04:50 +0200
From: @mlkersten
Running the one liner does not cause problems.
Running two clients causes an assertion in BUNappend
in SYSMONqueue, which is protected by a mal_delaylock (weird)
0 0x00007fd50b4389c8 in raise () from /lib64/libc.so.6
1 0x00007fd50b43a65a in abort () from /lib64/libc.so.6
2 0x00007fd50b431187 in __assert_fail_base () from /lib64/libc.so.6
3 0x00007fd50b431232 in __assert_fail () from /lib64/libc.so.6
4 0x00007fd50e0f44f1 in strPut (h=0x7fd454155c60, dst=0x7fd461f2f288, v=0x7fd45414f1f0 '\333' <repeats 40 times>, "q") at /export/scratch1/mk/default//package/gdk/gdk_atoms.c:1323
5 0x00007fd50e18ef54 in BUNappend (b=0x7fd454155b20, t=0x7fd45414f1f0, force=0 '\000') at /export/scratch1/mk/default//package/gdk/gdk_bat.c:955
6 0x00007fd50e83af7d in SYSMONqueue (cntxt=0x7fd5014bc330, mb=0x7fd454107280, stk=0x7fd45415a320, pci=0x7fd4541371b0) at /export/scratch1/mk/default//package/monetdb5/modules/mal/sysmon.c:77
Comment 22257
Date: 2016-07-25 21:37:33 +0200
From: @sjoerdmullender
Run the server under valgrind. One of the messages is:
==21299== Invalid read of size 1
==21299== at 0x5566DF1: strCmp (gdk_atoms.c:1084)
==21299== by 0x5601AF9: setcolprops (gdk_bat.c:818)
==21299== by 0x56025CC: BUNappend (gdk_bat.c:952)
==21299== by 0x4F94D5D: SYSMONqueue (sysmon.c:76)
==21299== by 0x4E85B1E: runMALsequence (mal_interpreter.c:629)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
==21299== by 0x4EACD2F: runScenarioBody (mal_scenario.c:574)
==21299== by 0x4EACE3F: runScenario (mal_scenario.c:594)
==21299== Address 0x1353c920 is 16 bytes inside a block of size 56 free'd
==21299== at 0x4C2CD5A: free (vg_replace_malloc.c:530)
==21299== by 0x55614DA: GDKfree (gdk_utils.c:1804)
==21299== by 0x4E7AF32: runtimeProfileFinish (mal_runtime.c:113)
==21299== by 0x4E86654: runMALsequence (mal_interpreter.c:760)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
==21299== by 0x4EACD2F: runScenarioBody (mal_scenario.c:574)
==21299== by 0x4EACE3F: runScenario (mal_scenario.c:594)
==21299== by 0x4EAE928: MSserveClient (mal_session.c:449)
==21299== Block was alloc'd at
==21299== at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==21299== by 0x55610E0: GDKmalloc_prefixsize (gdk_utils.c:1679)
==21299== by 0x55611F7: GDKmallocmax (gdk_utils.c:1722)
==21299== by 0x55612FF: GDKmalloc (gdk_utils.c:1746)
==21299== by 0x55617B8: GDKstrdup (gdk_utils.c:1881)
==21299== by 0x4E7AA32: runtimeProfileInit (mal_runtime.c:87)
==21299== by 0x4E851E5: runMALsequence (mal_interpreter.c:500)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
In other words, the query being added by SYSMONqueue has aleady been freed.
Comment 22259
Date: 2016-07-25 21:58:24 +0200
From: MonetDB Mercurial Repository <>
Changeset 9f67d824a657 made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=9f67d824a657
Changeset description:
Comment 22260
Date: 2016-07-25 22:00:12 +0200
From: @sjoerdmullender
Should be fixed now.
Comment 24492
Date: 2016-10-13 10:04:07 +0200
From: @sjoerdmullender
Jun2016-SP2 has been released.
The text was updated successfully, but these errors were encountered: