You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On my 8-core (4 physical core with hyperthreading, the attached SciQL script finished succesfully in less than 2 seconds (debug build) when run with t = {1,7,8} threads (mserver5 --set gdk_nr_threads=t).
However, when run with t={2,3,4,5,6} threads, the script hangs as several threads --- 3(!) when the server run with 2(!) threads --- hang on the MT_sema_down(&q->s, "q_dequeue"); in dataflow's q_dequeue() function; see also the gdk trace below.
This happens with the latest version of the SciQL-2 branch (changeset c935ec8da74c), but similar (or identical?) behaviour has been observed also with earlier versions, i.e., before the recent re-cast of the worker-pool had been propagated from the Feb2013 branch.
(gdb) thread apply all bt
Thread 8 (Thread 0x7fffea7d7700 (LWP 27115)):
0 0x000000337ce0d6a0 in sem_wait () from /lib64/libpthread.so.0
1 0x00007ffff7903b3b in q_dequeue (q=0x7fffd4003660) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:210
2 0x00007ffff7905530 in DFLOWscheduler (flow=0x7fffd40035d0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:573
3 0x00007ffff7905b33 in runMALdataflow (cntxt=0x628028, mb=0x7fffdc2eed50, startpc=12, stoppc=43, stk=0x7fffd4003e70) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:653
4 0x00007ffff7aa719f in MALstartDataflow (cntxt=0x628028, mb=0x7fffdc2eed50, stk=0x7fffd4003e70, pci=0x7fffdc526270) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/modules/mal/language.c:136
5 0x00007ffff790083c in runMALsequence (cntxt=0x628028, mb=0x7fffdc2eed50, startpc=1, stoppc=54, stk=0x7fffd4003e70, env=0x7fffdc591100, pcicaller=0x7fffdc5257b0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:650
6 0x00007ffff7900c8f in runMALsequence (cntxt=0x628028, mb=0x7fffdc33b810, startpc=38, stoppc=39, stk=0x7fffdc591100, env=0x0, pcicaller=0x0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:730
7 0x00007ffff7903f95 in DFLOWworker (t=0x7ffff7f2e028 <workers+8>) at /ufs/manegold/_/Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:301
8 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
9 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7fffea9d8700 (LWP 27114)):
0 0x000000337ce0d6a0 in sem_wait () from /lib64/libpthread.so.0
1 0x00007ffff7903b3b in q_dequeue (q=0x7fffe00037b0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:210
2 0x00007ffff7905530 in DFLOWscheduler (flow=0x7fffe0003670) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:573
3 0x00007ffff7905b33 in runMALdataflow (cntxt=0x628028, mb=0x7fffdc528030, startpc=12, stoppc=43, stk=0x7fffe00040b0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:653
4 0x00007ffff7aa719f in MALstartDataflow (cntxt=0x628028, mb=0x7fffdc528030, stk=0x7fffe00040b0, pci=0x7fffdc525390) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/modules/mal/language.c:136
5 0x00007ffff790083c in runMALsequence (cntxt=0x628028, mb=0x7fffdc528030, startpc=1, stoppc=54, stk=0x7fffe00040b0, env=0x7fffdc591100, pcicaller=0x7fffdc56f5b0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:650
6 0x00007ffff7900c8f in runMALsequence (cntxt=0x628028, mb=0x7fffdc33b810, startpc=41, stoppc=42, stk=0x7fffdc591100, env=0x0, pcicaller=0x0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:730
7 0x00007ffff7903f95 in DFLOWworker (t=0x7ffff7f2e020 ) at /ufs/manegold/_/Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:301
8 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
9 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7fffeabd9700 (LWP 27113)):
0 0x000000337ce0d6a0 in sem_wait () from /lib64/libpthread.so.0
1 0x00007ffff7903b3b in q_dequeue (q=0x7fffdc338b90) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:210
2 0x00007ffff7905530 in DFLOWscheduler (flow=0x7fffdc2eb7c0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:573
3 0x00007ffff7905b33 in runMALdataflow (cntxt=0x628028, mb=0x7fffdc33b810, startpc=21, stoppc=42, stk=0x7fffdc591100) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:653
4 0x00007ffff7aa719f in MALstartDataflow (cntxt=0x628028, mb=0x7fffdc33b810, stk=0x7fffdc591100, pci=0x7fffdc33e8d0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/modules/mal/language.c:136
5 0x00007ffff790083c in runMALsequence (cntxt=0x628028, mb=0x7fffdc33b810, startpc=1, stoppc=104, stk=0x7fffdc591100, env=0x7fffdc594120, pcicaller=0x7fffdc582590) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:650
6 0x00007ffff7900c8f in runMALsequence (cntxt=0x628028, mb=0x7fffdc11d020, startpc=1, stoppc=288, stk=0x7fffdc594120, env=0x7fffdc58bdf0, pcicaller=0x7fffdc2f3cc0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:730
7 0x00007ffff7900c8f in runMALsequence (cntxt=0x628028, mb=0x7fffdc313f00, startpc=1, stoppc=0, stk=0x7fffdc58bdf0, env=0x0, pcicaller=0x0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:730
8 0x00007ffff78ffd01 in callMAL (cntxt=0x628028, mb=0x7fffdc313f00, env=0x7fffeabd8b48, argv=0x7fffeabd8ba0, debug=0 '\000') at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:472
9 0x00007fffef073b06 in SQLexecutePrepared (c=0x628028, be=0x7fffdc02b470, q=0x7fffdc2044c0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/sql/backends/monet5/sql_scenario.c:1888
10 0x00007fffef073f42 in SQLengineIntern (c=0x628028, be=0x7fffdc02b470) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/sql/backends/monet5/sql_scenario.c:1951
11 0x00007fffef0745c7 in SQLengine (c=0x628028) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/sql/backends/monet5/sql_scenario.c:2057
12 0x00007ffff792e12a in runPhase (c=0x628028, phase=4) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:522
13 0x00007ffff792e313 in runScenarioBody (c=0x628028) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:566
14 0x00007ffff792e436 in runScenario (c=0x628028) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:586
15 0x00007ffff792f4a8 in MSserveClient (dummy=0x628028) at /ufs/manegold/_/Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_session.c:431
16 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
17 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7fffeadda700 (LWP 27110)):
0 0x000000337c2eb863 in select () from /lib64/libc.so.6
1 0x00007ffff713bb95 in MT_sleep_ms (ms=50) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/gdk/gdk_posix.c:1226
2 0x00007fffef198995 in store_manager () at /ufs/manegold//Monet/HG/ANY/source/MonetDB/sql/storage/store.c:1593
3 0x00007fffef11806e in mvc_logmanager () at /ufs/manegold/_/Monet/HG/ANY/source/MonetDB/sql/server/sql_mvc.c:195
4 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
5 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7fffeafdb700 (LWP 27109)):
0 0x000000337c2eb863 in select () from /lib64/libc.so.6
1 0x00007ffff7aad259 in SERVERlistenThread (Sock=0x1a5cd10) at /ufs/manegold/_/Monet/HG/ANY/source/MonetDB/monetdb5/modules/mal/mal_mapi.c:209
2 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
3 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7ffff03cb700 (LWP 27108)):
0 0x000000337c2eb863 in select () from /lib64/libc.so.6
1 0x00007ffff713bb95 in MT_sleep_ms (ms=1000) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/gdk/gdk_posix.c:1226
2 0x00007ffff791d9ac in profilerHeartbeat (dummy=0x0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_profiler.c:1431
3 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
4 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7ffff05cc700 (LWP 27107)):
0 0x000000337c2eb863 in select () from /lib64/libc.so.6
1 0x00007ffff713bb95 in MT_sleep_ms (ms=50) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/gdk/gdk_posix.c:1226
2 0x00007ffff707712c in GDKvmtrim (limit=0x7ffff779ecf8 <GDK_mem_maxsize>) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/gdk/gdk_utils.c:921
3 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
4 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7ffff6b3e840 (LWP 27082)):
0 0x000000337ce0e12d in read () from /lib64/libpthread.so.0
1 0x0000003380e2a3c1 in rl_getc () from /lib64/libreadline.so.6
2 0x0000003380e2abc9 in rl_read_key () from /lib64/libreadline.so.6
3 0x0000003380e15d51 in readline_internal_char () from /lib64/libreadline.so.6
4 0x0000003380e162a5 in readline () from /lib64/libreadline.so.6
5 0x00007ffff791e7d2 in getConsoleInput (c=0x627d40, prompt=0x65ac70 ">", linemode=0, exit_on_error=1) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_readline.c:329
6 0x00007ffff791ed94 in readConsole (cntxt=0x627d40) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_readline.c:473
7 0x00007ffff792f6f2 in MALreader (c=0x627d40) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_session.c:491
8 0x00007ffff792e12a in runPhase (c=0x627d40, phase=0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:522
9 0x00007ffff792e229 in runScenarioBody (c=0x627d40) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:552
10 0x00007ffff792e436 in runScenario (c=0x627d40) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:586
11 0x00007ffff792f4a8 in MSserveClient (dummy=0x627d40) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_session.c:431
12 0x000000000040367c in main (argc=3, av=0x7fffffffd6c8) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/tools/mserver/mserver5.c:622
(gdb)
Attached file: deadlock.sciql (application/octet-stream, 4253 bytes)
Description: SciQL test script
Comment 19036
Date: 2013-08-20 19:54:35 +0200
From: @mlkersten
Yes, there might be a potential deadlock in the following situation. Recall that we have a single pool per session. If we call a function containing a parallel block, then we effectively have reduced the available pool with 1 worker, because the calling MAL instruction is put on hold without releasing the worker thread. After a few calls, all workers may be occupied by handling a MAL function call, putting new instructions in the queue.
Comment 19037
Date: 2013-08-20 21:22:52 +0200
From: @mlkersten
Situation is re-created with the test BugTracker-2013/Tests/nestedcalls.sql
A related on is BugTracker-2013/Tests/recursive.sql
Please be aware that the problem exists (also) in the Feb2013 (release) branch (and SciQL-2 branch that is spawned off the Feb2013 branch). --- You added you tests (only) to the default (development) branch.
Do I understand you correctly, that with "an unfortunate constellation" of nested/recursive MAL function calls --- only if the calling or the called or both functions involve dataflow blocks? ---, all worker threads might become DFLOWscheduler's, and thus there are no threads left to do the actual work (DFLOWworker), and thus, all DFLOWscheduler's wait for the work to be done by no awailable worker threads?
Given that the DFLOWscheduler is not supposed/expected to do much work itself, would it be an option to spawn a new scheduler-thread with each (non-inlined/-inlineable) MAL function call --- only if the calling or the called or both functions involve dataflow blocks? ---, thus keeping the worker threads free to do the "real" work?
revert unintened(?) changes that "slipped in" with changeset [489815265a61](https://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=489815265a61)
these appear unrelated to fixing bug #3346,
and rather the result of too coarse copy-&-paste
Date: 2013-08-20 17:26:15 +0200
From: @drstmane
To: MonetDB5 devs <>
Version: 11.15.11 (Feb2013-SP3)
CC: @mlkersten
Last updated: 2013-09-27 13:47:18 +0200
Comment 19035
Date: 2013-08-20 17:26:15 +0200
From: @drstmane
Created attachment 223
SciQL test script
On my 8-core (4 physical core with hyperthreading, the attached SciQL script finished succesfully in less than 2 seconds (debug build) when run with t = {1,7,8} threads (mserver5 --set gdk_nr_threads=t).
However, when run with t={2,3,4,5,6} threads, the script hangs as several threads --- 3(!) when the server run with 2(!) threads --- hang on the MT_sema_down(&q->s, "q_dequeue"); in dataflow's q_dequeue() function; see also the gdk trace below.
This happens with the latest version of the SciQL-2 branch (changeset c935ec8da74c), but similar (or identical?) behaviour has been observed also with earlier versions, i.e., before the recent re-cast of the worker-pool had been propagated from the Feb2013 branch.
(gdb) thread apply all bt
Thread 8 (Thread 0x7fffea7d7700 (LWP 27115)):
0 0x000000337ce0d6a0 in sem_wait () from /lib64/libpthread.so.0
1 0x00007ffff7903b3b in q_dequeue (q=0x7fffd4003660) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:210
2 0x00007ffff7905530 in DFLOWscheduler (flow=0x7fffd40035d0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:573
3 0x00007ffff7905b33 in runMALdataflow (cntxt=0x628028, mb=0x7fffdc2eed50, startpc=12, stoppc=43, stk=0x7fffd4003e70) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:653
4 0x00007ffff7aa719f in MALstartDataflow (cntxt=0x628028, mb=0x7fffdc2eed50, stk=0x7fffd4003e70, pci=0x7fffdc526270) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/modules/mal/language.c:136
5 0x00007ffff790083c in runMALsequence (cntxt=0x628028, mb=0x7fffdc2eed50, startpc=1, stoppc=54, stk=0x7fffd4003e70, env=0x7fffdc591100, pcicaller=0x7fffdc5257b0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:650
6 0x00007ffff7900c8f in runMALsequence (cntxt=0x628028, mb=0x7fffdc33b810, startpc=38, stoppc=39, stk=0x7fffdc591100, env=0x0, pcicaller=0x0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:730
7 0x00007ffff7903f95 in DFLOWworker (t=0x7ffff7f2e028 <workers+8>) at /ufs/manegold/_/Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:301
8 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
9 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7fffea9d8700 (LWP 27114)):
0 0x000000337ce0d6a0 in sem_wait () from /lib64/libpthread.so.0
1 0x00007ffff7903b3b in q_dequeue (q=0x7fffe00037b0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:210
2 0x00007ffff7905530 in DFLOWscheduler (flow=0x7fffe0003670) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:573
3 0x00007ffff7905b33 in runMALdataflow (cntxt=0x628028, mb=0x7fffdc528030, startpc=12, stoppc=43, stk=0x7fffe00040b0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:653
4 0x00007ffff7aa719f in MALstartDataflow (cntxt=0x628028, mb=0x7fffdc528030, stk=0x7fffe00040b0, pci=0x7fffdc525390) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/modules/mal/language.c:136
5 0x00007ffff790083c in runMALsequence (cntxt=0x628028, mb=0x7fffdc528030, startpc=1, stoppc=54, stk=0x7fffe00040b0, env=0x7fffdc591100, pcicaller=0x7fffdc56f5b0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:650
6 0x00007ffff7900c8f in runMALsequence (cntxt=0x628028, mb=0x7fffdc33b810, startpc=41, stoppc=42, stk=0x7fffdc591100, env=0x0, pcicaller=0x0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:730
7 0x00007ffff7903f95 in DFLOWworker (t=0x7ffff7f2e020 ) at /ufs/manegold/_/Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:301
8 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
9 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7fffeabd9700 (LWP 27113)):
0 0x000000337ce0d6a0 in sem_wait () from /lib64/libpthread.so.0
1 0x00007ffff7903b3b in q_dequeue (q=0x7fffdc338b90) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:210
2 0x00007ffff7905530 in DFLOWscheduler (flow=0x7fffdc2eb7c0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:573
3 0x00007ffff7905b33 in runMALdataflow (cntxt=0x628028, mb=0x7fffdc33b810, startpc=21, stoppc=42, stk=0x7fffdc591100) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_dataflow.c:653
4 0x00007ffff7aa719f in MALstartDataflow (cntxt=0x628028, mb=0x7fffdc33b810, stk=0x7fffdc591100, pci=0x7fffdc33e8d0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/modules/mal/language.c:136
5 0x00007ffff790083c in runMALsequence (cntxt=0x628028, mb=0x7fffdc33b810, startpc=1, stoppc=104, stk=0x7fffdc591100, env=0x7fffdc594120, pcicaller=0x7fffdc582590) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:650
6 0x00007ffff7900c8f in runMALsequence (cntxt=0x628028, mb=0x7fffdc11d020, startpc=1, stoppc=288, stk=0x7fffdc594120, env=0x7fffdc58bdf0, pcicaller=0x7fffdc2f3cc0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:730
7 0x00007ffff7900c8f in runMALsequence (cntxt=0x628028, mb=0x7fffdc313f00, startpc=1, stoppc=0, stk=0x7fffdc58bdf0, env=0x0, pcicaller=0x0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:730
8 0x00007ffff78ffd01 in callMAL (cntxt=0x628028, mb=0x7fffdc313f00, env=0x7fffeabd8b48, argv=0x7fffeabd8ba0, debug=0 '\000') at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_interpreter.c:472
9 0x00007fffef073b06 in SQLexecutePrepared (c=0x628028, be=0x7fffdc02b470, q=0x7fffdc2044c0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/sql/backends/monet5/sql_scenario.c:1888
10 0x00007fffef073f42 in SQLengineIntern (c=0x628028, be=0x7fffdc02b470) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/sql/backends/monet5/sql_scenario.c:1951
11 0x00007fffef0745c7 in SQLengine (c=0x628028) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/sql/backends/monet5/sql_scenario.c:2057
12 0x00007ffff792e12a in runPhase (c=0x628028, phase=4) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:522
13 0x00007ffff792e313 in runScenarioBody (c=0x628028) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:566
14 0x00007ffff792e436 in runScenario (c=0x628028) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:586
15 0x00007ffff792f4a8 in MSserveClient (dummy=0x628028) at /ufs/manegold/_/Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_session.c:431
16 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
17 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7fffeadda700 (LWP 27110)):
0 0x000000337c2eb863 in select () from /lib64/libc.so.6
1 0x00007ffff713bb95 in MT_sleep_ms (ms=50) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/gdk/gdk_posix.c:1226
2 0x00007fffef198995 in store_manager () at /ufs/manegold//Monet/HG/ANY/source/MonetDB/sql/storage/store.c:1593
3 0x00007fffef11806e in mvc_logmanager () at /ufs/manegold/_/Monet/HG/ANY/source/MonetDB/sql/server/sql_mvc.c:195
4 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
5 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7fffeafdb700 (LWP 27109)):
0 0x000000337c2eb863 in select () from /lib64/libc.so.6
1 0x00007ffff7aad259 in SERVERlistenThread (Sock=0x1a5cd10) at /ufs/manegold/_/Monet/HG/ANY/source/MonetDB/monetdb5/modules/mal/mal_mapi.c:209
2 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
3 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7ffff03cb700 (LWP 27108)):
0 0x000000337c2eb863 in select () from /lib64/libc.so.6
1 0x00007ffff713bb95 in MT_sleep_ms (ms=1000) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/gdk/gdk_posix.c:1226
2 0x00007ffff791d9ac in profilerHeartbeat (dummy=0x0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_profiler.c:1431
3 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
4 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7ffff05cc700 (LWP 27107)):
0 0x000000337c2eb863 in select () from /lib64/libc.so.6
1 0x00007ffff713bb95 in MT_sleep_ms (ms=50) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/gdk/gdk_posix.c:1226
2 0x00007ffff707712c in GDKvmtrim (limit=0x7ffff779ecf8 <GDK_mem_maxsize>) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/gdk/gdk_utils.c:921
3 0x000000337ce07d15 in start_thread () from /lib64/libpthread.so.0
4 0x000000337c2f253d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7ffff6b3e840 (LWP 27082)):
0 0x000000337ce0e12d in read () from /lib64/libpthread.so.0
1 0x0000003380e2a3c1 in rl_getc () from /lib64/libreadline.so.6
2 0x0000003380e2abc9 in rl_read_key () from /lib64/libreadline.so.6
3 0x0000003380e15d51 in readline_internal_char () from /lib64/libreadline.so.6
4 0x0000003380e162a5 in readline () from /lib64/libreadline.so.6
5 0x00007ffff791e7d2 in getConsoleInput (c=0x627d40, prompt=0x65ac70 ">", linemode=0, exit_on_error=1) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_readline.c:329
6 0x00007ffff791ed94 in readConsole (cntxt=0x627d40) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_readline.c:473
7 0x00007ffff792f6f2 in MALreader (c=0x627d40) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_session.c:491
8 0x00007ffff792e12a in runPhase (c=0x627d40, phase=0) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:522
9 0x00007ffff792e229 in runScenarioBody (c=0x627d40) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:552
10 0x00007ffff792e436 in runScenario (c=0x627d40) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_scenario.c:586
11 0x00007ffff792f4a8 in MSserveClient (dummy=0x627d40) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/monetdb5/mal/mal_session.c:431
12 0x000000000040367c in main (argc=3, av=0x7fffffffd6c8) at /ufs/manegold//Monet/HG/ANY/source/MonetDB/tools/mserver/mserver5.c:622
(gdb)
Comment 19036
Date: 2013-08-20 19:54:35 +0200
From: @mlkersten
Yes, there might be a potential deadlock in the following situation. Recall that we have a single pool per session. If we call a function containing a parallel block, then we effectively have reduced the available pool with 1 worker, because the calling MAL instruction is put on hold without releasing the worker thread. After a few calls, all workers may be occupied by handling a MAL function call, putting new instructions in the queue.
Comment 19037
Date: 2013-08-20 21:22:52 +0200
From: @mlkersten
Situation is re-created with the test BugTracker-2013/Tests/nestedcalls.sql
A related on is BugTracker-2013/Tests/recursive.sql
Comment 19040
Date: 2013-08-21 08:41:04 +0200
From: @drstmane
Please be aware that the problem exists (also) in the Feb2013 (release) branch (and SciQL-2 branch that is spawned off the Feb2013 branch). --- You added you tests (only) to the default (development) branch.
Comment 19041
Date: 2013-08-21 08:48:32 +0200
From: @drstmane
Do I understand you correctly, that with "an unfortunate constellation" of nested/recursive MAL function calls --- only if the calling or the called or both functions involve dataflow blocks? ---, all worker threads might become DFLOWscheduler's, and thus there are no threads left to do the actual work (DFLOWworker), and thus, all DFLOWscheduler's wait for the work to be done by no awailable worker threads?
Given that the DFLOWscheduler is not supposed/expected to do much work itself, would it be an option to spawn a new scheduler-thread with each (non-inlined/-inlineable) MAL function call --- only if the calling or the called or both functions involve dataflow blocks? ---, thus keeping the worker threads free to do the "real" work?
Comment 19042
Date: 2013-08-21 11:39:30 +0200
From: MonetDB Mercurial Repository <>
Changeset 98ac58eef94c made by Stefan Manegold Stefan.Manegold@cwi.nl in the MonetDB repo, refers to this bug.
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=98ac58eef94c
Changeset description:
Comment 19047
Date: 2013-08-21 23:37:05 +0200
From: @drstmane
More exhaustive testing indicates that Martin's fix (changeset 489815265a61 ff.) does also work fine with the initial SciQL problem.
Thanks to Martin for the prompt fix and for providing a concise SQL-only test in sql/test/BugTracker-2013/Tests/nestedcalls.sql .
The text was updated successfully, but these errors were encountered: