You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if you have a column with only two rows, the SAMPLE function does not pick each approximately 50% of the time. something is biasing the SAMPLE draws.
Reproducible: Always
Steps to Reproduce:
here's the sql command.
SELECT * FROM ( SELECT 1 AS col UNION ALL SELECT 2 AS col ) AS temp SAMPLE 1;
but you have to run it a thousand times, however you prefer. here's the R code to do that with MonetDB.R--
here's a reproducible example using R code to repeat the sampling 1000 times. in both SAMPLE examples below, the database pulls the 2 less than 200 times out of 1000. shouldn't it be close to 500 out of 1000? this seems not random (misleading to users?) sorry if i'm misunderstanding something.. thank you!!
start in an empty directory somewhere
setwd( "C:/My Directory/MonetDB" )
dbGetQuery( db , "SELECT 1 AS col UNION ALL SELECT 2 AS col" )
out <- NULL
for ( i in 1:1000 ){
out <- c( out , dbGetQuery( db , "SELECT * FROM ( SELECT 1 AS col UNION ALL SELECT 2 AS col ) AS temp SAMPLE 0.5" ) )
}
not random
table( unlist( out ) )
1 2
880 120
out <- NULL
for ( i in 1:1000 ){
out <- c( out , dbGetQuery( db , "SELECT * FROM ( SELECT 1 AS col UNION ALL SELECT 2 AS col ) AS temp SAMPLE 1" ) )
}
ALSO not random
table( unlist( out ) )
1 2
856 144
Actual Results:
in both of the above examples of R output, more than 4 out of 5 draws were 1s instead of 2s.
Could not reproduce on Windows 7 Professional (64bits) with MonetDB Oct2014-SP3, both 32 and 64bits. Also could not find any flaw with random number generator on Windows.
Comment 21114
Date: 2015-08-10 20:41:20 +0200
From: Anthony Damico <>
sorry, this has not been fixed on windows using the Jul2015 release
table( unlist( out ) )
1 2
832 168
table( unlist( out ) )
1 2
830 170
MonetDB 5 server v11.21.1 "Jul2015"
Serving database 'test', using 8 threads
Compiled for x86_64-pc-winnt/64bit with 64bit OIDs dynamically linked
Found 15.914 GiB available main-memory.
Copyright (c) 1993-July 2008 CWI.
Copyright (c) August 2008-2015 MonetDB B.V., all rights reserved
Visit http://www.monetdb.org/ for further information
Listening for connection requests on mapi:monetdb://127.0.0.1:50000/
Start processing logs sql/sql_logs version 52200
Finished processing logs sql/sql_logs
MonetDB/SQL module loaded
Date: 2015-08-23 19:45:49 +0200
From: Anthony Damico <>
just tested this- looks like it's been solved on the latest testing version. thanks thanks!!
MonetDB 5 server v11.21.3 "Jul2015"
Serving database 'test', using 8 threads
Compiled for x86_64-pc-winnt/64bit with 64bit OIDs dynamically linked
Found 15.914 GiB available main-memory.
Copyright (c) 1993-July 2008 CWI.
Copyright (c) August 2008-2015 MonetDB B.V., all rights reserved
Visit http://www.monetdb.org/ for further information
Listening for connection requests on mapi:monetdb://127.0.0.1:50000/
Start processing logs sql/sql_logs version 52200
Finished processing logs sql/sql_logs
MonetDB/SQL module loaded
Date: 2015-05-28 09:58:52 +0200
From: Anthony Damico <>
To: SQL devs <>
Version: 11.19.11 (Oct2014-SP3)
CC: ajdamico, @hannesmuehleisen, @njnes
Last updated: 2015-08-28 13:41:54 +0200
Comment 20895
Date: 2015-05-28 09:58:52 +0200
From: Anthony Damico <>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0
Build Identifier:
if you have a column with only two rows, the SAMPLE function does not pick each approximately 50% of the time. something is biasing the SAMPLE draws.
Reproducible: Always
Steps to Reproduce:
here's the sql command.
SELECT * FROM ( SELECT 1 AS col UNION ALL SELECT 2 AS col ) AS temp SAMPLE 1;
but you have to run it a thousand times, however you prefer. here's the R code to do that with MonetDB.R--
here's a reproducible example using R code to repeat the sampling 1000 times. in both SAMPLE examples below, the database pulls the 2 less than 200 times out of 1000. shouldn't it be close to 500 out of 1000? this seems not random (misleading to users?) sorry if i'm misunderstanding something.. thank you!!
start in an empty directory somewhere
setwd( "C:/My Directory/MonetDB" )
library(MonetDB.R)
batfile <-
monetdb.server.setup(
database.directory = paste0( getwd() , "/MonetDB" ) ,
monetdb.program.path =
ifelse(
.Platform$OS.type == "windows" ,
"C:/Program Files/MonetDB/MonetDB5" ,
""
) ,
dbname = "test" ,
dbport = 50000
)
pid <- monetdb.server.start( batfile )
db <- dbConnect( MonetDB.R() , "monetdb://localhost:50000/test" , wait = TRUE )
dbGetQuery( db , "SELECT 1 AS col UNION ALL SELECT 2 AS col" )
out <- NULL
for ( i in 1:1000 ){
out <- c( out , dbGetQuery( db , "SELECT * FROM ( SELECT 1 AS col UNION ALL SELECT 2 AS col ) AS temp SAMPLE 0.5" ) )
}
not random
table( unlist( out ) )
1 2
880 120
out <- NULL
for ( i in 1:1000 ){
out <- c( out , dbGetQuery( db , "SELECT * FROM ( SELECT 1 AS col UNION ALL SELECT 2 AS col ) AS temp SAMPLE 1" ) )
}
ALSO not random
table( unlist( out ) )
1 2
856 144
Actual Results:
in both of the above examples of R output, more than 4 out of 5 draws were 1s instead of 2s.
Expected Results:
half 1s half 2s
Comment 20900
Date: 2015-06-02 13:50:32 +0200
From: MonetDB Mercurial Repository <>
Changeset 91c472eabc48 made by Niels Nes niels@cwi.nl in the MonetDB repo, refers to this bug.
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=91c472eabc48
Changeset description:
Comment 20903
Date: 2015-06-03 14:20:44 +0200
From: @hannesmuehleisen
Looking into this, in particular the behaviour of rand() on our beloved Windows.
Comment 20905
Date: 2015-06-03 15:26:08 +0200
From: @hannesmuehleisen
Could not reproduce on Windows 7 Professional (64bits) with MonetDB Oct2014-SP3, both 32 and 64bits. Also could not find any flaw with random number generator on Windows.
Comment 21114
Date: 2015-08-10 20:41:20 +0200
From: Anthony Damico <>
sorry, this has not been fixed on windows using the Jul2015 release
1 2
832 168
1 2
830 170
MonetDB 5 server v11.21.1 "Jul2015"
Serving database 'test', using 8 threads
Compiled for x86_64-pc-winnt/64bit with 64bit OIDs dynamically linked
Found 15.914 GiB available main-memory.
Copyright (c) 1993-July 2008 CWI.
Copyright (c) August 2008-2015 MonetDB B.V., all rights reserved
Visit http://www.monetdb.org/ for further information
Listening for connection requests on mapi:monetdb://127.0.0.1:50000/
Start processing logs sql/sql_logs version 52200
Finished processing logs sql/sql_logs
MonetDB/SQL module loaded
Comment 21116
Date: 2015-08-11 14:56:41 +0200
From: MonetDB Mercurial Repository <>
Changeset 29e2ba371d58 made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=29e2ba371d58
Changeset description:
Comment 21118
Date: 2015-08-11 15:22:27 +0200
From: MonetDB Mercurial Repository <>
Changeset 5c206e0510fe made by Hannes Muehleisen hannes@cwi.nl in the MonetDB repo, refers to this bug.
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=5c206e0510fe
Changeset description:
Comment 21162
Date: 2015-08-23 19:45:49 +0200
From: Anthony Damico <>
just tested this- looks like it's been solved on the latest testing version. thanks thanks!!
MonetDB 5 server v11.21.3 "Jul2015"
Serving database 'test', using 8 threads
Compiled for x86_64-pc-winnt/64bit with 64bit OIDs dynamically linked
Found 15.914 GiB available main-memory.
Copyright (c) 1993-July 2008 CWI.
Copyright (c) August 2008-2015 MonetDB B.V., all rights reserved
Visit http://www.monetdb.org/ for further information
Listening for connection requests on mapi:monetdb://127.0.0.1:50000/
Start processing logs sql/sql_logs version 52200
Finished processing logs sql/sql_logs
MonetDB/SQL module loaded
Comment 21193
Date: 2015-08-28 13:41:54 +0200
From: @sjoerdmullender
Jul2015 has been released.
The text was updated successfully, but these errors were encountered: