You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0
Build Identifier:
If the write-ahead log becomes so large that it consists of multiple files,
only the last file is included in the snapshot.
Reproducible: Always
Steps to Reproduce:
The attached python script can be used to reproduce. Invoke it as follows:
./demo.py /tmp/demo 150M
This is what happens:
the script starts an mserver and opens two concurrent connections.
connection1 creates a table, adds some data, commits, observes the
data it added and adds some more, uncommitted data
connection2 observes the committed data and repeatedly adds lots of additional
data which it commits
because connection1 is still ongoing, the committed data ends up in
the write ahead log which becomes spread over multiple files because
it's so large. In my example, the log consists of the files log.4
and log.5.
hot_snapshot creates a snapshot containing only log.5
when starting the snapshot, log.4 is missing and log.5 is ignored so
we lose data
Date: 2020-06-04 11:17:58 +0200
From: @joerivanruth
To: SQL devs <>
Version: 11.37.7 (Jun2020)
Last updated: 2020-07-27 09:30:11 +0200
Comment 27783
Date: 2020-06-04 11:17:58 +0200
From: @joerivanruth
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0
Build Identifier:
If the write-ahead log becomes so large that it consists of multiple files,
only the last file is included in the snapshot.
Reproducible: Always
Steps to Reproduce:
The attached python script can be used to reproduce. Invoke it as follows:
./demo.py /tmp/demo 150M
This is what happens:
the script starts an mserver and opens two concurrent connections.
connection1 creates a table, adds some data, commits, observes the
data it added and adds some more, uncommitted data
connection2 observes the committed data and repeatedly adds lots of additional
data which it commits
because connection1 is still ongoing, the committed data ends up in
the write ahead log which becomes spread over multiple files because
it's so large. In my example, the log consists of the files log.4
and log.5.
hot_snapshot creates a snapshot containing only log.5
when starting the snapshot, log.4 is missing and log.5 is ignored so
we lose data
Actual Results:
This is the output of the script on my system:
Starting ['mserver5', '--dbpath=/tmp/demo/foo', '--set', 'mapi_port=50000']
Agent 1: DROP TABLE IF EXISTS foo
0.002s (0.002s total)
Agent 1: CREATE TABLE foo(i BIGINT)
0.002s (0.004s total)
Agent 1: INSERT INTO foo SELECT 42 FROM sys.generate_series(0, 1)
0.003s (0.007s total)
Agent 1: COMMIT
0.011s (0.018s total)
Agent 1: SELECT COUNT(*) FROM foo
==> 1
0.003s (0.018s total)
Agent 1: INSERT INTO foo SELECT 42 FROM sys.generate_series(0, 33)
0.002s (0.020s total)
Agent 2: SELECT COUNT() FROM foo
==> 1
0.002s (0.020s total)
Agent 2: INSERT INTO foo SELECT 42 FROM sys.generate_series(0, 150000000)
2.033s (2.052s total)
Agent 2: COMMIT
1.428s (3.480s total)
Agent 2: INSERT INTO foo SELECT 42 FROM sys.generate_series(0, 150000000)
3.329s (6.809s total)
Agent 2: COMMIT
1.858s (8.667s total)
Agent 2: INSERT INTO foo SELECT 42 FROM sys.generate_series(0, 100)
2.105s (10.772s total)
Agent 2: COMMIT
1.282s (12.053s total)
Agent 2: SELECT COUNT() FROM foo
==> 300000101
0.001s (12.053s total)
Contents of /tmp/demo/foo/sql_logs/sql:
log [4] 10
log.4 2400001006
log.5 524288
Starting snapshot
Agent 2: CALL sys.hot_snapshot(r'/tmp/demo/snap.tar')
0.454s (12.507s total)
Killing server
Deleting database dir
Restoring snapshot
Contents of /tmp/demo/foo/sql_logs/sql:
log [4] 10
log.5 524288
Restarting server
Starting ['mserver5', '--dbpath=/tmp/demo/foo', '--set', 'mapi_port=50000']
Counting restored rows
Agent 3: SELECT COUNT(*) FROM foo
MonetDB Error: SELECT: no such table 'foo'
Expected Results:
After restoring the snapshot, table 'foo' should exist and have 300000101 rows.
Comment 27787
Date: 2020-06-04 11:29:54 +0200
From: @joerivanruth
*** Bug #6874 has been marked as a duplicate of this bug. ***
Comment 27789
Date: 2020-06-04 11:30:07 +0200
From: @joerivanruth
*** Bug #6875 has been marked as a duplicate of this bug. ***
Comment 27790
Date: 2020-06-04 11:31:43 +0200
From: @joerivanruth
Created attachment 681
python script that demonstrates the issue
The text was updated successfully, but these errors were encountered: