Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thash files not released upon drop table #6863

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

thash files not released upon drop table #6863

monetdb-team opened this issue Nov 30, 2020 · 0 comments
Labels
bug Something isn't working normal SQL

Comments

@monetdb-team
Copy link

Date: 2020-05-12 17:13:06 +0200
From: @swingbit
To: SQL devs <>
Version: 11.35.19 (Nov2019-SP3)
CC: @yzchang

Last updated: 2020-06-09 12:07:56 +0200

Comment 27704

Date: 2020-05-12 17:13:06 +0200
From: @swingbit

User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36
Build Identifier:

In the following:

  • I trigger the creation of a hash on a column.
  • I verify that the thash file is written on disk
  • I drop the table
  • I verify that all the column files are deleted, except the thash file

sql>create table t(s string);
operation successful

sql>copy into t from '100kuuids.txt' on client;
100000 affected rows

-- no thash file yet
sql>select schema,table,column,location,count,hashes,phash from sys.storage('sys','t');
+--------+-------+--------+----------+--------+--------+-------+
| schema | table | column | location | count | hashes | phash |
+========+=======+========+==========+========+========+=======+
| sys | t | s | 06/600 | 100000 | 0 | false |
+--------+-------+--------+----------+--------+--------+-------+
1 tuple

-rw-------. 1 roberto roberto 448K May 12 17:01 06/600.tail
-rw-------. 1 roberto roberto 3.8M May 12 17:02 06/600.theap

-- this select triggers the creation of a hash on column t.s
sql>select * from t where s='e40ee8bd-9a6b-4c4e-b8a2-c103aa1a0aa3';
+--------------------------------------+
| s |
+======================================+
| e40ee8bd-9a6b-4c4e-b8a2-c103aa1a0aa3 |
+--------------------------------------+
1 tuple

sql>select schema,table,column,location,count,hashes,phash from sys.storage('sys','t');
+--------+-------+--------+----------+--------+--------+-------+
| schema | table | column | location | count | hashes | phash |
+========+=======+========+==========+========+========+=======+
| sys | t | s | 06/600 | 100000 | 983088 | false |
+--------+-------+--------+----------+--------+--------+-------+
1 tuple

-rw-------. 1 roberto roberto 448K May 12 17:01 06/600.tail
-rw-------. 1 roberto roberto 1.0M May 12 17:03 06/600.thash
-rw-------. 1 roberto roberto 3.8M May 12 17:02 06/600.theap

sql>drop table t;
operation successful

-rw-------. 1 roberto roberto 1.0M May 12 17:03 06/600.thash

The leftover thash file is deleted only on a server restart.

The issue is that these files can be rather large and this becomes not workable very quickly in a scenario where the server needs to stay up for a long time and data are regularly replaced (tables dropped/created).

Reproducible: Always

Comment 27705

Date: 2020-05-12 17:16:46 +0200
From: @yzchang

Hai Roberto,

Just very quickly:

  • do you still have the hash file after letting mserver5 "rest" for some time?
  • do you still have the hash file after restarting mserver5?

=> i.e. mserver5 doesn't clean up the hash file in any of its later clean up actions.

Comment 27706

Date: 2020-05-12 17:28:04 +0200
From: @swingbit

Hi Jennie,

The thash file stays there indefinitely if not restarting.

It is removed at the next mserver5 restart.

Comment 27707

Date: 2020-05-12 17:40:39 +0200
From: @yzchang

thanks. at least we got it right by restart. but it should have been removed by drop table. i'm going to test this in the RC Jun2020.

Comment 27708

Date: 2020-05-12 19:00:27 +0200
From: MonetDB Mercurial Repository <>

Changeset c174d01fcc3f made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=c174d01fcc3f

Changeset description:

Destroy index files when a BAT is destroyed.
This should fix bug #6863.

Comment 27709

Date: 2020-05-12 19:58:10 +0200
From: @swingbit

I've back-ported the fix to Nov2019 and tried.
Unfortunately I don't see any difference. The thash file stays won't be removed.

Comment 27710

Date: 2020-05-12 20:20:52 +0200
From: @sjoerdmullender

You should only need the change in gdk_storage.c, not the one in gdk_hash.c.

I did see some different behavior when I was testing, but I couldn't put my finger on it. What I did, did fix it in at least one test I did.

Comment 27711

Date: 2020-05-12 20:25:12 +0200
From: @swingbit

Yes! That was it.
I reverted the change in gdk_hash.c nd now it works as expected.
Thanks

Comment 27804

Date: 2020-06-09 12:07:56 +0200
From: MonetDB Mercurial Repository <>

Changeset 3d1bffd73ea2 made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=3d1bffd73ea2

Changeset description:

Destroy index files when a BAT is destroyed.
This should fix bug #6863.
@monetdb-team monetdb-team added bug Something isn't working normal SQL labels Nov 30, 2020
@sjoerdmullender sjoerdmullender added this to the Ancient Release milestone Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working normal SQL
Projects
None yet
Development

No branches or pull requests

2 participants