Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count string rows in union of string tables leaks (RSS) memory #6925

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

Count string rows in union of string tables leaks (RSS) memory #6925

monetdb-team opened this issue Nov 30, 2020 · 0 comments
Labels
bug Something isn't working normal SQL

Comments

@monetdb-team
Copy link

Date: 2020-07-13 18:18:38 +0200
From: jpastuszek
To: SQL devs <>
Version: 11.37.7 (Jun2020)

Last updated: 2020-07-27 09:30:13 +0200

Comment 27905

Date: 2020-07-13 18:18:38 +0200
From: jpastuszek

User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0
Build Identifier:

This query when run couple of times causes MonetDB (mserver5) to use up all 30GiB of RAM and become very slow (throttling) or crash with allocation error:

SELECT count(*) FROM (SELECT message FROM logs.message_20200326
UNION SELECT message FROM logs.message_20200327
UNION SELECT message FROM logs.message_20200328
UNION SELECT message FROM logs.message_20200329
UNION SELECT message FROM logs.message_20200330
UNION SELECT message FROM logs.message_20200331
UNION SELECT message FROM logs.message_20200401
UNION SELECT message FROM logs.message_20200402
UNION SELECT message FROM logs.message_20200403
UNION SELECT message FROM logs.message_20200404
UNION SELECT message FROM logs.message_20200405
UNION SELECT message FROM logs.message_20200406
UNION SELECT message FROM logs.message_20200407) d

The message column is of type STRING that has between 0 and ~32k characters (average 362) where each row has in average about 5 other rows as duplicates.

When I run this query the server spends time (according to "perf top" command) in:

36.55% [.] strHash
4.70% [.] strPut
2.97% [.] BATgroup_internal
1.04% [.] strCmp
0.66% [.] VarHeapValRaw
0.15% [.] MT_thread_setworking
0.06% [.] pthread_getspecific@plt
0.06% [.] insert_string_bat
0.04% [.] BATproject2
0.01% [.] strcmp@plt
0.00% [.] BATcreatedesc
0.00% [.] HASHmask
0.00% [.] getBBPsize

From query plan it looks like it runs "group.groupdone" (BATgroup_internal) on the tables.

Single run of the query counts 1353683 rows and uses around 9GB of RSS memory that is not freed upon completion of the query. Every next run of the same query the mserver5 process eats up additional 9GB of RSS memory.

Reproducible: Always

Steps to Reproduce:

  1. Run the count(*) query on union of large tables with string column.
  2. Observe RSS memory usage of mserver5 process.
  3. Keep repeating the query until server dies.

Actual Results:

Server runs out of memory.

Expected Results:

MonetDB should free memory after each run of the query.

The query with only two tables in the UNION does not leak, having three makes it leak, the more you add the bigger (proportionally) the leak is.

Comment 27906

Date: 2020-07-14 09:41:07 +0200
From: MonetDB Mercurial Repository <>

Changeset f5070ee98dcb made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=f5070ee98dcb

Changeset description:

When resetting a view, make sure the string heap is not shared again.
This fixes bug #6925.

Comment 27907

Date: 2020-07-14 11:05:50 +0200
From: @sjoerdmullender

The problem was caused by a reference counting issue. A BAT that wasn't actually used anymore at the end of the query kept a reference count of 1 and so was not destroyed. The fix I pushed corrects that. I checked, after a run of the query, the set of BATs that are still in use now remains completely unchanged, so no more leaks here.

Comment 27908

Date: 2020-07-14 15:34:38 +0200
From: MonetDB Mercurial Repository <>

Changeset 6f7dc9f47480 made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=6f7dc9f47480

Changeset description:

When resetting a view, make sure the string heap is not shared again.
This fixes bug #6925.

Comment 27910

Date: 2020-07-16 17:25:07 +0200
From: jpastuszek

Sjoerd,

I have tested your fix on my setup and I still observe the memory leaking.

I have prepared this test to reproduce the leak:

CREATE TEMPORARY TABLE dat AS
SELECT DISTINCT func AS s FROM sys.functions ORDER BY length(func) DESC SAMPLE 100 SEED 0
ON COMMIT PRESERVE ROWS;

CREATE TEMPORARY TABLE t AS
SELECT d.s || ' ' || d2.s || ' ' || d3.s AS s
FROM dat d
CROSS JOIN dat d2
CROSS JOIN dat d3
ON COMMIT PRESERVE ROWS;

CREATE TEMPORARY TABLE t2 AS SELECT 't2 ' || s AS s FROM t ON COMMIT PRESERVE ROWS;
CREATE TEMPORARY TABLE t3 AS SELECT 't3 ' || s AS s FROM t ON COMMIT PRESERVE ROWS;
CREATE TEMPORARY TABLE t4 AS SELECT 't4 ' || s AS s FROM t ON COMMIT PRESERVE ROWS;
CREATE TEMPORARY TABLE t5 AS SELECT 't5 ' || s AS s FROM t ON COMMIT PRESERVE ROWS;

Now, if you run following query multiple times the DB will eventually run out of memory and fail at allocation:

SELECT count(*) FROM (SELECT s FROM t UNION SELECT s FROM t2 UNION SELECT s FROM t3 UNION SELECT s FROM t4 UNION SELECT s FROM t5) d;

I hope this is helpful,
Jakub

Comment 27912

Date: 2020-07-16 20:16:23 +0200
From: @sjoerdmullender

There was another leak that I fixed in changeset 5d976ee54e97
With that fix, does you still have a problem?
I'm also going to try your latest test myself of course.

Comment 27913

Date: 2020-07-16 20:24:47 +0200
From: @sjoerdmullender

I don't see any leaked BATs and I see the virtual memory size returning to the exact same value between queries. So it looks pretty good, I'd say.

Comment 27917

Date: 2020-07-17 16:32:37 +0200
From: jpastuszek

OK, adding the second patch does the trick. No more leak. Thank you!

Comment 27918

Date: 2020-07-17 16:53:33 +0200
From: @sjoerdmullender

Great. Thanks for reporting and testing.

@monetdb-team monetdb-team added bug Something isn't working normal SQL labels Nov 30, 2020
@sjoerdmullender sjoerdmullender added this to the Ancient Release milestone Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working normal SQL
Projects
None yet
Development

No branches or pull requests

2 participants