Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suspected memory leak in mserver5 when creating/dropping tables #6401

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

Suspected memory leak in mserver5 when creating/dropping tables #6401

monetdb-team opened this issue Nov 30, 2020 · 0 comments
Labels
bug Something isn't working major MAL/M5

Comments

@monetdb-team
Copy link

Date: 2017-09-06 08:08:59 +0200
From: Edvinas <<edvinas.rasys>>
To: MonetDB5 devs <>
Version: 11.33.3 (Apr2019)
CC: arkr17997, edvinas.rasys, kravchenko.anton86, @mlkersten, @njnes

Last updated: 2020-06-03 16:58:53 +0200

Comment 25614

Date: 2017-09-06 08:08:59 +0200
From: Edvinas <<edvinas.rasys>>

User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36
Build Identifier:

Our use of monetdb involves creating and dropping a lot of tables in a quick sequence. We've observed that our servers continuously run out of memory and hang, most often during such drop/create/import sequence.

We managed to narrow down the scenario to the create/drop table sequence. This has been tried with the monetdb nodejs driver and mclient, both produce the same results.

Reproducible: Always

Steps to Reproduce:

1.monetdb create test
2.monetdb release test
3. run the following sh script

!/bin/sh

DB="test"

for i in seq 1 100;
do

mclient -d $DB <<EOF
create table test_$i ( PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
drop table test_$i;
EOF
echo table_$i processed
done

Actual Results:

When observed with htop, the RES value of mserver5 keeps increasing when step 3 is repeated several times

Expected Results:

Since there is no data in the tables, and the tables are dropped after creating, the expected result is that the RES value should be the same after the script as it was before the script.

This issue has been observed on monetdb versions 11.23.13, 11.25.23 and 11.27.5
Run on Ubuntu Linux 14.04

Currently tested on:
mserver5 --version
MonetDB 5 server v11.27.5 "Jul2017-SP1" (64-bit, 128-bit integers)
Copyright (c) 1993-July 2008 CWI
Copyright (c) August 2008-2017 MonetDB B.V., all rights reserved
Visit https://www.monetdb.org/ for further information
Found 2.0GiB available memory, 2 available cpu cores
Libraries:
libpcre: 8.31 2012-07-06 (compiled with 8.31)
openssl: OpenSSL 1.0.1f 6 Jan 2014 (compiled with )
libxml2: 2.9.1 (compiled with 2.9.1)
Compiled by: root@dev.monetdb.org (x86_64-pc-linux-gnu)
Compilation: gcc -O3 -fomit-frame-pointer -pipe -g -D_FORTIFY_SOURCE=2
Linking : /usr/bin/ld -m elf_x86_64

Not sure if this is useful, but here is some debugger info from our servers, when the memory is exhausted:

(gdb) where
0 0x00007f6819301c53 in gethostid () at ../sysdeps/unix/sysv/linux/gethostid.c:101
Cannot access memory at address 0x8

(gdb) thr app all bt

Thread 374 (Thread 0x7f680f7d4700 (LWP 6381)):
0 0x00007f6819301c53 in gethostid () at ../sysdeps/unix/sysv/linux/gethostid.c:101
1 0x0000000000000000 in ?? ()

Thread 373 (Thread 0x7f680f5d3700 (LWP 6363)):
0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
1 0x00007f681a074600 in ?? () from /usr/lib/libmonetdb5.so.23
2 0x00007f68195dd184 in start_thread (arg=0x7f680f5d3700) at pthread_create.c:312
3 0x00007f681930a37d in eventfd (count=-129114208, flags=0) at ../sysdeps/unix/sysv/linux/eventfd.c:42
4 0x0000000000000000 in ?? ()

Thread 7 (Thread 0x7f68130ef700 (LWP 1151)):
Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0x21:
0 0x00007f6819301c53 in gethostid () at ../sysdeps/unix/sysv/linux/gethostid.c:101
Cannot access memory at address 0x21

Comment 25920

Date: 2017-11-27 12:57:02 +0100
From: @sjoerdmullender

The problem is most likely in the storage allocator for the gtrans structure that just keeps on growing.

Comment 26139

Date: 2018-01-31 08:46:15 +0100
From: Edvinas <<edvinas.rasys>>

Is this issue being looked at? I'm not able to see any reference to it in the recent commits or release notes.

Comment 26199

Date: 2018-02-15 10:43:08 +0100
From: Edvinas <<edvinas.rasys>>

Retested with 11.27.13 and this is still an issue for us. It's difficult to observe with 100 table rebuilds in the proposed test case, but with 10 000 rebuilds (for i in seq 1 10000;) it appears to be leaking around 30-40MB. It does not appear to be dependent on the length of the columns, but it does depend on the number of columns:

10k table rebuilds - 5 columns (varchar 255) ~ 30-40MB
10k table rebuilds - 10 columns (varchar 255) ~ 60MB
10k table rebuilds - 10 columns (varchar 500) ~ 60MB
10k table rebuilds - 20 columns (varchar 500) ~ 110MB

this averages down to around 0.5-0.6KB on 1 column rebuild

our typical single production rebuild scenario involves around 40 tables each with 2000 columns, so extrapolating that would be around 40-50MB leak. Our real world observations appear to be higher than that, but this issue appears to be the main cause.

Comment 26468

Date: 2018-05-25 21:13:58 +0200
From: Anton Kravchenko <<kravchenko.anton86>>

(In reply to Sjoerd Mullender from comment 1)

The problem is most likely in the storage allocator for the gtrans structure
that just keeps on growing.
I hit this memory leak too. Have you tried to test your hypothesis (by printing the size of the storage allocator for the gtrans while creating/droping tables)?

Comment 26748

Date: 2018-12-24 11:24:52 +0100
From: Thanos <>

Must be like to visit here this amazing blog it is the nice way to look here http://addprinterwindows10.com/ and install the add wireless printer windows 10 use the usb cable and support the all system easily.

Comment 26928

Date: 2019-03-19 00:05:34 +0100
From: @mlkersten

Any progress on this issue with the upcoming Apr19 release>

Comment 26935

Date: 2019-03-19 16:25:28 +0100
From: @mlkersten

Running the script on default branch in March 2019 with sequence 10K does not show any increase in RES size. Running on Fedora. Closing the report for now.

Comment 27116

Date: 2019-07-09 14:30:25 +0200
From: Edvinas <<edvinas.rasys>>

I've run the script with the latest (11.33.3) version on

Ubuntu 16.04
Ubuntu 18.04
CentOS 7.6.1810

and the issue is still present. When observed in htop both VIRT and RES sizes just keep growing...

Comment 27183

Date: 2019-07-28 14:57:45 +0200
From: @mlkersten

Running the script against the default on July 27.
It created 10K connections and dropped it.

23297 mk 20 0 1211360 107192 45752 S 0.0 0.3 0:13.88 mserver5
...
23297 mk 20 0 1211360 145036 45752 S 0.5 0.4 0:50.81 mserver5

It shows that the RES size increased with
145036 - 107192 = 37844 KB -> 3.7KB/test

Comment 27587

Date: 2020-03-14 22:41:39 +0100
From: @mlkersten

Run script with 1000 connections against Default on March 14 2020

22065 mk 20 0 1139068 86196 41416 S 21.5 0.3 0:01.78 mserver5
22065 mk 20 0 1139068 89556 41416 S 0.3 0.3 0:05.14 mserver5

3360 K for 1000 connections.

It shows that the RES size increased with
89556 - 86196 = 3360 KB -> 3.3KB/test

Comment 27588

Date: 2020-03-14 22:48:52 +0100
From: @mlkersten

And for a 10.000 run we have

25546 mk 20 0 1136608 84416 41888 S 20.8 0.3 0:00.87 mserver5
25546 mk 20 0 1138660 130988 41888 S 14.6 0.4 0:50.21 mserver5

130988 - 84416 = 46572 ->4.6K

This could be the basis for a torture test.

Comment 27634

Date: 2020-03-29 19:08:21 +0200
From: MonetDB Mercurial Repository <>

Changeset 90bdae54a2ef made by Niels Nes niels@cwi.nl in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=90bdae54a2ef

Changeset description:

do a deep copy of the global transaction structure, within the store_apply_delta's, when no other 'sessions' are running.
This frees any data leftover from previous drops solving bug #6401.
Current setting is extreme (ie always if possible). We may need to have a tunable for this.

Comment 27635

Date: 2020-03-29 19:09:04 +0200
From: @njnes

the leak is now fully fixed by doing a deepcopy of the central structures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working major MAL/M5
Projects
None yet
Development

No branches or pull requests

2 participants