Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A simple way of speeding up impscheck for dense canditers #6847

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

A simple way of speeding up impscheck for dense canditers #6847

monetdb-team opened this issue Nov 30, 2020 · 0 comments
Labels
enhancement New feature or request GDK Kernel

Comments

@monetdb-team
Copy link

Date: 2020-04-06 02:24:45 +0200
From: Nenya Edjah <<nenya_edjah>>
To: GDK devs <>
Version: -- development

Last updated: 2020-06-03 16:58:52 +0200

Comment 27647

Date: 2020-04-06 02:24:45 +0200
From: Nenya Edjah <<nenya_edjah>>

User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36
Build Identifier:

In the impscheck() macro, there is an:

 if (im[icnt] & mask)

check to determine if the corresponding segments of the data should be scanned.

In the else case, we essentially skip over those irrelevant cache lines with a while loop:

 while (p < ci->ncand && o < e) {         
     p++;                                 
     o = canditer_next(ci);               
 }                                        

For dense canditers, this is inefficient since the canditer_next(ci) operation basically just does some arithmetic operations. So, it doesn't make sense to waste CPU cycles by iterating through a while loop. Especially if the amount of data being skipped is large.

A simple optimization is to replace the snippet above with the following:

 if (ci->tpe == cand_dense) {                 
     BUN skip_sz = MIN(ci->ncand - p, e - o); 
     p += skip_sz;                            
     o += skip_sz;                            
     ci->next += skip_sz;                     
 } else {                                     
     while (p < ci->ncand && o < e) {         
         p++;                                 
         o = canditer_next(ci);               
     }                                        
 }                                            

I implemented this change and ran a test on a nearly sorted column with 100m elements. When using a single CPU, this reduces the average query latency from 140ms to 4ms.

Since this is simple change, I just wanted to let you guys know about it in case you wanted to get it implemented in a future version MonetDB.

Best,
Nenya

Reproducible: Always

Comment 27648

Date: 2020-04-06 09:50:54 +0200
From: MonetDB Mercurial Repository <>

Changeset b0e2acf99f2f made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=b0e2acf99f2f

Changeset description:

Speed up imprints check for dense candidate lists.
Thanks lots to Nenya Edjah for the report and the core of the change.
This fixes bug #6847.

Comment 27649

Date: 2020-04-06 09:52:17 +0200
From: @sjoerdmullender

Thanks for the analysis and patch.
As you can see, I made some bigger changes, but in essence, it's your patch.

Comment 27650

Date: 2020-04-06 12:27:47 +0200
From: Nenya Edjah <<nenya_edjah>>

:D

@monetdb-team monetdb-team added enhancement New feature or request GDK Kernel labels Nov 30, 2020
@sjoerdmullender sjoerdmullender added this to the Ancient Release milestone Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request GDK Kernel
Projects
None yet
Development

No branches or pull requests

2 participants