I'm making hash table for game with sdl2. Is my hash table has speed problem? Did I go wrong?

ringo · March 2, 2025, 7:44am

100000 data for each and doing tasks to all data

1062ms add
1477ms search
9ms add no rehash
9ms search no rehash
2430ms delete <-idk why this takes so long time

elefantinho · March 2, 2025, 6:09pm

is this a spatial hash table for objects placed in 2D?
or is it just a hash table implementing an ordinary dictionary/set/map data structure?

did you write it yourself, or are you using some existing library? which is it?

I have hypothesis, but not useful to guess yet without more information

did you try to profile your code? do you know how? which platform are you using?

icculus · March 3, 2025, 1:23am

I don’t know why your deletes are slow without seeing source code, but here’s what SDL3’s hashtable looks like:

github.com/libsdl-org/SDL

src/SDL_hashtable.c

2c7c3d4d7

/*
  Simple DirectMedia Layer
  Copyright (C) 1997-2025 Sam Lantinga <slouken@libsdl.org>

  This software is provided 'as-is', without any express or implied
  warranty.  In no event will the authors be held liable for any damages
  arising from the use of this software.

  Permission is granted to anyone to use this software for any purpose,
  including commercial applications, and to alter it and redistribute it
  freely, subject to the following restrictions:

  1. The origin of this software must not be misrepresented; you must not
     claim that you wrote the original software. If you use this software
     in a product, an acknowledgment in the product documentation would be
     appreciated but is not required.
  2. Altered source versions must be plainly marked as such, and must not be
     misrepresented as being the original software.
  3. This notice may not be removed or altered from any source distribution.
*/

This file has been truncated. show original

A discussion on how it works is here:

github.com/libsdl-org/SDL

A better hashtable implementation

main ← taisei-project:robin-hood-hashtable

opened 09:39PM - 18 Sep 24 UTC

Akaricchi

+313 -115

The original SDL hashtable implementation is pretty naive. Every key-value pair …is stored in a separate allocation, and collisions are resolved via a linked list. While this scheme is very simple to implement, it's bad for cache locality (and therefore performance). This new implementation uses an open-addressing scheme, which places every pair in one contiguous array instead of separate buckets. It automatically grows as the load factor increases, which was a TODO in the old code. Linear probing is used to resolve collisions, which keeps colliding items close in memory. [Robin hood hashing](https://programming.guide/robin-hood-hashing.html) is used to greatly reduce variance in the probe sequence length across all items. The backward shifting optimization for deletions (described at the previous link) is also implemented, but "smart search" is not. This is a very versatile hashtable optimized for lookup performance. I originally wrote this for Taisei Project, where it served us well in hot paths for years. The motivation for porting it was to speed up some hash lookups in the Vulkan GPU backend. It's definitely not the most sophisticated or optimal algorithm, but I think it's a good balance between performance and implementation simplicity. The main thing holding it back, though, is that SDL's hashtables are not templated, so we're stuck with pointer-sized keys and values. It's often just barely not enough, requiring us to malloc the key or [do other silly things](https://github.com/libsdl-org/SDL/blob/7a924b36aeb3b9a068ec09266069f36becb5a57c/src/render/gpu/SDL_pipeline_gpu.c#L195). This throws a wrench into all the cache-locality goodness, though doing less pointer chasing is still beneficial in the end.

ringo · March 3, 2025, 4:07am

both,
as a expansion of C structure for objects ex: equipments,HP etc. original structure only have common value like position, size.
and dictionary for callbacks,events also, callback arguments use it.

yes, I wrote it by myself except wyhash.

I guess only 1 loop cause this issue but idk why it is actually cause
just deleting

platform is windows11 64bit

ringo · March 3, 2025, 4:12am

I just realized I couldn’t benchmark no rehash adding properly.
actually adding no rehash takes 350ms.

generaly hash table takes so long time like this?

ringo · March 3, 2025, 5:00am

I watched it, that code is very beautiful and simple.

I think I’m making simple thing to more complicated.
I write my hash table with chaining but not linked list instead, uint8_t *array and uint8_t num
and I stored all data key, value, value_datasize

ringo · March 3, 2025, 5:32am

in 10k data
time add 24
time search 1
time delete 30
time add no rehash 5
time search no rehash 1

I understand each add/delete needs O(1) * n
but idk why its like O(n) * n
cash locality problem? umm…

flowCRANE · March 3, 2025, 1:53pm

Just use profiler, instead of wondering what is going on.

elefantinho · March 3, 2025, 4:05pm

ok buddy but what is your algorithm exactly? or code.

on Windows, you can profile your code on Visual Studio on a line-by-line basis

jamesL · March 6, 2025, 7:05am

or try stb_ds.h

typesafe dynamic array and hash tables for C, will compile in C++

from

ringo · March 7, 2025, 2:59pm

Recently, I’ve been busy sry for late.

I did a gprof profiling and I realized my hashtable isn’t a problem,
but my allocater is a problem.
my free() takes 80% of time.

my allocater has stack of current using pointer.
and when freeing search stack for specific pointer which takes O(n). alloc takes O(1)

I should change this

ringo · March 7, 2025, 9:26pm

OK I no longer use stack. just array.
now my freeing only takes O(1).