ringo
March 2, 2025, 7:44am
1
100000 data for each and doing tasks to all data
1062ms add
1477ms search
9ms add no rehash
9ms search no rehash
2430ms delete <-idk why this takes so long time
is this a spatial hash table for objects placed in 2D?
or is it just a hash table implementing an ordinary dictionary/set/map data structure?
did you write it yourself, or are you using some existing library? which is it?
I have hypothesis, but not useful to guess yet without more information
did you try to profile your code? do you know how? which platform are you using?
I donāt know why your deletes are slow without seeing source code, but hereās what SDL3ās hashtable looks like:
/*
Simple DirectMedia Layer
Copyright (C) 1997-2025 Sam Lantinga <slouken@libsdl.org>
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
*/
This file has been truncated. show original
A discussion on how it works is here:
main
ā taisei-project:robin-hood-hashtable
opened 09:39PM - 18 Sep 24 UTC
The original SDL hashtable implementation is pretty naive. Every key-value pair ā¦ is stored in a separate allocation, and collisions are resolved via a linked list. While this scheme is very simple to implement, it's bad for cache locality (and therefore performance).
This new implementation uses an open-addressing scheme, which places every pair in one contiguous array instead of separate buckets. It automatically grows as the load factor increases, which was a TODO in the old code. Linear probing is used to resolve collisions, which keeps colliding items close in memory. [Robin hood hashing](https://programming.guide/robin-hood-hashing.html) is used to greatly reduce variance in the probe sequence length across all items. The backward shifting optimization for deletions (described at the previous link) is also implemented, but "smart search" is not.
This is a very versatile hashtable optimized for lookup performance. I originally wrote this for Taisei Project, where it served us well in hot paths for years. The motivation for porting it was to speed up some hash lookups in the Vulkan GPU backend.
It's definitely not the most sophisticated or optimal algorithm, but I think it's a good balance between performance and implementation simplicity. The main thing holding it back, though, is that SDL's hashtables are not templated, so we're stuck with pointer-sized keys and values. It's often just barely not enough, requiring us to malloc the key or [do other silly things](https://github.com/libsdl-org/SDL/blob/7a924b36aeb3b9a068ec09266069f36becb5a57c/src/render/gpu/SDL_pipeline_gpu.c#L195). This throws a wrench into all the cache-locality goodness, though doing less pointer chasing is still beneficial in the end.
ringo
March 3, 2025, 4:07am
4
both,
as a expansion of C structure for objects ex: equipments,HP etc. original structure only have common value like position, size.
and dictionary for callbacks,events also, callback arguments use it.
yes, I wrote it by myself except wyhash.
I guess only 1 loop cause this issue but idk why it is actually cause
just deleting
platform is windows11 64bit
ringo
March 3, 2025, 4:12am
5
I just realized I couldnāt benchmark no rehash adding properly.
actually adding no rehash takes 350ms.
generaly hash table takes so long time like this?
ringo
March 3, 2025, 5:00am
6
I watched it, that code is very beautiful and simple.
I think Iām making simple thing to more complicated.
I write my hash table with chaining but not linked list instead, uint8_t *array and uint8_t num
and I stored all data key, value, value_datasize
ringo
March 3, 2025, 5:32am
7
in 10k data
time add 24
time search 1
time delete 30
time add no rehash 5
time search no rehash 1
I understand each add/delete needs O(1) * n
but idk why its like O(n) * n
cash locality problem? ummā¦
Just use profiler, instead of wondering what is going on.
ok buddy but what is your algorithm exactly? or code.
on Windows, you can profile your code on Visual Studio on a line-by-line basis
jamesL
March 6, 2025, 7:05am
10
or try stb_ds.h
typesafe dynamic array and hash tables for C, will compile in C++
from
stb single-file public domain libraries for C/C++
1 Like
ringo
March 7, 2025, 2:59pm
11
Recently, Iāve been busy sry for late.
I did a gprof profiling and I realized my hashtable isnāt a problem,
but my allocater is a problem.
my free() takes 80% of time.
my allocater has stack of current using pointer.
and when freeing search stack for specific pointer which takes O(n). alloc takes O(1)
I should change this
ringo
March 7, 2025, 9:26pm
12
OK I no longer use stack. just array.
now my freeing only takes O(1).