News: 1739444412

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Undergrad and colleagues accidentally shred 40-year hash table gospel

(2025/02/13)


It isn't often that a decades-old assumption underpinning modern technology is overturned, but a recent paper based on the work of an undergraduate and his two co-authors has done just that.

That assumption refers to hash tables, and a conjecture based on work from the 1980s regarding the optimal way to store and query the data in them. The student, formerly of Rutgers University in New Jersey, came up with a new kind of hash table that is faster and uses fewer steps to find specific elements, all while being unaware of that conjecture.

As detailed by [1]Quanta Magazine , Andrew Krapivin, now a graduate at the University of Cambridge, is one of the co-authors on a paper, " [2]Optimal Bounds for Open Addressing Without Reordering ," published last month that sets out how his hash table can find elements faster than was previously considered possible.

[3]

Hash tables have been around since the 1950s, and are an example of a key-value store where a hash function is used to generate the index for the data value based on the key itself.

[4]

[5]

Previously, a historic paper authored by computer scientist Andrew Yao, " [6]Uniform hashing is optimal ," had asserted that the best way of finding an individual element or an empty location in a hash table is simply to access potential locations randomly, an approach known as uniform probing.

The new paper states that despite its simplicity, Yao's conjecture had never been settled.

[7]

There was a way to get around this involving the insertion algorithm carrying out a reordering process after insertion, i.e. to optimize the placement of elements within the hash table. But it wasn't clear if this was a necessary step in order to speed things up.

[8]Google exec sees enterprise quantum app on closer horizon

[9]Tiny Linux kernel tweak could cut datacenter power use by 30%, boffins say

[10]Abstract, theoretical computing qualifications are turning teens off

[11]BASIC co-creator Thomas Kurtz hits END at 96

The 2025 paper claims that even without reordering elements over time, it is possible to construct a hash table using Krapivin's method that achieves far better probe complexity – the average number of locations that need to be checked (probed) to find a specific key – than previous hash table methods.

The authors of the paper say that they refer to their insertion strategy as "elastic hashing" because of the way that the algorithm often probes much further down the table before snapping back to the position it ends up using.

According to Quanta, the paper demonstrates that for Krapivin's hash table method, the time required for worst-case queries and insertions is proportional to (log x) 2 , which is much faster than the previously assumed linear time complexity in x. Here, x is a number that describes how close the hash table is to being completely full, where x = 100 means the table is 99 percent full, and x = 1,000 means the table is 99.9 percent full.

Krapivin is said to have come up with this method after reading a paper titled " [12]Tiny Pointers ," co-authored by his professor at Rutgers, and exploring how to further miniaturize pointers so they used even less memory space. ®

Get our [13]Tech Resources



[1] https://www.quantamagazine.org/undergraduate-upends-a-40-year-old-data-science-conjecture-20250210/

[2] https://arxiv.org/abs/2501.02305

[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/science&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2Z64lMkx1tDYrMVKhYc4wZQAAARM&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/science&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Z64lMkx1tDYrMVKhYc4wZQAAARM&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/science&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Z64lMkx1tDYrMVKhYc4wZQAAARM&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[6] https://dl.acm.org/doi/pdf/10.1145/3828.3836

[7] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/science&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Z64lMkx1tDYrMVKhYc4wZQAAARM&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[8] https://www.theregister.com/2025/02/06/google_quantum_apps/

[9] https://www.theregister.com/2025/01/29/linux_kernel_tweak/

[10] https://www.theregister.com/2024/11/28/bcs_computer_science_gcse/

[11] https://www.theregister.com/2024/11/20/rip_thomas_kurtz/

[12] https://arxiv.org/abs/2111.12800

[13] https://whitepapers.theregister.com/



Well Done.

Anonymous Coward

Very clever.

It will be interesting to see how this filters through related areas of knowledge.

:)

Re: Well Done.

Anonymous Coward

Of course my interest is business sensitive, so I can't show my handle.

If my employer found out I'm interested in seeing how this filters through related areas of knowledge I'd be for the chop!

But the value I've given to all you readers, knowing that there is a poster here who thinks this is interesting, that's what I care about!!!!

Re: Well Done.

Anonymous Coward

My post was a 'passing' comment, not intended to rival an in depth PHD level exposition !!!

Wouldn't it been much easier to read my post and say to yourself ... Meh !!!

You found my post of no value ... fine that is OK with me, yet you expended so much effort to tell everyone !!!???

Something regarding 'kettles' and 'black' comes to mind or does 'your' post have much more value than mine because 'you' posted it.

People who think that their every thought NEEDS to be posted ...

Including when they simply think 'Meh ... I don't care !!!' ... <=== This is what we ALL need to know.

My mistake ... how silly of me to think otherwise.

:)

Re: Well Done.

cyberdemon

When I was at Uni, I discovered a new kind of hash, too.

Sadly, I have none of it left..

Re: Well Done.

cyberdemon

Since at least one person didn't get it, I should explain that my subtle joke with the icon is one that I stole from a Private Eye cartoon..

Sherlock Holmes (rifling through his desk drawers): "Watson! My Opium and Hashish drawers are empty!"

Dr. Watson: "No 'shit', Sherlock?"

Reminds me of how best to fill a cabin

Charlie Clark

Certainly an interesting article and the details of the pointers are key, with databases the potential big benefactors. Of course, for any real world scenario, you also have to factor in any overhead associated with non-random strategy.

Re: Reminds me of how best to fill a cabin

Anonymous Coward

Charle Clark,

Careful DON'T post your opinion the OP ^^^^ may object because it does not 'interest him or her' !!!

N.B.

It is a crime to post as AC yet the objection is from an AC ... Huh !!!???

:)

Re: Reminds me of how best to fill a cabin

Anonymous Coward

Take your meds. You'll do better. You really will.

Dicts?

Paddy

I wonder if this could be applied to speed up Python dicts and sets, as they use hashes?

Re: Dicts?

cornetman

Hashes are everywhere. Whether or not this discovery is generally applicable, we shall see. I still have yet to read the paper but it does sound quite interesting.

DOS Air:
All the passengers go out onto the runway, grab hold of the plane, push it
until it gets in the air, hop on, jump off when it hits the ground again.
Then they grab the plane again, push it back into the air, hop on, et
cetera.