Updated The claim trie memory reduction (markdown)

Alex Grin 2021-10-14 16:22:02 -04:00
parent ed9b4ac68b
commit fb679ca5db

@ -43,11 +43,13 @@ class Node {
With this kind of structure, I can walk from the root node to the leaf node for any search in O(len(name)) time. I can keep a set of nodes that need their hashes updated as I go. It works okay, but now consider the general inefficiencies of this approach. Example keys: superfluous, stupendous, and stupified. How does that look?
## 🄴➙🄽➙🄳➙🄾➙🅄➙🅂
➚🅃➙🅄➙🄿➙🄸➙🄵➙🄸➙🄴➙🄳
```
🄴➙🄽➙🄳➙🄾➙🅄➙🅂
➚ 🅃➙🅄➙🄿➙🄸➙🄵➙🄸➙🄴➙🄳
🅂
➘🅄➙🄿➙🄴➙🅁➙🄵➙🄻➙🅄➙🄾➙🅄➙🅂
➘ 🅄➙🄿➙🄴➙🅁➙🄵➙🄻➙🅄➙🄾➙🅄➙🅂
```
In other words, we're now using 25 nodes to hold three data points. All but two of those nodes have one or no child. 22 of the 25 nodes have an empty data member. This is RAM intensive and very wasteful.
@ -61,15 +63,19 @@ Over the years there have been many proposals to improve this structure. I'm goi
It ends up that idea #1 makes all the difference. You have to combine the nodes as much as possible. That turns the above trie into 5 nodes down from 25 becoming:
## 🄴🄽🄳🄾🅄🅂
➚🅃🅄🄿➙🄸🄵🄸🄴🄳
```
🄴🄽🄳🄾🅄🅂
➚ 🅃🅄🄿➙🄸🄵🄸🄴🄳
🅂
➘🅄🄿🄴🅁🄵🄻🅄🄾🅄🅂
➘ 🅄🄿🄴🅁🄵🄻🅄🄾🅄🅂
```
## Section 2: the experiments
[ Timed experiments for 1 million insertions of random data [a-zA-Z0-9]{1, 60}](https://www.notion.so/adecf55e97fb4c8080e5288bb44cd65d)
Timed experiments for 1 million insertions of random data [a-zA-Z0-9]{1, 60}
TABLE GOES HERE
A few notes about the table:
@ -110,7 +116,7 @@ set.end() == set.lower_bound("C" + char(1))
The general find algorithm:
[https://s3-us-west-2.amazonaws.com/secure.notion-static.com/8c57dec4-5909-4568-bc0e-0902e6c13952/untitled](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/8c57dec4-5909-4568-bc0e-0902e6c13952/untitled)
![](https://s3.us-west-2.amazonaws.com/secure.notion-static.com/8c57dec4-5909-4568-bc0e-0902e6c13952/untitled?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAT73L2G45O3KS52Y5%2F20211014%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20211014T202050Z&X-Amz-Expires=86400&X-Amz-Signature=fa216fce572dea2d2d63bd4216cbea6f35954476f0cd5c68599e8af78d39da84&X-Amz-SignedHeaders=host&response-content-disposition=filename%20%3D%22untitled%22)
The general find algorithm in pseudo-code: