Difference between revisions of "Category:Hash Table"
(from the between-1am-and-2am dept.: added "deleting unwanted items" bit) |
m (→Deleting unwanted items from a hash table) |
||
Line 88: | Line 88: | ||
} | } | ||
− | Note that this is a very expensive (slow) operation, you should not use $hget().item unless you have to. | + | Note that this is a very expensive (slow) operation, you should not use $hget().item unless you have to. If you can, use [[hdel|/hdel]] with the -w (wildcard) switch instead! |
[[Category:MIRC Help]] | [[Category:MIRC Help]] |
Revision as of 01:06, 12 April 2006
Contents
Basic Defintion
Basically, hash tables are like ini files, storing information in following format:
Tablename
item1=data1
item2=data2
...
However, hash tables are a lot faster than ini- or text-files, which especially takes effect if you have a lot of entries. A kind of disadvantage is the missing possibility to see all entries. With some files, you can just open them and take a look on every entry, whereat hash table, you can't just open then. To have the chance to still get an overview, you can use the following script:
; lets make a little alias called showhash. At the end, it will look like /showhash <name> alias showhash { ; echo the name and a kind of "table header" echo -a $1 echo -a item => data var %i = 1 ; lets loop through all items. $hget($1,0).item will return the total amount of items. while (%i <= $hget($1,0).item) { echo -a $hget($1,%i).item => $hget($1,%i).data ; increase looping-variable inc %i } }
Modifying hash tables
You can make a new hashtable using the /hmake, respectivly delete one using the /hfree command. Using /hdel and /hadd, you can modify the data saved in your hashtable. Due to hash tables not being automatically saved, you have to use the /hsave and /hload command.
Receiving data
After saving data in hash tables, you can of course receive them. Therefor, use the $hget and $hfind identifiers.
More detailed information
The following explanations aren't necessary for using hashtables in mIRC scripts, however, these facts could help you make your scripts faster, especially if you are going to use hash tables for a lot of data.
A hashtable is basically an array of N buckets (N being the size from /hmake) where each bucket is just a pointer (4 bytes) to a linked list of items. In principle you could enlarge the number of buckets later on by resizing the array, but that means the hash values for items change: the bucket number for an item, i.e. the index into the array of pointers, is determined by the formula "hashfunc(itemname) modulo N". Changing N on the fly essentially means you have to reassign most of the items to other buckets, which is a nightmare with large hashtables.
This means that $hget will indeed be slower if you take a smaller hashtable and fill it with the same amount of items as before, since the average amount of items to traverse in a bucket when looking for a match is N/2, which would mean that $hget takes much longer on a big file loaded in a small hash table, compared to a big file loaded in a big hash table.
A benchmark would show this, although you'd have to repeat $hget a large number of times of course, as the speed of individual $hgets is still negligible.
On the other hand, if your file contains M entries, /hload does M implicit $hgets to ensure that item names won't be added twice. If the factor M:N (i.e. items:size) is 10:1, then it will do (on average) 5 * M lookups, if it is 100:1, then it will do 50 * M lookups and so on. For large values of M, this difference is noticeable, even with one single /hload. So you should choose N very carefully before issuing a /hload.
Note that there is no such thing as an "overflow" in hashtables; buckets point to linked lists that can be of arbitrary length. If you make a hashtable of size 1, you are essentially using a single linked list of items, and everything will still work fine (but $hget would be just as slow as $hfind in that case).
Deleting unwanted items from a hash table
In some cases, you'll want to loop over all the items in the hashtable, deleting the ones you do not want anymore. It may be a bit unclear at first how to do this: deleting items while looping over them is kind of like removing the ground under your own feet; how can you be sure you don't end up skipping over items, for example?
Fortunately, there's one fundamental fact that you can use to do this: if you loop over N items using $hget().item, and then delete the N+1'th item, then those N items will remain the same. The N+1'th item, though, will be a new item (typically the N+2'th item). For example, if $hget().item returns A, B, C and D for respectively N values 1, 2, 3 and 4. and then you delete item C, then you can be sure that $hget().item with N values 1 and 2 will still return A and B after deleting C, respectively. Item D, however, will now be item 3 instead of 4, because there will never be any "gaps" between item numbers.
The following alias shows how to put this into practice, by looping over all the items in a given hashtable, and calling a custom identifier $want-to-delete on each of them, deleting the items for which $want-to-delete returns $true. The loop will always complete (no infinite loop!), and after it has completed, all the items that were to be deleted, will indeed be gone.
; usage: /purgeitems <hashtable> alias purgeitems { ; our counter: we start off at the first item var %i = 1 ; ; as long as the %i-th item exists, continue looping while ($hget($1,%i).item != $null) { ; temporarily store the name of this item in a variable ; $v1 refers to the "$hget($1,%i).item" bit above var %t = $v1 ; ; now perform a check to see if we still want to keep this item; ; $want-to-delete is a custom alias (not included) and returns ; either $true or $false, indicating whether we should delete %t if ($want-to-delete(%t)) { ; yes, we want to delete this item; use its item name (%t) to ; actually delete it using /hdel hdel $1 %t ; ; the important part is that we do NOT increase %i here; ; we just deleted the %i-th item, so the %i-th item is now a ; new item. due to the way hashtable iteration works, we also ; know for sure that we have not seen the %i-th item yet! } else { ; we're leaving this item in, so let's increase %i so that we ; move on to the next item. inc %i } } }
Note that this is a very expensive (slow) operation, you should not use $hget().item unless you have to. If you can, use /hdel with the -w (wildcard) switch instead!