Question 1
Assume the utilization of linear probing for hash-tables. To enhance the complexity of the operations performed on the table, a special AVAILABLE object is used. Assuming that all keys are positive integers, the following two techniques were suggested in order to enhance complexity:
i) In case an entry is removed, instead of marking its location as AVAILABLE, indicate the key as the negative value of the removed key (i.e. if the removed key was 16, indicate the key as -16). Searching for an entry with the removed key would then terminate once a negative value of the key is found (instead of continuing to search if AVAILABLE is used).
ii) Instead of using AVAILABLE, find a key in the table that should have been placed in the location of the removed entry, then place that key (the entire entry of course) in that location (instead of setting the location as AVAILABLE). The motive is to find the key faster since it now in its hashed location. This would also avoid the dependence on the AVAILABLE object.
Will either of these proposal have an advantage of the achieved complexity? You should analyze both time-complexity and space-complexity. Additionally, will any of these approaches result in misbehaviors (in terms of functionalities)? If so, explain clearly through illustrative examples.
Question 2
To reduce the maximum number of collisions in the hash table described in Question 6 above, someone proposed the use of a larger array of 15 elements (that is roughly 15% bigger) and of course modifying the hash function to: h(k)=k mod 15. The idea was to reduce the load factor and hence the number of collisions.
Does this proposal hold any validity to it? If yes, indicate why such modifications would actually reduce the number of collisions. If no, indicate clearly the reasons you believe/think that such proposal is senseless.
Question 3
Assume an open addressing hash table implementation, where the size of the array N = 19, and that double hashing is performed for collision handling. The second hash function is defined as:
d(k) = q - k mod q,
where k is the key being inserted in the table and the prime number q is = 11. Use simple modular operation (k mod N) for the first hash function.
i) Show the content of the table after performing the following operations, in order:
put(38), put(15), put(43), put(22), put(71), put(8), put(28), put(37), put(19).
ii) What is the size of the longest cluster caused by the above insertions?
iii) What is the number of occurred collisions as a result of the above operations?
iv) What is the current value of the table′s load factor?