Sunday, February 5, 2017

HashMap in Java Best Practice

Introduction
In HashMap, most important point is key designing. Key’s hash code is used primarily in conjunction to its equals() method, for putting a key in map and then getting it back from map. So if hash code of key object changes after we have put a key value pair in map, then its almost impossible to fetch the value object back from map. It is a case of memory leak.
On runtime, JVM computes hash code for each object and provide it on demand. When we modify an object’s state, JVM set a flag that object is modified and hash code must be AGAIN computed. So, next time you call object’s hashCode() method, JVM recalculate the hash code for that object.
For this basic reasoning, key objects are suggested to be IMMUTABLE. IMMUTABILITY allows you to get same hash code every time, for a key object. So it actually solves most of the problems in one go. Also, this class must honor the hashCode() and equals() methods contract.
This is the main reason why immutable classes like String, Integer or other wrapper classes are a good key object candidate. Remember setter method is not present for variables which we want to make immutable
Example

import java.util.*;

public class MapTesting {

public static void main(String[] args) {
HashMap<Student, String> h = new HashMap<Student, String>();
h.put(new Student("vikas", 12, "sixth"), "Vicky");
h.put(new Student("akshay", 12, "seventh"), "Akii");
h.put(new Student("lucky", 14, "sixth"), "lucky");
h.put(new Student("kash", 15, "ninth"), "kashy");
Iterator e = h.entrySet().iterator();
while (e.hasNext()) {
Map.Entry m = (Map.Entry) e.next();
System.out.print("Key is " + m.getKey());
System.out.println("value is " + m.getValue());
}

}

}

class Student {

private String Name;
private int age;
private String standard;

public Student(String name, int age, String standard) {
Name = name;
this.age = age;
this.standard = standard;
}

@Override
public String toString() {
return "Student [Name=" + Name + ", age=" + age + ", standard=" + standard + "]";
}

public String getName() {
return Name;
}

public int getAge() {
return age;
}



public String getStandard() {
return standard;
}

@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((Name == null) ? 0 : Name.hashCode());
result = prime * result + age;
result = prime * result + ((standard == null) ? 0 : standard.hashCode());
return result;
}

@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (!(obj instanceof Student))
return false;
Student other = (Student) obj;
if (Name == null) {
if (other.Name != null)
return false;
} else if (!Name.equals(other.Name))
return false;
if (age != other.age)
return false;
if (standard == null) {
if (other.standard != null)
return false;
} else if (!standard.equals(other.standard))
return false;
return true;
}
}



But remember that immutability is recommended and not mandatory. If you want to make a mutable object as key in hashmap, then you have to make sure that state change for key object does not change the hash code of object. This can be done by overriding the hashCode() method. But, you must make sure you are honoring the contract with equals() also.


An example is always better for demonstration, right? Then lets have one

In this example, I have created an account class with only two fields for simplicity. I have overridden the hash code and equals method such that it uses only account number to verify the uniqueness of Account object. All other possible attributes of Account class can be changed on runtime.
Note-:Remember setAccountNumber() is missing.



public class Account 
{
    private int accountNumber;
    private String holderName;
    public Account(int accountNumber) {
        this.accountNumber = accountNumber;
    }
    public String getHolderName() {
        return holderName;
    }
    public void setHolderName(String holderName) {
        this.holderName = holderName;
    }
    public int getAccountNumber() {
        return accountNumber;
    }
    //Depends only on account number
    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + accountNumber;  
        return result;
    }
    //Compare only account numbers
    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        Account other = (Account) obj;
        if (accountNumber != other.accountNumber)
            return false;
        return true;
    }
}


But approach works fine till Java 7.   If still key happens to be same, then in case performance in worst case will be O(n). To avoid this Java 8 uses binary tree in place of linkedlist in case of collision.
But if we use above programming approach then also it wont make any difference. Hence our key object class should implement Comparable class

Well, previously entries with conflicting keys were simply appended to linked list, which later had to be traversed. Now HashMappromotes list into binary tree, using hash code as a branching variable. If two hashes are different but ended up in the same bucket, one is considered bigger and goes to the right. If hashes are equal (as in our case), HashMap hopes that the keys are Comparable, so that it can establish some order. This is not a requirement of HashMap keys, but apparently a good practice. If keys are not comparable, don't expect any performance improvements in case of heavy hash collisions.

class Key implements Comparable<Key> {
private final int value;
Key(int value) {
this.value = value;
}
@Override
public int compareTo(Key o) {
return Integer.compare(this.value, o.value);
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass())
return false;
Key key = (Key) o;
return value == key.value;
}
@Override
public int hashCode() {
return value;
}
}

Note : source      http://openjdk.java.net/jeps/180
The principal idea is that once the number of items in a hash bucket grows beyond a certain threshold, that bucket will switch from using a linked list of entries to a balanced tree. In the case of high hash collisions, this will improve worst-case performance from O(n) to O(log n).

HashMap simply uses hash value of key object and then further passing this hash value in HashMap's hash function to find bucket. In case of collision, earlier item was added in linkedlist in same bucket but now as of Java 8, balanced tree is used.

No comments:

Post a Comment

System Design :: Performace Tuning: Scaling, Resiliency, persistence

Netflix System Deisgn