How to Design a High Performance/Scalable Hit Counter Class?


Design a hit counter which counts the number of hits received in the past 5 minutes.

Each function accepts a timestamp parameter (in seconds granularity) and you may assume that calls are being made to the system in chronological order (ie, the timestamp is monotonically increasing). You may assume that the earliest timestamp starts at 1.

It is possible that several hits arrive roughly at the same time.

Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
HitCounter counter = new HitCounter();
 
// hit at timestamp 1.
counter.hit(1);
 
// hit at timestamp 2.
counter.hit(2);
 
// hit at timestamp 3.
counter.hit(3);
 
// get hits at timestamp 4, should return 3.
counter.getHits(4);
 
// hit at timestamp 300.
counter.hit(300);
 
// get hits at timestamp 300, should return 4.
counter.getHits(300);
 
// get hits at timestamp 301, should return 3.
counter.getHits(301); 
 
/**
 * Your HitCounter object will be instantiated and called as such:
 * HitCounter* obj = new HitCounter();
 * obj->hit(timestamp);
 * int param_2 = obj->getHits(timestamp);
 */
HitCounter counter = new HitCounter();

// hit at timestamp 1.
counter.hit(1);

// hit at timestamp 2.
counter.hit(2);

// hit at timestamp 3.
counter.hit(3);

// get hits at timestamp 4, should return 3.
counter.getHits(4);

// hit at timestamp 300.
counter.hit(300);

// get hits at timestamp 300, should return 4.
counter.getHits(300);

// get hits at timestamp 301, should return 3.
counter.getHits(301); 

/**
 * Your HitCounter object will be instantiated and called as such:
 * HitCounter* obj = new HitCounter();
 * obj->hit(timestamp);
 * int param_2 = obj->getHits(timestamp);
 */

Follow up: What if the number of hits per second could be very large? Does your design scale?

Queue-based Hit Counter Design

We can use a Queue to maintain all the timestamps in order. When adding a new timestamp, or getting the number of hits, we can scan the queue and remove those entries that are older than the maximum time interval e.g. 5 minutes. This takes O(N), thus not scalable.

If we don’t scan and remove older entries, we still need to go through the entire queue (which may grow large) to find out the number of hits by comparing the timestamps differences.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
class HitCounter {
public:
    /** Initialize your data structure here. */
    HitCounter() {
        
    }
    
    /** Record a hit.
        @param timestamp - The current timestamp (in seconds granularity). */
    void hit(int timestamp) {
        purge(timestamp, INTERVAL);
        Q.push(timestamp);
    }
    
    /** Return the number of hits in the past 5 minutes.
        @param timestamp - The current timestamp (in seconds granularity). */
    int getHits(int timestamp) {
        purge(timestamp, INTERVAL);
        return Q.size();
    }
        
private:
    queue<int> Q;
    int INTERVAL = 300; // 5 minutes
    
    void purge(int timestamp, int INTERVAL) {
       while (!Q.empty()) {
           auto p = Q.front();
           if (timestamp - p >= INTERVAL) {
               Q.pop();
           } else {
               break;
           }
       }         
    }
};
</int>
class HitCounter {
public:
    /** Initialize your data structure here. */
    HitCounter() {
        
    }
    
    /** Record a hit.
        @param timestamp - The current timestamp (in seconds granularity). */
    void hit(int timestamp) {
        purge(timestamp, INTERVAL);
        Q.push(timestamp);
    }
    
    /** Return the number of hits in the past 5 minutes.
        @param timestamp - The current timestamp (in seconds granularity). */
    int getHits(int timestamp) {
        purge(timestamp, INTERVAL);
        return Q.size();
    }
        
private:
    queue<int> Q;
    int INTERVAL = 300; // 5 minutes
    
    void purge(int timestamp, int INTERVAL) {
       while (!Q.empty()) {
           auto p = Q.front();
           if (timestamp - p >= INTERVAL) {
               Q.pop();
           } else {
               break;
           }
       }         
    }
};
</int>

Design a Hit Counter based on Buckets

As we are only interested in the number of hits in the last 5 minutes, in seconds granularity, we might divide the timestamps into 300 units. Thus, this hints to use two buckets, one for storing the actual timestamps, and another for the counters.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class HitCounter {
public:
    /** Initialize your data structure here. */
    HitCounter() {
        
    }
    
    /** Record a hit.
        @param timestamp - The current timestamp (in seconds granularity). */
    void hit(int timestamp) {
        int t = timestamp % 360;
        if (time[t] != timestamp) {
            time[t] = timestamp; // timestamps could be 300, 600, 900
            hits[t] = 1;  // reset the counter
        } else {
            hits[t] ++;  // same timestamp, update hits
        }
    }
    
    /** Return the number of hits in the past 5 minutes.
        @param timestamp - The current timestamp (in seconds granularity). */
    int getHits(int timestamp) {
        int r = 0;
        for (int i = 0; i < 360; ++ i) {
            if (timestamp - time[i] < 300) {
                r += hits[i];
            }
        }
        return r;
    }
    
private:
    int time[360] = { -1 };
    int hits[360] = { 0 };
};
class HitCounter {
public:
    /** Initialize your data structure here. */
    HitCounter() {
        
    }
    
    /** Record a hit.
        @param timestamp - The current timestamp (in seconds granularity). */
    void hit(int timestamp) {
        int t = timestamp % 360;
        if (time[t] != timestamp) {
            time[t] = timestamp; // timestamps could be 300, 600, 900
            hits[t] = 1;  // reset the counter
        } else {
            hits[t] ++;  // same timestamp, update hits
        }
    }
    
    /** Return the number of hits in the past 5 minutes.
        @param timestamp - The current timestamp (in seconds granularity). */
    int getHits(int timestamp) {
        int r = 0;
        for (int i = 0; i < 360; ++ i) {
            if (timestamp - time[i] < 300) {
                r += hits[i];
            }
        }
        return r;
    }
    
private:
    int time[360] = { -1 };
    int hits[360] = { 0 };
};

The space and time complexity is both O(1), highly scalable when the number of hits are large!

–EOF (The Ultimate Computing & Technology Blog) —

GD Star Rating
loading...
583 words
Last Post: Finding Out Which Content Is Unacceptable For Your Website
Next Post: Algorithms to Compute the Factor Combinations for An Integer using DFS and BFS

The Permanent URL is: How to Design a High Performance/Scalable Hit Counter Class?

Leave a Reply