Given an array of integers citations where citations[i] is the number of citations a researcher received for their ith paper and citations is sorted in ascending order, return the researcher's h-index.
According to the definition of h-index on Wikipedia: The h-index is defined as the maximum value of h such that the given researcher has published at least h papers that have each been cited at least h times.
You must write an algorithm that runs in logarithmic time.
Example 1:
Input: citations = [0,1,3,5,6] Output: 3 Explanation: [0,1,3,5,6] means the researcher has 5 papers in total and each of them had received 0, 1, 3, 5, 6 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.
Example 2:
Input: citations = [1,2,100] Output: 2
Constraints:
n == citations.length1 <= n <= 1050 <= citations[i] <= 1000citations is sorted in ascending order.Problem Overview: You receive a sorted array citations where citations[i] represents the number of citations for a research paper. The array is sorted in ascending order. Your goal is to compute the researcher's H-index: the maximum h such that at least h papers have ≥ h citations.
The sorted property changes how you approach the problem compared to the original H-Index problem. Instead of scanning all possible values, you can use the index position to infer how many papers remain on the right side of the array.
Approach 1: Linear Scan from the End (O(n) time, O(1) space)
Start from the end of the sorted array where citation counts are largest. For each index i, the number of papers with at least citations[i] citations equals n - i. Check whether citations[i] >= n - i. The first position that satisfies this condition determines the H-index.
This works because the array is sorted: moving left only decreases citation counts while increasing the number of papers considered. Once the condition becomes true, you’ve found the largest valid H-index. The algorithm simply iterates once across the array, making it easy to implement with constant memory.
This approach is practical when input size is moderate or when code simplicity matters more than theoretical optimality. It relies only on sequential iteration over an array.
Approach 2: Binary Search on Citation Threshold (O(log n) time, O(1) space)
The sorted order enables a classic binary search. Instead of checking every index, search for the smallest index i where citations[i] >= n - i. The value n - i represents how many papers lie to the right (including the current one), which corresponds to a potential H-index candidate.
During the search, compute mid. If citations[mid] >= n - mid, the condition might hold for an even smaller index, so move the right boundary left. Otherwise, move the left boundary right because the citation count is too small to support that many papers.
Once the search converges, the H-index equals n - left. This reduces the complexity from linear to logarithmic time while maintaining constant memory usage. Binary search fits naturally because the condition forms a monotonic boundary across the sorted array.
This solution combines array indexing with binary search logic and is the expected optimal approach in most technical interviews.
Recommended for interviews: Start by explaining the linear scan to demonstrate understanding of the H-index definition. Then optimize using binary search by leveraging the sorted array property. Interviewers typically expect the O(log n) binary search solution because recognizing the monotonic condition shows strong algorithmic reasoning.
Given the array is sorted, we can efficiently search for the h-index using binary search, aiming for logarithmic time complexity. The idea is to use the binary search to find the maximum h such that citations[h] ≥ h.
The C solution defines a function hIndex using binary search to find the h-index. It utilizes a loop to adjust left and right boundaries based on comparison, reducing the search space logarithmically.
Time Complexity: O(log n).
Space Complexity: O(1).
In this linear scan approach, we traverse the sorted citations list from beginning to end. The goal is to determine the maximum valid h-index by checking citations against their corresponding paper count.
This C implementation uses a simple loop to check each citation's sufficiency against its position in the list, returning the calculated h-index when the condition is satisfied.
Time Complexity: O(n).
Space Complexity: O(1).
We notice that if there are at least x papers with citation counts greater than or equal to x, then for any y \lt x, its citation count must also be greater than or equal to y. This exhibits monotonicity.
Therefore, we use binary search to enumerate h and obtain the maximum h that satisfies the condition. Since we need to satisfy that h papers are cited at least h times, we have citations[n - mid] \ge mid.
The time complexity is O(log n), where n is the length of the array citations. The space complexity is O(1).
| Approach | Complexity |
|---|---|
| Binary Search Approach | Time Complexity: O(log n). |
| Linear Scan Approach | Time Complexity: O(n). |
| Binary Search | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Linear Scan from End | O(n) | O(1) | When implementation simplicity matters and the array size is manageable |
| Binary Search on Sorted Citations | O(log n) | O(1) | Best choice when the citations array is sorted and optimal performance is required |
H Index II | Binary search | Leetcode #275 • Techdose • 38,757 views views
Watch 9 more video solutions →Practice H-Index II with our built-in code editor and test cases.
Practice on FleetCode