Home
DSA Course
Advanced Topics
Suffix Array
Suffix Array
A Suffix Array is a sorted array of all suffixes of a string, used for efficient string matching and analysis.
Suffix Array & LCP
Ready
Controls
See the Logic in Motion
Stop memorizing code. Unlock the full interactive visualizer to master the logic step-by-step.Intuition
The Problem: Finding patterns in a huge text (like the human genome) is slow if we scan linearly.
The Solution: If we create all possible suffixes of the text and sort them lexicographically, we can use Binary Search to find any substring in O(M log N) time.
Combined with the LCP (Longest Common Prefix) array, it solves complex problems like finding the longest repeated substring.
Concept
Structure:
- Suffixes: Substrings starting from index
ito the end. - Array: An array of integers representing the starting indices of these suffixes, sorted alphabetically.
Example: for "banana"
- Suffixes: banana, anana, nana, ana, na, a
- Sorted: a, ana, anana, banana, na, nana
- Suffix Array: [5, 3, 1, 0, 4, 2]
How it Works
- Generate: List all suffixes of string `S`.
- Sort: Sort them lexicographically. The indices of the sorted suffixes form the Suffix Array `SA`.
- LCP Construction: (Optional) Compute the Longest Common Prefix between adjacent suffixes in the sorted array to enable advanced queries.
- Search: Use binary search on the suffix array to locate pattern `P`.
Step-by-Step Breakdown
1. Naive Construction: Create array of objects `(index, string)`, sort strings, iterate to extract indices. `O(N^2 log N)`.
2. Optimized Construction: Use prefix doubling (sorting by 1 char, then 2, 4, 8...). `O(N log^2 N)` or `O(N log N)`.
3. Building LCP:
- Use Kasai's Algorithm `O(N)`.
- Maintains an invariant relation between LCP of `suffix[i]` and `suffix[i+1]`.
When to Use
- Finding longest repeated substring.
- Finding longest common substring of two strings.
- Counting unique substrings.
- Fast substring search in static text.
When NOT to Use
- Dynamic text (insertions/deletions) -- Suffix Trees or Arrays are expensive to rebuild.
- Simple pattern matching where KMP/Z-Algo suffices.
How to Identify
"Multiple pattern queries on static text", "Longest repeated substring", "Lexicographically kth substring".
Stop Guessing, Start Mastering.
Build the FAANG intuition. Master this pattern with optimized implementations, visual dry runs, and our curated collection of high-yield problems.
