graph TD
A["20 (1-4)"]
B["6 (1-2)"]
C["14 (3-4)"]
D["2 (1-1)"]
E["4 (2-2)"]
F["6 (3-3)"]
G["8 (4-4)"]
A --> B
A --> C
B --> D
B --> E
C --> F
C --> G
Segment trees
Segment tree is a data structure that allows efficient range query processing and updates on arrays.
In this section everything is assumed to be 1-indexed unless said otherwise.
What Is a Segment Tree
It is a binary tree where each node represents the value of query on a segment (subarray) of the original array. The tree is built with the following rules:
- Leaf nodes represent a segment of length \(1\) .
- Each internal nodes represents a combination (sum, min, max, etc.) of it’s two children.
Let’s say that we have the following array [2, 4, 6, 8]. A sum segment tree built on this array looks as follows:
Note that the leaf nodes represent only one element and the other nodes merge it’s children by taking the sum of their values.
Node Count and Padding
Let N be the number of leaf nodes in a segment tree, and n be the length of the original array.
Nodes Count
- In a segment tree, the number of nodes at each level forms the sequence
[1, 2, 4, …], i.e., powers of two. - Each power of 2 is eqaul to the sum of all previous powers of 2 plus 1.
- Therefore, the total number of nodes in a segment tree is
2N - 1.
Padding
- If the original array has
nelements andnis not a power of two, we pad the array with additional elements so that the number of leaf nodesNbecomes the next power of two, ensuring that the extra elements do not affect the values of the other nodes. - In the worst case, we may need to add up to
nelements, because the next power of two is less than2n.
Example
If we change the previous example to [2, 4, 6], we need to add an element to make the length a power of two without affecting the other values. In this case, the best value to add is 0, since it does not change the sums of other nodes. The resulting segment tree will look like the following:
graph TD
A["12 (1-4)"]
B["6 (1-2)"]
C["6 (3-4)"]
D["2 (1-1)"]
E["4 (2-2)"]
F["6 (3-3)"]
G["0 (4-4) padded"]
A --> B
A --> C
B --> D
B --> E
C --> F
C --> G
Representation
We will represent our segment tree as an array giving indices to each node. So the previous segment tree will be given indices as follows.
graph TD
A["1: sum = 12 (1-4)"]
B["2: sum = 6 (1-2)"]
C["3: sum = 6 (3-4)"]
D["4: sum = 2 (1-1)"]
E["5: sum = 4 (2-2)"]
F["6: sum = 6 (3-3)"]
G["7: sum = 0 (4-4) padded"]
A --> B
A --> C
B --> D
B --> E
C --> F
C --> G
Each node i has children at positions 2 * i and 2 * i + 1, and its parent is at position i / 2.
In this example, the number of leaf nodes N is 4, which is also the index of the first leaf node. Therefore, a leaf node at position i corresponds to index i - N + 1 in the original array.
Querying
Let’s define a function query(id, l, r, ql, qr) that returns the answer to a query from ql to qr, where:
idis the index of the current node.lis the beginning of the interval covered by the current node.ris the end of the interval covered by the current node.qlis the beginning of the query range.qris the end of the query range.
The function works as follows:
- If
[ql, qr]and[l, r]are completely disjoint, discard the current node, as it does not contribute to the query. - If
[ql, qr]completely covers[l, r], return the value stored at the current node. - Otherwise, split the range at the midpoint
m = (l + r) / 2and recurse on the children:query(2 * id, l, m, ql, qr)
query(2 * id + 1, m + 1, r, ql, qr)
Then merge the results from both children to obtain the answer for the current node.
To solve a query [ql, qr], call query(1, 1, N, ql, qr).
The time complexity of this algorithm is \(O(\log n)\).
Point Update
Some problems will ask to update index j to value v. To do this we’ll update the leaf node at position j + N - 1 and recompute all it’s ancestors values.
The number of ancestors of leaf node j + N - 1 is logn. Therefore the complexity of point update is \(O(\log n)\)
Implementation
When building a segment tree we only have two things: the original array a(0-indexed), and its size n.
Tree Size
First we need to compute the number of leafs N. It can be computed in two ways:
Either with an equation:
int N = 1 << (32 - __builtin_clz(n - 1));Or with a loop:
int N;
for (N = 1; N < n; N *= 2); // the ; is crucialThe size of the tree will be 2*N.
int seg[2*N]Building The Tree
We’ll write the code to build a sum segment tree.
Leafs Construction
To build the tree we first need to put the leafs’ value.
for (int i = 0 ; i < n; i++) {
seg[i + N] = a[i];
}Then we pad new elements to complete the leafs.
for (int i = n + N; i < 2*N; i++){
seg[i] = 0; // padded value depends on the segment tree
}Internal Nodes Construction
To build Internal nodes we’ll loop over them in reverse order and merge their children.
for (int i = N - 1; i > 0 ; i--) {
seg[i] = seg[i * 2] + seg[i * 2 + 1]; // merging children's values
}Queries
int query(int id, int l, int r, int ql, int qr) {
// check if disjoint
if (ql > r || l > qr) {
return 0;
}
// check if [ql, qr] fully covers [l, r]
if (ql <= l && r <= qr) {
return seg[id];
}
// recurse and merge
int m = (l + r) / 2;
return query(id * 2, l, m, ql, qr) + query(id * 2 + 1, m + 1, r, ql, qr);
}Point Updates
void pointUpdate(int j, int v){
a[j] = v;
int idx = j + N - 1;
seg[idx] = v;
// reconstruct all ancestors
for (int i = idx / 2; i > 0; i /= 2) {
seg[i] = seg[i * 2] + seg[i * 2 + 1];
}
}Example Problem
Dynamic Range Minimum Queries
Given array \(x\) of \(n\) elements and \(q\) queries. Each query is one of two types
- “1 \(k\) \(u\)”: update the value at position \(k\) to \(u\).
- “2 \(a\) \(b\)”: what is the minimum value in range [\(a\),\(b\)]?
Solution
We’ll simply make a segment tree for minimum and apply the queries and point updates directly in it. Since the tree is a minimum sgment tree there will be some differences from the previous snippets because the
#include <iostream>
#include <climits>
using namespace std;
const int MAXN = 200000;
const int N = (1 << (32 - __builtin_clz(MAXN - 1)));
int seg[N * 2];
int a[MAXN];
int query(int id, int l, int r, int ql, int qr) {
// check if disjoint
if (ql > r || l > qr) {
return INT_MAX;
}
// check if [ql, qr] fully covers [l, r]
if (ql <= l && r <= qr) {
return seg[id];
}
// recurse and merge
int m = (l + r)/2;
return min(query(id * 2, l, m, ql, qr) , query(id * 2 + 1, m + 1, r, ql, qr));
}
void pointUpdate(int j, int v) {
a[j - 1] = v;
int idx = j + N - 1;
seg[idx] = v;
// reconstruct all ancestors
for (int i = idx/2; i > 0; i /= 2) {
seg[i] = min(seg[i * 2] , seg[i * 2 + 1]);
}
}
int main() {
int n, q;
cin >> n >> q;
for (int i = 0; i < n; i++) {
cin >> a[i];
}
for (int i = 0 ; i < n; i++) {
seg[i + N] = a[i];
}
for (int i = n + N; i < 2*N; i++) {
seg[i] = INT_MAX; // padded value depends on the segment tree
}
for (int i = N - 1; i > 0; i--) {
seg[i] = min(seg[i * 2] , seg[i * 2 + 1]);
}
while (q--) {
int type;
cin >> type;
if (type == 1) {
int k, u;
cin >> k >> u;
pointUpdate(k, u);
}
else {
int ql, qr;
cin >> ql >> qr;
cout << query(1, 1, N, ql, qr) << "\n";
}
}
}