Square Root Decomposition

Square Root Decomposition is a technique that allows range queries to be computed in \(O(\sqrt n)\) time, improving over the \(O(n)\) time required by a regular array.

Note

A range query is a query that asks to compute value (e.g., sum, minimum, or maximum) for a range of elements.

Data Structure

Let’s consider an array \(a\) of length \(n\). To build a square root decomposition structure, we first divide \(a\) into blocks of size \(\lceil \sqrt{n} \rceil\) (let’s call this \(sz\)). Then, for each block, we compute the relevant value (e.g., sum, minimum, or maximum) and store it in an array \(b\), where \(b_i\) represents the computed value for block \(i\).

Note

\(\lceil \sqrt{n} \rceil\) is \(\sqrt n\) rounded upwards to the nearest integer.

Since \(sz\) is approximately \(\sqrt n\), then the number of blocks will be around \(\frac{n}{\sqrt n}\) which is \(\sqrt n\).

\[ \underbrace{a_0, a_1, \dots, a_{sz-1}}_{b_{0}}, \underbrace{a_{sz}, a_{sz + 1}, \dots, a_{2 * sz - 1}}_{b_{1}}, \dots, \underbrace{a_{(sz - 1) * sz}, a_{(sz - 1) * sz + 1}, \dots, a_{n - 1}}_{b_{sz}} \]

Computing Queries

Suppose we want to process a range query from \(l\) to \(r\).

The trivial approach is to loop through all elements from \(l\) to \(r\) and compute the relevant value. This takes \(O(n)\) time in the worst case.

Using a square root decomposition structure, we notice that the range \([l, r]\) usually fully covers several blocks, except possibly the first block (containing \(l\)) and the last block (containing \(r\)).

To answer the query efficiently, we first handle the two partially covered blocks separately. Then, we loop over the fully covered blocks in between, using the precomputed values for each block. This reduces the query time to \(O(\sqrt{n})\).

Point Updates

Some tasks require updating a specific index \(j\) to a new value \(v\) while still answering queries efficiently.

To do this, we simply update \(a[j]\) to \(v\), which takes \(O(1)\) time. For the block array \(b\), we observe that index \(j\) belongs to exactly one block. Therefore, we can recompute the relevant value for this block in \(O(\sqrt{n})\) time.

As a result, the total complexity of a point update is \(O(\sqrt{n})\).

Implementation

Let’s assume that we are calculating range maximum queries.

Preprocessing

Here we will build the structure.

for (int i = 0; i < n; i++) {
    b[i / sz] = max(a[i], b[i / sz]); // element i belongs to block i/sz
}

Answering Queries

int queryMax(int l, int r) {
    int blockL = l / sz;
    int blockR = r / sz;

    int ans = INT_MIN;

    // special case when query starts and ends in the same block
    if (blockL == blockR) { 
        for(int i = l; i <= r; i++) {
            ans = max(ans, a[i]);
        }
        return ans;
    }

    // compute first block
    for (int i = l; i < (blockL + 1) * sz; i++) {
        ans = max(ans, a[i]);
    }

    // compute intermedieate blcoks
    for (int i = blockL + 1; i < blockR ; i++) {
        ans = max(ans, b[i]);
    }

    // compute last block
    for (int i = blockR * sz; i <= r ; i++){
        ans = max(ans , a[i]);
    }

    return ans;
}

Point Updates

void pointUpdate(int j , int v){
    int blockJ = j / sz;

    a[j] = v;
    
    //recompute b[blockJ]
    b[blockJ] = INT_MIN;

    for (int i = blockJ * sz; i < (blockJ + 1) * sz; i++) {
        b[blockJ] = max(b[blockJ], a[i]);
    }
}

Dynamic Range Minimum Queries

Dynamic Range Minimum Queries
CSES easy

Given array \(x\) of \(n\) elements and \(q\) queries. Each query is one of two types

  • “1 \(k\) \(u\)”: update the value at position \(k\) to \(u\).
  • “2 \(a\) \(b\)”: what is the minimum value in range [\(a\),\(b\)]?

Solution

This task is similar to the previous implementation, but now we are asked to compute the minimum. There are a few minor considerations to keep in mind:

  1. The arrays are defined globally, which is why we use MAXN as the maximum possible value for \(n\).

  2. In the queries, we apply k--, l--, and r-- because the input is 1-indexed, while our implementation uses 0-indexed arrays.

#include <iostream>
#include <cmath>    
#include <climits>  

using namespace std;
const int MAXN = 200000;
const int sz = (int)ceil(sqrt(MAXN));
int a[MAXN];
int b[sz]; 

int queryMin(int l, int r) {
    int blockL = l / sz;
    int blockR = r / sz;

    int ans = INT_MAX;

    // special case when query starts and ends in the same block
    if (blockL == blockR) { 
        for(int i = l; i <= r; i++) {
            ans = min(ans, a[i]);
        }
        return ans;
    }

    // compute first block
    for (int i = l; i < (blockL + 1) * sz; i++) {
        ans = min(ans, a[i]);
    }

    // compute intermedieate blcoks
    for (int i = blockL + 1; i < blockR ; i++) {
        ans = min(ans, b[i]);
    }

    // compute last block
    for (int i = blockR * sz; i <= r ; i++){
        ans = min(ans , a[i]);
    }

    return ans;
}

void pointUpdate(int j , int v){
    int blockJ = j / sz;

    a[j] = v;
    
    //recompute b[blockJ]
    b[blockJ] = INT_MAX;

    for (int i = blockJ * sz; i < (blockJ + 1) * sz; i++) {
        b[blockJ] = min(b[blockJ], a[i]);
    }
}

int main() {
    int n, q;
    cin >> n >> q;

    for (int i = 0; i < n; i++) {
        cin >> a[i];
    }

    for (int i = 0 ; i < sz; i++) {
        b[i] = INT_MAX;
    }

    for (int i = 0; i < n; i++) {
        b[i / sz] = min(a[i], b[i / sz]); // element i belongs to block i/sz
    }

    while (q--){
        int type;
        cin >> type;

        if (type == 1){
            int k , u;
            cin >> k >> u;
            k--;
            pointUpdate(k, u);
        }
        else{
            int l , r;
            cin >> l >> r;
            l--; r--;
            cout << queryMin(l , r) << "\n";
        }
    }
}