Sieve of Eratosthenes

We know how to check if one number is prime (\(O(\sqrt{N})\)). But what if you need to find all prime numbers up to \(1,000,000\)?

Using the previous method inside a loop would be way too slow (\(N \times \sqrt{N}\)). We need a bulk operation. We need a sieve.

The Concept

Instead of checking numbers one by one, we filter them.

Imagine writing down all numbers from 1 to 20. 1. We know 2 is prime. Circle it. Now cross out all multiples of 2 (4, 6, 8, 10…). 2. The next uncrossed number is 3. Circle it. Cross out all multiples of 3 (6, 9, 12…). 3. The next uncrossed number is 5 (4 is crossed). Circle it. Cross out multiples…

By the time you reach the end, only the prime numbers remain uncrossed.

The Algorithm

We use a boolean array isPrime[] where isPrime[i] is true if i is potentially prime. Initially, we assume everything is prime.

Basic Steps:

  1. Create an array of booleans, set all to true.
  2. Mark 0 and 1 as false (not prime).
  3. Start loop from i = 2.
  4. If i is prime (true), it means no smaller number divided it.
  5. Loop through all multiples of i (2*i, 3*i…) and mark them false.

The Code

#include <iostream>
#include <vector>
using namespace std;

// Maximum number we want to check (e.g., 1 million)
const int MAX_N = 1000000;
vector<bool> is_prime(MAX_N + 1, true); // Assume all are prime first

void sieve() {
    is_prime[0] = is_prime[1] = false; // Edge cases

    for (int i = 2; i * i <= MAX_N; i++) {
        // If i is not crossed out yet, it is a prime
        if (is_prime[i]) {
            // Cross out all multiples of i
            // Optimization: Start from i * i (smaller multiples are already handled)
            for (int j = i * i; j <= MAX_N; j += i) {
                is_prime[j] = false;
            }
        }
    }
}

Why start at \(i \times i\)?

This is a crucial optimization. When \(i = 5\), why don’t we cross out 10 (\(2 \times 5\)) or 15 (\(3 \times 5\))?

  • Because 10 was already crossed out by 2.
  • Because 15 was already crossed out by 3.

The first multiple of 5 that hasn’t been touched yet is \(5 \times 5 = 25\).

Complexity

This algorithm is famously efficient. Even though there are nested loops, it is surprisingly NOT \(O(N^2)\). It is \(O(N \log (\log N))\). That is almost linear. For \(N = 10^7\) (10 million), it runs in a fraction of a second.

Example Problem

Noldbach

Noldbach

Noldbach states that at least \(k\) prime numbers from \(2\) to \(n\) inclusively can be expressed as the sum of three integer numbers: two neighboring prime numbers and \(1\). For example, \(19 = 7 + 11 + 1\), or \(13 = 5 + 7 + 1\).

Two prime numbers are called neighboring if there are no other prime numbers between them.

Find out if Noldbach is right or wrong.

Solution

To solve this, we need a list of all prime numbers between \(2\) and \(n\). While you could test each number individually, the Sieve of Eratosthenes is faster for generating a list of primes in a range.

After listing all the primes up to \(n\), we should check for every two consecutive prime numbers \(p\) and \(q\), whether \(p + q + 1\) is a prime number (remember from last lesson how to do this!).

#include <iostream>
#include <vector>
using namespace std;

// Maximum number we want to check (e.g., 1 million)
const int MAX_N = 1000000;
vector<bool> is_prime(MAX_N + 1, true); // Assume all are prime first
vector<int> primes;

int findPrime(int n) {
    is_prime[0] = is_prime[1] = false; // Edge cases

    for (int i = 2; i * i <= MAX_N; i++) {
        // If i is not crossed out yet, it is a prime
        if (is_prime[i]) {
            // Cross out all multiples of i
            // Optimization: Start from i * i (smaller multiples are already handled)
            for (int j = i * i; j <= MAX_N; j += i) {
                is_prime[j] = false;
            }
        }
    }

    for (int i = 2; i <= n; ++i){
        if (is_prime[i]) // We only care about prime numbers
            primes.push_back(i);
    }

    int count = 0;
    for (int i = 1; i < (int)primes.size(); ++i){
        if (is_prime[primes[i] + primes[i-1] + 1]){ // check whether consecutive primes + 1 is a prime
            count++;
        }
    }

    return count;
}


int main() { 
    int n, k;
    cin >> n >> k;

    int count = findPrime(n);

    if (count >= k)
        cout << "YES\n";
    else
        cout << "NO\n";
    return 0;
}