Sieve of Eratosthenes
We know how to check if one number is prime (\(O(\sqrt{N})\)). But what if you need to find all prime numbers up to \(1,000,000\)?
Using the previous method inside a loop would be way too slow (\(N \times \sqrt{N}\)). We need a bulk operation. We need a sieve.
The Concept
Instead of checking numbers one by one, we filter them.
Imagine writing down all numbers from 1 to 20. 1. We know 2 is prime. Circle it. Now cross out all multiples of 2 (4, 6, 8, 10…). 2. The next uncrossed number is 3. Circle it. Cross out all multiples of 3 (6, 9, 12…). 3. The next uncrossed number is 5 (4 is crossed). Circle it. Cross out multiples…
By the time you reach the end, only the prime numbers remain uncrossed.
The Algorithm
We use a boolean array isPrime[] where isPrime[i] is true if i is potentially prime. Initially, we assume everything is prime.
Basic Steps:
- Create an array of booleans, set all to
true. - Mark
0and1asfalse(not prime). - Start loop from
i = 2. - If
iis prime (true), it means no smaller number divided it. - Loop through all multiples of
i(2*i,3*i…) and mark themfalse.
The Code
#include <iostream>
#include <vector>
using namespace std;
// Maximum number we want to check (e.g., 1 million)
const int MAX_N = 1000000;
vector<bool> is_prime(MAX_N + 1, true); // Assume all are prime first
void sieve() {
is_prime[0] = is_prime[1] = false; // Edge cases
for (int i = 2; i * i <= MAX_N; i++) {
// If i is not crossed out yet, it is a prime
if (is_prime[i]) {
// Cross out all multiples of i
// Optimization: Start from i * i (smaller multiples are already handled)
for (int j = i * i; j <= MAX_N; j += i) {
is_prime[j] = false;
}
}
}
}Why start at \(i \times i\)?
This is a crucial optimization. When \(i = 5\), why don’t we cross out 10 (\(2 \times 5\)) or 15 (\(3 \times 5\))?
- Because 10 was already crossed out by 2.
- Because 15 was already crossed out by 3.
The first multiple of 5 that hasn’t been touched yet is \(5 \times 5 = 25\).
Complexity
This algorithm is famously efficient. Even though there are nested loops, it is surprisingly NOT \(O(N^2)\). It is \(O(N \log (\log N))\). That is almost linear. For \(N = 10^7\) (10 million), it runs in a fraction of a second.
Example Problem
Noldbach
Noldbach states that at least \(k\) prime numbers from \(2\) to \(n\) inclusively can be expressed as the sum of three integer numbers: two neighboring prime numbers and \(1\). For example, \(19 = 7 + 11 + 1\), or \(13 = 5 + 7 + 1\).
Two prime numbers are called neighboring if there are no other prime numbers between them.
Find out if Noldbach is right or wrong.
Solution
To solve this, we need a list of all prime numbers between \(2\) and \(n\). While you could test each number individually, the Sieve of Eratosthenes is faster for generating a list of primes in a range.
After listing all the primes up to \(n\), we should check for every two consecutive prime numbers \(p\) and \(q\), whether \(p + q + 1\) is a prime number (remember from last lesson how to do this!).
#include <iostream>
#include <vector>
using namespace std;
// Maximum number we want to check (e.g., 1 million)
const int MAX_N = 1000000;
vector<bool> is_prime(MAX_N + 1, true); // Assume all are prime first
vector<int> primes;
int findPrime(int n) {
is_prime[0] = is_prime[1] = false; // Edge cases
for (int i = 2; i * i <= MAX_N; i++) {
// If i is not crossed out yet, it is a prime
if (is_prime[i]) {
// Cross out all multiples of i
// Optimization: Start from i * i (smaller multiples are already handled)
for (int j = i * i; j <= MAX_N; j += i) {
is_prime[j] = false;
}
}
}
for (int i = 2; i <= n; ++i){
if (is_prime[i]) // We only care about prime numbers
primes.push_back(i);
}
int count = 0;
for (int i = 1; i < (int)primes.size(); ++i){
if (is_prime[primes[i] + primes[i-1] + 1]){ // check whether consecutive primes + 1 is a prime
count++;
}
}
return count;
}
int main() {
int n, k;
cin >> n >> k;
int count = findPrime(n);
if (count >= k)
cout << "YES\n";
else
cout << "NO\n";
return 0;
}