Graphs

Definition

A graph is a mathematical structure that represents relationships between individual items called vertices. A single relationship is called an edge, which indicates that a pair of vertices are connected.

Example: Cities and Roads

A classic example of a graph is a network of cities connected by roads. Consider several cities in a country, some pairs of cities are connected by roads, while others are not. If you plan to travel from city \(A\) to city \(B\), you may need to pass through one or more intermediate cities along the way.

Why Use Graphs?

Representing problems as graphs allows us to solve problems efficiently using graph algorithms, such as:

  • “Is it possible to reach city \(B\) from city \(A\)?”
  • “What is the shortest path from city \(A\) to city \(B\)?”

Storing Various Graphs

There are various types of graphs and various ways to store each type of graph. We will discuss the types that most commonly appear in problems and the ways to store them.

Types

Direction

Edges can sometimes have a direction, this means that an edge \((U, V)\) and edge \((V, U)\) are different. This is usually stated in a problem using a term like unidirectional or directed whereas terms like bidirectional and undirected mean that the edges have no specific direction.

Weights

In some graphs, edges (and possibly vertices) can have one or more values associated with them. An example of this would be to assign a duration to the roads in our cities and roads example. We can say that edge \((U, V, W)\) means that travelling along the road between cities \(U\) and \(V\) requires \(W\) time.

Note

Notice that in the phrasing of \((U, V, W)\) no specific direction for the edge was mentioned, so it is safe to assume the edge is undirected.

Storing

Before we discuss storing methods, we will discuss some queries we might perform that will help us compare different storing methods and understand what are the strengths and weaknesses of each.

  • What is the space complexity of this method?
  • What is the time complexity of checking if the edge \((U, V)\) exists?
  • Given some vertex \(U\), what is the time complexity of looping over all vertices \(V\) adjacent to it?

We must also understand how a graph is given to us in the input of a problem. The most common way is to first read \(N\) and \(M\) which represent the number of vertices and the number of edges respectively, then in each of the next \(M\) lines a pair of integers will be given \(U\) \(V\) (additionally maybe some weight \(W\)). This means there is an edge between vertices \(U\) and \(V\) (and if the graph is directed its usually from \(U\) to \(V\) unless stated otherwise).

Tip

We recommend you take a look at the website: Graph Editor

Finally, its best practice to store all graph related containers globally. That is because most graph algorithms are best implemented with functions and its easier to access the graph if it was global. For this it is crucial to know the constraints on variables like \(N\) and \(M\) to properly set the sizes of our containers and avoid out of bounds access.

Caution

Here we assume that the graph cannot have multiple edges i.e a pair of vertices directly connected by several edges.

Array of Edges

This is the simplest way to store edges, since the edges are given as a list of pairs we will store it directly into an array of pairs.


const int maxN = 100'000;
const int maxM = 200'000;

pair<int,int> E[maxM];

int main () {
    int n, m;
    cin >> n >> m;

    for (int i = 0 ; i < m ; i++) {
        int u, v; cin >> u >> v;
        E[i] = {u, v};
    }
}
Note

For weighted graphs we can define a struct to hold all \(3\) values or use std::tuple or std::array. Generally it’s best practice to avoid nesting pairs.

Category Complexity
Space \(O(M)\)
Checking if an edge exists \(O(M)\)
Looping over neighbors of a vertex \(O(M)\)

Adjacency List

In this method we will create \(N\) lists, each one using a vector, where the \(i_{th}\) list stores all neighbors of vertex \(i\).


const int maxN = 100'000;
const int maxM = 200'000;

vector<int> adjacency_list[maxN + 1]; // add one because usually vertices are 1-indexed

int main () {
    int n, m;
    cin >> n >> m;

    for (int i = 0 ; i < m ; i++) {
        int u, v; cin >> u >> v;
        adjacency_list[u].push_back(v); // add v to the neighbors of u
        adjacency_list[v].push_back(u); // if undirected, add u to v's list as well
    }
}
Note

For weighted graphs we can use vector<pair<int,int>> where \((V, W)\) means \(V\) is a neighbor and the edge to it has weight \(W\).

Category Complexity
Space \(O(M)\)
Checking if an edge exists \(O(K)\) where \(K\) is the number of neighbors
Looping over neighbors of a vertex \(O(K)\) where \(K\) is the number of neighbors
Tip

Although looping over the neighbors seems slow, it is in fact \(O(M)\) for all vertices combined.

WarningChallenge

Can you modify this container to speed up “checking if an edge exists” without affecting the other queries.

Adjacency Matrix

This method creates an \(N \times N\) matrix of booleans \(G\) where \(G_{i,j}\) is true if vertex \(i\) is directly connected to vertex j and false otherwise.

const int maxN = 1'000;
const int maxM = 200'000;

bool G[maxN + 1][maxN + 1] // add one because usually vertices are 1-indexed

int main () {
    int n, m;
    cin >> n >> m;

    for (int i = 0 ; i < m ; i++) {
        int u, v; cin >> u >> v;
        G[u][v] = true;
        G[v][u] = true; // if the graph is undirected we need to write this line as well
    }
}
Note

For weighted graphs, instead of booleans, \(G_{i, j}\) can be an integer representing the weight of the edge from \(i\) to \(j\), but watch out for how you mark nonexistent edges.

Category Complexity
Space \(O(N^2)\)
Checking if an edge exists \(O(1)\)
Looping over neighbors of a vertex \(O(N)\)