Relations, Well-Foundedness, and Noetherian Induction

Joseph Sullivan

August 2020



If you’re acquainted with math, you’ve definitely seen induction before. To prove some property \(P(n)\) for all natural numbers \(n\), it’s sufficient to prove these two things:

  1. \(P(1)\) is true. This is referred to as our base case.

  2. \(P(n)\implies P(n+1)\). That is, by assuming \(P(n)\) true, we can show \(P(n+1)\). This is referred to as our induction step.

Let’s do an example. We’ll prove that for any natural number \(n\), a suitcase can be packed with \(n\) shirts.

Proof. We perform induction on \(n\), the number of shirts. Clearly a suitcase can be packed with one shirt, so our base case is satisfied. Now, suppose we can pack a suitcase with \(n\) shirts for a particular \(n\). Obviously we can always pack just one more shirt. Thus, we’ve proved our claim.


Okay, clearly this proof is invalid since we cannot pack arbitrarily many shirts in a suitcase. Our induction step is flawed: there’s a point where we cannot add just one more shirt. Nonetheless, the idea is there—we manually show that a claim is true for \(n=1\), then we show that we can always pack one more shirt, regardless of \(n\). Let’s talk about a different type of induction, one that’s a bit... stronger.

Strong Induction

As before, we want to prove a property \(P(n)\) for all natural numbers. It turns out that it’s also sufficient to prove the following:

  1. \(P(1)\) is true.

  2. \(P(1) \land \dots \land P(n-1) \implies P(n)\). That is, by assuming our property is true for all \(k<n\), we can show our property for \(n\).1

Intuitively, strong induction feels like it should work. If we prove \(P(1)\), then by our induction step we then deduce \(P(2)\). Thus, we have \(P(1)\) and \(P(2)\), so by our induction step we then deduce \(P(3)\). Thus, we have \(P(1),P(2),\) and \(P(3)\), so... You get the idea.

Of course “...” doesn’t constitute a mathematically rigorous proof. Later in this post we’ll prove that the Principle of Strong Induction follows from regular induction when we start talking about well-founded relations.


For the sake of completeness, let’s do an example of strong induction. We’ll prove that a tree with \(n\) vertices has \(n-1\) edges. A tree is a type of connected graph that has no cycles in it, e.g.

Now let’s prove our claim, using induction.

Proof. We perform induction on the number of vertices \(n\). For our base case, we consider a tree with one vertex. It must have zero edges, since otherwise there would be at least two vertices. Thus, our base case holds.

For our induction step, we assume that all trees with \(k<n\) vertices have \(k-1\) edges. Now, we consider an arbitrary tree with \(n\) vertices.

We can assume that \(n\geq 2\), since if \(n=1\), then we know our property is true by the base case. Therefore, we assume this tree has at least two vertices and likewise at least one edge. We choose an arbitrary edge in our tree, and we delete it. Since trees have no cycles, deleting this edge will break the tree into two components. Each component is a tree as well, since if either component contained a cycle, then the original tree would contain a cycle.

Let \(n_1\) be the number of vertices of the first component, and \(n_2\) the number of vertices of the other component. We know \(n=n_1+n_2\) since the component together with the deleted edge form the original tree. Likewise, by our induction hypothesis the first component tree has \(n_1-1\) edges, and the other contains \(n_2-1\) edges. Therefore, the original tree, containing both components and the extra edge, has \[(n_1-1) + (n_2-1) + 1 = n-1\] edges. Thus, our claim is true.


Induction on natural numbers is really powerful, especially since a lot of things are natural numbers—e.g. the number of leaves on a tree, the number of edges in a graph, the number of characters in a formula, etc.

But let’s spice it up a little bit.


So far we’ve been doing induction on the natural numbers. Don’t get me wrong—numbers are pretty cool. But like, math is so much more than numbers, you feel me? We want to generalize induction to a much larger class of problems.

We note that induction, in the forms that we’ve seen, takes advantage of the ordering of natural numbers. That is, we prove a property true for the minimum natural number, then we build upward from there. Thus, for us to generalize induction we first have to generalize ordering and relations.

For natural numbers, there are various well-known relations, e.g. \(<,\leq,>,\geq,=,\) and more. You might have also seen \(\mid\), the divides relation. For example, \(2\mid 4\), that is, \(2\) divides—or goes into—\(4\). On the other hand, \(3 \nmid 5\), and also \(5\nmid 3\).

We can also have relations on other sets, not just natural numbers. For example, if we have a graph, we can define a relation on the set of vertices:

For any two vertices \(u,v\), we either have \(u\) is adjacent to \(v\), or \(u\) is not adjacent to \(v\). For succinctness, we could say \(u A v\) to say \(u\) is adjacent to \(v\), and we call \(A\) our relation. For example, in our particular graph we have \(a A d\), but not \(a A e\).

Even more, we can have a relation relate objects that don’t even belong to the same set (“Heterogenuous Relation,” n.d.). For example, if \(A\) is the set of oceans on the Earth, and \(B\) is the set of continents, then we could have the relation \(R\), where an ocean \(a\in A\) relates to a continent \(b\in B\)—that is, \(a R b\)—if the ocean \(a\) borders the continent \(b\).

We see that a relation, call it \(R\), on sets \(X\) and \(Y\) is some sort of mechanism that tells us whether an object \(x\in X\) relates to \(y\in Y\). That is, a relation tells us either \(x R y\), or it tells us that it tells us that not \(x R y\).

This is the jist of it, but mathematicians have a somewhat unintuitive formalism for talking about relations. Let \(X\) and \(Y\) be sets, and let \(X\times Y\) be the cartesian product, i.e. all possible pairs \((x,y)\) where \(x\in X\) and \(y\in Y\). Then, \(R\) is a subset of \(X\times Y\), where \((x,y)\) is contained in \(R\) if and only if \(x R y\).

For example, let’s consider \(\leq\) on the natural numbers. In an odd way we say that \(\leq\hspace{0.03cm} \subseteq \mathbb{N}\times \mathbb{N}\). Is \((4,6)\) an element of \(\leq\)? Yes, because \(4\leq 6\). Is \((5,5)\in \leq\)? Again, yes, because \(5\leq 5\). However, \((6,4)\) is not an element of \(\leq\), since \(6\not\leq 4\).

As we’ve seen in our examples, relations have a wide variety of forms. Let’s define some properties to classify relations.


From here onward, we’re going to be considering only relations that act on a single set, i.e. we consider \(R\subseteq X\times X\). These are referred to as homogeneous relations, and seem to show up the most frequently.

In the interest of time, I’m going to list a bunch of properties with brief definitions, then focus in on a few of them (“Binary Relation,” n.d.).


One particularly important relation, the \(\leq\) relation on the natural numbers, is reflexive, antisymmetric, transitive, and total. It’s reflexive because \(n\leq n\). It’s antisymmetric because \(n\leq m\) and \(m\leq n\) implies \(n=m\). It’s transitive because \(n\leq m\) and \(m\leq k\) implies \(n\leq k\). Finally, it’s total because for all natural numbers \(n,m\), one of them is larger, or they are both equal. If \(m\) is larger, then \(n\leq m\). Otherwise, \(m\leq n\).

Partial Orders

The first three properties from the previous example—reflexivity, antisymmetry, and transitivity—turn out to be really important. In fact, we call any relation that obeys those three properties, a partial order. A set that has an associated partial order, is called a partially ordered set, or a poset for short. If \(X\) is our set in question and \(\preceq\) is our partial order, notationally we refer to \((X,\preceq)\) as our poset. For simplicity we often call the set \(X\) itself a poset.

Partial orders are a really powerful generalization of other more common orderings. We can now define concepts like maximum in a comprehensive way. Let \(X\) be a set, and let \(\preceq\) be an partial order on \(X\).

It turns out, however, that the terms maximum and minimum can be too restrictive, because partial orders can be somewhat weird.

For example, let \(N_2\) be the set of all natural numbers greater than or equal to \(2\), i.e. \(N_2=\{2,3,4,\dots\}\). Then, consider the relation \(\mid\), “divides.” You might have noticed that “divides” is reflexive, antisymmetric, and transitive, so therefore it’s a partial order. Does \(N_2\) have a minimum element with respect to this ordering?

We know that \(3,4,5,\dots\) can not be minimum elements, since \(k\nmid 2\) if \(k>2\) with respect to the traditional ordering. However, \(2\) is not a minimum element, because \(2\nmid 3\). Thus, \(N_2\) has no minimum element. At the same time, we did observe that there is nothing “below” \(2\), that for all \(k\in N_2\) \(k\neq 2\), \(k\nmid 2\). In fact, the same can be said for every prime number, since prime numbers are only divisible by themselves and \(1\), but \(1\not\in N_2\).

As stated above, a minimum element is smaller than every other element with respect to the partial order. Inspired by our example, however, we also define the term minimal and maximal:

In our example, the prime numbers are are minimal, since no other elements in \(N_2\) come before them. One thing to note is that minimal/maximal elements are not necessarily unique (e.g. the prime numbers), but minimum/maximum elements are unique, if they exist. This can be proved pretty quickly:

Proof. Let \((X,\preceq)\) be a partially ordered set with a maximum, and let \(x,y\) be (not necessarily distinct) maximum elements. Since \(x\) is maximum, in particular \(y\preceq x\). Likewise, because \(y\) is maximum, \(x\preceq y\). By antisymmetry, \(x=y\), so the maximum is unique.


The proof for the uniqueness of a minimum element is similar. It’s also worth mentioning that a minimum is automatically minimal and likewise a maximum is maximal. Likewise, if the minimum exists, there is only one minimal element—the minimum. Dually, if the maximum exists, there is only one maximal element—the maximum.

One more interesting fact—in a total order (a partial order that is total), an element is maximal if and only if it is the maximum, and an element is minimal if and only if it is the minimum.

Finally, it’s also worth mentioning that a partial order \(\preceq\) also has an associated strict partial order \(\prec\). If \((X,\preceq)\) is a poset, we say \(x\prec y\) if \(x\preceq y\) and \(x\neq y\). You can check for yourself that consequently \(\prec\) must be asymmetric, and transitive. Now let’s see some examples of these concepts in action.

Hasse Diagrams

For finite sets, it’s often convenient to specify a partial order through a directed graph. For example, let our set be \(X=\{a,b,c,d,e,f,g,h\}\), and we say that for arbitrary elements \(u,v\in X\), \(u\preceq v\) if and only if there is an arrow from the vertex associated with \(u\) to the vertex associated with \(v\) in the following directed graph:

Three things of note: (1) it’s implied that \(u\preceq u\) for all vertices \(u\in X\), (2) not all directed graphs give rise to a partial order since some graphs might not obey transitivity or antisymmetry, and (3) this graph is really ugly. Mathematicians figured out a nice way to mostly fix (2) and definitely help out (3), through what’s referred to as a Hasse Diagram (“Hasse Diagram,” n.d.).

If you look at our directed graph, a lot of the arrows could be removed if we infer transitivity. For example, in our graph we have arrows \(\overrightarrow{ab}\), \(\overrightarrow{be}\), and \(\overrightarrow{ae}\). However, if we chose to infer transitivity, we could take away \(\overrightarrow{ae}\). Continuing, we remove all unnecessary arrows in a process called transitive reduction. We get the following graph:

The backwards process when we infer transitivity, is called the transitive closure. This makes the graph a lot prettier, and this also ensures that the inferred relation is transitive. Often people remove the arrows and assume that the edges point in the upwards direction. The only problem is that our graph might not be antisymmetric. For example, on the set \(Y=\{a,b,c\}\), one might specify the graph

which represents a relation that is not antisymmetric, since \(a\preceq b\) and \(b\preceq a\), but \(a\neq b\).

Now that we have a convenient way to represent partial orders on finite sets, let’s visualize what it means for an element to be minimal versus a minimum. Let’s look at the set \(X=\{a,b,c,d,e,f,g\}\) with the partial order on the left and the set \(X'=\{b,c,d,e,f,g\}\) with the partial order on the right:

The element \(a\in X\) is the minimum of \(X\), since for all elements \(x\in X\), \(a\preceq x\). Intuitively, we can draw a path upward from \(a\) to get to any other element in the graph.

However, we don’t have a minimum element in \(X'\). Rather, \(b,c,d\in X'\) are all minimal elements. None of them are the minimum, since they do not compare to each other. Nonetheless, they are all still minimal since no element precedes them.

We’ve built an understanding of partial orders, but there’s still one more property of a relation that we need to discuss before we can generalize induction.


We started our discussion of partial orders by talking about the standard ordering \(\leq\) on the natural numbers. We saw that this ordering is reflexive, symmetric, and transitive. It also has another property that we need for induction: well-foundedness.

Definition 1. A partially ordered set \((X,\preceq)\) is well-founded if every nonempty subset contains at least one minimal element.

To motivate this definition, let’s look back regular induction. We prove a claim for the base case, i.e. \(P(1)\), and then the induction step. Then, our claim is true for all natural numbers \(n\).

One way of reasoning about why induction works is because we do not have any infinite descending chains. For example, \(P(7)\) is true, because have the finite descending chain \[1<2<\dots<6<7.\] Then, because have have the base case \(P(1)\) and the induction step, we get \(P(2)\). From \(P(2)\) we get \(P(3)\). From \(P(3)\) we get \(P(4)\). In a finite number of steps, we get to \(P(7)\).

Let’s be a little bit more formal about this. Let \((X,\preceq)\) be a partially ordered set. An infinite descending chain \(C\) is an infinite sequence \(\{x_n\}_{n=1}^\infty\) such that \(x_{n+1}\prec x_n\). This brings us to the following theorem: (Klappenecker, n.d.; “Condition for Well-Foundedness,” n.d.)

Theorem 1. Let \((X,\preceq)\) be a well-founded partially ordered set. Then \(X\) has no infinite descending chain.

Proof. Let \((X,\preceq)\) be a poset. Assume that \(X\) is well-founded. Suppose, to the contrary, that \(X\) contains an infinite descending chain \(C\). Since \(C\) is a nonempety subset of \(X\) and because \(X\) is well-founded, we know \(C\) has a minimal element \(y\). Since \(y\in C\), for some \(n\), \(y=x_n\). However, by definition, \(x_{n+1}\preceq x_n\) and \(x_{n+1}\neq x_n\), so \(y\) is not a minimal element. Thus, we have a contradiction.


Having no infinite descending chains is referred to as the descending chain condition (DCC). Now we have a feeling about why well-foundedness is useful—it prevents infinite descending chains, meeting the descending chain condition. In fact, the reverse implication can also be deduced with the axiom of choice, but maybe we’ll save that for another day. We first started talking about well-foundedness when discussing natural numbers, but we haven’t formally proved that they are actually well-founded. Let’s do something slightly stronger (pahio 2013).

Theorem 2. Let \(\mathbb{N}\) be the natural numbers with the standard order \(\leq\). Then, every nonempty subset of \(\mathbb{N}\) has a minimum element.

Proof. Let \(A\) be an arbitrary nonempty subset of \(\mathbb{N}\), and let \(C\) be the set of natural numbers less than or equal to all elements of \(A\). That is, \[C = \{x\in \mathbb{N}: x\leq a \; \forall a\in A\}.\] Since \(1\leq n\) for all natural numbers \(n\), we know \(1\in C\).

Since \(A\) is nonempty, there exists \(a\in A\), and in particular \(a+1\not\in C\) because \(a+1 \not\leq a\). Therefore, \(C\neq \mathbb{N}\). Suppose, to the contrary, that for all \(c\in C\), \(c+1\in C\). By induction this implies that \(C=\mathbb{N}\), which is a contradiction.

Thus, there exists \(c_0\in C\) such that \(c_0+1\not\in C\). Since \(c_0+1\not\in C\), this implies that there exists \(a_0\in A\) such that \(c_0+1\not\leq a_0\), i.e. \(a_0 < c_0+1\). This means \(a_0\leq c_0\), and by definition of \(c_0\in C\) we know \(c_0\leq a_0\). By transitivity, \(c_0=a_0\).

Finally, since \(a_0\in C\), we know \(a_0\leq a\) for all \(a\in A\). Therefore, \(a_0\in A\) is the minimum element of \(A\).


Recall from before that minimum elements are automatically minimal. Thus, the natural numbers (with the usual ordering) are well-founded. For natural numbers we proved an even stronger condition. When a poset obeys the property that every nonempty subset has a minimum element, we say the poset is well-ordered. Thus, the natural numbers are well-ordered.


It’s worth discussing some more eccentric well-founded partial orders. Let \(X_1,X_2,\dots,X_n\) be sets with partial orders \(\preceq_1,\preceq_2,\dots,\preceq_n\) respectively.

We can then define an order on the Cartesian product \(\prod_{i=1}^n X_i = X_1\times \dots \times X_n\), the set of all \(n\)-tuples \((x_1,\dots,x_n)\) where \(x_i\in X_i\).

The product order \(\preceq\) is defined by \[(x_1,\dots,x_n) \preceq (y_1,\dots,y_n)\] if and only if \(x_i \preceq_i y_i\) for \(i=1,\dots,n\). You can verify yourself that the product order is reflexive, antisymmetric, and transitive.

For instance, consider \(\mathbb{N}\times \mathbb{N}\times \mathbb{N}\) with the product order. In this example, \((2,2,2)\preceq (2,3,4)\), but \((2,2,2)\not\preceq (1,2,3)\), because in the first coordinate we have \(2\not\leq 1\).

In general, when is the product order well-founded? We prove this theorem to provide a sufficient condition.

Theorem 3. Let \(X_1,\dots,X_n\) be posets with well-founded partial orders \(\preceq_1,\dots,\preceq_n\). Then, the product order on \(\prod_{i=1}^n X_i\) is well-founded.

Proof. By induction on \(n\). When \(n=1\), the product order is just the given well-founded partial order, so it is well-founded by assumption.

Then, assume that the theorem is true for \(n\), that when \(X_1,\dots,X_n\) are well-founded partially ordered sets, then \(\prod_{i=1}^n X_i\) with the product order is a well-founded partial order.

Now, let \(X_1,\dots,X_n,X_{n+1}\) be well-founded partially ordered sets. The product order \(\prod_{i=1}^{n+1} X_i\) is in fact2 the same as the product order on \(\left(\prod_{i=1}^n X_i\right) \times X_{n+1}\). Both terms here are well-founded: the left by our induction hypothesis, and the right by assumption. Thus, it suffices to show the well-foundedness of the product order of a pair of well-founded partial orders in order to show the \(n+1\) case.

We refer to our pair of well-founded posets as \(X_1,X_2\) and prove \(X_1\times X_2\) is a well-founded poset with the product order. Let \(S\) be a nonempty subset of \(X_1\times X_2\), and let \(S_1\) be the projection of \(S\) onto the first coordinate, that is, \[S_1 = \{y\in X_1 : \exists x_2 (y,x_2)\in S\}.\] Since \(S_1 \subseteq X_1\) and \(S_1\) is nonempty because \(S\) is nonempty, we know \(S_1\) has a minimal element, call it \(y_1\), because \(X_1\) is well-founded. Then, let \(S_2\) be the set of \(x_2\in X_2\) such that \((y_1,x_2)\in S\).

We note that \(S_2\) is a nonempty subset of \(X_2\), because if \(y_1\in S_1\), then there must be some pair \((y_1,x_2)\in S\) by definition. Furthermore, because \(S_2\subseteq X_2\) and \(X_2\) is well-founded, there must be a minimal element \(y_2\in S_2\). This then implies that \((y_1,y_2)\in S\). We claim that this element is minimal in \(X_1\times X_2\) with the product order.

Suppose that \((z_1,z_2) \preceq (y_1,y_2)\). This implies that \(z_1\preceq_1 y_1\) and \(z_2\preceq_2 y_2\). Recall our definition of set \(S_1\). Since \(z_1\in S_1\) and \(y_1\) is a minimal element, we cannot have \(z_1\preceq y_1\) unless \(z_1=y_1\). Thus, \(z_1=y_1\), and we are now considering \((y_1,z_2)\preceq (y_1,y_2)\).

Now, recall our definition of \(S_2\). Since \(z_2\) is paired with \(y_1\) in \(S\), we know \(z_2\in S_2\). Additionally, we know that \(y_2\) is a minimal element in \(S_2\). Therefore, \(z_2\preceq_2 y_2\) likewise implies \(z_2=y_2\).

Thus, \((z_1,z_2)\preceq (y_1,y_2)\) implies \((z_1,z_2)=(y_1,y_2)\). Therefore, \((y_1,y_2)\) is minimal, and our theorem is proved.


There are several other examples of well-founded partial orders: the lexicographic ordering on the Cartesian product of well-founded orders, the set of finite subsets of natural numbers with inclusion as the ordering, etc (Klappenecker, n.d.). Finally, now that we have a grasp on well-foundedness, we can generalize induction.

Noetherian Induction

Let \((X,\preceq)\) be a set with a well-founded partial order. The Principle of Noetherian Induction states, to prove a property \(P(x)\) for all \(x\in X\), it suffices to prove

  1. Induction basis. \(P(x)\) is true for all minimal elements \(x\in X\).

  2. Induction step. \(P(y)\) is true assuming \(P(x)\) for all \(x\prec y\), for all non-minimal \(y\in X\).

Let’s prove that Noetherian induction works, that proving these two statements is sufficient (Klappenecker, n.d.).

Proof. Let \((X,\preceq)\) be a set with a well-founded partial order, and let \(P(x)\) be a property. Suppose that we have proved (1) and (2). We want to show that \(P(x)\) is true for all \(x\in X\). Let \(C\) be the set of counterexamples to our property \(P(x)\). That is, \[C=\{x\in X : P(x) \text{ is false.}\}.\] Suppose, to the contrary, that \(C\) is not empty. Then, because \(C\subseteq X\) and \(X\) is well-founded, this implies \(C\) has a minimal element \(y\). If \(y\) is also minimal in the entirety of \(X\), then we know \(P(y)\) by (1) so \(y\not\in C\) and we have a contradiction.

Otherwise, we know that for all \(x\in X\) such that \(x\prec y\), \(x\not\in C\). In other words, because \(y\) is a minimal counterexample, anything that precedes \(y\) is not a counterexample.

Therefore, we know \(P(x)\) for all \(x\prec y\), so by (2) we know \(P(y)\). This implies \(y\not\in C\), which is a contradiction.

Thus, the set of counterexamples is empty, so \(P(x)\) is true for all \(x\in X\).


This all might seem a little abstract and vague, but it turns out to be really useful. Let’s run through an example.


Let’s prove that any natural number greater than or equal to 2 can be written as a product of primes. Recall from our discussion of partial orders, that \(N_2=\{2,3,4,\dots\}\) is a poset with the relation \(\mid\), “divides.” This poset is in fact well-founded, with an element being minimal if and only if it is prime. We leave proving these two statements as an exercise to the reader.

As a result of these two properties, to prove our claim we can perform Noetherian induction on \(N_2\) with respect to the \(\mid\) order. Let’s do that.

Proof. By Noetherian induction on \(n\in N_2\). All of our minimal elements can be written as a product of primes trivially, since a prime number \(p\) can be simply written as itself.

Now, let \(n\) be a non-minimal (i.e. composite) natural number, and assume that all numbers \(m\) preceding \(n\) (i.e. all divisors of \(n\)) can be written as a product of primes. Since \(n\) is composite, it can be written as product \(xy\). Thus, \(x \mid n\) and \(y \mid n\). By our induction hypothesis, we know \(x,y\) can be written as product of primes, i.e. \(x = p_{x_1} \cdots p_{x_k}\) and \(y=p_{y_1}\cdots p_{y_j}\), where each \(p_{x_i}\) or \(p_{y_i}\) are not necessarily distinct.

Therefore, since \(n=xy\), we know \(n=p_{x_1} \cdots p_{x_k}p_{y_1}\cdots p_{y_j}\), so \(n\) is a product of primes.


To be fair, this proof is overly complicated. We could have gotten away with strong induction, using the fact that \(x,y<n\). However, you might have realized that strong induction is, in fact, a special case of Noetherian Induction, using the regular ordering of the natural numbers, which we had proved earlier is well-founded.

I hope to showcase more applications of Noetherian induction in the future. Maybe I’ll explain how to prove that a certain strategy is optimal in a game called Nim.

“Binary Relation.” n.d. Wikipedia.
“Condition for Well-Foundedness.” n.d. Proof Wiki.
“Hasse Diagram.” n.d. Wikipedia.
“Heterogenuous Relation.” n.d. Wikipedia.
Huang, Guan-Shieng. 2008. “Fundamentals of Mathematics Lecture 5: Induction.” National Chi Nan University, Taiwan. 2008.
Klappenecker, Andreas. n.d. “Noetherian Induction.”
pahio. 2013. “Natural Numbers Are Well-Ordered.” 2013.

  1. the symbol \(\land\) means “and”↩︎

  2. Proof is left as exercise to the reader.↩︎