Sylow Theorems

Joseph Sullivan

December 2021

My first time going through the Sylow theorems was not a good time. I really didn’t get what was going on, and I think that a lot of this struggle could have been avoided if the lectures/textbooks didn’t treat the Sylow theorems as one big bundle, but rather as independent results that are all useful in discovering the structure of a group of finite order.

For this reason, my notes will be structured as follows:

Group Actions

Definition 1. A group action of a group \(G\) on a set \(X\) is a homomorphism \[\mu: G\to \mathop{\mathrm{Perm}}(X),\] where \(\mathop{\mathrm{Perm}}(X)\) denotes the permutation group (set of bijections under composition) of \(X\). Then, we use the notation \[g\cdot x = \mu(g)(x)\] for \(g\in G\), \(x\in X\).

Definition 2. We define some notions related to a group action. Let \(G\) act on \(X\).

  • For \(x\in X\), define the stabilizer of \(x\) \[\mathop{\mathrm{Stab}}(x) = \{g\in G \operatorname{\big|}g\cdot x = x\},\] i.e., the set of \(g\in G\) that fix \(x\).

  • For \(x\in X\), define the orbit of \(x\) \[G\cdot x = \{g\cdot x \in X \operatorname{\big|}g\in G\},\] i.e., the set of \(G\)-translations of \(x\).

  • We say the action is transitive if there is only one orbit.

Proposition 1. Let \(G\) act on \(X\). For all \(x\in X\), \(\mathop{\mathrm{Stab}}(x)\) is a subgroup of \(G\).

Remark 1. In finite group theory, one very important result on group actions is the Orbit Stabilizer Theorem. It allows us to go between the study of orbits and stabilizer subgroups.

Theorem 1 (Orbit Stabilizer Theorem). Let \(G\) be a group which acts on \(X\). Then, \[|G\cdot x| = [G: \mathop{\mathrm{Stab}}(x)].\]

Remark 2. Before proving the Orbit Stabilizer Theorem, let’s talk about why it should feel true. The size of an orbit is the number of distinct \(G\)-translations of \(x\). The orbit stabilizer theorem just says that the number of distinct \(G\)-translations is the size of \(G\) (number of possibly redundant translations) divided by redundancies (the size of the stabilizer).

In this sense, the Orbit Stabilizer Theorem is very much like the First Isomorphism Theorem—the orbit is like the image of a homomorphism, the group \(G\) is like the domain, and the stabilizer is like the kernel.

Let’s now prove the theorem.

Proof. We define a bijection between \(G\cdot x\) and the set of (left) \(\mathop{\mathrm{Stab}}(x)\) cosets, which we denote by \(G/\mathop{\mathrm{Stab}}(x)\), \[\begin{aligned} f: G/\mathop{\mathrm{Stab}}(x) &\longrightarrow G\cdot x\\ g\mathop{\mathrm{Stab}}(x) &\longmapsto g\cdot x. \end{aligned}\] We have to show that \(f\) is well-defined on coset representatives, that \(f\) is injective, and that \(f\) is surjective.

First, \(f\) is well-defined because if \(g\mathop{\mathrm{Stab}}(x) = h\mathop{\mathrm{Stab}}(x)\), then \(h^{-1}g\in \mathop{\mathrm{Stab}}(x)\), so \(h^{-1}g \cdot x = x\). Therefore, \(g\cdot x = h\cdot x\), so \(g\mathop{\mathrm{Stab}}(x), h\mathop{\mathrm{Stab}}(x)\) have the same image.

Next, we show \(f\) is injective. Assume \(g\cdot x = h\cdot x\). Then, \(h^{-1}g\cdot x = x\), so \(h^{-1}g\in \mathop{\mathrm{Stab}}(x)\) and \(g\mathop{\mathrm{Stab}}(x) = h\mathop{\mathrm{Stab}}(x)\).

Finally, \(f\) is surjective because if \(g\cdot x \in G\cdot x\), then \(g\cdot x\) is the image of \(g\mathop{\mathrm{Stab}}(x)\).

Thus, \(f\) is a bijection, so \[\begin{aligned} |G/\mathop{\mathrm{Stab}}(x)| &= |G\cdot x|\\ [G:\mathop{\mathrm{Stab}}(x)] &= |G\cdot x|. \end{aligned}\] ◻

Let’s take a moment to talk about a few group actions that important for studying the structure of groups. We’ll see these examples later when we try to prove the Sylow theorems.

Example 1. For \(g\in G\), let \(c_g: G\to G\) denote conjugation by \(g\). Then, we have a group action of \(G\) on itself by conjugation, i.e., \[\begin{aligned} \varphi: G &\longrightarrow \mathop{\mathrm{Perm}}(G)\\ g &\longmapsto c_g, \end{aligned}\] so \(g\cdot h = ghg^{-1}\). For \(g\in G\), \[\mathop{\mathrm{Stab}}(g) = C_G(g),\] the centralizer of \(g\), i.e., the elements that commute with \(g\). Then, \[G\cdot g = \mathop{\mathrm{Cl}}(g),\] the conjugacy class of \(g\). We can then write \(G\) as the disjoint union of its distinct orbits (the conjugacy classes). Let \(g_1,\dots,g_n\) represent the conjugacy classes. We then have \[G = \bigsqcup_{i=1}^n \mathop{\mathrm{Cl}}(g_i).\] If \(g\in Z(G)\) (the center, the set of elements that commute with all of \(G\)), then \(\mathop{\mathrm{Cl}}(g)=\{g\}\). Therefore, if we let \(h_1,\dots,h_m\) represent the non-center conjugacy classes, then \[G = Z(G) \sqcup \bigsqcup_{i=1}^m \mathop{\mathrm{Cl}}(h_i),\] so \[|G| = |Z(G)| + \sum_{i=1}^m |\mathop{\mathrm{Cl}}(g_i)|.\] By the Orbit Stabilizer Theorem, \(|\mathop{\mathrm{Cl}}(h_i)| = [G: C_G(h_i)]\), so \[|G| = |Z(G)| + \sum_{i=1}^m [G: C_G(h_i)].\] This equation is referred to as the class equation, and it very useful for deducing things about the center or centralizers.

Example 2. Let \(H_1,\dots,H_n\) be subgroups of \(G\) that are conjugate to each other and closed under conjugation (i.e., for all \(g\in G\), \(i=1,\dots,n\), \(gH_ig^{-1} = H_j\) for some \(j\)). Then, \[\mathop{\mathrm{Stab}}(H_i) = N_G(H_i),\] the normalizer of \(H_i\), i.e. the elements \(g\in G\) such that \(gH_ig^{-1} = H_i\).

Existence of Sylow Subgroups

Our entire focus right now is to guarantee the existence of certain subgroups called Sylow subgroups. These large subgroups are very useful for classifying groups of finite order or deducing properties like simplicity.

Definition 3. Let \(p\) be a prime, \(G\) a group of order \(mp^\alpha\), \(p\nmid m\).

  • A \(p\)-group is a group \(P\) of order \(p^\beta\) for some \(\beta\in \mathbb{N}\).

  • A \(p\)-subgroup of \(G\) is a subgroup of \(G\) which is a \(p\)-group.

  • A Sylow \(p\)-subgroup of \(G\) is a \(p\)-subgroup of \(G\) of order \(p^\alpha\) (i.e., the highest power of \(p\) possible).

Theorem 2 (Sylow Theorem 1). Let \(G\) be a group. Then, there exists a Sylow \(p\)-subgroup.

We outline the strategy. We will prove this theorem by induction on the size of \(G\). Therefore, we need to show how to reduce the problem of finding a Sylow \(p\)-subgroup of \(G\) to finding a Sylow \(p\)-subgroup of a group of smaller order.

There are two strategies that we could try to reduce the size of \(G\).

  1. Take a nontrivial normal subgroup \(N\) of order \(p\), apply our induction hypothesis to the quotient \(G/N\), and then try to pull back a Sylow \(p\)-subgroup of \(G\).

  2. Find a proper subgroup \(H\) of \(G\) with the same power of \(p\), and apply the induction hypothesis to \(H\).

We will show that we can always do one of these two strategies. We will start with some lemmas that will help us implement strategy 1. Feel free to skip the lemmas/proofs, and go back once you see that it is necessary.

Lemma 1. Let \(G\) be a group, \(x\in G\) of order \(n<\infty\), \(a\in \mathbb{Z}\setminus \{0\}\). Then, \[|x^a|=\frac{n}{\gcd(n,a)}\]

Proof. Let \(d=|\gcd(n,a)|\), so we can write \[n=db, \qquad a=dc\] for \(b,c\) coprime. We want to show that \(|x^a| = b\). We will show that \[|x^a| \operatorname{\big|}b, \qquad b \operatorname{\big|}|x^a|.\] First, we have \[(x^{a})^b = x^{dcb} = (x^{db})^c = (x^n)^c = 1.\] This implies \(|x^a| \operatorname{\big|}b\).

Next, we have \[(x^{a})^{|x^a|} = x^{a|x^a|}= 1,\] so \(n\operatorname{\big|}a|x^a|\), and (removing a factor of \(d\)) we get \[b \operatorname{\big|}c|x^a|.\] By construction \(b,c\) are coprime, so \(b\operatorname{\big|}|x^a|\). Thus, \(|x^a| = b\). ◻

Lemma 2 (Cauchy’s Theorem for Abelian Groups). Let \(G\) be an abelian group such that \(p \operatorname{\big|}|G|\). Then, there exists an element of order \(p\).

Proof. We prove by strong induction on \(|G|\). The base case \(|G|=p\) is trivial, just take any non-identity element. Otherwise, assume \(|G|>p\), and take some non-identity element \(x\in G\). Either \(x\) will have order which is a multiple of \(p\) (this case is nice), or we will have to pass to a quotient, apply our induction hypothesis, and pull back an element from the quotient.

In our first case, assume \(p \operatorname{\big|}|x|\). Then \(|x|=pn\), so \(x^n \in G\) has order \(p\) by our lemma.

Otherwise, assume \(p \nmid |x|\), and let \(N=\left\langle x\right\rangle\). Since \(G\) is abelian, \(N\) is normal (all subgroups are normal in an abelian group), so we can consider the quotient \(G/N\). If we write \(|G|=mp\), then \[|G/N| = \frac{mp}{|N|}.\] Since \(p\nmid |x|\), we will still have \(p\operatorname{\big|}|G/N|\). Since \(x\) is not the identity, \(G/N\) is strictly smaller in size for \(G\).

Thus, we can apply the induction hypothesis and get an element \(\overline{y} \in G/N\) of order \(p\). Choose a preimage \(y\in G\). We now argue that \(p \operatorname{\big|}|y|\). We know \(\left\langle y^p\right\rangle < \left\langle y\right\rangle\) because \(y^p \in \left\langle y\right\rangle\), but \(\left\langle y^p\right\rangle\neq \left\langle y\right\rangle\) because \(y^p \in N\) (i.e., \(\overline{y^p} = 1\)) but \(y\not\in N\). Therefore, \(|y^p|\) strictly divides \(|y|\), so \[\frac{|y|}{\gcd(|y|,p)} \neq |y|,\] so \(\gcd(|y|,p)\neq 1\), so \(p\operatorname{\big|}|y|\). This puts us back in the first case, so always have an element of order \(p\). ◻

Proof of Sylow Theorem 1. We prove the theorem by induction on the order of \(G\). For our base case, if \(|G|=1\), then \(G\) itself is a Sylow \(p\)-subgroup.

Now, assume \(G\) has order \(mp^\alpha\) (such that \(p\nmid m\)).

In order to apply strategy 1, we would like to find a normal subgroup of order \(p\) so that when we pull back a Sylow \(p\)-subgroup, we would already be done. To “find” such a subgroup, we’ll try looking in the center. We consider two cases: \(p \operatorname{\big|}|Z(G)|\) and \(p \nmid |Z(G)|\).

Case 1. \(p\operatorname{\big|}|Z(G)|\). By our lemma, there is then an element \(x\in Z(G)\) of order \(p\), and because \(x\) is in the center, \(N=\left\langle x\right\rangle\) is a normal subgroup. Then, \(G/N\) is a group of order \(mp^{\alpha-1}\). By induction \(G/N\) has a Sylow \(p\)-subgroup \(\overline{P}\), which has preimage \(P\) under projection. Then, \(P/N = \overline{P}\), so \(P\) has order \(p^{\alpha-1}\cdot p = p^\alpha\). Thus, \(P\) is a Sylow \(p\)-subgroup of \(G\).

Case 2. \(p\nmid |Z(G)|\). Letting \(g_1,\dots,g_r\) be representatives of the non-central conjugacy classes, we have by rearranging the class equation \[|G| - \sum_{i=1}^r [G: C_G(g_i)] = |Z(G)|.\] We then cannot have \(p\operatorname{\big|}|G|\) and \(p\operatorname{\big|}[G: C_G(g_i)]\) for all \(i=1,\dots,r\) because this would imply \(p\operatorname{\big|}|Z(G)|\). Thus, either \(p\nmid |G|\) and the trivial subgroup is a normal subgroup, or for some \(i\), \(p\nmid [G:C_G(g_i)]\). Then, \[[G:C_G(g_i)] \cdot |C_G(g_i)| = mp^\alpha,\] so \(p^\alpha \operatorname{\big|}|C_G(g_i)|\). Additionally, \(|C_G(g_i)| < |G|\) because \(g_i\not\in Z(G)\), so some element of \(G\) does not commute with \(g_i\) and is not in \(C_G(g_i)\). Therefore, we can apply our induction hypothesis to get a Sylow \(p\)-subgroup \(P\) of \(C_G(g_i)\) which has order \(p^\alpha\). Then, \(P\) is a Sylow \(p\)-subgroup of \(G\) itself. ◻

Acting on the Sylow Subgroups

Now that we have established the existence of Sylow \(p\)-subgroups, we want to deduce the remaining Sylow theorems by studying group actions on Sylow \(p\)-subgroups. First, we define some notation.

Definition 4. Let \(p\) be a prime, \(G\) a group. We denote \(\mathrm{Syl}_p(G)\) to be the collection of Sylow \(p\)-subgroups, and \(n_p=|\mathrm{Syl}_p(G)|\) the number of Sylow \(p\)-subgroups.

Proposition 2. For any subgroup \(H\leq G\), \(H\) acts on \(\mathrm{Syl}_p(G)\) by conjugation. The stabilizer of \(P\in \mathrm{Syl}_p(G)\) under this action is \(N_H(P)=N_G(P)\cap H\).

Proof. Assume \(G\) has order \(mp^\alpha\), \(p\nmid m\). Since conjugation is a group automorphism, for any \(h\in H\) and \(P\in \mathrm{Syl}_p(G)\), we know the image of conjugation \(hPh^{-1}\) is still a subgroup of \(G\), and it has order \(p^\alpha\) because conjugation is an isomorphism (in particular bijection).

Thus, conjugation maps Sylow \(p\)-subgroups to Sylow \(p\)-subgroups, so the action is well-defined. It is not difficult to then show conjugation satisfies the axioms for an action, and that the stabilizers are the appropriate normalizers. ◻

Theorem 3 (Sylow Theorems 2, 3). Let \(p\) be a prime, and let \(G\) be a group of order \(mp^\alpha\), where \(p\nmid m\). Then,

  1. If \(P\) is a Sylow \(p\)-subgroup of \(G\) and \(Q\) is any \(p\)-subgroup of \(G\), then there exists \(g\in G\) such that \(Q\leq gPg^{-1}\).

  2. \(n_p \equiv 1 \pmod{p}\) and \(n_p \operatorname{\big|}m\).

(These statements are taken from Dummit and Foote).

Remark 3. The second Sylow theorem has two important corollaries: we can extend \(p\)-subgroups to Sylow \(p\)-subgroups, and the conjugation action on \(\mathrm{Syl}_p(G)\) is transitive. Both of these facts are extremely useful for deducing information about the structure of a group given its order.

Before proving Sylow Theorems 2 and 3, we prove the following lemma that will help us count the size of orbits under the conjugation action of \(\mathrm{Syl}_p(G)\) by a \(p\)-subgroup \(Q\).

Lemma 3. Let \(P\in \mathrm{Syl}_p(G)\), and \(Q\) a \(p\)-subgroup. Then, \[Q\cap N_G(P) = Q\cap P.\]

Proof. We know \(Q\cap P \subseteq Q\cap N_G(P)\) because \(P\leq N_G(P)\). Let \(H=Q\cap P\). We now need to show \(H \subseteq P\) (since we already know \(H\subseteq Q\)).

Our strategy is to show \(PH=P\), since normalizers are the thing that lets us talk about \(PH\) as a subgroup.

First, \(PH\) is a subgroup of \(G\) because \(H\leq N_G(P)\). We know \(P\leq PH\), so \(p^\alpha \operatorname{\big|}|PH|\). We want to argue that \(PH\) is a \(p\)-group, so \(|PH| \operatorname{\big|}p^\alpha\). We know \[|PH| = \frac{|P||H|}{|P\cap H|},\] so \(|PH|\) is a power of \(p\), since \(P,H,P\cap H\) are all \(p\)-subgroups (by Lagrange’s theorem). Thus, \(|PH|=p^\alpha\) and contains \(P\), which has order \(p^\alpha\). Thus, \(PH=P\), and \(H\leq P\). ◻


Proof of Sylow Theorems 2,3. We break the proof into a few sections.

(a) Action by Arbitrary \(p\)-subgroup \(Q\). We first prove a general lemma about the action of \(p\)-subgroups on \(\mathrm{Syl}_p(G)\) by conjugation (as in Proposition 2). By Sylow theorem 1, there exists a Sylow \(p\)-subgroup \(P\). We consider the orbit of \(P\) under conjugation by \(G\), which we denote \[\mathcal{S}= \{P_1,\dots, P_r\}\] (We do not yet know that \(r=n_p\), i.e. that the conjugation action is transitive. We will establish this later).

Then, let \(Q\) be any \(p\)-subgroup of \(G\). This subgroup acts on \(\mathcal{S}\subseteq \mathrm{Syl}_p(G)\) by conjugation, but when conjugating by elements of \(Q\) we might have smaller orbits. We can write \(\mathcal{S}\) as the disjoint union of \(Q\)-orbits, which we denote \[\mathcal{S}= \mathcal{O}_1 \sqcup \cdots \sqcup \mathcal{O}_s.\] In particular, \(r=|\mathcal{O}_1| + \dots + |\mathcal{O}_s|\). Renumber the elements of \(\mathcal{S}\) (the elements \(P_1,\dots,P_r\)) so that the first \(i=1,\dots,s\) represent the \(Q\)-orbits \(\mathcal{O}_i\). Then, by the Orbit Stabilizer theorem \[|\mathcal{O}_i| = [Q: N_Q(P_i)].\] We know \(N_Q(P_i)=Q\cap N_G(P_i)\), and by our lemma \(Q\cap N_G(P_i) = Q\cap P_i\). Thus, \[|\mathcal{O}_i| = [Q: Q\cap P_i]. \tag{Orbit Size}\]

(b) Modular Relation. We now prove that \(r\equiv 1 \pmod{p}\), which will allow us to establish Sylow Theorems 2 and 3. Choose \(Q=P_1\), so that \(|\mathcal{O}_1|=1\). We now show that all of the other \(|\mathcal{O}_i|\) has a factor of \(p\). We have by (Orbit Size) that \[|\mathcal{O}_i| = [P_1 : P_1 \cap P_i],\] so for \(i\neq 1\), we have \(P_1\neq P_i\), so \(|P_1\cap P_i|\) must be missing at least one factor of \(p\), so \[\frac{p^\alpha}{p^{\alpha-1}} \operatorname{\big|}[P_1 : P_1 \cap P_i].\] Thus, \[\begin{aligned} r &\equiv |\mathcal{O}_1| + \sum_{i=2}^s |\mathcal{O}_i| \pmod{p}\\ &\equiv 1 + \sum_{i=2}^s 0 \pmod{p}, \end{aligned}\] which establishes \(r\equiv 1 \pmod{p}\).

(c) Sylow Theorem 2. Let \(Q\) be any \(p\)-subgroup of \(G\). Suppose, to the contrary, that \(Q\not\in P_i\) for any \(P_i\). We will try to contradict our new-found modular relation. By our assumption, \(Q\cap P_i\) (a \(p\)-subgroup) is strictly smaller than \(Q\), so \([Q: Q\cap P_i]\) has a factor of \(p\). By (Orbit Size), we then have \(p \operatorname{\big|}|\mathcal{O}_i|\) for each \(i\), which implies \(r\equiv 0 \pmod{0}\). This is a contradiction, so \(Q\) is in some \(P_i\), which is a conjugation of our original fixed Sylow \(p\)-subgroup \(P\). Thus, \[Q \leq gPg^{-1}\] for some \(g\).

(d) Sylow Theorem 3. First, we use Sylow Theorem 2 to show \(G\) acting by conjugation is transitive. Let \(Q\) be a Sylow \(p\)-subgroup, which we wish to show conjugation to \(P\). By Sylow Theorem 2, \(Q\leq gPg^{-1}\). Then, \(|Q|=|gPg^{-1}|=p^\alpha\), so \(Q=gPg^{-1}\). Thus, conjugation is transitive.

Therefore, \(r=n_p\), since \(r=|\mathcal{S}|\) is the order of the orbit of \(P\), which is all of \(\mathrm{Syl}_p(G)\). This tells us \[n_p \equiv 1 \pmod{p}.\]

Next, \(n_p \operatorname{\big|}m\) because by the Orbit Stabilizer theorem applied to the conjugation action of \(G\) on \(\mathrm{Syl}_p(G)\), we get \[n_p = [G: N_G(P)] = \frac{|G|}{|N_G(P)|},\] and \(P\leq N_G(P)\) and \(|P|=p^\alpha\), so \(|G|/|N_G(P)|\) is a factor of \(m\). ◻