Monday, November 21, 2016

Chirality and the Conway Polynomial

I'd like to relay a simple conjecture in knot theory, which, despite its simplicity to state, remains unproven. It has to do with the notion of chirality and its opposite amphicheirality. A knot $K\subset S^3$ is said to be amphicheiral if it is isotopic to its mirror image. There are two types. If the strand orientation is preserved, it is said to be $+$ amphicheiral, while if the strand orientation is reversed, it is said to be $-$ amphicheiral. Equivalently in both cases, there exists an orientation-reversing homeomorphism $h\colon S^3\to S^3$ which fixes the knot setwise and which either preserves of reverses the knot's strand orientation. If the knot is hyperbolic, then $h$ can be realized by an isometry, and must be of finite order. One particularly interesting case is when $h$ is an involution. In that case, whether or not $K$ is hyperbolic, we say the knot is strongly $\pm$ amphicheiral. It is possible that a knot can be both $+$ and $-$ amphicheiral simultaneously. The figure $8$ knot has this property.

The Jones Polynomial $V_K(t)\in \mathbb Z[t,t^{-1}]$ is a knot invariant which does not see strand orientation and which satisfies $V_{K^*}(t)=V_{K}(t^{-1})$ where $K^*$ is the mirror image of $K$. Thus the Jones Polynomial of an amphicheiral knot must be symmetric with respect to $t$, and indeed, this is a good way to detect chirality of many knots, e.g. the trefoil.

The Conway Polynomial $C_K(z)\in \mathbb Z[z^2]$ on the other hand is a knot invariant which does not distinguish between $K$ and $K^*$, so it cannot detect chirality in the same way as the Jones. However, we have

Conjecture:
Let $\overline{C_K(z)}\in\mathbb Z_4[z^2]$ be the image of $C_K(z)$ under the natural projection $\mathbb Z[z^2]\twoheadrightarrow \mathbb Z_4[z^2]$. If $K$ is amphicheiral, then
$$\exists f(z)\in \mathbb Z_4[z] \text{ such that }\overline{C_K(z)}=f(z)f(-z)\in \mathbb Z_4[z].$$

(Like a true topologist, I am using $\mathbb Z_k$ to denote the cyclic group of order $k$.)

In this blog post, I would like to explain several confluent lines of evidence for this conjecture. I made an equivalent conjecture in 2006, in a paper with the same name as my blog post. Later, my student Vajira Manathunga showed that my conjecture is equivalent to the one just stated.

The case of strongly amphicheiral knots:
Kawauchi and Hartley proved some time ago that if $K$ is strongly amphicheiral, then $C_K(z)=f(z)f(-z)$ for some $f(z)\in \mathbb Z[z]$. Indeed in the $+$ case, they showed that $C_K(z)=f(z^2)^2$ for some $f$. (They phrased their results in terms of Alexander Polynomials.)

Later Hartley extended this result to the case of all $-$ amphicheiral knots. At this point, the reader may wonder whether the real conjecture ought to be over the integers and not over $\mathbb Z_4$, but it turns out to be false over $\mathbb Z$. Hartley and Kawauchi did not consider the general case in their papers. It may be that they didn't see a reason why it should hold, or perhaps they had a counterexample which they didn't publish. In any event the first published counterexample was due to Ermotti et al, who found a $+$ amphicheiral knot with an orientation reversing symmetry of order $4$, which does not split over $\mathbb Z$.

Independently of the Kawauchi and Hartley story, I was led to conjecture the $\mathbb Z_4$ version after looking at certain properties of Vassiliev invariants related to the Conway polynomial. Gratifyingly the Ermotti et all example does split over $\mathbb Z_4$. Manathunga has since come up with lots of examples which don't split over $\mathbb Z$. See our joint paper on the arXiv. Indeed, in that paper, we prove the conjecture for the case of a positive amphicheiral symmetry that preserves a braid axis, using the Burau representation to calculate the Alexander (and hence Conway) polynomial.

I would like to sketch the argument given by Hartley and Kawauchi for the strongly positive case as well as explain a little bit the Vassiliev invariant background which led to this conjecture.

Review of the Alexander Polynomial:
The Alexander Polynomial is an invariant which is strongly related to the Conway polynomial, and has a very pleasant topological definition. The complement of any knot in $S^3$ has $H_1(S^3\setminus K)=\mathbb Z$, so the covering space $X_K$ of $S^3\setminus K$ associated with the commutator subgroup has deck transformation group $\mathbb Z$. This means that $H_1(S^3\setminus K)$ is a $\mathbb Z[\mathbb Z]$-module. We will think of the group ring $\mathbb Z[\mathbb Z]$ as Laurent polynomials in $t$: $\mathbb Z[t,t^{-1}]$. Moreover, we will switch to rational coefficients, since $\mathbb Q[t^{\pm 1}]$ is a PID. Let $M_K$ be the first homology of $M_K$ with rational coefficients, looked upon as a $\Lambda=\mathbb Q[t^{\pm 1}]$ module. This is called the Alexander Module. One can show, for example by finding an explicit presentation for $M_K$, that $M_K$ is a finitely generated torsion module over $\Lambda$. By the classification of modules over a PID, we have
$$M_K\cong \bigoplus \Lambda/ f_i \Lambda$$
for some $f_i\in \Lambda$.

The Alexander polynomial $\Delta_K(t)$ is defined to be the product of all $p_i$: $\Delta_K(t)=\prod_i f_i$. I.e. it is the order of the Alexander module $M_K$. The polynomials $f_i$ are only well-defined up to units in $\Lambda$. So in addition to multiplying by rational numbers, one can also multiply by powers of $t$. The Alexander polynomial is therefore only well-defined up to these operations. However,  there is a satisfying normal form. It turns out that $\Delta_K(t)$ is equivalent up to multiplying by powers of $t$ to $\Delta_K(t^{-1})$. So one can assume that it is symmetric $\Delta_K(t)=\Delta_K(t^{-1})$. Moreover one can multiply by a constant so that $\Delta_K(1)=1$. With these conventions, though not obviously, $\Delta_K(t)$ has integer coefficients. In fact, Levine showed that each $f_i$ has these properties.

The Conway Polynomial is a repackaging of the Alexander polynomial under the change of variables $z=t^{1/2}-t^{-1/2}$.

Hartley and Kawauchi's Argument for strongly positive amphicheiral knots:
Theorem: If $K$ is strongly positive amphicheiral, then the Alexander module is a direct double $M_k\cong N \oplus N$ for some $\Lambda$-module $N$. This implies that the Conway polynomial is a square $f(z^2)^2$.

Proof: Let $\alpha\colon S^3\to S^3$ be an orientation-reversing involution that preserves the knot's strand orientation. It turns out that it must have two fixed points in the complement of the knot. This implies that $\alpha$'s restriction to $S^3\setminus K$ lifts to an involution $\tilde{\alpha}$ of the universal abelian cover $X_K$. We will use that $M_K$ has a nondegenerate sesquilinear pairing, called the Blanchfield form:
$$B\colon M_K\otimes M_K\to \mathbb Q(t)/\mathbb Q[t,t^{-1}].$$
By definition, $B(x,y)$ is defined as follows. Since $M_K$ is torsion, there is some $\Delta\in \mathbb Q[t,t^{-1}]$ such that $\Delta\cdot y=\partial \sigma$. Indeed one could simply let $\Delta$ be the Alexander polynomial. Then we let
$$B(x,y)=\frac{1}{\Delta}\sum t^{-n}(t^nx)\cdot \sigma,$$
where $\cdot$ denotes the algebraic intersection number.

$B$ is sesquilinear: $B(fx,gy)=f\bar g B(x,y)$ and $B(x,y)=\overline{B(y,x)}$.

Moreover, because $\alpha$ preserves the knot's orientation, it must reverse a meridian. Hence $\tilde{\alpha} t\tilde\alpha=t^{-1}$, which implies that $B(\tilde\alpha x,\tilde\alpha y)=-\overline{B(x,y)}.$ Now the trick is to define $B'(x,y)=B(x,\tilde{\alpha} y)$. It is not hard to see that this is an antisymmetric nondegenerate bilinear form (not sesquilinear) on $M_K$. Now $M_K$ is a finitely generated module over a PID; so it decomposes as a direct sum of pieces $\Lambda/f\Lambda$. Moreover we may assume that the $f$'s are powers of primes $p\in \Lambda$. We can therefore decompose $M_K$ into a direct sum of $p$-primary components: $$M_K=\oplus_p M_K^{(p)}$$ where each $M_K^{(p)}$ is a direct sum of cyclic modules of order $p^k$ for some $k$. It is not hard to see that this must be an orthogonal decomposition with respect to $B'$. Now, each $M_K^{(p)}$ decomposes into a direct sum of cyclic modules of order $p$, plus those of order $p^2$ and so forth. This decomposition is not necessarily orthogonal, but can be fixed up to be so, using an upper triangular argument. So we get summands $\oplus_{i=1}^{j(k,p)} \Lambda/p^k\Lambda$ supporting a nondegenerate antisymmetric bilinear form. Tensoring with the field $F=\Lambda/p\Lambda$, we get a symplectic vector space of dimension $j(k,p)$, which must therefore be even! This implies that $M_K$ is a direct double, as desired. $\Box$



So, now that we understand this classical result of Hartley and Kawauchi, how did Vassiliev invariants enter the picture? For the purposes of this blog post it is not that important to know what exactly a Vassiliev invariant is. It is a type of knot invariant. Some are "Vassiliev" and some are not. The name is used as an adjective to describe some knot invariants in much the way the proper name "Hausdorff" is used as an adjective to describe certain topological spaces. The first thing we want to bring attention to is that the coefficients of the Conway polynomial are Vassiliev invariants of knots:
$$C_K(z)=1+\sum_{i\geq 1}c_{2i}(K)z^{2i},$$
and their Vassiliev degree is $2i$. These invariants are not additive under connected sum, which can be desirable. A standard trick to make them additive is to apply the formal logarithm of power series $\log\colon 1+z\mathbb Z[[z]]\to z\mathbb Z[[z]]$ to $C_K(z)$ and then consider the coefficients. However, this trick does not preserve integrality of the coefficients. This leads to the following definitions:

Definition: Define $$\exp_{\mathbb Z}\colon z\mathbb Z[[z]]\to 1+z\mathbb Z[[z]]$$ by the formula $$\exp_{\mathbb Z}(a_1 z+a_2 z^2+\cdots)= (1-z)^{a_1}(1+z^2)^{a_2}(1-z^3)^{a_3}\cdots=\prod (1+(-z)^n)^{a_n}$$

It is not hard to see that $\exp_{\mathbb Z}$ is bijective and sends addition to multiplication. We define $\log_{\mathbb Z}\colon 1+z\mathbb Z[[z]]\to z\mathbb Z[[z]]$ as its inverse.

Primitive Vassiliev Invariants from the Conway Polynomial: We now define primitive (i.e. additive under connected sum) Vassiliev invariants of degree $2k$ coming from the Conway polynomial, $pc_{2k}\colon \{\text{Knots}\}\to \mathbb Z$, by
$$\log_{\mathbb Z}(C_K(z))=\sum_{i=1}^\infty pc_{2i}(K) z^{2i},$$
where the formal logarithm $\log_{\mathbb Z}$ is taken with respect to the variable $z^2$.

These invariants have an interesting property. Modulo $2$, they drop in degree. $pc_{2n}\pmod 2$ is of degree $2n-1$. On the other hand, no torsion is known among Vassiliev knot invariants. It is thus natural to conjecture

Conjecture: There exist integer-valued Vassiliev knot invariants $v_{2n-1}$ of degree $2n-1$ such that $v_{2n-1}\cong pc_{2n}\pmod 2$.

Such invariants have been explicitly identified for $n=1,2$.

A knot invariant taking values in an abelian group is said to be odd if it changes sign under mirror image. Thus odd $\mathbb Z$-valued knot invariants vanish on amphicheiral knots.

Somewhat unmotivated conjecture: The invariants $v_{4n-1}$ can be chosen to be odd.

This appears to be untrue for general $v_{2n-1}$, but experimental evidence suggested to me that it could be true in the stated half of cases! If this is true, it implies that $pc_{4n}\cong 0\pmod 2$ on amphicheiral knots.

The following proposition then tells us how these considerations on Vassiliev invariants lead to the mod $4$ splitting conjecture.

Proposition: The invariants $pc_{4n}\cong 0\pmod 2$ iff $C_K(z)$ factors as $f(z)f(-z)$ modulo $4$.

So, one approach to proving the mod 4 splitting conjecture is to explicitly identify degree $4n-1$ Vassiliev invariants $v_{4n-1}$ which have the same parity as $pc_{4n}$. This appears to be a difficult problem. Already for  $n=2$ the answer is not known. In degree $7$, the space of all Vassiliev invariants is spanned by polynomial invariants coming from the HOMFLY and Kauffman polynomials, together with a $2$-fold cable. Computer calculations show that $v_{4n-1}$ must involve this cabled invariant, though we have not explicitly identified it.



Wednesday, November 9, 2016

The Kurosh Problems for PI algebras via Combinatorics

This is a continuing series of posts on some of the highlights of a class at UCSD being given by Efim Zelmanov.

For a field $F$, recall the Kurosh problem as I stated it for algebras.

Kurosh Problem for nilalgebras: Suppose $A$ is a finitely generated $F$-algebra which is nil. That is $\forall a\in A\exists N\in\mathbb Z a^N=0$. Is $A$ finite dimensional (equivalently, nilpotent)?

We saw that the Golod-Shafarevich construction gives a counterexample. However, it turns out that the answer is positive for PI algebras, which we will show. There is another version of the Kurosh problem which I didn't state since it was not relevant for the Burnside problem. It replaces the concept of "nil" with "algebraic."

Definition: An element $a\in A$ is said to be algebraic if there is a polynomial $f(t)$ with $F$-coefficients such that $f(a)=0\in A$. An algebra is said to be algebraic if every element is algebraic.

Kurosh Problem for algebraic algebras: Suppose $A$ is algebraic and finitely generated. Is $A$ finite dimensional?

Since nil implies algebraic, the Golod-Shafarevich construction shows that this is false in general, but we will show that it is indeed true for PI algebras.

Everything follows from a combinatorial lemma about words. But first we need a definition. Let $X$ be a finite alphabet with a fixed order, say $x_1<x_2\ldots x_m$. Order the set of words $X^*$ lexicographically, with the convention that if $v$ is an initial word for $w$, then $v>w$. (This is the opposite of the usual convention.) So for example $x_1x_2<x_1^2$ and $x_1x_2>x_1x_2x_3$.

Definition: A word $w\in X^*$ is $n$-divisible if $w=w'u_1\cdots u_n w''$ for some subwords $u_i$ and every nontrivial permutation $u_{\sigma(1)}\ldots u_{\sigma(n)}$ is a smaller word than $u_1\ldots u_n$. If a word is not $n$-divisible, we say it is $n$-indivisible.

If $A$ satisfies a polynomial identity of degree $n$, then by linearizing we can assume it is of the form $x_1\ldots x_n+\sum_{1\neq\sigma\in \mathcal S_n}\alpha_\sigma x_{\sigma(1)}\cdots x_{\sigma(n)}$. Plugging in $x_i=u_i$, we can rewrite an $n$-divisible word as a linear combination of smaller words. Hence $A$ is spanned by words in its generators which are $n$-indivisible.

Shirshov $\mathcal N(m,k,n)$ lemma: There exists a function $\mathcal N\colon \mathbb N^3\to \mathbb N$ such that any word of length $\geq \mathcal N(m,k,n)$ in the alphabet $X=\{x_1,\ldots, x_m\}$ either has a subword of the form $v^k$ or is $n$-divisible.

Proof of Shirshov lemma:
We proceed by induction on the pair $(n,m)$ ordered lexicographically. For the base cases $m=1$, we can take $\mathcal N(1,k,n)=k$.  So now suppose we know how to define it for all pairs less than $(n,m)$.

Define a set of words
$$T=\{x_m^sx_{i_1}\cdots x_{i_l}\,|\, s\geq 1, \ell\geq 1, \forall j\,i_j<m, s<k, \ell<\mathcal N(m-1,k,n)\}
$$
Note that $T$ is finite. We say a word is a $T$-word if it is a product of elements of $T$. Any word that starts with $x_m$ and ends with some $x_i\neq x_m$, and is both $n$-indivisible and doesn't contain any $k$th powers must be a $T$ word. This is because any such word can be written as a product of $T$-type words except that $s$ and $\ell$ could be large. However, the size of $s$ is bounded by the hypothesis of no $k$th powers and each $\ell$ is inductively bounded by $\mathcal N(m-1,k,n)$ since it does not contain any powers and is $n$-indivisible.

Now we want to show that a long word $v$ in $X$ must contain $k$th powers or be $n$-divisible. So suppose $v$ does not contain $k$th powers and is $n$-indivisible. Any such word $v$ can be written as $v'(x_1,\ldots,x_{m-1})v''x_m^\mu$, where $v''$ is a $T$-word, $\ell(v')<\mathcal N(m-1,k,n)$ and $\mu<k$. So we need to control the length of the $T$-word $v''$.

Now $T$-words can be ordered by looking at the order with respect to $X$ or with respect to $T$. Luckily, or actually by design, these give the same order on $T$-words of the same composition.

Lemma: Given two $T$-words $u,v$ of the same composition (that is they use the same letters with the same multiplicity from the alphabet $X$, or equivalently $T$), then
$$u<_T v\Leftrightarrow u<_X v.$$
Proof of lemma:
If $u$ and $v$ begin with the same letter of $T$, then proceed by induction. Otherwise, suppose that the first $T$ word of $v$ is an initial word of the first $T$ word of $u$. Then, $u<v$ with respect to both orders. $\Box$

Now suppose that $\ell_T(v'')>\mathcal N(|T|,k,n-1)+1$. Then $v''=\underline{\hspace{2em}}t$, where there is an $(n-1)$-division in the initial space. So we need such an $n-1$ division to promote to an $n$-division. Suppose $u_1\cdots u_{n-1}$ is an $n-1$ division, where the $u_i$ are $T$-words.

Lemma: $u_1\cdots u_{n-1} x_m$ is $n$-divisible.
Proof of lemma:
Write $u_i=x_mu'_i$. Divide the word as $$\underbrace{x_m}\underbrace{u_1'x_m}\underbrace{u_2'x_m}\cdots\underbrace{u'_{m-1}x_m}.$$
 Every permutation of these words decreases the order, given our assumption that this is true for the $u_i$'s. $\Box$

So, to finish the proof, we can define $
\begin{multline*}
\mathcal N(m,k,n)=[\mathcal N(m-1,k,n)+k] +\\ [\mathcal N(m-1,k,n)+k]\cdot\mathcal N(m^{\mathcal N(m-1,k,n)},k,n-1)
\end{multline*}
$
Here the first $[\mathcal N(m-1,k,n)+k]$ controls the length of $v'$ and $x_m^\mu$. The second $[\mathcal N(m-1,k,n)+k]$ bounds the length of an element of $T$, while $m^{\mathcal N(m-1,k,n)}$ is an upper bound on the size of $T$.

This completes the proof of the Shirshov Lemma. $\Box$

We actually need a slightly strengthened version of the Shirshov Lemma, which itself requires an elementary lemma.

Lemma: Let $v,w\in X^*$ be words in the alphabet $X$. Then $v,w$ commute iff they are powers of the same word.
Proof of lemma: 
We use induction on the sum of lengths $\ell(v)+\ell(w)$. Assume that $vw=wv$. Then if $\ell(v)=\ell(w)$, it is clear that $v=w$, since they are the initial words on both sides of the equality. On the other hand if $\ell(v)<\ell(w)$, say, then $v$ is an initial word of $w$: $w=vw'$. So $vvw'=vw'v$, and hence $vw'=w'v$. By induction $v=u^k$ and $w'=u^\ell$, and $w=u^{k+\ell}$. $\Box$

Lemma: Let $v$ be a word that is not a nontrivial power. If $\ell(v)>\geq n$, then $v^{2n}$ is $n$-divisible.
Proof of lemma:
Write $v=x_{i_1}\cdots x_{i_\ell}$ for $\ell\geq n$. Consider all cyclic permutations of the word. If two cyclic permutations are equal, then $v'v''=v''v'$ for some decomposition, meaning that $v$ is a proper power by the previous lemma. Now write
$$v=v_1'v_1''=v_2'v_2''=\cdots=v_\ell'v_\ell',$$
which are all the ways of decomposing $v$ into two words. Order them in such a way that
$$v_1''v_1'>v_2''v_2'>\cdots>v_\ell'' v_\ell'.$$
Note that $v^2$ contains all of these as subwords: $v^2=v_i'\underbrace{v_i''v_i'}v_i''$. Consider
$$v^{2n}=v^2\cdots v^2=(v_1'\underbrace{v_1''v_1'v_1'')(v_2'}\underbrace{v_2''v_2'v_2'')(v_3'}v_3''v_3'v_3'')\cdots$$
The bracketed subwords give an $n$-division. $\Box$

This leads to the following strengthening of the Shirshov Lemma:
Theorem: Let $k\geq 2n$. $\exists \mathcal N(m,k,n)$ such that a word $w$ of length $\geq \mathcal N$ is $n$-divisible or contains a subword $v^k$ with $\ell(v)<n$.

We can now deduce some interesting consequences.

Corollary: Let $A=\langle a_1,\cdots, a_m\rangle$ be a finitely generated PI $F$-algebra such that every product $a_{i_1}\ldots a_{i_s}$ for $s<n$ is nilpotent. Then $A$ is a nilpotent algebra, and hence finite dimensional.
Proof of corollary:
Let $k$ be the maximum of all the degrees of nilpotency and $2n$. Then $A^{\mathcal N(m,k,n)}=(0)$. $\Box$
This implies the Kurosh problem for nilalgebras.

Corollary: Suppose that all words $a_{i_1}\cdots a_{i_s}$ $s<n$ are algebraic. Then $A$ is finite dimensional.
Proof of corollary:
$A$ is generated by products of length $<\mathcal N(m,k,n)$. $\Box$

Corollary: Suppose $A$ is a finitely generated nilalgebra where the degrees of nilpotency are bounded. Then $A$ is finite dimensional.
Proof of corollary:
We have $x^n=0$ for all $x\in A$, so it satisfies a polynomial identity. $\Box$

Corollary: Suppose $A$ is a finitely generated algebraic algebra of bounded degree. Then $A$ is finite dimensional.
Proof of corollary:
Say $n$ bounds the degree. Then $S_n([y,x^n],[y,x^{n-1}],\cdots,[y,x])=0$. Why? If $x$ is algebraic, of degree $n$, then $\{x,\ldots, x^n\}$ are linearly dependent, so that the standard polynomial will be $0$. We put the $y'$s in there so the identity is not itself $0$ as a noncomuttatuve polynomial. $\Box$





Wednesday, November 2, 2016

Amitsur-Levitsky Theorem

In a previous post, we claimed that $M_n(R)$ satisfies a polynomial identity of degree $2n$ for any commutative $F$-algebra. In fact this is true for commutative rings, as well.

For each $k$, define the standard polynomial $\displaystyle S_k=\sum_{\sigma\in\mathcal S_k}(-1)^{|\sigma|}x_{\sigma(1)}\cdots x_{\sigma(k)}$ in the noncommuting variables $x_1,\ldots, x_k$.

Theorem: (Amitsur-Levitsky 1950) $M_n(R)$ satisfies the identity $S_{2n}=0$.

We will outline an elegant short proof due to Shmuel Rosset (1976). It is clearly sufficient to prove the theorem for $M_n(\mathbb Z)$, and then since $\mathbb Z\subset \mathbb Q$, for $M_n(\mathbb Q)$. The point being that we now need only consider coefficients in a field of characteristic $0$.

Consider the alternating (Grassman) algebra $G$ with $2n$ anticommuting generators $\xi_1,\ldots,\xi_{2n}$. $G$ is a superalgebra: $G=G_0\oplus G_1$, where $G_0$ is spanned by products of an even number of generators and $G_1$ is spanned by products of odd numbers of generators. In particular we will use that $G_0$ is a commutative algebra.

Now let $A_1,\ldots, A_{2n}\in M_n(\mathbb Q)$. Consider
$$X=A_1\otimes \xi_1+\cdots+A_{2n}\otimes \xi_{2n}.$$
Note that $$X^{2n}=S_{2n}(A_1,\ldots,A_{2n})\otimes \xi_1\cdots \xi_{2n}.$$ Thus, to show that $S_{2n}(A_1,\ldots, A_{2n})=0$, it is sufficient to show that $X^{2n}=0$. Note that $X^2\in M_n(G_0)$ is a matrix taking values in the commutative algebra $G_0$.

Lemma: Let $B\in M_n(R)$ for a commutative algebra $R$. Then if $\mathrm{tr}(B)=\mathrm{tr}(B^2)=\cdots=\mathrm{tr}(B^{n})=0$, then $B^n=0$.
Proof of lemma:
The matrix $B$ satisfies its characteristic polynomial, and the coefficients of the characteristic polynomial are polynomials in traces of powers of $B$. For example, in degree $2$, the characteristic polynomial can be written
$$\lambda^2-\mathrm{tr}(B)\lambda+\frac{1}{2}(\mathrm{tr}(B)^2-\mathrm{tr}(B^2)).$$
Plugging in $B$, we get   $B^2-\mathrm{tr}(B)B+\frac{1}{2}(\mathrm{tr}(B)^2-\mathrm{tr}(B^2))=0$, so if the traces are trivial, $B^2=0$ as well. One can prove that the characteristic polynomial of a $2\times 2$ matrix can be written this way by first embedding our field in its algebraic closure and considering the Jordan normal form $\begin{bmatrix}\lambda _1&*\\0&\lambda_2\end{bmatrix} $. Then the characteristic polynomial is $$(x-\lambda_1)(x-\lambda_2)=x^2-(\lambda_1+\lambda_2)+(\lambda_1\lambda_2).$$
On the other hand $\mathrm{tr}(B)=\lambda_1+\lambda_2$ and $\mathrm{tr}(B^2)=\lambda_1^2+\lambda_2^2$. In general, the coefficients of the characteristic polynomial for a matrix of arbitrary size will be symmetric polynomials in the eigenvalues and are expressible as polynomials in $\sum \lambda_i,\sum \lambda_i^2,\ldots$. $\Box$

So to finish the proof, we need to show that $\mathrm{tr}(X^{2k})=0$ for all $k\leq n$. Now
$$X^{2k}=\sum A_{i_1}\cdots A_{i_{2k}}\otimes \xi_{i_1}\cdots \xi_{i_{2k}},$$
and the trace will cancel in pairs since $\mathrm{tr}(A_{i_1}\cdots A_{i_{2k}})=\mathrm{tr}(A_{i_2}\cdots A_{i_{2k}}A_{i_1})$ and these two terms appear with opposite sign.






Tuesday, November 1, 2016

Polynomial Identities

Let $F$ be a field and $F\langle X\rangle$ the free associative algebra with alphabet $X$. Let $f(x_1,\ldots,x_m)\in F\langle X\rangle$ be a nonzero polynomial (in non-commuting variables.) We say that an algebra $A$ satisfies the identity $f=0$ if $f(a_1,\ldots,a_m)=0$ for all $m$-tuples of elements of $A$.


Examples:
  1. $A$ commutative: then $A$ satisfies $x_1x_2-x_2x_1=0$.
  2. $A$ nilpotent (say $A^n=0$.) Then $A$ satisfies $x_1\ldots x_n=0$.
  3. $A$ finite dimensional (say of dimension $d$.) Let $$f(x_1,\ldots,x_{d+1})=\sum_{\sigma\in \mathcal{S}_{d+1}}(-1)^{|\sigma|}x_{\sigma(1)}\ldots x_{\sigma(d)}.$$ Then $f$ is multilinear and antisymmetric. If $e_1,\ldots,e_d$ is a basis for $A$, then $f(a_1,\ldots,a_{d+1})$ can be expressed as a linear combination of elements $f(e_{i_1},\ldots,e_{i_{d+1}})$ by multilinearity. However in each such summand, some basis element appears twice, and so $f$ vanishes by antisymmetry. We denote this particular $f$ by $S_{d+1}(x_1,\ldots,x_{d+1})$.
We will call algebras satisfying a polynomial identity "PI algebras."

Lemma: If $A$ satisfies a polynomial identity, it satisfies a multilinear polynomial identity.

Proof:
Let $f$ be a nonzero identity satisfied by $A$. We show how to linearize in the first variable $x_1$. Suppose $x_1$ appears with maximal degree $d$. Introduce new variables $y_1,\ldots, y_d$ and consider the sum
$$f(y_1+\cdots+y_d,x_2,\ldots, x_n)-\sum f(y_1+\cdots+\hat{y}_i+\cdots+y_d,x_2,\ldots)+\cdots$$
which is an alternating sum in which the second term is the sum of ways of omitting one variable. The next term will be the sum of ways of omitting two variables etc. Then this identity still holds since it is a sum of $f$'s, each of which is $0$, but it has the effect that it kills any momomials involving fewer than $d$ instances of $x_1$ and it linearizes those that do. In fact, thinking of the $x_1$'s as indicating $d$ spots in a subword, it sums over putting the variables $y_1,\ldots, y_d$ in those spots in all possible ways.  So for example $f(x_1,x_2)=x_1^2x_2x_1$ would get linearized to
$$\sum_{\sigma\in \mathcal{S}_3} y_{\sigma(1)}y_{\sigma(2)}x_2y_{\sigma(3)}.$$
Now  that we have linearized $x_1$, repeat the process to linearize the other variables.
$\Box$

Lemma: If $A$ is a PI algebra and $R$ is a commutative $F$-algebra, then $A\otimes_F R$ is also a PI algebra.

Proof:
In view of the previous lemma, we may assume that $A$'s polynomial identity, $f(x_1,\ldots, x_m)=0$ is multilinear. So $$f=\sum_{\sigma\in S_m}\alpha_\sigma x_{\sigma(1)}\cdots x_{\sigma(m)},$$ for some coefficients $\alpha_\sigma\in F$.
 We claim that $A\otimes_F R$ satisfies the same polynomial identity. Since $f$ is linear in each variable, $f$ applied to a general $m$-tuplet of elements of $A\otimes_F R$ is a linear combination of $f$ applied to basic tensors. Now
$\begin{align*}
f(a_1\otimes r_1,\ldots ,a_m\otimes r_m)&=\sum_{\sigma\in \mathcal{S}_m}\alpha_\sigma a_{\sigma(1)}\cdots a_{\sigma(m)}\otimes r_{\sigma(1)}\cdots r_{\sigma(m)}\\
&= \sum_{\sigma\in \mathcal{S}_m}\alpha_\sigma a_{\sigma(1)}\cdots a_{\sigma(m)}\otimes r_{1}\cdots r_{m}\\
&=f(a_1,\ldots,a_m)\otimes r_1\cdots r_m\\
&=0
\end{align*}
$
$\Box$

Corollary: The algebra of $n\times n$ matrices over a commutative $F$ algebra $R$,  $M_n(R)$, is a PI algebra.

Proof:
$M_n(F)$ is finite dimensional, so is a PI algebra as we saw earlier. On the other hand,
$M_n(R)=M_n(F)\otimes _F R$, so the previous lemma applies. $\Box$

Indeed, since the dimension of $M_n(F)$ is $n^2$, we know that $M_n(R)$ will satisfy the identity $S_{n^2+1}(x_1,\ldots,x_{n^2+1})=0$. It becomes an interesting question as to whether this is the best that can be done. Does $M_n(R)$ satisfy a lower degree identity? The answer turns out to be yes.

Theorem: $M_n(R)$ satisfies the identity $S_{2n}(x_1,\ldots,x_{2n})=0$ but does not satisfy any identity of degree $<2n$.

Suppose that $A$ satisfies an identity of degree $k<2n$. Linearize, rename variables, and multiply by a scalar so that $$f=x_1\ldots x_k+\sum_{\sigma\neq 1} \alpha_\sigma x_{\sigma(1)}\cdots x_{\sigma(k)}.$$ Let $e_{ij}$ be the matrix with a $0$ in every entry except at the $(i,j)$ entry where there is a $1$.
Now let $$x_1=e_{11},x_2=e_{12},x_2=e_{22}, x_4=e_{23},\ldots$$
Then the only order in which this multiplies to be nonzero is $x_1 x_2\ldots x_k$, so $f(x_1,\ldots,x_k)= x_1\ldots x_k\neq 0$, and therefore $A$ does not satisfy the identity $f$.

Stay tuned until the next blog post for the fact that $S_{2n}=0$.







Sunday, October 30, 2016

Grigorchuk's group

In a previous post we constructed a rather intricate counterexample to the Generalized Burnside Problem using the Golod-Shafarevich criterion for the infinite dimensionality of algebras.
In this post, we will look at a famous group introduced by Grigorchuk which will also be a counterexample to the generalized Burnside problem. Namely, it will be an infinite, finitely generated, $2$-group. The construction is amazingly elementary.

Start with an infinite rooted binary tree $T$, which we visualize growing downward.

         
The non-root nodes of the tree are in 1-1 correspondence with finite sequences of $0$'s and $1$'s. $0$ means head left while $1$ means head right.  The Grigorchuk group $G$ is a subgroup of the automorphism group of $T$. Let $a\in\mathrm{Aut}(T)$ be the involution which exchanges the left and right branches of the tree. On the level of finite bitstrings, it just flips the first bit. We define three other automorphisms $b$, $c$, and $d$ as follows:

    So each of $b,c,d$ fixes the rightmost chain of nodes and alternates acting by $a,a,1$ on the sequence of trees branching off to the left.

If an automorphism fixes the two nodes adjacent to the root, it can be written as a pair of autmorphisms $(x,y)$, each acting on a tree isomorphic to $T$. In other words, let $\mathrm{stab}_1$ denote the group of automorphisms fixing the two nodes of distance $1$ from the root. Then $\mathrm{stab}_1\cong \mathrm{Aut}(T)\times \mathrm{Aut}(T)$. With this convention, we see that $$b=(a,c), c=(a,d), d=(1,b).$$ Moreover one can take these as recursive definitions for $b,c,d$! It is easy to see that $\{1,b,c,d\}$ form a Klein 4-group.

Definition: Grigorchuk's group $G$ is the subgroup of $\mathrm{Aut}(T)$ generated by $\{a,b,c,d\}$.

By definition, $G$ is finitely generated. We also want so show that it is infinite and torsion.

To see that it is infinite, calculate $aba=(c,a), aca=(d,a)$ and $ada=(b,1)$. Thus $H=\mathrm{stab}_1\cap G$ contains elements with $a,b,c,d$ as their first coordinates. So $H$ surjects onto $G$ itself. But $H$ is a proper subgroup of $G$ (in fact index $2$) so the only way this could happen is if $G$ is infinite.

Now we show that $G$ is a torsion group. Any element of $G$ can be written as a product of generators which alternate between $a$ and non-$a$ letters. Moreover any such word is conjugate to one which begins with $a$ and ends with a non-$a$ letter. So consider a word
$$g=au_1au_2\ldots au_k,$$ where $u_i\in\{b,c,d\}$. We proceed by induction on the length $2k$. (If $k=1$, it is not hard to check that $ab,ac,ad$ are all torsion.) We consider two cases: $k$ even and $k$ odd. If $k$ is even, then $g\in\mathrm{stab}_1$, so we can write $g=(x,y)$. But $x$ and $y$ are now both words of length $\leq k$. For example, $$abac=(aba)c=(c,a)(a,d)=(ca,ad).$$ So we have reduced to a smaller case. Now suppose that $k$ is odd. Then $g\not\in\mathrm{stab}_1$, so we can't quite do the same thing. However $g^2\in\mathrm{stab}_1$. Now $g^2=(x,y)$ where $x$ and $y$ have length $\leq|g|$, so it doesn't look like we've made progress. However, if $d$ appears as one of the letters, then the length of each $x,y$ does go down. For example, if $g=abacad$, then
$\begin{align*}
g^2&=(aba)c(ada)b(aca)d\\
&=(c,a)(a,d)(b,1)(a,c)(d,a)(1,b)\\
&=(cabad,adcab)
\end{align*}$
The point is that $d$ appears (at least) twice and parity considerations show that it puts a copy of $1$ into the first and second coordinates. Thus $g^2$ is torsion, and hence so is $g$, by induction.

If $d$ doesn't appear in $g$, then similar parity considerations will show that every appearance of $b$ in $g$ will put a copy of $c$ into $x$ and $y$, while every appearance of $c$ will put a copy of $d$ into $x$ and $y$. That is $g^2=(x,y)$ where $x$ and $y$ now have either a $c$ or $d$ in both. If there is a $d$, then $x$ and $y$ are both torsion, and so therefore is $g^2$ and $g$. If there is a $c$, then $$g^4=(x^2,y^2)=((\alpha_1,\alpha_2),(\alpha_3,\alpha_4)),$$ where $d$ appears in each $\alpha_i$, so $g^4$ is torsion here as well.

Thus we have shown that $G$ is torsion (in fact a $2$-group)!






Thursday, October 27, 2016

Niven's proof of the irrationality of Pi (Edited)

There is a beautiful short proof of $\pi$'s irrationality due to Ivan Niven, which I'd like to present here. In some sense, one could just read Niven's short one page paper, but I'd like to go over it anyway, if only to assure myself that there are no hidden details! I have presented this proof to a couple of my undergraduate classes, and I am editing this post to try to address some of the points where students tended to get confused.

As with most proofs of irrationality, we assume toward a contradiction that $\pi=a/b$ for some positive integers $a,b$. In other words, we suppose that $\pi=a/b$ and get a contradiction, i.e. a false statement. Since we get a false statement, it means that our initial assumption has to have been incorrect, and so $\pi$ cannot possibly equal $a/b$ for any $a,b$ integers.

Our strategy will be to use $a,b$ to construct a "magic" function, one so magical that it can't possibly exist! Our function will have the following two properties:
  1. $\int_0^\pi f(x)\sin x\,dx$ is a positive integer.
  2. $0<\int_0^\pi f(x)\sin x\,dx<1$
This is of course impossible, and is the desired contradiction. 

Now define the magic function to be of the form $$f(x)=\frac{x^n(a-bx)^n}{n!}$$
where $n$ is a positive integer which we will now determine. Let us look at $f(x)$ on the interval $[0,\pi]$. The biggest $x^n$ can be is $\pi^n$, whereas the biggest $(a-bx)^n$ can be is $a^n$ (when $x=0$). So $f(x)\leq \frac{\pi^na^n}{n!}$. This is a crude upper bound, it is not the exact maximum, but that's okay. Also, $f(x)>0$ when $0<x<\pi=a/b$ since one can easily see that $(a-bx)>0$ for $x<\pi$. So we have
$$ 0<\int_0^\pi f(x)\sin x\,dx \leq \frac{\pi^na^n}{n!}\cdot \pi,$$
because the integrand is bounded by $\frac{\pi^na^n}{n!}$ and the length of the interval is $\pi$. It is a calculus exercise to show that $\lim_{n\to\infty} \pi\frac{\pi^na^n}{n!}=0$. One way to solve this exercise is to note that this limit is equal to $\pi\lim_{n\to\infty} \frac{c^n}{n!}$ for $c=\pi a$, and $\sum_{n=0}^\infty \frac{c^n}{n!} =e^c$ is a power series which converges everywhere. So in particular $\lim_{n\to\infty} \frac{c^n}{n!}=0$. So now that we know the limit is $0$, we may pick a large $n$ so that $\int_0^\pi f(x)\sin x\,dx<1$ since the integral is bounded by a sequence tending to $0$. This establishes the second property we want for our "magic" function $f$.

To establish the first property, we use integration by parts to calculate  $\int_0^\pi f(x)\sin x\,dx$.
$$\begin{align*}
 \int f(x) \sin x\,dx&= -f(x)\cos x+\int f'(x)\cos x\,dx\\
&=-f(x)\cos x+f'(x)\sin x -\int f''(x)\sin x\,dx\\
&=-f(x)\cos x+f'(x)\sin x + f''(x)\cos x -\int f'''(x)\cos x\,dx\\
&\vdots\\
&=-f(x)\cos x+f'(x)\sin x + f''(x)\cos x-f''(x)\sin x+\cdots \pm f^{(2n)}(x)\cos x
\end{align*}
 $$
The reason this eventually terminates is that $f(x)$ is a polynomial of degree $2n$. So $f^{(2n+1)}(x)=0$.  We now have an antiderivative for $f(x)\sin x$, which we can use to calculate $\int_0^\pi f(x)\sin x\,dx$  by plugging in $\pi$ and subtracting off the value at $0$. Notice that all the $\sin$ terms disappear since $\sin(0)=\sin(\pi)=1$. We get
$\begin{multline*} \int_0^\pi f(x)\sin x\,dx= f(\pi)-f^{(2)}(\pi)+f^{(4)}(\pi)-\cdots\pm f^{(2n)}(\pi)\\
 +f(0)-f^{(2)}(0)+\cdots \mp f^{(2n)}(0)
\end{multline*}
$

Hence, in order to show $\int_0^\pi f(x)\sin x\,dx$ is an integer, it will suffice to show that all derivatives of $f$ evaluated at $\pi$ and $0$ are integers. In fact, it is not difficult to check that $f(\pi-x)=f(x)$ using that $\pi=a/b$ by direct algebraic computation. Taking the derivative of both sides, using the chain rule, we get:
$$\begin{align*}
f(\pi-x)&=f(x)\\
-f'(\pi-x)&=f'(x)\\
f''(\pi-x)&=f''(x)\\
&\vdots\\
(-1)^kf^{(k)}(\pi-x)&=f^{(k)}(x)
\end{align*}$$
Plugging in $x=\pi$, we get $(-1)^k f^{(k)}(0)=f^{(k)}(\pi)$. So if we can show that $f^{(k)}(0)$ are integers, then it will automatically follow that $f^{(k)}(\pi)$ are integers, since they differ only by a sign.

So, now our proof will finally be done if we can show that $f^{(k)}(0)$ are integers. There is a nice trick for computing $f^{(k)}(0)$ for polynomials $f(x)$. Taylor's theorem says the coefficient of $x^k$ is $f^{(k)}(0)/k!$, or to put it another way $$f^{(k)}(0)=k! \times \text{ coefficient of }x^k.\,\,(\dagger)$$
We can compute the coefficients of powers of $x$ in $f(x)$ using the binomial theorem. First of all, because we are multiplying by $x^n$, the powers of $x$ in $f(x)$ range from $x^n$ to $x^{2n}$. So $f^{(k)}(x)=0$ for all $k<2n$.  Now, for $0\leq d\leq n$, the term involving $x^d$ in $(a-bx)^n$ is given by $\binom{n}{d}(-bx)^da^{n-d}=\binom{n}{d}(-b)^da^{n-d} x^d$. So the coefficient of $x^{n+d}$ in $\frac{1}{n!} x^n(a-bx)^n$ is given by
$\frac{1}{n!}\binom{n}{d}(-b)^da^{n-d}$. So, by the above formula $(\dagger)$ for calculating derivatives, we have
$$f^{(n+d)}(0)=\frac{(n+d)!}{n!}\binom{n}{d}(-b)^da^{n-d}.$$
This is an integer because $\frac{(n+d)!}{n!}=\frac{(n+d)(n+d-1)\cdots (n+1)n(n-1)\cdots 1}{n(n-1)\cdots1}=(n+d)(n+d-1)\cdots (n+1)$ is an integer. 

So in summary, the polynomial $f(x)$ only has nonzero derivatives $f^{(k)}(0)$ for $n\leq k\leq 2n$ and these derivatives have the explicit form given in the above formula, establishing that they are integers.

 It would be interesting to figure out exactly what was needed about $\pi$ to make this argument work. Can it be modified to show other numbers are irrational?

Edit: Matt Baker's Blog has a discussion on how to generalize this in a motivated way.  



Wednesday, October 26, 2016

Burnside Problems

I am sitting in on a nice class given by Efim Zelmanov at UCSD and I thought I'd talk a little bit on this blog about some of the stuff he has covered. This post will be about the Burnside problems:

Generalized Burnside Problem: Suppose that $G$ is a finitely generated group where every element has finite order. Is $G$ a finite group?

Burnside Problem: Suppose that $G$ is a finitely generated group, such that there exists an $N$ for which $g^N=1$ for all $g\in G$. Is $G$ finite?

It turns out that the answer to both these questions is no, and in this blog post we will explore how using a detour through the theory of algebras will help us settle this question!

In 1941, Kurosh formulated the following question. Suppose that $F$ is a field, $A_F$ is an associative $F$ algebra (not assuming a unit) and that $A$ is finitely generated. An element $a\in A$ is said to be nilpotent if $\exists n\geq 1 \,\,a^n=0$, and an algebra is called a nilalgebra if all of its elements are nilpotent. The Kurosh problem is

Kurosh nilalgebra problem: Is every finitely generated nilalgebra finite dimensional?

Kurosh noticed that if there is a counterexample to this question for a field of positive characteristic, then there is a counterexample to the Generalized Burnside Problem. Here is the argument for that.

Theorem: If there is a counterexample to the Kurosh problem, there is counterxample to the generalized Burnside problem.
 [Proof]
Suppose that there is some finitely generated infinite-dimensional nilalgebra over $F$ where $\mathrm{char}(F)=p>0$. Let $A=\langle a_1,\ldots, a_m\rangle$ be such an example. Let $\hat{A}=A\oplus F\cdot 1$ be the unital hull of $A$: the algebra you get by adding a unit. (Since $A$ was nil, it couldn't possibly have had a unit to begin with!) Now notice that for all $a\in A$, $1+a$ is invertible since $1-a+a^2-a^3+\cdots$ is a finite sum and an inverse. Thus $1+a\in G(\hat A)$, the group of units of $\hat A$. Consider the subgroup of the group of units
$$G=\langle 1+a_1,\ldots,1+a_m\rangle\subset 1+A\subset G(\hat{A}).$$
Now every element in $1+A$ is torsion: we know there is some $n(a)$ where $a^{n(a)}=0$. Choose $k$ so that $p^k\geq n(a)$. Then $$(1+a)^{p^k}=1+a^{p^k}=1$$ since we are in characteristic $p$. Therefore $G$ is a finitely generated torsion group. If we can show it is infinite, we have a counterexample to the Generalized Burnside Problem.

Suppose to the contrary that $|G|=d$. Let $g_1\ldots g_d$ be a product of elements of $G$. Then $\{1,g_1,g_1g_2,\ldots,g_1\ldots g_d\}$ is a set of $d+1$ elements, and by the pigeonhole principle two of the elements must coincide: say $g_1\ldots g_i=g_1\ldots g_j$ $(j>i)$. Then $g_{i+1}\ldots g_j=1$. So we can remove a segment from the original word $g_1\ldots g_d$ without changing the group element it represents. In particular any element of $G$ is a product of $<d$ generators (we don't need inverses because the group is torsion).
In other words,
$$\forall g\in G\,\, g=(1+a_{i_1})\cdots (1+a_{i_k})\,\,\,\,(k<d)$$
But now consider an arbitrary product of $d$ generators of $A$: $a_{i_1}\ldots a_{i_d}$. Then
$$(1+a_{i_1})\cdots (1+a_{i_d})=(1+a_{j_1})\cdots(1+a_{j_k})\,\,\,\, k<d.$$
Now expanding this out, we see that $a_{i_1}\ldots a_{i_d}$ is a linear combination of products of generators of length $<d$. This implies $A$ is finite dimensional, which is a contradiction. $\Box$

So it now suffices to find a counterexample to the Kurosh problem. We begin with some graded algebra conventions. Let $F\langle X\rangle$ be the free associative algebra on a finite alphabet $X=\{x_1,\ldots ,x_m\}$. This is graded by length of words:
$$F\langle X\rangle =F\cdot 1+F\langle X\rangle_1+F\langle X\rangle_2+\cdots$$.
 We say a subspace $H\subseteq F\langle X\rangle$ is graded if $H=\sum_i(H\cap F\langle X\rangle_i)$. If $R$ is a graded subspace, we can form the quotient algebra $\langle X\,|\, R=(0)\rangle$. Then since the ideal generated by $R$, $I=\mathrm{id}(R)$, is also graded, the factor algebra $$A=F\langle X\rangle/I=F\cdot 1+A_1+A_2+\cdots$$ is graded as well. (Note we assume all relations have degree $\geq 2$.)

Definition: Given a graded vector space $V=\oplus_{i\geq 0} V_i$ in which each graded summand is finite dimensional, define the Hilbert series $H_V(t)=\sum_{i\geq 0} (\mathrm{dim} V_i)t^i$. Note this is a formal power series.

We can order formal power series lexicographically by coefficients.

Proposition (Golod-Shafarevich): In the situation above, $$H_A(t)(1-mt+H_R(t))\geq 1,$$ where the comparison is of formal power series.

We will come back to the proof of this proposition at the end of the post. According to Zelmanov, it was given to Golod as a homework problem by Shafarevich!

Now, suppose that the proposition is true, and moreover suppose that we can find a $t_0$ with $0<t_0<1$ such that
  1. $H_R(t_0)$ converges.
  2. $1-mt_0+H_R(t_0)<0$.
Then the above proposition shows that $H_A(t_0)$ does not converge, so that $A$ must be infinite. (The formal inequality implies the numerical inequality when both sides converge.)

The construction of a finitely generated nilalgebra which is infinite dimensional:

Let $F$ be a countable field. Then the free algebra on $X$ without identity is also countable:
$$F_0\langle X\rangle =\{f_1,f_2,\ldots,\}.$$
Let $d_1=2$. Let $d_2>\mathrm{deg}(f_1^{d_1})$, and $d_3>\mathrm{deg}(f_2^{d_2})$ etc. Let $R$ be the set of homogeneous components of $f_i^{d_i}$. By choice of the $d_i$'s, every degree occurs no more than once.

Therefore $H_R(t)\leq t^2+t^3+\cdots$, and so $H_R(t)$ converges when $0<t<1$. In order to satisfy the criterion 2. above, we need $1-mt_0+H_R(t_0)<0$, which will be satisfied if
$1-mt_0+t_0^2+t_0^3+\cdots<0$ for some $0<t_0<1$. This is an easy precalculus exercise!

So the algebra $A=\langle F_0\langle X\rangle\,|\, R=(0)\rangle$ is infinite dimensional. Moreover it is nil by construction. Every element of $F_0\langle X\rangle$ has a power which is trivial.

Returning to the proof of the Golod-Shafarevich inequality:

Let $r_i=\mathrm{dim}(R_i), \mathrm{dim}(A_i)=a_i$ and, letting $I=\mathrm{id}(R)$, $b_i=\dim(I_i)$. Note that $a_i+b_i=\mathrm{dim} F_0\langle X\rangle_i=m^i$. The Golod-Shafarevich inequality can now be rewritten as
$$
a_n-ma_{n-1}+\sum_{i+j= n} a_ir_j\geq 0
$$

We use the following basic observation: $I\subseteq XI+RF\langle X\rangle$. This follows since a general element of $I$ is a linear combination of elements of the form $u_iRv_i$. If $R$is not at the left of the word, then $u_iRv_i\in XI$. Otherwise, it is in $RF\langle X\rangle$. Now choose a complementary graded subspace $S$ so that $F\langle X\rangle=I+S$. Then $\mathrm{dim}(S_i)=a_i$. Moreover, we can now say
$$I\subseteq XI+RS$$
Thus $b_n\leq mb_{n-1}+\sum_{i+j=n}r_ia_j$, which is $$(m^n-a_n)\leq m(m^{n-1}-a_{n-1})+\sum_{i+j=n}r_ia_j,$$
which is equivalent to the desired inequality.