Loading [MathJax]/extensions/TeX/mathchoice.js

Monday, November 21, 2016

Chirality and the Conway Polynomial

I'd like to relay a simple conjecture in knot theory, which, despite its simplicity to state, remains unproven. It has to do with the notion of chirality and its opposite amphicheirality. A knot K\subset S^3 is said to be amphicheiral if it is isotopic to its mirror image. There are two types. If the strand orientation is preserved, it is said to be + amphicheiral, while if the strand orientation is reversed, it is said to be - amphicheiral. Equivalently in both cases, there exists an orientation-reversing homeomorphism h\colon S^3\to S^3 which fixes the knot setwise and which either preserves of reverses the knot's strand orientation. If the knot is hyperbolic, then h can be realized by an isometry, and must be of finite order. One particularly interesting case is when h is an involution. In that case, whether or not K is hyperbolic, we say the knot is strongly \pm amphicheiral. It is possible that a knot can be both + and - amphicheiral simultaneously. The figure 8 knot has this property.

The Jones Polynomial V_K(t)\in \mathbb Z[t,t^{-1}] is a knot invariant which does not see strand orientation and which satisfies V_{K^*}(t)=V_{K}(t^{-1}) where K^* is the mirror image of K. Thus the Jones Polynomial of an amphicheiral knot must be symmetric with respect to t, and indeed, this is a good way to detect chirality of many knots, e.g. the trefoil.

The Conway Polynomial C_K(z)\in \mathbb Z[z^2] on the other hand is a knot invariant which does not distinguish between K and K^*, so it cannot detect chirality in the same way as the Jones. However, we have

Conjecture:
Let \overline{C_K(z)}\in\mathbb Z_4[z^2] be the image of C_K(z) under the natural projection \mathbb Z[z^2]\twoheadrightarrow \mathbb Z_4[z^2]. If K is amphicheiral, then
\exists f(z)\in \mathbb Z_4[z] \text{ such that }\overline{C_K(z)}=f(z)f(-z)\in \mathbb Z_4[z].

(Like a true topologist, I am using \mathbb Z_k to denote the cyclic group of order k.)

In this blog post, I would like to explain several confluent lines of evidence for this conjecture. I made an equivalent conjecture in 2006, in a paper with the same name as my blog post. Later, my student Vajira Manathunga showed that my conjecture is equivalent to the one just stated.

The case of strongly amphicheiral knots:
Kawauchi and Hartley proved some time ago that if K is strongly amphicheiral, then C_K(z)=f(z)f(-z) for some f(z)\in \mathbb Z[z]. Indeed in the + case, they showed that C_K(z)=f(z^2)^2 for some f. (They phrased their results in terms of Alexander Polynomials.)

Later Hartley extended this result to the case of all - amphicheiral knots. At this point, the reader may wonder whether the real conjecture ought to be over the integers and not over \mathbb Z_4, but it turns out to be false over \mathbb Z. Hartley and Kawauchi did not consider the general case in their papers. It may be that they didn't see a reason why it should hold, or perhaps they had a counterexample which they didn't publish. In any event the first published counterexample was due to Ermotti et al, who found a + amphicheiral knot with an orientation reversing symmetry of order 4, which does not split over \mathbb Z.

Independently of the Kawauchi and Hartley story, I was led to conjecture the \mathbb Z_4 version after looking at certain properties of Vassiliev invariants related to the Conway polynomial. Gratifyingly the Ermotti et all example does split over \mathbb Z_4. Manathunga has since come up with lots of examples which don't split over \mathbb Z. See our joint paper on the arXiv. Indeed, in that paper, we prove the conjecture for the case of a positive amphicheiral symmetry that preserves a braid axis, using the Burau representation to calculate the Alexander (and hence Conway) polynomial.

I would like to sketch the argument given by Hartley and Kawauchi for the strongly positive case as well as explain a little bit the Vassiliev invariant background which led to this conjecture.

Review of the Alexander Polynomial:
The Alexander Polynomial is an invariant which is strongly related to the Conway polynomial, and has a very pleasant topological definition. The complement of any knot in S^3 has H_1(S^3\setminus K)=\mathbb Z, so the covering space X_K of S^3\setminus K associated with the commutator subgroup has deck transformation group \mathbb Z. This means that H_1(S^3\setminus K) is a \mathbb Z[\mathbb Z]-module. We will think of the group ring \mathbb Z[\mathbb Z] as Laurent polynomials in t: \mathbb Z[t,t^{-1}]. Moreover, we will switch to rational coefficients, since \mathbb Q[t^{\pm 1}] is a PID. Let M_K be the first homology of M_K with rational coefficients, looked upon as a \Lambda=\mathbb Q[t^{\pm 1}] module. This is called the Alexander Module. One can show, for example by finding an explicit presentation for M_K, that M_K is a finitely generated torsion module over \Lambda. By the classification of modules over a PID, we have
M_K\cong \bigoplus \Lambda/ f_i \Lambda
for some f_i\in \Lambda.

The Alexander polynomial \Delta_K(t) is defined to be the product of all p_i: \Delta_K(t)=\prod_i f_i. I.e. it is the order of the Alexander module M_K. The polynomials f_i are only well-defined up to units in \Lambda. So in addition to multiplying by rational numbers, one can also multiply by powers of t. The Alexander polynomial is therefore only well-defined up to these operations. However,  there is a satisfying normal form. It turns out that \Delta_K(t) is equivalent up to multiplying by powers of t to \Delta_K(t^{-1}). So one can assume that it is symmetric \Delta_K(t)=\Delta_K(t^{-1}). Moreover one can multiply by a constant so that \Delta_K(1)=1. With these conventions, though not obviously, \Delta_K(t) has integer coefficients. In fact, Levine showed that each f_i has these properties.

The Conway Polynomial is a repackaging of the Alexander polynomial under the change of variables z=t^{1/2}-t^{-1/2}.

Hartley and Kawauchi's Argument for strongly positive amphicheiral knots:
Theorem: If K is strongly positive amphicheiral, then the Alexander module is a direct double M_k\cong N \oplus N for some \Lambda-module N. This implies that the Conway polynomial is a square f(z^2)^2.

Proof: Let \alpha\colon S^3\to S^3 be an orientation-reversing involution that preserves the knot's strand orientation. It turns out that it must have two fixed points in the complement of the knot. This implies that \alpha's restriction to S^3\setminus K lifts to an involution \tilde{\alpha} of the universal abelian cover X_K. We will use that M_K has a nondegenerate sesquilinear pairing, called the Blanchfield form:
B\colon M_K\otimes M_K\to \mathbb Q(t)/\mathbb Q[t,t^{-1}].
By definition, B(x,y) is defined as follows. Since M_K is torsion, there is some \Delta\in \mathbb Q[t,t^{-1}] such that \Delta\cdot y=\partial \sigma. Indeed one could simply let \Delta be the Alexander polynomial. Then we let
B(x,y)=\frac{1}{\Delta}\sum t^{-n}(t^nx)\cdot \sigma,
where \cdot denotes the algebraic intersection number.

B is sesquilinear: B(fx,gy)=f\bar g B(x,y) and B(x,y)=\overline{B(y,x)}.

Moreover, because \alpha preserves the knot's orientation, it must reverse a meridian. Hence \tilde{\alpha} t\tilde\alpha=t^{-1}, which implies that B(\tilde\alpha x,\tilde\alpha y)=-\overline{B(x,y)}. Now the trick is to define B'(x,y)=B(x,\tilde{\alpha} y). It is not hard to see that this is an antisymmetric nondegenerate bilinear form (not sesquilinear) on M_K. Now M_K is a finitely generated module over a PID; so it decomposes as a direct sum of pieces \Lambda/f\Lambda. Moreover we may assume that the f's are powers of primes p\in \Lambda. We can therefore decompose M_K into a direct sum of p-primary components: M_K=\oplus_p M_K^{(p)} where each M_K^{(p)} is a direct sum of cyclic modules of order p^k for some k. It is not hard to see that this must be an orthogonal decomposition with respect to B'. Now, each M_K^{(p)} decomposes into a direct sum of cyclic modules of order p, plus those of order p^2 and so forth. This decomposition is not necessarily orthogonal, but can be fixed up to be so, using an upper triangular argument. So we get summands \oplus_{i=1}^{j(k,p)} \Lambda/p^k\Lambda supporting a nondegenerate antisymmetric bilinear form. Tensoring with the field F=\Lambda/p\Lambda, we get a symplectic vector space of dimension j(k,p), which must therefore be even! This implies that M_K is a direct double, as desired. \Box



So, now that we understand this classical result of Hartley and Kawauchi, how did Vassiliev invariants enter the picture? For the purposes of this blog post it is not that important to know what exactly a Vassiliev invariant is. It is a type of knot invariant. Some are "Vassiliev" and some are not. The name is used as an adjective to describe some knot invariants in much the way the proper name "Hausdorff" is used as an adjective to describe certain topological spaces. The first thing we want to bring attention to is that the coefficients of the Conway polynomial are Vassiliev invariants of knots:
C_K(z)=1+\sum_{i\geq 1}c_{2i}(K)z^{2i},
and their Vassiliev degree is 2i. These invariants are not additive under connected sum, which can be desirable. A standard trick to make them additive is to apply the formal logarithm of power series \log\colon 1+z\mathbb Z[[z]]\to z\mathbb Z[[z]] to C_K(z) and then consider the coefficients. However, this trick does not preserve integrality of the coefficients. This leads to the following definitions:

Definition: Define \exp_{\mathbb Z}\colon z\mathbb Z[[z]]\to 1+z\mathbb Z[[z]] by the formula \exp_{\mathbb Z}(a_1 z+a_2 z^2+\cdots)= (1-z)^{a_1}(1+z^2)^{a_2}(1-z^3)^{a_3}\cdots=\prod (1+(-z)^n)^{a_n}

It is not hard to see that \exp_{\mathbb Z} is bijective and sends addition to multiplication. We define \log_{\mathbb Z}\colon 1+z\mathbb Z[[z]]\to z\mathbb Z[[z]] as its inverse.

Primitive Vassiliev Invariants from the Conway Polynomial: We now define primitive (i.e. additive under connected sum) Vassiliev invariants of degree 2k coming from the Conway polynomial, pc_{2k}\colon \{\text{Knots}\}\to \mathbb Z, by
\log_{\mathbb Z}(C_K(z))=\sum_{i=1}^\infty pc_{2i}(K) z^{2i},
where the formal logarithm \log_{\mathbb Z} is taken with respect to the variable z^2.

These invariants have an interesting property. Modulo 2, they drop in degree. pc_{2n}\pmod 2 is of degree 2n-1. On the other hand, no torsion is known among Vassiliev knot invariants. It is thus natural to conjecture

Conjecture: There exist integer-valued Vassiliev knot invariants v_{2n-1} of degree 2n-1 such that v_{2n-1}\cong pc_{2n}\pmod 2.

Such invariants have been explicitly identified for n=1,2.

A knot invariant taking values in an abelian group is said to be odd if it changes sign under mirror image. Thus odd \mathbb Z-valued knot invariants vanish on amphicheiral knots.

Somewhat unmotivated conjecture: The invariants v_{4n-1} can be chosen to be odd.

This appears to be untrue for general v_{2n-1}, but experimental evidence suggested to me that it could be true in the stated half of cases! If this is true, it implies that pc_{4n}\cong 0\pmod 2 on amphicheiral knots.

The following proposition then tells us how these considerations on Vassiliev invariants lead to the mod 4 splitting conjecture.

Proposition: The invariants pc_{4n}\cong 0\pmod 2 iff C_K(z) factors as f(z)f(-z) modulo 4.

So, one approach to proving the mod 4 splitting conjecture is to explicitly identify degree 4n-1 Vassiliev invariants v_{4n-1} which have the same parity as pc_{4n}. This appears to be a difficult problem. Already for  n=2 the answer is not known. In degree 7, the space of all Vassiliev invariants is spanned by polynomial invariants coming from the HOMFLY and Kauffman polynomials, together with a 2-fold cable. Computer calculations show that v_{4n-1} must involve this cabled invariant, though we have not explicitly identified it.



Wednesday, November 9, 2016

The Kurosh Problems for PI algebras via Combinatorics

This is a continuing series of posts on some of the highlights of a class at UCSD being given by Efim Zelmanov.

For a field F, recall the Kurosh problem as I stated it for algebras.

Kurosh Problem for nilalgebras: Suppose A is a finitely generated F-algebra which is nil. That is \forall a\in A\exists N\in\mathbb Z a^N=0. Is A finite dimensional (equivalently, nilpotent)?

We saw that the Golod-Shafarevich construction gives a counterexample. However, it turns out that the answer is positive for PI algebras, which we will show. There is another version of the Kurosh problem which I didn't state since it was not relevant for the Burnside problem. It replaces the concept of "nil" with "algebraic."

Definition: An element a\in A is said to be algebraic if there is a polynomial f(t) with F-coefficients such that f(a)=0\in A. An algebra is said to be algebraic if every element is algebraic.

Kurosh Problem for algebraic algebras: Suppose A is algebraic and finitely generated. Is A finite dimensional?

Since nil implies algebraic, the Golod-Shafarevich construction shows that this is false in general, but we will show that it is indeed true for PI algebras.

Everything follows from a combinatorial lemma about words. But first we need a definition. Let X be a finite alphabet with a fixed order, say x_1<x_2\ldots x_m. Order the set of words X^* lexicographically, with the convention that if v is an initial word for w, then v>w. (This is the opposite of the usual convention.) So for example x_1x_2<x_1^2 and x_1x_2>x_1x_2x_3.

Definition: A word w\in X^* is n-divisible if w=w'u_1\cdots u_n w'' for some subwords u_i and every nontrivial permutation u_{\sigma(1)}\ldots u_{\sigma(n)} is a smaller word than u_1\ldots u_n. If a word is not n-divisible, we say it is n-indivisible.

If A satisfies a polynomial identity of degree n, then by linearizing we can assume it is of the form x_1\ldots x_n+\sum_{1\neq\sigma\in \mathcal S_n}\alpha_\sigma x_{\sigma(1)}\cdots x_{\sigma(n)}. Plugging in x_i=u_i, we can rewrite an n-divisible word as a linear combination of smaller words. Hence A is spanned by words in its generators which are n-indivisible.

Shirshov \mathcal N(m,k,n) lemma: There exists a function \mathcal N\colon \mathbb N^3\to \mathbb N such that any word of length \geq \mathcal N(m,k,n) in the alphabet X=\{x_1,\ldots, x_m\} either has a subword of the form v^k or is n-divisible.

Proof of Shirshov lemma:
We proceed by induction on the pair (n,m) ordered lexicographically. For the base cases m=1, we can take \mathcal N(1,k,n)=k.  So now suppose we know how to define it for all pairs less than (n,m).

Define a set of words
T=\{x_m^sx_{i_1}\cdots x_{i_l}\,|\, s\geq 1, \ell\geq 1, \forall j\,i_j<m, s<k, \ell<\mathcal N(m-1,k,n)\}
Note that T is finite. We say a word is a T-word if it is a product of elements of T. Any word that starts with x_m and ends with some x_i\neq x_m, and is both n-indivisible and doesn't contain any kth powers must be a T word. This is because any such word can be written as a product of T-type words except that s and \ell could be large. However, the size of s is bounded by the hypothesis of no kth powers and each \ell is inductively bounded by \mathcal N(m-1,k,n) since it does not contain any powers and is n-indivisible.

Now we want to show that a long word v in X must contain kth powers or be n-divisible. So suppose v does not contain kth powers and is n-indivisible. Any such word v can be written as v'(x_1,\ldots,x_{m-1})v''x_m^\mu, where v'' is a T-word, \ell(v')<\mathcal N(m-1,k,n) and \mu<k. So we need to control the length of the T-word v''.

Now T-words can be ordered by looking at the order with respect to X or with respect to T. Luckily, or actually by design, these give the same order on T-words of the same composition.

Lemma: Given two T-words u,v of the same composition (that is they use the same letters with the same multiplicity from the alphabet X, or equivalently T), then
u<_T v\Leftrightarrow u<_X v.
Proof of lemma:
If u and v begin with the same letter of T, then proceed by induction. Otherwise, suppose that the first T word of v is an initial word of the first T word of u. Then, u<v with respect to both orders. \Box

Now suppose that \ell_T(v'')>\mathcal N(|T|,k,n-1)+1. Then v''=\underline{\hspace{2em}}t, where there is an (n-1)-division in the initial space. So we need such an n-1 division to promote to an n-division. Suppose u_1\cdots u_{n-1} is an n-1 division, where the u_i are T-words.

Lemma: u_1\cdots u_{n-1} x_m is n-divisible.
Proof of lemma:
Write u_i=x_mu'_i. Divide the word as \underbrace{x_m}\underbrace{u_1'x_m}\underbrace{u_2'x_m}\cdots\underbrace{u'_{m-1}x_m}.
 Every permutation of these words decreases the order, given our assumption that this is true for the u_i's. \Box

So, to finish the proof, we can define \begin{multline*} \mathcal N(m,k,n)=[\mathcal N(m-1,k,n)+k] +\\ [\mathcal N(m-1,k,n)+k]\cdot\mathcal N(m^{\mathcal N(m-1,k,n)},k,n-1) \end{multline*}
Here the first [\mathcal N(m-1,k,n)+k] controls the length of v' and x_m^\mu. The second [\mathcal N(m-1,k,n)+k] bounds the length of an element of T, while m^{\mathcal N(m-1,k,n)} is an upper bound on the size of T.

This completes the proof of the Shirshov Lemma. \Box

We actually need a slightly strengthened version of the Shirshov Lemma, which itself requires an elementary lemma.

Lemma: Let v,w\in X^* be words in the alphabet X. Then v,w commute iff they are powers of the same word.
Proof of lemma: 
We use induction on the sum of lengths \ell(v)+\ell(w). Assume that vw=wv. Then if \ell(v)=\ell(w), it is clear that v=w, since they are the initial words on both sides of the equality. On the other hand if \ell(v)<\ell(w), say, then v is an initial word of w: w=vw'. So vvw'=vw'v, and hence vw'=w'v. By induction v=u^k and w'=u^\ell, and w=u^{k+\ell}. \Box

Lemma: Let v be a word that is not a nontrivial power. If \ell(v)>\geq n, then v^{2n} is n-divisible.
Proof of lemma:
Write v=x_{i_1}\cdots x_{i_\ell} for \ell\geq n. Consider all cyclic permutations of the word. If two cyclic permutations are equal, then v'v''=v''v' for some decomposition, meaning that v is a proper power by the previous lemma. Now write
v=v_1'v_1''=v_2'v_2''=\cdots=v_\ell'v_\ell',
which are all the ways of decomposing v into two words. Order them in such a way that
v_1''v_1'>v_2''v_2'>\cdots>v_\ell'' v_\ell'.
Note that v^2 contains all of these as subwords: v^2=v_i'\underbrace{v_i''v_i'}v_i''. Consider
v^{2n}=v^2\cdots v^2=(v_1'\underbrace{v_1''v_1'v_1'')(v_2'}\underbrace{v_2''v_2'v_2'')(v_3'}v_3''v_3'v_3'')\cdots
The bracketed subwords give an n-division. \Box

This leads to the following strengthening of the Shirshov Lemma:
Theorem: Let k\geq 2n. \exists \mathcal N(m,k,n) such that a word w of length \geq \mathcal N is n-divisible or contains a subword v^k with \ell(v)<n.

We can now deduce some interesting consequences.

Corollary: Let A=\langle a_1,\cdots, a_m\rangle be a finitely generated PI F-algebra such that every product a_{i_1}\ldots a_{i_s} for s<n is nilpotent. Then A is a nilpotent algebra, and hence finite dimensional.
Proof of corollary:
Let k be the maximum of all the degrees of nilpotency and 2n. Then A^{\mathcal N(m,k,n)}=(0). \Box
This implies the Kurosh problem for nilalgebras.

Corollary: Suppose that all words a_{i_1}\cdots a_{i_s} s<n are algebraic. Then A is finite dimensional.
Proof of corollary:
A is generated by products of length <\mathcal N(m,k,n). \Box

Corollary: Suppose A is a finitely generated nilalgebra where the degrees of nilpotency are bounded. Then A is finite dimensional.
Proof of corollary:
We have x^n=0 for all x\in A, so it satisfies a polynomial identity. \Box

Corollary: Suppose A is a finitely generated algebraic algebra of bounded degree. Then A is finite dimensional.
Proof of corollary:
Say n bounds the degree. Then S_n([y,x^n],[y,x^{n-1}],\cdots,[y,x])=0. Why? If x is algebraic, of degree n, then \{x,\ldots, x^n\} are linearly dependent, so that the standard polynomial will be 0. We put the y's in there so the identity is not itself 0 as a noncomuttatuve polynomial. \Box





Wednesday, November 2, 2016

Amitsur-Levitsky Theorem

In a previous post, we claimed that M_n(R) satisfies a polynomial identity of degree 2n for any commutative F-algebra. In fact this is true for commutative rings, as well.

For each k, define the standard polynomial \displaystyle S_k=\sum_{\sigma\in\mathcal S_k}(-1)^{|\sigma|}x_{\sigma(1)}\cdots x_{\sigma(k)} in the noncommuting variables x_1,\ldots, x_k.

Theorem: (Amitsur-Levitsky 1950) M_n(R) satisfies the identity S_{2n}=0.

We will outline an elegant short proof due to Shmuel Rosset (1976). It is clearly sufficient to prove the theorem for M_n(\mathbb Z), and then since \mathbb Z\subset \mathbb Q, for M_n(\mathbb Q). The point being that we now need only consider coefficients in a field of characteristic 0.

Consider the alternating (Grassman) algebra G with 2n anticommuting generators \xi_1,\ldots,\xi_{2n}. G is a superalgebra: G=G_0\oplus G_1, where G_0 is spanned by products of an even number of generators and G_1 is spanned by products of odd numbers of generators. In particular we will use that G_0 is a commutative algebra.

Now let A_1,\ldots, A_{2n}\in M_n(\mathbb Q). Consider
X=A_1\otimes \xi_1+\cdots+A_{2n}\otimes \xi_{2n}.
Note that X^{2n}=S_{2n}(A_1,\ldots,A_{2n})\otimes \xi_1\cdots \xi_{2n}. Thus, to show that S_{2n}(A_1,\ldots, A_{2n})=0, it is sufficient to show that X^{2n}=0. Note that X^2\in M_n(G_0) is a matrix taking values in the commutative algebra G_0.

Lemma: Let B\in M_n(R) for a commutative algebra R. Then if \mathrm{tr}(B)=\mathrm{tr}(B^2)=\cdots=\mathrm{tr}(B^{n})=0, then B^n=0.
Proof of lemma:
The matrix B satisfies its characteristic polynomial, and the coefficients of the characteristic polynomial are polynomials in traces of powers of B. For example, in degree 2, the characteristic polynomial can be written
\lambda^2-\mathrm{tr}(B)\lambda+\frac{1}{2}(\mathrm{tr}(B)^2-\mathrm{tr}(B^2)).
Plugging in B, we get   B^2-\mathrm{tr}(B)B+\frac{1}{2}(\mathrm{tr}(B)^2-\mathrm{tr}(B^2))=0, so if the traces are trivial, B^2=0 as well. One can prove that the characteristic polynomial of a 2\times 2 matrix can be written this way by first embedding our field in its algebraic closure and considering the Jordan normal form \begin{bmatrix}\lambda _1&*\\0&\lambda_2\end{bmatrix} . Then the characteristic polynomial is (x-\lambda_1)(x-\lambda_2)=x^2-(\lambda_1+\lambda_2)+(\lambda_1\lambda_2).
On the other hand \mathrm{tr}(B)=\lambda_1+\lambda_2 and \mathrm{tr}(B^2)=\lambda_1^2+\lambda_2^2. In general, the coefficients of the characteristic polynomial for a matrix of arbitrary size will be symmetric polynomials in the eigenvalues and are expressible as polynomials in \sum \lambda_i,\sum \lambda_i^2,\ldots. \Box

So to finish the proof, we need to show that \mathrm{tr}(X^{2k})=0 for all k\leq n. Now
X^{2k}=\sum A_{i_1}\cdots A_{i_{2k}}\otimes \xi_{i_1}\cdots \xi_{i_{2k}},
and the trace will cancel in pairs since \mathrm{tr}(A_{i_1}\cdots A_{i_{2k}})=\mathrm{tr}(A_{i_2}\cdots A_{i_{2k}}A_{i_1}) and these two terms appear with opposite sign.






Tuesday, November 1, 2016

Polynomial Identities

Let F be a field and F\langle X\rangle the free associative algebra with alphabet X. Let f(x_1,\ldots,x_m)\in F\langle X\rangle be a nonzero polynomial (in non-commuting variables.) We say that an algebra A satisfies the identity f=0 if f(a_1,\ldots,a_m)=0 for all m-tuples of elements of A.


Examples:
  1. A commutative: then A satisfies x_1x_2-x_2x_1=0.
  2. A nilpotent (say A^n=0.) Then A satisfies x_1\ldots x_n=0.
  3. A finite dimensional (say of dimension d.) Let f(x_1,\ldots,x_{d+1})=\sum_{\sigma\in \mathcal{S}_{d+1}}(-1)^{|\sigma|}x_{\sigma(1)}\ldots x_{\sigma(d)}. Then f is multilinear and antisymmetric. If e_1,\ldots,e_d is a basis for A, then f(a_1,\ldots,a_{d+1}) can be expressed as a linear combination of elements f(e_{i_1},\ldots,e_{i_{d+1}}) by multilinearity. However in each such summand, some basis element appears twice, and so f vanishes by antisymmetry. We denote this particular f by S_{d+1}(x_1,\ldots,x_{d+1}).
We will call algebras satisfying a polynomial identity "PI algebras."

Lemma: If A satisfies a polynomial identity, it satisfies a multilinear polynomial identity.

Proof:
Let f be a nonzero identity satisfied by A. We show how to linearize in the first variable x_1. Suppose x_1 appears with maximal degree d. Introduce new variables y_1,\ldots, y_d and consider the sum
f(y_1+\cdots+y_d,x_2,\ldots, x_n)-\sum f(y_1+\cdots+\hat{y}_i+\cdots+y_d,x_2,\ldots)+\cdots
which is an alternating sum in which the second term is the sum of ways of omitting one variable. The next term will be the sum of ways of omitting two variables etc. Then this identity still holds since it is a sum of f's, each of which is 0, but it has the effect that it kills any momomials involving fewer than d instances of x_1 and it linearizes those that do. In fact, thinking of the x_1's as indicating d spots in a subword, it sums over putting the variables y_1,\ldots, y_d in those spots in all possible ways.  So for example f(x_1,x_2)=x_1^2x_2x_1 would get linearized to
\sum_{\sigma\in \mathcal{S}_3} y_{\sigma(1)}y_{\sigma(2)}x_2y_{\sigma(3)}.
Now  that we have linearized x_1, repeat the process to linearize the other variables.
\Box

Lemma: If A is a PI algebra and R is a commutative F-algebra, then A\otimes_F R is also a PI algebra.

Proof:
In view of the previous lemma, we may assume that A's polynomial identity, f(x_1,\ldots, x_m)=0 is multilinear. So f=\sum_{\sigma\in S_m}\alpha_\sigma x_{\sigma(1)}\cdots x_{\sigma(m)}, for some coefficients \alpha_\sigma\in F.
 We claim that A\otimes_F R satisfies the same polynomial identity. Since f is linear in each variable, f applied to a general m-tuplet of elements of A\otimes_F R is a linear combination of f applied to basic tensors. Now
\begin{align*} f(a_1\otimes r_1,\ldots ,a_m\otimes r_m)&=\sum_{\sigma\in \mathcal{S}_m}\alpha_\sigma a_{\sigma(1)}\cdots a_{\sigma(m)}\otimes r_{\sigma(1)}\cdots r_{\sigma(m)}\\ &= \sum_{\sigma\in \mathcal{S}_m}\alpha_\sigma a_{\sigma(1)}\cdots a_{\sigma(m)}\otimes r_{1}\cdots r_{m}\\ &=f(a_1,\ldots,a_m)\otimes r_1\cdots r_m\\ &=0 \end{align*}
\Box

Corollary: The algebra of n\times n matrices over a commutative F algebra RM_n(R), is a PI algebra.

Proof:
M_n(F) is finite dimensional, so is a PI algebra as we saw earlier. On the other hand,
M_n(R)=M_n(F)\otimes _F R, so the previous lemma applies. \Box

Indeed, since the dimension of M_n(F) is n^2, we know that M_n(R) will satisfy the identity S_{n^2+1}(x_1,\ldots,x_{n^2+1})=0. It becomes an interesting question as to whether this is the best that can be done. Does M_n(R) satisfy a lower degree identity? The answer turns out to be yes.

Theorem: M_n(R) satisfies the identity S_{2n}(x_1,\ldots,x_{2n})=0 but does not satisfy any identity of degree <2n.

Suppose that A satisfies an identity of degree k<2n. Linearize, rename variables, and multiply by a scalar so that f=x_1\ldots x_k+\sum_{\sigma\neq 1} \alpha_\sigma x_{\sigma(1)}\cdots x_{\sigma(k)}. Let e_{ij} be the matrix with a 0 in every entry except at the (i,j) entry where there is a 1.
Now let x_1=e_{11},x_2=e_{12},x_2=e_{22}, x_4=e_{23},\ldots
Then the only order in which this multiplies to be nonzero is x_1 x_2\ldots x_k, so f(x_1,\ldots,x_k)= x_1\ldots x_k\neq 0, and therefore A does not satisfy the identity f.

Stay tuned until the next blog post for the fact that S_{2n}=0.







Sunday, October 30, 2016

Grigorchuk's group

In a previous post we constructed a rather intricate counterexample to the Generalized Burnside Problem using the Golod-Shafarevich criterion for the infinite dimensionality of algebras.
In this post, we will look at a famous group introduced by Grigorchuk which will also be a counterexample to the generalized Burnside problem. Namely, it will be an infinite, finitely generated, 2-group. The construction is amazingly elementary.

Start with an infinite rooted binary tree T, which we visualize growing downward.

         
The non-root nodes of the tree are in 1-1 correspondence with finite sequences of 0's and 1's. 0 means head left while 1 means head right.  The Grigorchuk group G is a subgroup of the automorphism group of T. Let a\in\mathrm{Aut}(T) be the involution which exchanges the left and right branches of the tree. On the level of finite bitstrings, it just flips the first bit. We define three other automorphisms b, c, and d as follows:

    So each of b,c,d fixes the rightmost chain of nodes and alternates acting by a,a,1 on the sequence of trees branching off to the left.

If an automorphism fixes the two nodes adjacent to the root, it can be written as a pair of autmorphisms (x,y), each acting on a tree isomorphic to T. In other words, let \mathrm{stab}_1 denote the group of automorphisms fixing the two nodes of distance 1 from the root. Then \mathrm{stab}_1\cong \mathrm{Aut}(T)\times \mathrm{Aut}(T). With this convention, we see that b=(a,c), c=(a,d), d=(1,b). Moreover one can take these as recursive definitions for b,c,d! It is easy to see that \{1,b,c,d\} form a Klein 4-group.

Definition: Grigorchuk's group G is the subgroup of \mathrm{Aut}(T) generated by \{a,b,c,d\}.

By definition, G is finitely generated. We also want so show that it is infinite and torsion.

To see that it is infinite, calculate aba=(c,a), aca=(d,a) and ada=(b,1). Thus H=\mathrm{stab}_1\cap G contains elements with a,b,c,d as their first coordinates. So H surjects onto G itself. But H is a proper subgroup of G (in fact index 2) so the only way this could happen is if G is infinite.

Now we show that G is a torsion group. Any element of G can be written as a product of generators which alternate between a and non-a letters. Moreover any such word is conjugate to one which begins with a and ends with a non-a letter. So consider a word
g=au_1au_2\ldots au_k, where u_i\in\{b,c,d\}. We proceed by induction on the length 2k. (If k=1, it is not hard to check that ab,ac,ad are all torsion.) We consider two cases: k even and k odd. If k is even, then g\in\mathrm{stab}_1, so we can write g=(x,y). But x and y are now both words of length \leq k. For example, abac=(aba)c=(c,a)(a,d)=(ca,ad). So we have reduced to a smaller case. Now suppose that k is odd. Then g\not\in\mathrm{stab}_1, so we can't quite do the same thing. However g^2\in\mathrm{stab}_1. Now g^2=(x,y) where x and y have length \leq|g|, so it doesn't look like we've made progress. However, if d appears as one of the letters, then the length of each x,y does go down. For example, if g=abacad, then
\begin{align*} g^2&=(aba)c(ada)b(aca)d\\ &=(c,a)(a,d)(b,1)(a,c)(d,a)(1,b)\\ &=(cabad,adcab) \end{align*}
The point is that d appears (at least) twice and parity considerations show that it puts a copy of 1 into the first and second coordinates. Thus g^2 is torsion, and hence so is g, by induction.

If d doesn't appear in g, then similar parity considerations will show that every appearance of b in g will put a copy of c into x and y, while every appearance of c will put a copy of d into x and y. That is g^2=(x,y) where x and y now have either a c or d in both. If there is a d, then x and y are both torsion, and so therefore is g^2 and g. If there is a c, then g^4=(x^2,y^2)=((\alpha_1,\alpha_2),(\alpha_3,\alpha_4)), where d appears in each \alpha_i, so g^4 is torsion here as well.

Thus we have shown that G is torsion (in fact a 2-group)!






Thursday, October 27, 2016

Niven's proof of the irrationality of Pi (Edited)

There is a beautiful short proof of \pi's irrationality due to Ivan Niven, which I'd like to present here. In some sense, one could just read Niven's short one page paper, but I'd like to go over it anyway, if only to assure myself that there are no hidden details! I have presented this proof to a couple of my undergraduate classes, and I am editing this post to try to address some of the points where students tended to get confused.

As with most proofs of irrationality, we assume toward a contradiction that \pi=a/b for some positive integers a,b. In other words, we suppose that \pi=a/b and get a contradiction, i.e. a false statement. Since we get a false statement, it means that our initial assumption has to have been incorrect, and so \pi cannot possibly equal a/b for any a,b integers.

Our strategy will be to use a,b to construct a "magic" function, one so magical that it can't possibly exist! Our function will have the following two properties:
  1. \int_0^\pi f(x)\sin x\,dx is a positive integer.
  2. 0<\int_0^\pi f(x)\sin x\,dx<1
This is of course impossible, and is the desired contradiction. 

Now define the magic function to be of the form f(x)=\frac{x^n(a-bx)^n}{n!}
where n is a positive integer which we will now determine. Let us look at f(x) on the interval [0,\pi]. The biggest x^n can be is \pi^n, whereas the biggest (a-bx)^n can be is a^n (when x=0). So f(x)\leq \frac{\pi^na^n}{n!}. This is a crude upper bound, it is not the exact maximum, but that's okay. Also, f(x)>0 when 0<x<\pi=a/b since one can easily see that (a-bx)>0 for x<\pi. So we have
0<\int_0^\pi f(x)\sin x\,dx \leq \frac{\pi^na^n}{n!}\cdot \pi,
because the integrand is bounded by \frac{\pi^na^n}{n!} and the length of the interval is \pi. It is a calculus exercise to show that \lim_{n\to\infty} \pi\frac{\pi^na^n}{n!}=0. One way to solve this exercise is to note that this limit is equal to \pi\lim_{n\to\infty} \frac{c^n}{n!} for c=\pi a, and \sum_{n=0}^\infty \frac{c^n}{n!} =e^c is a power series which converges everywhere. So in particular \lim_{n\to\infty} \frac{c^n}{n!}=0. So now that we know the limit is 0, we may pick a large n so that \int_0^\pi f(x)\sin x\,dx<1 since the integral is bounded by a sequence tending to 0. This establishes the second property we want for our "magic" function f.

To establish the first property, we use integration by parts to calculate  \int_0^\pi f(x)\sin x\,dx.
\begin{align*}  \int f(x) \sin x\,dx&= -f(x)\cos x+\int f'(x)\cos x\,dx\\ &=-f(x)\cos x+f'(x)\sin x -\int f''(x)\sin x\,dx\\ &=-f(x)\cos x+f'(x)\sin x + f''(x)\cos x -\int f'''(x)\cos x\,dx\\ &\vdots\\ &=-f(x)\cos x+f'(x)\sin x + f''(x)\cos x-f''(x)\sin x+\cdots \pm f^{(2n)}(x)\cos x \end{align*}  
The reason this eventually terminates is that f(x) is a polynomial of degree 2n. So f^{(2n+1)}(x)=0.  We now have an antiderivative for f(x)\sin x, which we can use to calculate \int_0^\pi f(x)\sin x\,dx  by plugging in \pi and subtracting off the value at 0. Notice that all the \sin terms disappear since \sin(0)=\sin(\pi)=1. We get
\begin{multline*} \int_0^\pi f(x)\sin x\,dx= f(\pi)-f^{(2)}(\pi)+f^{(4)}(\pi)-\cdots\pm f^{(2n)}(\pi)\\  +f(0)-f^{(2)}(0)+\cdots \mp f^{(2n)}(0) \end{multline*}

Hence, in order to show \int_0^\pi f(x)\sin x\,dx is an integer, it will suffice to show that all derivatives of f evaluated at \pi and 0 are integers. In fact, it is not difficult to check that f(\pi-x)=f(x) using that \pi=a/b by direct algebraic computation. Taking the derivative of both sides, using the chain rule, we get:
\begin{align*} f(\pi-x)&=f(x)\\ -f'(\pi-x)&=f'(x)\\ f''(\pi-x)&=f''(x)\\ &\vdots\\ (-1)^kf^{(k)}(\pi-x)&=f^{(k)}(x) \end{align*}
Plugging in x=\pi, we get (-1)^k f^{(k)}(0)=f^{(k)}(\pi). So if we can show that f^{(k)}(0) are integers, then it will automatically follow that f^{(k)}(\pi) are integers, since they differ only by a sign.

So, now our proof will finally be done if we can show that f^{(k)}(0) are integers. There is a nice trick for computing f^{(k)}(0) for polynomials f(x). Taylor's theorem says the coefficient of x^k is f^{(k)}(0)/k!, or to put it another way f^{(k)}(0)=k! \times \text{ coefficient of }x^k.\,\,(\dagger)
We can compute the coefficients of powers of x in f(x) using the binomial theorem. First of all, because we are multiplying by x^n, the powers of x in f(x) range from x^n to x^{2n}. So f^{(k)}(x)=0 for all k<2n.  Now, for 0\leq d\leq n, the term involving x^d in (a-bx)^n is given by \binom{n}{d}(-bx)^da^{n-d}=\binom{n}{d}(-b)^da^{n-d} x^d. So the coefficient of x^{n+d} in \frac{1}{n!} x^n(a-bx)^n is given by
\frac{1}{n!}\binom{n}{d}(-b)^da^{n-d}. So, by the above formula (\dagger) for calculating derivatives, we have
f^{(n+d)}(0)=\frac{(n+d)!}{n!}\binom{n}{d}(-b)^da^{n-d}.
This is an integer because \frac{(n+d)!}{n!}=\frac{(n+d)(n+d-1)\cdots (n+1)n(n-1)\cdots 1}{n(n-1)\cdots1}=(n+d)(n+d-1)\cdots (n+1) is an integer. 

So in summary, the polynomial f(x) only has nonzero derivatives f^{(k)}(0) for n\leq k\leq 2n and these derivatives have the explicit form given in the above formula, establishing that they are integers.

 It would be interesting to figure out exactly what was needed about \pi to make this argument work. Can it be modified to show other numbers are irrational?

Edit: Matt Baker's Blog has a discussion on how to generalize this in a motivated way.  



Wednesday, October 26, 2016

Burnside Problems

I am sitting in on a nice class given by Efim Zelmanov at UCSD and I thought I'd talk a little bit on this blog about some of the stuff he has covered. This post will be about the Burnside problems:

Generalized Burnside Problem: Suppose that G is a finitely generated group where every element has finite order. Is G a finite group?

Burnside Problem: Suppose that G is a finitely generated group, such that there exists an N for which g^N=1 for all g\in G. Is G finite?

It turns out that the answer to both these questions is no, and in this blog post we will explore how using a detour through the theory of algebras will help us settle this question!

In 1941, Kurosh formulated the following question. Suppose that F is a field, A_F is an associative F algebra (not assuming a unit) and that A is finitely generated. An element a\in A is said to be nilpotent if \exists n\geq 1 \,\,a^n=0, and an algebra is called a nilalgebra if all of its elements are nilpotent. The Kurosh problem is

Kurosh nilalgebra problem: Is every finitely generated nilalgebra finite dimensional?

Kurosh noticed that if there is a counterexample to this question for a field of positive characteristic, then there is a counterexample to the Generalized Burnside Problem. Here is the argument for that.

Theorem: If there is a counterexample to the Kurosh problem, there is counterxample to the generalized Burnside problem.
 [Proof]
Suppose that there is some finitely generated infinite-dimensional nilalgebra over F where \mathrm{char}(F)=p>0. Let A=\langle a_1,\ldots, a_m\rangle be such an example. Let \hat{A}=A\oplus F\cdot 1 be the unital hull of A: the algebra you get by adding a unit. (Since A was nil, it couldn't possibly have had a unit to begin with!) Now notice that for all a\in A, 1+a is invertible since 1-a+a^2-a^3+\cdots is a finite sum and an inverse. Thus 1+a\in G(\hat A), the group of units of \hat A. Consider the subgroup of the group of units
G=\langle 1+a_1,\ldots,1+a_m\rangle\subset 1+A\subset G(\hat{A}).
Now every element in 1+A is torsion: we know there is some n(a) where a^{n(a)}=0. Choose k so that p^k\geq n(a). Then (1+a)^{p^k}=1+a^{p^k}=1 since we are in characteristic p. Therefore G is a finitely generated torsion group. If we can show it is infinite, we have a counterexample to the Generalized Burnside Problem.

Suppose to the contrary that |G|=d. Let g_1\ldots g_d be a product of elements of G. Then \{1,g_1,g_1g_2,\ldots,g_1\ldots g_d\} is a set of d+1 elements, and by the pigeonhole principle two of the elements must coincide: say g_1\ldots g_i=g_1\ldots g_j (j>i). Then g_{i+1}\ldots g_j=1. So we can remove a segment from the original word g_1\ldots g_d without changing the group element it represents. In particular any element of G is a product of <d generators (we don't need inverses because the group is torsion).
In other words,
\forall g\in G\,\, g=(1+a_{i_1})\cdots (1+a_{i_k})\,\,\,\,(k<d)
But now consider an arbitrary product of d generators of A: a_{i_1}\ldots a_{i_d}. Then
(1+a_{i_1})\cdots (1+a_{i_d})=(1+a_{j_1})\cdots(1+a_{j_k})\,\,\,\, k<d.
Now expanding this out, we see that a_{i_1}\ldots a_{i_d} is a linear combination of products of generators of length <d. This implies A is finite dimensional, which is a contradiction. \Box

So it now suffices to find a counterexample to the Kurosh problem. We begin with some graded algebra conventions. Let F\langle X\rangle be the free associative algebra on a finite alphabet X=\{x_1,\ldots ,x_m\}. This is graded by length of words:
F\langle X\rangle =F\cdot 1+F\langle X\rangle_1+F\langle X\rangle_2+\cdots.
 We say a subspace H\subseteq F\langle X\rangle is graded if H=\sum_i(H\cap F\langle X\rangle_i). If R is a graded subspace, we can form the quotient algebra \langle X\,|\, R=(0)\rangle. Then since the ideal generated by R, I=\mathrm{id}(R), is also graded, the factor algebra A=F\langle X\rangle/I=F\cdot 1+A_1+A_2+\cdots is graded as well. (Note we assume all relations have degree \geq 2.)

Definition: Given a graded vector space V=\oplus_{i\geq 0} V_i in which each graded summand is finite dimensional, define the Hilbert series H_V(t)=\sum_{i\geq 0} (\mathrm{dim} V_i)t^i. Note this is a formal power series.

We can order formal power series lexicographically by coefficients.

Proposition (Golod-Shafarevich): In the situation above, H_A(t)(1-mt+H_R(t))\geq 1, where the comparison is of formal power series.

We will come back to the proof of this proposition at the end of the post. According to Zelmanov, it was given to Golod as a homework problem by Shafarevich!

Now, suppose that the proposition is true, and moreover suppose that we can find a t_0 with 0<t_0<1 such that
  1. H_R(t_0) converges.
  2. 1-mt_0+H_R(t_0)<0.
Then the above proposition shows that H_A(t_0) does not converge, so that A must be infinite. (The formal inequality implies the numerical inequality when both sides converge.)

The construction of a finitely generated nilalgebra which is infinite dimensional:

Let F be a countable field. Then the free algebra on X without identity is also countable:
F_0\langle X\rangle =\{f_1,f_2,\ldots,\}.
Let d_1=2. Let d_2>\mathrm{deg}(f_1^{d_1}), and d_3>\mathrm{deg}(f_2^{d_2}) etc. Let R be the set of homogeneous components of f_i^{d_i}. By choice of the d_i's, every degree occurs no more than once.

Therefore H_R(t)\leq t^2+t^3+\cdots, and so H_R(t) converges when 0<t<1. In order to satisfy the criterion 2. above, we need 1-mt_0+H_R(t_0)<0, which will be satisfied if
1-mt_0+t_0^2+t_0^3+\cdots<0 for some 0<t_0<1. This is an easy precalculus exercise!

So the algebra A=\langle F_0\langle X\rangle\,|\, R=(0)\rangle is infinite dimensional. Moreover it is nil by construction. Every element of F_0\langle X\rangle has a power which is trivial.

Returning to the proof of the Golod-Shafarevich inequality:

Let r_i=\mathrm{dim}(R_i), \mathrm{dim}(A_i)=a_i and, letting I=\mathrm{id}(R), b_i=\dim(I_i). Note that a_i+b_i=\mathrm{dim} F_0\langle X\rangle_i=m^i. The Golod-Shafarevich inequality can now be rewritten as
a_n-ma_{n-1}+\sum_{i+j= n} a_ir_j\geq 0

We use the following basic observation: I\subseteq XI+RF\langle X\rangle. This follows since a general element of I is a linear combination of elements of the form u_iRv_i. If Ris not at the left of the word, then u_iRv_i\in XI. Otherwise, it is in RF\langle X\rangle. Now choose a complementary graded subspace S so that F\langle X\rangle=I+S. Then \mathrm{dim}(S_i)=a_i. Moreover, we can now say
I\subseteq XI+RS
Thus b_n\leq mb_{n-1}+\sum_{i+j=n}r_ia_j, which is (m^n-a_n)\leq m(m^{n-1}-a_{n-1})+\sum_{i+j=n}r_ia_j,
which is equivalent to the desired inequality.