Expanders, Property Testing and the PCP theorem

Lecture 8: Confining a random walk on an expander is hard

Posted in lectures by HQN on September 23, 2008

Let G be an (n,d,\alpha)-spectral expander, B be a subset of vertices of G of size \beta n. Suppose we uniformly choose a random vertex of G and walk randomly for t steps.

Theorem 1 (Confining a random walk is hard). Let (B,t) be the event that the walk is confined within B the entire time.Then, Prob\left[(B,t)\right] \leq \sqrt\beta (\alpha + (1-\alpha)\beta)^t. In particular, if \alpha, \beta<1 then this confinement probability is exponentially small.

Proof. Let \mathbf P = (p_{ij}) be the “projection into B matrix“, i.e. p_{ii} = 1, \forall i \in B and p_{ij}=0 for all other i,j. Noting that \mathbf P is idempotent, it is not difficult to see that

Prob\left[(B,t)\right] = \|(\mathbf{P\hat A})^t\mathbf{Pu}\|_1 = \|(\mathbf{P\hat AP})^t\mathbf{Pu}\|_1 \leq \sqrt n \|(\mathbf{P\hat AP})^t\mathbf{Pu}\|_2

Thus, to bound Prob\left[(B,t)\right] we need to know how much the matrix \mathbf{M = P\hat AP} shrinks a vector after each multiplication. The same trick we did in the last lecture gives, for any non-zero vector \mathbf y

\frac{\|\mathbf{My}\|}{\|\mathbf y\|} \leq \lambda_1(\mathbf M) = \max_{\mathbf z\neq \mathbf 0} \frac{\mathbf z^T\mathbf{Mz}}{\mathbf z^T\mathbf z}

Now, consider any non-zero vector \mathbf z. Let \mathbf{x = Pz}. Then, \mathbf z^T\mathbf{Mz} = \mathbf z^T\mathbf{P\hat APz} = \langle \mathbf{\hat Ax, x} \rangle. First, express \mathbf x as a linear combination of the orthonormal eigenvectors of \mathbf{\hat A}:

\mathbf x = c_1\mathbf u_1 + c_2 \mathbf u_2 + \cdots c_n \mathbf u_n

Let \mathbf x^{\|} = c_1\mathbf u_1 and \mathbf x^{\perp} = \mathbf{x-x^{\|}}. Then,

\langle \mathbf{\hat Ax, x} \rangle = \langle \mathbf{\hat A x^{\|}} + \mathbf{\hat A x^{\perp}}, \mathbf x^{\|} + \mathbf x^{\perp} \rangle = \|\mathbf x^{\|}\|^2 + \langle \mathbf{\hat A x^\perp, x^\perp} \rangle

The usual trick gives

\langle \mathbf{\hat A x^\perp, x^\perp} \rangle = \|\mathbf x^\perp\|^2 \frac{\langle \mathbf{\hat A x^\perp, x^\perp} \rangle}{\|\mathbf x^\perp\|^2} \leq \hat\lambda \|\mathbf x^\perp\|^2 = \hat\lambda \left(\|\mathbf x\|^2 - \|\mathbf x^{\|}\|^2\right)

By Cauchy-Schwarz, \|\mathbf x^{\|}\|^2 = c_1^2 = \langle \mathbf x, \mathbf u_1\rangle^2 \leq \beta \|\mathbf x\|^2, we conclude that

\langle \mathbf{\hat Ax, x} \rangle \leq \hat\lambda\|\mathbf x\|^2 + (1-\hat\lambda)\|\mathbf x^\|\|^2 \leq (\alpha + (1-\alpha)\beta)\|\mathbf x\|^2 \leq (\alpha + (1-\alpha)\beta)\|\mathbf z\|^2

Consequently, each time we apply \mathbf M to a vector \mathbf y we reduce its length by a ratio of at least (\alpha + (1-\alpha)\beta). There are t applications, and the initial vector \mathbf{Pu} has length \sqrt{\beta/n}. The theorem follows.

Theorem 2 (Saving random bits for RP). A language L is in the class RP if there exists a polynomial time randomized algorithm A satisfying the following conditions:

  • x \in L \Rightarrow Prob(A accepts x) \geq 1/2
  • x \notin L \Rightarrow Prob(A accepts x) = 0

Suppose A uses r random bits. To reduce the error probability to 1/2^k, the easy way is to run the algorithm k times and accept the input if the algorithm accepts it at least once. However, this approach requires kr random bits. By imposing a (strongly explicit) (n,d,\alpha)-spectral expander on the space of random strings of length r, we can obtain the same effect with only r + O(k)\log d random bits. Here, we require (\alpha + (1-\alpha)1/2)<1. A strongly explicit expander is an expander which, given a vertex, there’s a poly-time algorithm computing the neighboring vertices.

Some linear algebra

Posted in lectures by HQN on September 21, 2008

I mentioned the Perron-Frobenius theorem in the lecture. There are two versions of the theorem, one for positive matrices and one for non-negative matrices. Define the spectral radius \rho(\mathbf A) = \max\{|\lambda|\}, where the max goes over all eigenvalues \lambda of \mathbf A. Here |\lambda| is the modulus of \lambda, as we allow complex eigenvalues.

Theorem 1 (Perron-Frobenius, positive matrices). Let \rho be the spectral radius of a positive matrix \mathbf A. Then,

  1. \rho is positive
  2. \rho is an (real) eigenvalue of \mathbf A
  3. there exists a positive \rho-eigenvector \mathbf x, i.e. \mathbf{Ax} =\rho \mathbf{x, x>0}
  4. \rho has geometric and algebraic multiplicity 1, i.e. \rho is a single root of \det(\lambda\mathbf{I-A})=0 and the \rho-eigenspace has dimension 1
  5. |\lambda|<\rho for all other eigenvalues \lambda \neq \rho of \mathbf A, i.e. \rho is the unique eigenvalue of maximum modulus

Theorem 2 (Perron-Frobenius, non-negative matrices). Let \rho be the spectral radius of a non-negative matrix \mathbf A. Then,

  1. \rho is non-negative
  2. \rho is an (real) eigenvalue of \mathbf A
  3. there exists a non-negative \rho-eigenvector \mathbf x, i.e. \mathbf{Ax} =\rho \mathbf{x, x\geq 0}

Theorem 2 follows from Theorem 1 by a perturbing the non-negative matrix (so that it becomes positive), then take a limit. See my favourite linear algebra book for the proofs of these theorems.

Lecture 7: Random walks on expanders are rapidly mixing

Posted in lectures by HQN on September 19, 2008

Let \hat{\mathbf A} be the normalized adjacency matrix of a d-regular graph, 1=\hat \lambda_1 \geq \dots \geq \hat \lambda_n be the eigenvalues of this matrix, and define \hat\lambda(G) = \max\{|\hat\lambda_2|, |\hat\lambda_n|\}.

Suppose we take a random walk on G starting from an initial distribution \pi^{(0)}. Then, the distribution after t steps is \pi^{(t)} = \hat{\mathbf A}^t\pi^{(0)}. Using the spectral theorem, it is not difficult to prove that, if \hat\lambda(G) <1 then \lim_{t\to \infty} \pi^{(t)} = \mathbf u, where \mathbf u is the uniform distribution. The more difficult question is “how fast is the walk converging to this limit?”

To answer this question, we need to define a notion of distance between two distributions. The total variation distance is probably the most commonly used. Since we are working on a finite probability space, the total variation distance is half the l_1-norm. Thus, we would like a theorem bounding the rate at which \|\pi^{(t)}-\mathbf u\|_1 tends to zero. We proved that

Theorem 1 (Convergence in l_1) \|\pi^{(t)}-\mathbf u\|_1 \leq \sqrt n \hat\lambda^t

Theorem 2 (Convergence in l_2) \|\pi^{(t)}-\mathbf u\|_2 \leq \hat\lambda^t

Theorem 1 follows from Theorem 2 by the Cauchy-Schwarz inequality. To prove Theorem 2, note that

\|\pi^{(t)}-\mathbf u\|_2 = \|\hat{\mathbf A}^t\pi^{(0)} - \mathbf u\|_2 =  \|\hat{\mathbf A}^t(\pi^{(0)} - \mathbf u)\|_2 =  \|\hat{\mathbf A}^t\mathbf v\|_2

Where, \mathbf v = \pi^{(0)} - \mathbf u is perpendicular to \mathbf u and \|\mathbf v\|_2 \leq 1. Each time we apply \hat{\mathbf A} to a vector \mathbf y perpendicular to \mathbf u (its first eigenvector), we shrink the length of \mathbf y by a factor of at least \hat\lambda. Hence the theorem follows.

An (n,d,\alpha)-spectral expander is a d-regular graph on n vertices such that \hat \lambda(G) \leq \alpha. If \alpha < 1 then an (n,d,\alpha)-spectral expander certainly has spectral gap bounded away from zero. Conversely, a graph whose spectral gap is bounded away from zero (plus self-loops) is a spectral expander.

Lecture 6: Cheeger-Alon-Milman Inequality

Posted in lectures by HQN on September 16, 2008

1. We showed that Gap-Max-3SAT(d')_{1,s} is NP-Hard for some constants s<1, d' using (n,d,1)-edge expanders (d is a constant). The proof relies on the fact that such an expander can be constructed in polynomial time. In the last lecture, we only proved that such expanders exist for all sufficiently large n, but we have not discussed how to construct expanders in polynomial time yet. Efficient constructions of good expanders will be discussed probably next week.

2. We proved half of the so-called Cheeger-Alon-Milman inequality, which states that

\frac{d-\lambda_2}{2} \leq h(G) \leq \sqrt{2d(d-\lambda_2)}

for any d-regular graph G. Here, \lambda_2 is the second largest eigenvalue and h(G) is the edge-isoperimetric number of the graph. We showed only that \frac{d-\lambda_2}{2} \leq h(G). Briefly, the proof went as follows.

First, we “symmetrize” h(G).

\displaystyle{h(G) = \min_{|S|\leq n/2} \frac{|E(S,\bar S)|}{|S|} \geq \min_{|S|\leq n/2} \frac{|E(S,\bar S)|}{|S|\frac{2|\bar S|}{n}}} = \frac 1 2 \min_{\emptyset \neq S \subset V} \frac{|E(S,\bar S)|}{\frac 1 n |S||\bar S|}

Define the sparsity of G to be

\displaystyle{\phi(G) = \min_{\emptyset \neq S \subset V} \frac{|E(S,\bar S)|}{\frac 1 n |S||\bar S|}}

Then, we just showed that h(G) \geq \frac 1 2 \phi(G). It remains to show that \phi(G) \geq d - \lambda_2. To do so, we turn the expression for \phi(G) into a more algebraic form. Imagine assigning a value of 1 to each vertex in S and 0 to each vertex in \bar S, we get a vector \mathbf x \in \{0,1\}^n. It is not difficult to see that

\displaystyle{\phi(G) = \min_{\mathbf x \in \{0,1\}^n, \mathbf x \neq \mathbf{0,1}} \frac{\sum_{uv\in E}(x_u-x_v)^2}{\frac{1}{2n} \sum_u\sum_v (x_u-x_v)^2}}

So, \phi(G) is the objective value of a certain optimization problem over zero-one vectors in \mathbb R^n. If we relax the zero-one condition, we get a lower bound for \phi(G). Specifically,

\displaystyle{\phi(G) \geq \min_{\mathbf x \in \mathbb R^n, \mathbf x \neq \mathbf{0,1}} \frac{\sum_{uv\in E}(x_u-x_v)^2}{\frac{1}{2n} \sum_u\sum_v (x_u-x_v)^2}}

Now, subtract (or add) the same amount from (to) each x_u will not change the min. Hence, we can replace each vector \mathbf x \in \mathbb R^n by \mathbf z \in \mathbb R^n where z_u = x_u - \frac 1 n \sum_v x_v. This way, \sum_u z_u = 0, or simply \mathbf z \perp \mathbf 1. Consequently,

\displaystyle{\phi(G) \geq \min_{\mathbf z \in \mathbb R^n, \mathbf z \neq \mathbf 0, \mathbf z \perp \mathbf 1} \frac{\sum_{uv\in E}(z_u-z_v)^2}{\frac{1}{2n} \sum_u\sum_v (z_u-z_v)^2}}

Now, let \mathbf A be the adjacency matrix of G, then it can be verified straightforwardly that

\sum_{uv\in E}(z_u-z_v)^2 = d\mathbf z^T\mathbf z - \mathbf z^T\mathbf A\mathbf z

Moreover, if \mathbf z \perp \mathbf 1, then \frac{1}{2n} \sum_u\sum_v (z_u-z_v)^2 = \mathbf z^T\mathbf z. Consequently,

\displaystyle{\phi(G) \geq \min_{\mathbf z \in \mathbb R^n, \mathbf z \neq \mathbf 0, \mathbf z \perp \mathbf 1} \left( d - \frac{\mathbf z^T\mathbf A\mathbf z}{\mathbf z^T\mathbf z} \right) = d - \max_{\mathbf z \in \mathbb R^n, \mathbf z \neq \mathbf 0, \mathbf z \perp \mathbf 1} \frac{\mathbf z^T\mathbf A\mathbf z}{\mathbf z^T\mathbf z} = d-\lambda_2}

For a more elaborate discussion on Cheeger-Alon-Milman inequality, see a nice series of posts by Luca Trevisan on the topic.

Lecture 5: Expanders, Probabilistic Proof of Existence

Posted in lectures by HQN on September 14, 2008

We gave a proof showing that most regular graphs are good (edge/vertex) expanders. In terms of edge-expansion, Bella Bollobas in this paper showed that, for every d\geq 3, most random d-regular graphs have edge-expansion rate (i.e. edge-isoperimetric number) at least d/2 - c_1\sqrt d for some constant c_1. (Specifically, we can set c_1=\sqrt{\lg 2}. On the other hand, given any d-regular graph G of order n, if we select a random subset U of vertices of size \lfloor n/2 \rfloor, then the expected number of edges crossing the (U,\bar U) cut is \frac{\frac{dn}{2}\lfloor n/2 \rfloor \lceil n/2 \rceil}{\binom n 2}

So the edge-isoperimetric number is bounded above by \frac d 2 \frac{n+1}{n-1}. Thus, for large graphs the edge expansion rate is (roughly) at most \frac d 2.

Better yet, Noga Alon proved in this paper that there is a constant c_2 such that the edge-isoperimetric number of any d-regular graph (with sufficiently large number of vertices) is at most d/2 - c_2 \sqrt d.

In summary, we can roughly think of edge-isoperimetric numbers of random d-regular graphs to be about d/2 - \Theta(\sqrt d).

For small values of d, Bollobas’ analysis yields the following. For every sufficiently large n,

  • there exists a (n,3,2/11)-edge expander
  • there exists a (n,4,11/25)-edge expander
  • there exists a (n,5,18/25)-edge expander
  • there exists a (n,6,26/25)-edge expander
  • there exists a (n,9,2.06)-edge expander

Lecture 4: Proof that PCP Theorem is equivalent to hardness of Gap-Max-SAT

Posted in lectures by HQN on September 8, 2008

In today’s lecture, I proved that the PCP theorem is equivalent to the fact that

  • Gap-MaxE3SAT_{1,s} is NP-Hard for some constant s \in (0,1)

From the fact, it’s easy to show that

  • MaxE3SAT is NP-Hard to approximate to within some constant \alpha <1

I also outlined things that will be discussed about expanders.

Lecture 3: Outline of Dinur’s Proof

Posted in lectures by HQN on September 5, 2008

In today’s lecture, I discussed the following:

  1. A list of results regarding various PCPs following the outline in Madhu Sudan’s lecture notes (See Section 2.2. on History of Results).
  2. Mentioned that the PCP theorem is equivalent to the NP-Hardness of Gap-MaxE3SAT_{1,s} for some constant s \in (0,1), which is equivalent to the NP-Hardness of approximating MaxE3SAT to a certain constant ratio \alpha<1.
  3. Defined the so-called Gap-Max-CG-\Sigma_{1,s} problem, whose NP-Hardness is also equivalent to the PCP theorem
  4. Briefly showed one direction of the PCP theorem, namely PCP[O(log n), O(1)]\subseteq NP
  5. Outlined Dinur’s proof of the PCP theorem.

Another article of interest is from Radnakrishnan and Sudan.

On Monday, I will prove #2 above formally, and start talking about expanders.