Notation. Let \( (\cdot)^{\top} \) denote the real transpose. Let \( [n]=\{1, \ldots, n\} \). Let \( \mathcal{B}(x, r) \) denote the Euclidean ball centered at \( x \) with radius \( r \). Let \( \|\cdot\| \) denote the \( \ell_{2} \) norm for vectors and spectral norm for matrices. For any non-zero \( x \in \mathbb{R}^{n} \), let \( \hat{x}=x /\|x\| \). Let \( \Pi_{i=d}^{1} W_{i}=W_{d} W_{d-1} \ldots W_{1} \). Let \( I_{n} \) be the \( n \times n \) identity matrix. Let \( \mathcal{S}^{k-1} \) denote the unit sphere in \( \mathbb{R}^{k} \). We write \( c=\Omega(\delta) \) when \( c \geqslant C \delta \) for some positive constant \( C \). Similarly, we write \( c=O(\delta) \) when \( c \leqslant C \delta \) for some positive constant \( C \). When we say that a constant depends polynomially on \( \epsilon^{-1} \), this means that it is at least \( C \epsilon^{-k} \) for some positive \( C \) and positive integer \( k \). For notational convenience, we write \( a=b+O_{1}(\epsilon) \) if \( \|a-b\| \leqslant \epsilon \) where \( \|\cdot\| \) denotes \( |\cdot| \) for scalars, \( \ell_{2} \) norm for vectors, and spectral norm for matrices. Define sgn : \( \mathbb{R} \rightarrow \mathbb{R} \) to be \( \operatorname{sgn}(x)=x /|x| \) for non-zero \( x \in \mathbb{R} \) and \( \operatorname{sgn}(x)=0 \) otherwise. For a vector \( v \in \mathbb{R}^{n}, \operatorname{diag}(\operatorname{sgn}(v)) \) is \( \operatorname{sgn}\left(v_{i}\right) \) in the \( i \)-th diagonal entry and \( \operatorname{diag}(v>0) \) is 1 in the \( i \)-th diagonal entry if \( v_{i}>0 \) and 0 otherwise.