Chapter 2 • Exercise 4

Part 1

This just follows from \begin{align} \mathbb{E}_{\pi^t \sim p^t} \left[\tilde{\bm{l}}^t(\pi' \mid \pi^t)\right] &= \sum_{\pi^t \in [A]} p^t(\pi^t) \cdot \frac{\bm{l}^{t}(\pi')}{p^t(\pi')} \cdot \bm{1}\{\pi^t = \pi'\} = \bm{l}^{t}(\pi') \end{align} Let \(q: [A] \to \mathbb{R}\) be any probability distribution over \([A]\). Then since both expectation and inner product are linear, we must have \begin{align} \langle q, \bm{l}^t\rangle = \left\langle q, \mathbb{E}_{\pi^{t} \sim p^t} \tilde{\bm{l}}^t\right\rangle = \mathbb{E}_{\pi^{t} \sim p^t} \left\langle q, \tilde{\bm{l}}^t\right\rangle \end{align} and hence \begin{align} \mathbb{E}[\textbf{Reg}] = \mathbb{E} \left[\sum_{t=1}^T \mathbb{E}_{\pi^t \sim p^t} \left\langle p^t, \tilde{\bm{l}}^t\right\rangle\right] - \mathbb{E} \left[\sum_{t=1}^T \mathbb{E}_{\pi^t \sim p^t} \left\langle e_{\pi}, \tilde{\bm{l}}^t\right\rangle\right] \end{align} as desired.

Part 2

We must have \begin{align} \mathbb{E}_{\pi \sim p^t} \left[\tilde{\bm{l}}^t(\pi\mid \pi')^2\right] = \sum_{\pi \in [A]} p^t{\pi} \cdot \left(\frac{\bm{l}^t(\pi)}{p^t({\pi})}\right)^2 \bm{1}\{\pi = \pi'\} = \frac{\bm{l}^t(\pi)^2}{p^t(\pi)} \end{align} Therefore \begin{align} \mathbb{E}_{\pi^t \sim p^t}\left[\mathbb{E}_{\pi \sim p^t}\left[\tilde{\bm{l}}^t(\pi \mid \pi^t)^2\right]\right] &= \mathbb{E}_{\pi^t \sim p^t}\left[ \frac{\bm{l}^t(\pi^t)^2}{p^t(\pi^t)} \right] \\ &= \sum_{\pi^t \in [A]} p^t(\pi^t) \frac{\bm{l}^t(\pi^t)^2}{p^t(\pi^t)} \\ &\leq A \end{align} since \(\bm{l}^t \in [0, 1]\)

Part 3

We reduce this to the Low Noise EWA setup. Choose any \(\pi' = \pi'_1, \pi'_2, \ldots \pi'_t \in \mathbb{\Pi}\). Consider the sequence \((x_t, y_t) = (t, \pi'_t)\). Consider the family \(\mathcal{F}: \{1, \ldots, T\} \to \Pi\) which returns the same policy for all its inputs. Finally consider the loss \( \Pi \times \Pi \times l' : \{1, \ldots, T\} \) (augmentation of \(t\) can be done by including it in \(y\)) defined by \begin{align} l'(f(t), \pi'_t) = \tilde{\bm{l}}^t(\pi'_t \mid f(t)) \quad i,j \in [A] \end{align} Then by the low noise EWA result, we have that the given construction of \(p^t\) satisfies (\(f^t\) becomes \(\pi^t\) since \(\mathcal{F}\) is set of constant functions) \begin{align} \sum_{i=1}^{T} \mathbb{E}_{\pi^t \sim p^t} \left[\tilde{\bm{l}}^t(\pi'_t \mid \pi^t)\right] - \tilde{\bm{l}}^t(\pi'_t \mid \pi^\star) \leq \frac{\eta A}{2} + \frac{\log A}{\eta} \leq \sqrt{2 A T \log A} \end{align} for all \(\pi^\star\). Note here we have used result in part 2, as well as fact that \(|\mathcal{F}| = A\).

Since this statement is true for all \(\pi' \in \Pi^T\), we can do an expectation over all \(\pi'\) to get \begin{align} \mathbb{E}\left[\sum_{i=1}^{T} \mathbb{E}_{\pi^t \sim p^t} \left\langle p^t, \tilde{\bm{l}}^t\right\rangle\right] - \mathbb{E}\left[\sum_{i=1}^T\mathbb{E}_{\pi^t \sim p^t} \left\langle e_{\pi^{\star}}, \tilde{\bm{l}}^t\right\rangle\right] \leq \sqrt{2 A T \log A} \end{align} for all \(\pi^{\star}\) as desired

Return home