Model-free reinforcement learning algorithms, e.g., Q-Learning and policy gradient, are widely adopted for numerous applications. While these algorithms enjoy good empirical results, the theoretical guarantee on their performance remains to be investigated. In this presentation, we will introduce the first work to prove the convergence of Q-learning, i.e., sublinear regret. This analytical result provides a better understanding of why and how Q-learning converges.
The essence of many online learning problems is manifested in the exploration and exploitation tradeoff. That is to explore the actions to learn the underlying reward models while capitalizing on the information gathered so far to exploit the best actions. Bandits are a classic mathematical abstraction of this setting with tractable analysis. There have been several fundamental results in the performance analysis of various bandit algorithms such as epsilon-greedy, UCB and Thompson sampling, under different modelling assumptions of finite and infinite action sets (linear, convex, Lipschitz, or kernelized models) in the past decades. We overview these results at the Trends in AI Theory seminar series.
I will give a gentle introduction to the convergence of stochastic processes, following up with a brief description of SGD convergence: the classical settings, assumptions, and results. I will mention briefly what one can obtain with slightly stronger assumptions / weaker requirements on the speed of convergence. We will then look at how these results apply to prove a weak result for MAML. I will outline how MAML updates fail some of the previously stated assumptions, and discuss how one can still recover a result.
In this talk I will describe the machinery of overparametrized neural networks, explain what is the Neural Tangent Kernel and provide mathematical context to the notion "as number of the parameters in a network grows to infinity, the training behavior resembles kernel methods." This will be followed by a brief overview of approaches to error estimates for kernel methods and the implied results for Neural Tangent Kernels, including (if time permits) a recent work by MediaTek Research.
To protect networks from adversarial examples, verifying the robustness properties of neural networks becomes a critical problem in AI security. In recent years, previous works mainly focused on verifying the ReLU-based CNNs. However, the majority of network structures for the image classification(e.g., AlexNet, LeNet, VGG) contain maxpool-based CNNs architectures. In this talk, I will first review how the dual approach solves the verification problem. Then, I will present a novel decomposition idea to handle the maxpool function. This further helps us to verify the maxpool-based CNN. Finally, I will show the performance of our proposed method in terms of precision and time efficiency.
Conditional Generative Adversarial Networks (cGANs) are implicit generative models which allow to sample from class-conditional distributions. Existing cGANs are based on a wide range of different discriminator designs and training objectives. One popular design in earlier works is to include a classifier during training with the assumption that good classifiers can help eliminate samples generated with wrong classes. Nevertheless, including classifiers in cGANs often comes with a side effect of only generating easy-to-classify samples. Recently, some representative cGANs avoid the shortcoming and reach state-of-the-art performance without having classifiers. Somehow it remains unanswered whether the classifiers can be resurrected to design better cGANs. In this work, we demonstrate that classifiers can be properly leveraged to improve cGANs. We start by using the decomposition of the joint probability distribution to connect the goals of cGANs and classification as a unified framework. The framework, along with a classic energy model to parameterize distributions, justifies the use of classifiers for cGANs in a principled manner. It explains several popular cGAN variants, such as ACGAN, ProjGAN, and ContraGAN, as special cases with different levels of approximations, which provides a unified view and brings new insights to understanding cGANs. Experimental results demonstrate that the design inspired by the proposed framework outperforms state-of-the-art cGANs on multiple benchmark datasets, especially on the most challenging ImageNet. The code is available at this https URL.
Most of the theoretical insights about Neural Networks are based on the study of the asymptotic limit where the networks can be described as Gaussian processes, in Physics this corresponds to non-interacting field theories. Moving away from the asymptotic limit yields a non-Gaussian process that corresponds to turning on particle interactions in the field, allowing for the computation of correlation functions of neural network outputs with Feynman diagrams. This allows a theoretical understanding of neural networks in terms of Wilsonian effective field theory.