Enhancing Evolutionary Conversion Rate Optimization via Multi-Armed Bandit Algorithms

Conversion rate optimization means designing web interfaces such that more visitors perform a desired action (such as register or purchase) on the site. One promising approach, implemented in Sentient Ascend, is to optimize the design using evolutionary algorithms, evaluating each candidate design online with actual visitors. Because such evaluations are costly and noisy, several challenges emerge: How can available visitor traffic be used most efficiently? How can good solutions be identified most reliably? How can a high conversion rate be maintained during optimization? This paper proposes a new technique to address these issues. Traffic is allocated to candidate solutions using a multi-armed bandit algorithm, using more traffic on those evaluations that are most useful. In a best-arm identification mode, the best candidate can be identified reliably at the end of evolution, and in a campaign mode, the overall conversion rate can be optimized throughout the entire evolution process. Multi-armed bandit algorithms thus improve performance and reliability of machine discovery in noisy real-world environments.

[1]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[2]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[3]  Moto Kamiura,et al.  Optimism in the face of uncertainty supported by a statistically-designed multi-armed bandit algorithm , 2017, Biosyst..

[4]  Risto Miikkulainen,et al.  Sentient Ascend: AI-Based Massively Multivariate Conversion Rate Optimization , 2018, AAAI.

[5]  Steven L. Scott,et al.  A modern Bayesian look at the multi-armed bandit , 2010 .

[6]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[8]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[9]  John Langford,et al.  Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.

[10]  Rémi Munos,et al.  Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.

[11]  Rémi Munos,et al.  Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.

[12]  Risto Miikkulainen,et al.  How to select a winner in evolutionary optimization? , 2017, SSCI.

[13]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[14]  Ole-Christoffer Granmo,et al.  Solving two-armed Bernoulli bandit problems using a Bayesian learning automaton , 2010, Int. J. Intell. Comput. Cybern..

[15]  A. E. Eiben,et al.  From evolutionary computation to the evolution of things , 2015, Nature.

[16]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[17]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[18]  R. Weber On the Gittins Index for Multiarmed Bandits , 1992 .

[19]  R. Munos,et al.  Best Arm Identification in Multi-Armed Bandits , 2010, COLT.

[20]  Risto Miikkulainen,et al.  Conversion rate optimization through evolutionary computation , 2017, GECCO.