Constantinos Daskalakis2014-03-14 2:09 PM

零和游戏接近最优化的无悔算法Near-Optimal No-Regret Algorithms for Zero-Sum Games

Abstract:

We propose a new no-regret learning algorithm. When used against an adversary, our algorithm achieves average regret that scales as O (1/√T) with the number T of rounds. This regret bound is optimal but not rare, as there are a multitude of learning algorithms with this regret guarantee. However, when our algorithm is used by both players of a zero-sum game, their average regret scales as O (ln T/T), guaranteeing a near-linear rate of convergence to the value of the game. This represents an almost-quadratic improvement on the rate of convergence to the value of a game known to be achieved by any no-regret learning algorithm, and is essentially optimal as we show a lower bound of Ω (1/T). Moreover, the dynamics produced by our algorithm in the game setting are strongly-uncoupled in that each player is oblivious to the payoff matrix of the game and the number of strategies of the other player, has limited private storage, and is not allowed funny bit arithmetic that can trivialize the problem; instead he only observes the performance of his strategies against the actions of the other player and can use private storage to remember past played strategies and observed payoffs, or cumulative information thereof. Here, too, our rate of convergence is nearly-optimal and represents an almost-quadratic improvement over the best previously known strongly-uncoupled dynamics.

KEYWORDS

SHARE & LIKE

COMMENTS

ABOUT THE AUTHOR

Constantinos Daskalakis

Associate Professor at MIT's Electrical Engineering and Computer Science department, a member of CSAIL, and affiliated with LIDS and ORC.

0 Following 2 Fans 0 Projects 10 Articles

SIMILAR ARTICLES

论文摘要 最近在[12]中表明, 在所有具有任意(可能的组合)可行性约束和受任意(可能组合)需求约束的独立添加的投标人的多维贝叶斯拍卖问题中,收入最优化可以从计算上有效减持至福利最优化。这种减持在所有拍卖假设中给可以得到有效解决的最优化机制设计问题提供了一种多项式时间解决方案,但它的这种逼近很脆弱且不

Read More

论文摘要 线性编程低密度奇偶校验码(和例如压缩感知的相关领域)的解码由于其近似于迭代解码算法的实际性能和它对有限块长度分析的可处理性在近几年已获得越来越多的关注。一些和Feldman等人所著同期的著作展示了如何使用膨胀曲线特性分析LP解码。这一分析线性具有低误差率,比经验观察性能更低上几个数量级。如同

Read More

Abstract:We propose a new no-regret learning algorithm. When used against an adversary, our algorithm achieves average regret that scales as O (1/√T) w

Read More

论文摘要: 1951年,约翰·F·纳什证明了任何博弈都存在纳什均衡。但是他的证明是非构造性质的,依赖于布劳威尔的不动点定理,于是也引发了以下的问题:是不是存在一种多项式时间算法来计算纳什均衡呢?以及这是不是取决于布劳威尔其不动点定理的内在规律?基于此点出发,之后提出了很多算法试图寻找纳什均衡点,但是没

Read More

论文摘要: 基于允许插入与缺失(即简称为indels)条件的层序演化模型我们引进了一级多项式时间系统发育重建算法。基于合适的假设,我们的算法要求序列长度随叶形分类群数呈现多项式形式的增长。我们的方法是是基于距离测量而且在很大程度上绕开了多重调节的问题。Abstract:We introduce the

Read More

论文摘要: 对于多维度单位需求定价问题即当消费者的心理价值相互独立时(但是未必是恒等分布的),我们提供了一个多项式时间近似算法。对于任意epsilon>0的情况,我们得到(1+epsilon)这个因子来近似最优收益。有以下三种情况:当这些价值数据是从符合单调风险率分布中取样得到时,近似值符合时间多项式

Read More

论文摘要 我们引入了一种新系统发育重建算法,不同于以往大多数严谨的推理技术,该算法不依赖于有关分支长度或树深度的假设。该算法返回了保证含有以下所有边界的森林: 1)足够长 2)足够接近叶。 真正树中有多少能够得到恢复依赖于所提供序列的长度。该算法基于距离且在多项式时间中运行。 Abstract:We

Read More

论文摘要 排序和搜索的经典问题假设了被比较对象的一个基本线性排序。在本文中我们基于偏序背景研究了这些问题,其中对象的某些对不可类比。从组合角度来看,这种概括很有趣,且它可在例如会议提交的非基于线性排序排名场景中有直接应用。它也可应用于例如包括生物网络的某些类型网络的重构中。与二十年前由Faigle和T

Read More

Abstract:A major task of evolutionary biology is the reconstruction of phylogenetic trees from molecular data. The evolutionary model is given by a Mar

Read More

Abstract:We introduce CLS, for continuous local search, a class of polynomial-time checkable total functions that lies at the intersection of PPAD and

Read More