This article considers a class of experimentation games with Lévy bandits encompassing those of Bolton and Harris (1999, Econometrica, 67, 349–374) and Keller, Rady, and Cripps (2005, Econometrica, 73, 39–68). Its main result is that efficient (perfect Bayesian) equilibria exist whenever players’ payoffs have a diffusion component. Hence, the trade-offs emphasized in the literature do not rely on the intrinsic nature of bandit models but on the commonly adopted solution concept (Markov perfect equilibrium). This is not an artefact of continuous time: we prove that efficient equilibria arise as limits of equilibria in the discrete-time game. Furthermore, it suffices to relax the solution concept to strongly symmetric equilibrium.
Rating systems not only provide information to users but also motivate the rated agent. This paper solves for the optimal (effort-maximizing) rating system within the standard career concerns framework. It is a mixture two-state rating system. That is, it is the sum of two Markov processes, with one that re-effects the belief of the rater and the other the preferences of the rated agent. The rating, however, is not a Markov process. Our analysis shows how the rating combines information of different types and vintages. In particular, an increase in effort may affect some (but not all) future ratings adversely.
We provide tight bounds on the rate of convergence of the equilibrium payoff sets for repeated games under both perfect and imperfect public monitoring. The distance between the equilibrium payoff set and its limit vanishes at rate (1 − δ)1/2 under perfect monitoring, and at rate (1 − δ)1/4 under imperfect monitoring. For strictly individually rational payoff vectors, these rates improve to 0 (i.e., all strictly individually rational payoff vectors are exactly achieved as equilibrium payoffs for delta high enough) and (1 − δ)1/2, respectively.
This paper studies the design of a recommender system for organizing social learning on a product. To improve incentives for early experimentation, the optimal design trades off fully transparent social learning by over-recommending a product (or “spamming”) to a fraction of agents in the early phase of the product cycle. Under the optimal scheme, the designer spams very little about a product right after its release but gradually increases the frequency of spamming and stops it altogether when the product is deemed sufficiently unworthy of recommendation. The optimal recommender system involves randomly triggered spamming when recommendations are public — as is often the case for product ratings — and an information “blackout” followed by a burst of spamming when agents can choose when to check in for a recommendation. Fully transparent recommendations may become optimal if a (socially-benevolent) designer does not observe the agents’ costs of experimentation.
We analyse strategic experimentation in which information arrives through fully revealing, publicly observable “breakdowns.” With hidden actions, there exists a unique equilibrium that involves randomization over stopping times. This randomization induces belief disagreement on the equilibrium path. When actions are observable, the equilibrium is pure, and welfare improves. We analyse the role of policy interventions such as subsidies for experimentation and risk-sharing agreements. We show that the optimal risk-sharing agreement restores the first-best outcome, independent of the monitoring structure.
We analyze the optimal design of dynamic mechanisms in the absence of transfers. The designer uses future allocation decisions as a way of eliciting private information. Values evolve according to a two-state Markov chain. We solve for the optimal allocation rule, which admits a simple implementation. Unlike with transfers, efficiency decreases over time, and both immiseration and its polar opposite are possible long-run outcomes. Considering the limiting environment in which time is continuous, we show that persistence hurts.
We study a discrete-time model of repeated moral hazard without commitment. In every period, a principal finances a project, choosing the scale of the project and a contingent payment plan for an agent, who has the opportunity to appropriate the returns of a successful project unbeknownst the principal. The absence of commitment is reflected both in the solution concept (perfect Bayesian equilibrium) and in the ability of the principal to freely revise the project’s scale from one period to the next. We show that removing commitment from the equilibrium concept is relatively innocuous — if the players are sufficiently patient, there are equilibria with payoffs low enough to effectively endow the players with the requisite commitment, within the confines of perfect Bayesian equilibrium. In contrast, the frictionless choice of scale has a significant effect on the project’s dynamics. Starting from the principal’s favorite equilibrium, the optimal contract eventually converges to the repetition of the stage-game Nash equilibrium, operating the project at maximum scale and compensating the agent (only) via immediate payments.
This paper studies strongly symmetric equilibria (SSE) in continuous-time games of strategic experimentation with Poisson bandits. SSE payoffs can be studied via two functional equations similar to the HJB equation used for Markov equilibria. This is valuable for three reasons. First, these equations retain the tractability of Markov equilibrium, while allowing for punishments and rewards: the best and worst equilibrium payoff are explicitly solved for. Second, they capture behavior of the discrete-time game: as the period length goes to zero in the discretized game, the SSE payoff set converges to their solution. Third, they encompass a large payoff set: there is no perfect Bayesian equilibrium in the discrete-time game with frequent interactions with higher asymptotic efficiency.