Dynamic Pricing using Multi-Armed Bandits

Multi-armed bandits using reinforcement learning have commonly been used by a decision-maker to learn in uncertain environments, where the goal is to both learn and to exploit current information consistent with utility maximization. This common exploration-exploitation framework has applications in a number of areas, including robotics, clinical trial design, advertising and dynamic pricing. Our project seeks to characterize a variety of reinforcement learning approaches, and determine a better approach to personalized pricing using reinforcement learning. The project is likely to including simulation studies as well as empirical analysis using data from a well known platform that serves as a marketplace for buyers and artists.

The RA would be expected to contribute in a number of ways to the project, including but not limited to:

Identifying, reviewing and presenting state of the art literature in the area
Data analysis to document and identify patterns, doing regression analysis and similar tasks.
Implementing existing algorithms and adapting them to the current research design
generate regular reports on aspects requested by professors.

Requisite Skills and Qualifications:

Data Analysis, Programming in R / Python, Data Visualization and Report Generation. Experience with Dynamic Programming is a plus but not required.