Dynamic Pricing using Multi-Armed Bandits
Multi-armed bandits using reinforcement learning have commonly been used by a decision-maker to learn in uncertain environments, where the goal is to both learn and to exploit current information consistent with utility maximization. This common exploration-exploitation framework has applications in a number of areas, including robotics, clinical trial design, advertising and dynamic pricing. Our project seeks to characterize a variety of reinforcement learning approaches, and determine a better approach to personalized pricing using reinforcement learning. The project is likely to including simulation studies as well as empirical analysis using data from a well known platform that serves as a marketplace for buyers and artists.
The RA would be expected to contribute in a number of ways to the project, including but not limited to:
- Identifying, reviewing and presenting state of the art literature in the area
- Data analysis to document and identify patterns, doing regression analysis and similar tasks.
- Implementing existing algorithms and adapting them to the current research design
- generate regular reports on aspects requested by professors.
Requisite Skills and Qualifications:
Data Analysis, Programming in R / Python, Data Visualization and Report Generation. Experience with Dynamic Programming is a plus but not required.