Which Features Matter When Many Models Fit the Data?
A central goal of social science is to understand what drives outcomes. Across domains such as consumer behavior, customer satisfaction, and political attitudes, researchers seek to identify which factors matter and how important they are. Feature importance sits at the core of this enterprise, translating empirical patterns into scientific claims, managerial recommendations, and policy guidance.
In applied work, a common workflow is to fit a predictive model and apply a post-hoc explanation method, such as SHAP, to estimate feature importance. These estimates are often treated as if they reflect stable properties of the underlying phenomenon. Yet this interpretation implicitly assumes that the fitted model is a faithful stand-in for the data-generating process.
This project challenges that assumption. The Rashomon effect shows that many datasets admit multiple models with similarly strong predictive performance but meaningfully different internal structures. Because explanations are derived from models, model multiplicity implies explanation multiplicity: equally accurate models can assign different importance to the same feature.
Consider a simple but common example. A firm measures multiple dimensions of customer experience—such as empathy, reliability, and assurance—and uses them to predict an overall satisfaction rating. While these component measures are relatively stable and interpretable, the final satisfaction score is often noisy and subjective. Because predictive accuracy is inherently limited, many distinct models can explain satisfaction equally well. One model may emphasize empathy, another reliability, and a third some combination of both. All fit the data, yet each suggests a different story about what “matters most.”
Rather than treating feature importance as a single point estimate, this project proposes to define it over the Rashomon set of near-optimal models. Features whose importance is stable across models can be distinguished from those whose apparent influence depends on modeling choices. In this view, disagreement across models is not a nuisance but information about what the data can and cannot support.
By estimating feature importance via the Rashomon set, this project aims to provide a more principled foundation for interpretation in machine-learning–based social science, clarifying when explanatory claims are robust and when uncertainty is intrinsic.
Requisite Skills and Qualifications:
We are looking for research assistants who are intellectually curious and interested in questions at the intersection of machine learning, data analysis, and social science inference. Ideal candidates should be motivated by understanding why models behave as they do, not just how to improve predictive performance.
A strong applicant may have experience in one or more of the following areas, though no single background is required. Familiarity with empirical data analysis, statistical reasoning, or machine learning concepts is helpful. Experience working with predictive models, interpretability tools, or high-dimensional datasets is a plus but not necessary.
Equally important are strong analytical thinking and a willingness to engage with open-ended research questions. The project involves reasoning about uncertainty, robustness, and interpretation, so comfort with ambiguity and interest in foundational questions are highly valued. Programming experience (e.g., Python, R, or similar languages) and the ability to work with data are beneficial, but students with strong quantitative intuition and a desire to learn are encouraged to apply.
This project is well suited for students considering graduate study or research careers in economics, marketing, statistics, data science, computer science, psychology, or related fields. Successful research assistants will have opportunities to develop new skills, contribute intellectually to an ongoing research agenda, and engage deeply with questions that matter for empirical social science.