Behavioral integrity captures the perceived alignment between an individual’s words and actions (Simons 2002). In a recently published article (Dikolli et al. 2020), my coauthors and I proposed a method of measuring the behavioral integrity of public company CEOs. Using this new measure, we show that auditors charge higher fees to companies led by CEOs with lower behavioral integrity. The critical data source in developing this measure of behavioral integrity is the letter written from the CEO to the company’s shareholders each year as part of the annual report.
To expand our understanding of how CEOs’ behavioral integrity impacts their companies and their stakeholders, we need to extend our database of shareholder letters. While the letters can be obtained from various sources, we want to start by analyzing the letters made available recently by the Yale Library’s acquisition of API access to Mergent Online’s database of annual reports. The Library has secured access to this database via API, and we are looking for help in downloading and extracting the shareholder letters from these annual reports. Once the database of shareholder letters has been expanded, we will update our behavioral integrity data and also investigate the content of the letters to examine the narratives employed by CEOs and how these relate to firm-specific and macroeconomic trends.
Requisite Skills and Qualifications:
Experience with programmatically accessing data from APIs (e.g., using Perl/Python/Postman to access data from a SOAP API), for downloading large numbers of documents and maintaining a database of identifying data corresponding to each downloaded document. Some experience with using Perl or Python for textual analysis / natural language processing would be useful, along with experience extracting particular pages of a PDF and converting PDF documents to plain text. Some familiarity with the SEC’s document database (EDGAR) would also be useful, although this is not required.