Any letter-writer signatures and titles were deleted prior to analysis to avoid introducing bias

Our longer-term goals will be to see the effects of this system on the promotion process within the department with an expectation that more junior faculty will become eligible for advancement. These effects will be evaluated by tracking the progress and content of junior faculty teaching portfolios compared to previous years and time to successful promotion. With a bottom-heavy young faculty group, our expectation is that this system will better prepare people for promotion as they can track their activities and determine where they need to place more effort to enhance their portfolio. Finally, this system will be used to improve the mentorship infrastructure within the department. Assigned faculty mentors will use the ARVU dashboard to mentor junior faculty on their progress for promotion. This dashboard will provide another data point for mentors to advise junior faculty where they need to focus their efforts in order to progress professionally. Gender disparities exist in academic medicine. Women in academic medicine are less likely to achieve the rank of professor or hold senior leadership positions compared to men, even after adjusting for age, experience, specialty, and research Stanford University School of Medicine, Department of Emergency Medicine, Palo Alto, California Northwestern University Feinberg School of Medicine. Department of Emergency Medicine, Chicago, Illinois Ohio State University College of Medicine, Department of Emergency Medicine, Columbus, Ohio productivity.Previous studies in other professional fields have shown that there are differences in language used in describing men and women in letters of recommendation.Additional studies have shown that evaluations of women medical students are more likely to describe women as “caring,” “compassionate,” and “empathetic,” in addition to “bright” and “organized,” than male medical students.

In addition, women are more often portrayed as teachers and students, and less often portrayed as researchers or professionals compared to men.Within emergency medicine the letter of recommendation, including both standardized letters and traditional letters, has been cited as one of the top four most important factors in selecting applicants to residency, along with EM rotation grade, interview, and clinical grades.More specifically, the letter of recommendation has been cited as the most important factor in selecting applicants to interview.Historically, in EM, letters of recommendation were written without guidelines or restrictions. In 1996, the Council of Residency Directors in Emergency Medicine implemented the standardized letter of recommendation , which was renamed the standardized letter of evaluation in 2013. The SLOE contains both a quantitative evaluation of an applicant and a narrative portion of 250 words or less.The SLOE narrative provides a focused assessment of the noncognitive attributes of potential residency candidates.The standardized format and universal instructions make the SLOE a good text sample to study for variation in language by gender. Additionally, while there are several studies analyzing traditional letters of recommendation for language variation between genders, there is a gap in the current literature in analyzing standardized letters of recommendation. Previously, our research team published a study in Academic Emergency Medicine Education and Training that showed minimal differences in language use between genders in evaluating 237 SLOEs from applicants invited to interview to a single academic EM residency for the 2015-2016 application cycle.The small dataset, and potential for a homogeneous sample , prompted the current investigation with a goal of confirming or refuting the original results with a larger dataset.

The choice to include all applicants was made with a goal of potentially increasing the variability in the language used within the SLOE . The aim of this study was to compare differences in language within specific word categories to describe men and women applicants in the SLOE narrative for all applicants to a single academic EM residency program for the 2016-2017 application cycle. We secondarily sought to determine whether there was an association between word categories’ differences and invitation to interview, regardless of gender, in order to better contextualize the possible importance of wording differences.SLOE narratives for all applicants to the residency for the application cycle 2016-2017 were downloaded from ERAS by the program coordinators and converted to Microsoft Word format. We included the narrative portion of the SLOE in analysis. The narrative is limited to 250 words and asks the writer to “Please concisely summarize this applicant’s candidacy including… Areas that will require attention, Any low rankings from the SLOE, and Any relevant non-cognitive attributes such as leadership, compassion, positive attitude, professionalism, maturity, self-motivation, likelihood to go above and beyond, altruism, recognition of limits, conscientiousness, etc.”If applicants submitted more than one SLOE, the SLOE from the first chronological clinical EM rotation was included in analysis. We analyzed firstrotation SLOEs, as opposed to all SLOEs, to provide a uniform evaluation of student performance and limit word differences based on varying experiences in time. Additionally, not every applicant had more than one SLOE. Exclusion criteria included applicants from non-Liaison Committee on Medical Education schools, as well as applicants with a first-rotation SLOE that was not available to be downloaded from ERAS. Analysis began after all NRMP decisions had been made and finalized and did not affect an applicant’s invitation to interview or placement on the rank list. Prior to analysis, each letter was read by two reviewers who screened for “stock” language.

These "stock" or standardized sentences were not related to applicant characteristics. They included statements in certain categories such as statements regarding waiving rights to see the letter ; stock opening statements ; stock closing statements ; descriptors of the rotation ; descriptors of grade calculation ; and descriptors of the letter writer . Pronouns were not made pleural or deidentified prior to analysis.This analysis found small but quantifiable differences in word frequency between genders in the language used in the SLOE. In this study, differences between genders were present in two categories: social words and ability words, with women having higher word frequency in both categories. Our prior investigation found differences of similar magnitude in affiliation words and ability words, with letters for women applicants having higher word frequency in both categories. For both studies, the differences in word frequency were statistically significant, but it is difficult to comment or draw conclusions about the significance of these small wording differences on application or educational outcomes. What is perhaps more notable than the presence of differences in two categories is the lack of difference in the remaining 14 categories. When looking specifically at the categories that had gender differences, our finding of ability words being used to describe women applicants more frequently than men applicants is in contrast to previous studies, while our other research finding, that women are more frequently described with social words than men, is in alignment with previous studies. In the medical literature, letters of recommendation for men applying for faculty positions contain more ability attributes such as standout adjectives and research descriptors than letters for women,and letters for women in medical school applying for residency positions are more frequently described by non-ability attributes such as being caring, compassionate, empathetic, bright, and organized.Looking specifically at ability words, this word category had statistically significant differences in both this investigation and our prior study, with ability words occurring more frequently for women than men. Ability words include descriptors such as talented, skilled, brilliant, proficient, adept, intelligent, and competent.

This consistency of findings between the two samples suggests that letter writers employ multiple descriptors within the ability category to convey proficiency of women applicants. However, the reason for this difference is unclear. Notably, the word “bright” is one of the ability words for which there was no gender difference found, counter to findings from prior research wherein women applicants were more often described as bright.6,18 While the descriptor “bright” is often considered a compliment, it has also been suggested that its use “subtly undermines the recipient of the praise in ways that pertain to youth and, often, gender” stemming from its association with the phrase “bright young thing.” The finding that women were more frequently described with social words aligns with previous studies of letters of recommendations. Studies in letters of recommendation for psychology and chemistry faculty positions have shown that women are often described as communal , while men are described as agentic and have more standout adjectives Other studies have found women to be described as more communicative.We employed a secondary analysis with respect to the invitation to interview to determine if small differences in word categories were associated with invitation to interview. The adjusted analysis showed an association between more standout words and invitation to interview; however, this analysis did not account for other factors that may influence invitations to interview . Although these findings represent an association and not causation, they help to contextualize the potential importance of small differences in word use, although this is not conclusive. Notably, neither social words nor ability words influenced the choice to interview, and there was an equitable frequency of standout words between genders. Despite the small word differences in the categories of social and ability words, we did not find a difference in the 14 other word categories queried. There are several possible explanations for this lack of a finding. It is possible that the sample was under powered to detect small wording differences in the 14 word categories. Another explanation is that the SLOE format itself may be driving the lack of a difference. The short word format of the SLOE and specific, detailed instructions as noted above may reduce bias. Other explanations include the increasing use of group authorship, which may introduce less bias than individual authorship. In 2012, a sampling of three EM residencies calculated that 34.9% of SLORs were created by groups.24 In 2014, 60% of EM program directors participated in group SLORs, 85.3% of departments provided a group SLOR, and 84.7% of PDs preferred a group SLOR.Although the sample size and lack of a standard comparator limit the ability to determine why we did not find a difference for the majority of word categories, we hypothesize that it is related to the format and hope to further support that hypothesis through future work examining paired SLOE and full-length letters for candidates. A recently published study by Friedman and colleagues in the otolaryngology literature has been the only study, in addition to our own, to our knowledge that evaluates a standardized letter for gender bias. In this 2017 study, the SLOR and more traditional NLOR in otolaryngology residency applications were compared by gender, concluding that the SLOR format reduced bias compared to the traditional NLOR format. Although in both letter formats some differences persisted , the SLOR format resulted in less frequent mention of women’s appearance and more frequent descriptions of women as “bright.”Although their analysis strategy differed from the one we used in this study, their findings parallel ours in that there are minimal differences by gender in a restricted letter format and highlight the need for further study of the how the question stem and word limitations may be intentionally built to minimize bias. Lastly, of note, our study focused specifically on differences in language use in the SLOE. This study does not evaluate the presence or absence of gender bias in the quantitative aspects of the SLOE, nor does our multi-variable model include other factors that would influence the invitation to interview such as rotation grades, test scores, school rank, or AOA status. Such analyses were beyond the scope of our study, which was focused on the SLOE narrative itself. Other studies have evaluated this but have not evaluated the narrative portion of the SLOE.Additionally, there remain many other forms of evaluation, numerical and narrative, in medical training, in addition to the SLOE that have analyzed gender bias. Recent studies have suggested that bias persists in other forms of evaluation. Specifically, Dayal and colleagues’ recent publication notes lower scores for women residents in EM Milestones ratings compared to male peers as they progress through residency.Evaluations of narrative comments from shift evaluations are another area of interest, of which we are aware of two current investigations underway in EM programs. Additionally, a study of evaluations of medical faculty by physician trainees by Heath and colleagues also showed gender disparities.As this body of literature continues to grow and interventions are developed to minimize bias in all narrative performance evaluations, we believe it will be important to think carefully about the question stems and response length allowed.

