Article Text


Performance gaps and improvement plans from a 5-hospital simulation programme for anaesthesiology providers: a retrospective study
  1. Samuel DeMaria Jr1,
  2. Adam Levine1,
  3. Philip Petrou2,
  4. David Feldman3,
  5. Patricia Kischak3,
  6. Amanda Burden4,
  7. Andrew Goldberg1
  1. 1Department of Anesthesiology, Icahn School of Medicine at Mount Sinai Hospital, New York, New York, USA
  2. 2Icahn School of Medicine at Mount Sinai Hospital, New York, New York, USA
  3. 3Hospitals Insurance Company, New York, New York, USA
  4. 4Department of Anesthesiology, Cooper Medical School of Rowan University, Camden, New Jersey, USA
  1. Correspondence to Dr Samuel DeMaria Jr, Department of Anesthesiology, Mount Sinai Health System, New York, NY 10029, USA; demarisa{at}


Background Simulation is increasingly employed in healthcare provider education, but usage as a means of identifying system-wide practitioner gaps has been limited. We sought to determine whether practice gaps could be identified, and if meaningful improvement plans could result from a simulation course for anaesthesiology providers.

Methods Over a 2-year cycle, 288 anaesthesiologists and 67 certified registered nurse anaesthetists (CRNAs) participated in a 3.5 hour, malpractice insurer-mandated simulation course, encountering 4 scenarios. 5 anaesthesiology departments within 3 urban academic healthcare systems were represented. A real-time rater scored each individual on 12 critical performance items (CPIs) representing learning objectives for a given scenario. Participants completed a course satisfaction survey, a 1-month postcourse practice improvement plan (PIP) and a 6-month follow-up survey.

Results All recorded course data were retrospectively reviewed. Course satisfaction was generally positive (88–97% positive rating by item). 4231 individual CPIs were recorded (of a possible 4260 rateable), with a majority of participants demonstrating remediable gaps in medical/technical and non-technical skills (97% of groups had at least one instance of a remediable gap in communication/non-technical skills during at least one of the scenarios). 6 months following the course, 91% of respondents reported successfully implementing 1 or more of their PIPs. Improvements in equipment/environmental resources or personal knowledge domains were most often successful, and several individual reports demonstrated a positive impact on actual practice.

Conclusions This professional liability insurer-initiated simulation course for 5 anaesthesiology departments was feasible to deliver and well received. Practice gaps were identified during the course and remediation of gaps, and/or application of new knowledge, skills and resources was reported by participants.

Statistics from


Simulation-based education is currently widespread in the healthcare industry. Anaesthesiologists, for example, have opportunities through continuing medical education offerings and the American Board of Anesthesiology (ABA) Maintenance of Certification in Anesthesiology (MOCA),1 Part 4: Improvement in Medical Practice Program.2–6 Results from the MOCA programme have been published elsewhere.3 ,5 In general, courses garner positive ratings and practice improvement plans (PIPs) target gaps in practitioners' work environments, teamwork skills and individual knowledge. Similar programmes for certified registered nurse anaesthetists (CRNAs) have also been reported.7 ,8

Simulation has been identified by professional liability insurers as an effective educational strategy9–14 and a way to probe patient-care environments for hazards and failures.15 ,16 Insurers continue to move beyond providing legal defence and compensation for damages resulting from malpractice, to adopting strategies in an effort to avoid incidents altogether.17 ,18 Subsequently, several insurers have incentivised simulation through sponsored educational programmes in anaesthesiology9 ,19 and other specialties.20–23 While educational courses are widely available, real-time performance data from those courses are rarely reported. This represents a missed opportunity.

As evidence that simulation-based courses are effective for formative assessment and identification of gaps continues to mount,5 ,11 ,24 ,25 these programmes may address important departmental or liability-carrier needs by allowing for education and assessment in a more convenient fashion (ie, at practitioners' home institutions or with no added individual cost). While questions remain about ideal content and frequency, these courses present a unique opportunity for stakeholders when education, formative assessment and identification of aggregated practitioner or systems-based gaps identification are desired. We describe a malpractice carrier-supported anaesthesia simulation programme and present findings from the first course cycle, with the primary focus of this report being the practice gaps observed and retrospectively analysed from a large cohort of anaesthesia providers. We also present the PIPs submitted by participants as well. In doing so, we hope to build on similar data reported from the MOCA courses by reporting observations of what faculty anaesthesiologists and CRNAs actually do during simulation-based encounters, to illustrate gaps potentially present in one large group of anaesthesia providers.


The reporting of these data was approved by the Mount Sinai Institutional Review Board and granted exemption from the need for written consent given the retrospective nature of the data. The IRB was informed that anonymised survey and performance data (see details below) were going to be reviewed and analysed for this retrospective study.

Hospitals Insurance Company (HIC) is the professional liability insurer for three major New York healthcare entities (The Mount Sinai Health System, Montefiore Health System and Maimonides Medical Center), encompassing two medical schools and five hospitals at the inception of this course. Previous authors have described a HIC-sponsored programme in obstetrics.22 This work was supported by HIC Grant: Anaesthesia Simulation Program 001. This grant provided US$710 000 divided over 2 years for the development and administration of a mandatory simulation initiative for anaesthesiologists and CRNAs in the system. The grant was initiated by HIC as a way of improving risk avoidance for anaesthesiology programmes in the system, and participation and completion of all materials (ie, attending the course, filling out a course survey and doing a PIP) were mandatory to maintain credentialing and departmental expenses associated with lost practitioner time due to course attendance were covered. No members of HIC, particularly those authoring this manuscript (PK, DF), played any role in data collection, analysis or course instruction.

HIC did not initiate this course as a result of poor outcomes, or with specific topics in mind. Instead, the course was implemented as a presumed way of bolstering patient safety and in particular to encourage routine and ongoing anaesthesia provider participation in simulation programmes. Simulation faculty was subsequently tasked with addressing topics they felt would fulfil this mission. To ensure quality and standardisation, all participants attended the course provided at the Department of Anesthesiology of the Icahn School of Medicine at Mount Sinai simulation centre. This programme was the one programme among the HIC hospitals that had American Society of Anesthesiologist (ASA) endorsement for MOCA (as of 2010).

On the basis of quality improvement data at The Mount Sinai Hospital and the site-specific MOCA simulation course experience (ie, participant performance, course evaluations and MOCA PIPs), four of the eight pre-existing scenarios used for the MOCA course were chosen as templates for the 3.5-hour course. A modified Delphi process was used among the simulation faculty to determine the learning objectives for each scenario (using an estimate-talk-estimate approach),26 and simulation course directors (AL, SD) finalised the scenarios prior to pilot testing.

The four scenarios were designed to incorporate aspects of crisis resource management (eg, team work, leadership and membership, communication, resource allocation) and medical/technical skills (ie, jet ventilation (JV) and needle cricothyrotomy, advanced cardiac life support (ACLS) knowledge/management and use of a defibrillator) (see online supplementary appendix 1). Also, course equipment being emphasised in each scenario was standardised so that practitioners across the hospitals could work with familiar devices. The pilot course was conducted for the department Chairs of the associated hospitals, who served as the first four participants, while HIC leadership observed in real-time through a closed-circuit audio-video system. Scenarios were then modified based on these participants' feedback and feedback from HIC leadership who observed this pilot scenario (eg, an updated timeout protocol was incorporated, and scenarios were made shorter by ∼2–3 min). Scenarios were also designed to incorporate equipment that was standard across the hospitals (eg, defibrillators and Datex-Ohmeda Anaesthesia Machines, GE Healthcare Chicago, Illinois, USA). Finally, the 11 faculty members of the Mount Sinai simulation group attended a 1-day training programme regarding delivery and debriefing of the scenarios before the programme started. This training was for the instructor portion of the course and not for training faculty to be raters. The first iteration of the course occurred from January 2013 to December 2014.

All participants were notified at the start of the course that performance data would be recorded as a part of guiding the debriefings and also for reporting out to departmental leadership for performance improvement purposes, but also for research purposes. All data would be anonymised, and no individual performance data would be reported out in an identifiable manner. Groups of six course participants were mixed by departmental membership but not by degree (ie, MD or CRNA) to form two triads in each course that stayed together for each of the four scenarios. Successful completion of the course required participants to (1) attend the course (ie, participate in four scenarios in assigned triads), (2) complete an anonymous individual satisfaction survey (rating several statements about the course on a Likert scale with 1=strongly disagree and 5=strongly agree) and (3) complete a PIP 1 month after the initial course (describing their plans for incorporation into practice as a result of the course material and whether they were successful or not in implementing these plans). The course director (AL) reviewed all plans for content and decided on credit. Participants knew that performance data were being recorded for reporting purposes, but that no individual data would be sent in an identifiable format to their employer.

Each participant was required to submit three PIPs, for a total of 1065 PIPs to be received by cycle-end. Course components are described in online supplementary appendix 2. Performance improvement plans were open-ended responses to the following statement: “Please describe your three performance improvement plans briefly, and provide detail regarding whether you were or were not successful, and whether you encountered any barriers to implementation.” All participants were briefed on the fact that PIPs would be reviewed and needed to be related to the scenarios they encountered. If PIPs were made for other aspects of practice, or for scenarios that were not encountered, credit would not be given.

During each course, a real time rater (RTR) was present to grade individual participants and groups on critical performance items (CPIs) (also identified by the Delphi method in the course development process and scored simply as ‘yes’ or ‘no’) (see online supplementary appendix 2). For each scenario, there were two to four CPIs rated for each individual participant. These data were to be collected as a means of reporting back to stakeholders the sort of gaps, if any, which were present in this population of practitioners, in aggregate (ie, not for research purposes, per se). All participants were informed that anonymous performance data were being recorded and that their Chairs would not be informed of any individual failures. Four RTRs, in total, were trained. Two RTRs attended each course. One RTR rated each group of three participants in the first two scenarios, and one RTR rated the second two scenarios. As two scenarios were occurring simultaneously in separate locations, only one RTR could be available for any given scenario. Each RTR, therefore, rated each of their assigned scenarios twice (ie, once for each group) in a given course. Each RTR was a board-certified anaesthesiologist member of the simulation team at Mount Sinai, an ACLS instructor and actively involved in simulation-based education of medical students, residents and attending anaesthesiologists. Training consisted of real-time rating in pairs during six MOCA courses where these four scenarios were already being used. As these data were being collected for quality reporting purposes to HIC, no further validation of raters was performed (eg, inter-rater reliability).

Individual CPIs were generic enough so as to be applicable for each individual. For example, ‘reliably operates a biphasic defibrillator’ was something each member in the team would be required to demonstrate during the scenario due to its design. However, if a rater was unclear on the performance of a certain CPI, they could ask individuals to demonstrate certain skills in the debriefing (eg, show me how to operate the defibrillator).

Two different group CPIs were also developed, where the performance of the group of three participants was recorded as one holistic rating: one for the ACLS scenario (in order to evaluate the team's management of a simulated cardiac arrest as a team), and one for the team's overall communication/teamwork throughout the course itself (overall course performance to determine if at least one major remediable gap in communications/teamwork could be identified for a given group so that this could be highlighted in the debriefing of the entire course). A major remediable communication/teamwork gap was considered any failure in teamwork or communication that could lead to or has been shown to play a role in patient harm (eg, failure to use closed-loop communication, unclear leadership during crises, poor resource utilisation, division of workload, etc).

Forty de-identified PIPs were randomly selected for coding analysis, based on methodology in previously published literature,5 in order to train and calibrate the PIP raters (who were separate and distinct from the RTRs). Briefly, four raters not involved in the study (board-certified anaesthesiologists in practice >3 years) each took the 40 PIPs and categorised them into one of five distinct categories (by consensus): improvements in communication/teamwork, environment/equipment, airway management, ACLS or dual antiplatelet therapy (DAPT) management. These categories were to be used for coding of the PIPs. After PIP rater training and category creation, 1002 de-identified PIPs (63 were lost) were included for final analysis and coded. Any discrepancies were resolved by a senior investigator (SD or AL). Six months after completion of the course, each participant received an anonymous survey to ascertain PIP implementation and several post-test knowledge questions.

Statistical analysis

Data were entered into an Excel spreadsheet (Microsoft Corp., Redmond, Washington, USA) and transferred to an SAS file (SAS Institute, Cary, North Carolina, USA) for data description and analysis. Descriptive data are presented as N (percentage) or mean (95% CI). For simple group comparison, the Fisher's exact test was used for categorical variables. All statistical analyses were performed using SAS 9.2 (SAS Institute.) with a 0.05 two-sided significance level.


Of the 355 participants, 288 (81%) were anaesthesiologists and 67 (19%) were CRNAs (table 1). Most anaesthesiologists (94.4%) were board certified; 16 individuals were not and 12 of these were in practice <2 years (ie, enrolled in the board process at the time of participation). All participants were ACLS certified with active certification at the time of the course. Seventy courses were conducted in total.

Table 1

Demographic data

The anonymous course evaluation survey was completed by 97% of participants immediately after the simulation course. The range of ‘agree’ and ‘strongly agree’ responses for each satisfaction survey item was 88–97%.

The coded responses to the immediate postcourse open-ended survey question, “What is the most important thing you will take away from this course?” referred to, communication/teamwork (65% of responses), the environment/equipment (17%), DAPT guidelines (8%), airway management (6%) and ACLS (4%). 1002 de-identified PIPs received 30-days postcourse were categorised by topic, with the following proportional representation by topic; communication/teamwork (61%), the environment/equipment (20%), DAPT guidelines (5%), airway management (8%) and ACLS (6%). Only one participant had an inadequate plan (having submitted one PIP item) but was able to amend the PIP by addition of two other plans.

Of the 4260 possible CPIs recorded (12 CPIs per participant), there were 29 non-rateable instances for 16 different practitioners (<1%), all from anaesthesiologists. The performance data are reported by group (ie, anaesthesiologist and CRNA) in table 2.

Table 2

Critical performance items (CPIs) recorded by real time raters (RTRs)

In the JV scenario, few participants recognised the inadequacy of pressures delivered by improvised devices, and less than one-third of either group could properly operate an actual jet ventilator. In the ACLS scenario, less than one-fourth of all participants performed ACLS at a level deemed completely correct in terms of diagnosis, drug dose and/or sequence. Despite a majority of individual participants demonstrating at least minor remediable gaps in their ACLS knowledge and skills (ie, wrong diagnosis, wrong drug, dose or sequence), 81% of all groups were rated as performing ACLS to a standard the RTR felt was acceptable management in keeping with current ACLS guidelines.

During the DAPT scenario, less than one-third of all participants could demonstrate how to access the ASA practice alert for perioperative coronary stent management and less than half appropriately delayed the elective surgery presented. Overall, 97% of all groups were rated as demonstrating at least one remediable gap in communication or teamwork skills during the course.

The response rate to the 6 month follow-up survey was 73%. The oxygen pipeline swap scenario and DAPT guidelines were cited as most impactful (45% and 20%, respectively). A large majority of respondents were successful in implementing more than 1 PIP (91%), and several reported implementing all three PIPs (78%). The PIPs most often reported as successfully completed were improvements in the work environment/equipment (45%) or personal knowledge (21%) domains. For those respondents who were unsuccessful in completing PIPs, 90% reported institutional constraints (eg, budgetary issues, lack of support from leadership) as being their major barriers to implementation.

In the open-ended feedback section, several respondents positively attributed individual clinical events or outcomes to the course in the 6 months after participation:

  1. one instance of successful needle cricothyrotomy/JV,

  2. one instance of successful intraoperative synchronised cardioversion for unstable ventricular tachycardia,

  3. 23 delayed elective cases after discovery of inappropriate cessation of DAPT and

  4. 10 successful equipment requests at off-site locations (including a defibrillator, JV and video laryngoscopes).

An equipment failure regarding JV resources was discovered when several participants, as part of their PIPs, examined their departmental difficult airway carts. In three cases at one institution, participants found that the JVs had the oxygen diameter index safety system (DISS) connectors removed, rendering the devices unusable. Only one participant submitted a negative review of the course in the 6 month survey.


Professional liability insurer-driven education has an established precedent, with simulation increasingly becoming a component of that process.17–20 ,23 This is largely driven by the success of simulation-based programmes for faculty level education and the use of simulation for formative assessment and systems improvements.2 ,3 ,5 ,6 The results from the first cycle of this mandatory, professional liability insurer-supported initiative illustrate that a simulation-based programme can also be useful to identify practice gaps (in aggregate) in a large cohort of anaesthesia providers and may directly improve their clinical practice (at least by self-report).

Participant satisfaction and practice improvement plans

Course satisfaction was high, and 91% of respondents reported being successful in implementing at least one PIP. This compares well with the 94% PIP success rate reported by Steadman et al for the MOCA cohort.3 ,5 Our participants most often chose PIPs related to improved communication/teamwork skills. However, they most often reported successful completion of PIPs in the environment/equipment domains.

While work environment and technical improvements are of general importance to hospital physicians,27–29 this discordance likely represents the challenge in making improvements in communication and teamwork domains when compared to environmental/equipment or personal knowledge gaps. It is clear that teamwork and communication are important in mitigating medical errors and patient harm,30–34 yet opportunities for formal training or to make cultural and policy changes (eg, improved perioperative timeouts) are likely limited for the average practitioner. This perhaps suggests a need for more widespread availability of such programmes as practitioners certainly desire these opportunities35 ,36 and in our cohort, most often submitted these as PIPs. The PIP and satisfaction data we report are indeed very similar to data from the MOCA cohort where environment/equipment improvements were reported most frequently, with teamwork improvements as the second most prevalent PIP. What cannot be gleaned, however, from previous work (but which we have shed light on in the present manuscript) is what the participants were actually doing during simulation, not just what they reported doing afterwards.

Critical performance items and translation to practice

Arriaga et al19 reported success in their pilot model for insurer-driven, multicentre simulation training for operating room teams, but they did not present performance data during their scenarios. The real-time CPIs we collected were valuable to course stakeholders but also may provide insight regarding practice gaps among anaesthesia providers, in general, and illustrate how educational programmes can be used to observe aggregate performance in large groups of practitioners, and hopefully help to fill gaps.

We found that several CPIs presented challenges for participants, with some important examples:

  1. JV scenario: The overwhelming majority of practitioners created makeshift JV devices. While many reports encourage improvised devices,37–41 it has been shown that they are unlikely to be useful and can even cause harm.42 Competent needle cricothyrotomy and JV operation were not uniformly observed in our cohort overall, which is consistent with previously reported data.42 From a resource standpoint, three JVs were found to be sabotaged by participants and led to departmental improvements in these emergency devices at one hospital. We were pleased also to see many participants' requests for airway management devices for off-site locations as a result of this scenario.

  2. ACLS scenario: The majority of practitioners deviated from ACLS guidelines in at least one domain (eg, drug, dose, sequence). While this rate is comparable to previously reported deviations from ACLS,43 ,44 the observed failures do emphasise the importance of continuous review and active ACLS certification in order to decrease loss of these decay-prone skills.45–47 Group performance, however, was generally good as rated by the RTRs. One participant reported successful cardioversion of unstable ventricular tachycardia, which they directly attributed to the course.

  3. DAPT scenario: To the best of our knowledge, no reliable data regarding the general level of knowledge regarding DAPT management among healthcare providers have been reported, but our data would suggest that potentially serious confusion exists among anaesthesia providers. While the ASA has practice parameters regarding DAPT,48 it is possible that not all practitioners are availing themselves of this information or that this constantly evolving information confuses practitioners. Twenty-three participants reported cancelling cases where DAPT was inappropriately stopped preoperatively, and it is not known how many other participants also did so as a result of the course.

  4. Pipeline scenario: In the pipeline contamination scenario, <20% of participants identified the delivery of a hypoxic gas mixture in <3 min. This may be explained by the workload experienced in the scenario we employed,10 though previous studies illustrate participants in pipeline contamination scenarios generally exhibit low levels of competency irrespective of added workload.24 ,25 ,49 ,50 While oxygen pipeline contamination is rare, gas delivery equipment is still a source of anaesthesia claims51 and thorough reviews of the workstation periodically are an important facet of continuing education.

While we noted some differences in CPIs between anaesthesiologist and CRNA groups, we did not compare the groups statistically given the lack of an a priori research design or CRNA volunteer rater. The gaps for anaesthesiologists may very well be different than the gaps for CRNAs, but this should be investigated further in larger studies with more detailed assessment goals and likely over multiple scenarios.


There are several limitations in this study. First, these scenarios were designed primarily for education and assessment aspects were formative, with no control group and a rating process reliant on a streamlined set of CPIs predominantly chosen for ease of use and practicality within an educational programme rather than an a priori research design approach. We could not staff each scenario with two RTRs, meaning that we do not have inter-rater reliability data for each CPI. Also, we did not compare RTR to post hoc video due to resource limitations. Owing to these limitations, the results of this study should be interpreted only at the level of observed retrospective data from an educational course. The simulations were deliberately challenging and in one case exceedingly rare (ie, oxygen pipeline swap). This may have led to falsely elevated failure rates that are likely not applicable to real life (ie, in actual practice, these providers exhibit few failures and most patients do well). Also, given the number and diversity of participants it was impractical to assess for lasting individual improvements and most of the resultant data rely on self-report and were not compared to performance ratings during simulations.

Another major limitation is that the scenarios were performed in groups, not as individuals, and results from strong performers may have contaminated those of weaker performers (or vice versa). Designing and implementing a course with an individual simulation for so many participants would have been impractical, however, without large amounts of funding. Best efforts were made to query each group member for the specific CPI where applicable or to have them individually demonstrate competence with certain medical/technical skills. We do not know the effect, if any, of group assignment, sequence of performing a CPI and whether being grouped with two other practitioners helped or impaired the performance of individuals. Further, RTRs were physicians, and no CRNA raters were employed due to staffing and employment (ie, union) restrictions. Had this programme been designed solely for research purposes, a CRNA rater would have also been used and a research design employed to stringently compare participants from either group, but also this leaves us with little to draw from any comparisons between groups.

As the same course was provided for all participants over the course of 2 years, there was also no way to account for potential contamination between groups. Participants were instructed not to share the scenarios or outcomes with colleagues, but we cannot rule this out. Finally, we have no institutional data to support whether this course truly made patients safer or improved their outcomes. The low occurrence of rare and critical events in anaesthesiology makes this problematic. For similar reasons we also did not attempt to measure changes in incidence of malpractice claims among the participating hospitals and anaesthesia providers. However, given the reported improvements made at the systems level and reports from participants, we do believe the overall impact was positive, and the course has been renewed for another 2-year cycle. Further, results from a HIC-sponsored audit involving 1610 live observations over the last 6 months of this course showed improved overall communication and compliance with measures of effective teamwork in the actual ORs (though these data would naturally be confounded by many other elements and due to efforts in other departments as well).52


Simulation-based educational opportunities for practitioners can be limited and costly. A professional liability insurer-funded simulation initiative was feasible and useful for large-scale implementation for a diverse group of anaesthesia providers in our system. Besides providing an educational foray (as with MOCA offerings), this course also offered an opportunity to examine aggregate performance gaps by scoring what practitioners actually do during simulation. We hope other institutions will use this model to initiate similar programmes and report observed gaps in participant performance or environmental/equipment resources so readers at large can address these and other important, untested gaps in anaesthesia knowledge, skills, attitudes and resources. This potential for translation to actual patient care is crucial to protecting patients from harm.


The authors acknowledge the assistance of the chairs in preparing this manuscript and presenting the course, as well as the RTRs and PIP raters for agreeing to participate.


View Abstract


  • This paper is attributed to the Icahn School of Medicine at Mount Sinai Hospital.

  • Contributors All authors meet the following criteria for authorship equally. SD, AL, PP, DF, PK, AB and AG made substantial contributions to the conception or design of the work; or the acquisition, analysis or interpretation of data for the work; drafting the work or revising it critically for important intellectual content; final approval of the version to be published; agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

  • Funding This work was supported by Hospitals Insurance Corporation (HIC grant number HIC-001).

  • Competing interests None declared.

  • Ethics approval Mount Sinai PPHS.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.