Tag Archives: evaluation

Planning Learning

When Australian GPs hear this term they think about the dreaded Learning Plan that has been compulsory in GP training for decades despite a remarkable lack of evidence. However, it does sound like a sensible idea that ought to work and we see some remarkably self-directed registrars with very organised learning plans.

However, this post is about the components of one of the educational roles which ought to be in the skillset of the average educator / trainer / supervisor.  We don’t just “teach”, we plan and organise learning – and hopefully facilitate and enhance it.

This means that when we have the task of teaching registrars about a particular topic – say, at a workshop or in a practice teaching session – we don’t stop at preparing and delivering a talk. As stated before, education (and learning) is not about throwing information at someone and hoping some of it sticks.  It’s not just about content.  So what else should we be planning, organising, anticipating and generally being pro-active about?  Here are some educational questions to ask yourself.

What is the most appropriate way to address this topic? Is it content dense or does it require practice of skills?  Could it be more effective to organise registrars to do some pre-reading of content and then to apply this knowledge in problem solving cases during the session (known as flipped classroom)?  Does the subject really need small group discussion rather than large group presentation?

Part of planning can include some micro managing such as engineering small groups (same levels together or experienced with less experienced) or being aware of different ways of arranging the room (chairs and tables, breakout rooms) for maximum educational effect. Is there anyone in the organising and planning group who are able to pay attention to these issues?  It may make the difference between an effective session and one that is ho hum.

When you decide to run a session in a particular way it is reasonable to want to know if it actually worked or if it could have been done better. This is what evaluation is about and it is important at a program level and at the level of the individual educational activity.  There are of course many levels at which you can try to evaluate.  Many education providers routinely measure learner feedback on either their ratings of the presenter or their impression of whether learning objectives were achieved (or at least addressed).  A learning objective is generally more ambitious than something that can be measured by the end of a workshop.  In Kirkpatrick’s Hierarchy measuring satisfaction is the lowest level and measuring the ultimate effect on society’s health would be at the apex of the triangle.  It is rare to non-existent for this to occur in education.  However, it is not out of reach to be able to measure knowledge or performance (in the workplace).  You might administer a knowledge test soon after the session or some months down the track.  It’s nice to have a control group – or even comparison with another cohort.  It’s difficult but possible to perhaps get assessments of aspects of practice performance.  Of course all these attempts are limited by numbers and various confounding factors.

It is often more meaningful and useful to use more than one method and to not just limit yourself to an online survey with a Likert scale. Qualitative feedback can inform future planning (what was missed, what can be done better) and focus groups can produce useful perspectives.  It is important that the questions asked are also those that matter to the educators on the ground.

Another form of evaluation occurs when educators get together to discuss how it all went and what can be done better next time. This implies that there is not just one isolated educator in the room for the session.  At the minimum they all have access to the written feedback in a timely fashion. It can be a learning process for educators as well.  This completes the planning circle and the pooling of educational knowledge and expertise enhances the educational planning process for the next iteration.  This is how we get quality improvement and head toward our goal of excellence in education and training.  It also makes the educator role much more fulfilling.

The Training Environment – micro and macro

Education and training does not just depend on the teacher / learner dyad in isolation. They are just part of a bigger training environment.  We are probably well aware of the micro environment of the practice or clinical setting which includes attitudes and involvement of the non-supervisor medical staff, the busyness of the service (in either direction), the variety of clinical cases, the supportiveness of non-medical staff and so forth.   These can be even more variable in community settings (compared to hospital) and can be harder to control.  However, they may often need to be accounted for.  If a particular practice has a patient load that is largely acute presentations, repeat scripts and medical certificates with little continuity of care (not uncommon in some settings) then educators should be aware of this and able to direct the registrar to a different type of experience in a later term  It can be more subtle within a practice where “female problems” are directed to a female registrar who then gains less experience in other areas.  A registrar may feed back that a supervisor is not very helpful  but yet the environment is conducive to learning because office staff are supportive and other medical staff are knowledgeable and involved.  The one thing you can say is that the issues are complex and a training system needs to take account of this.

There are a few points made in the following article (about education in residency training) regarding the importance of the “intangibles of the learning environment”.  The author claims that “At its best, the residency experience must be conducted as professional education, not as vocational training.” It goes further than mere training or credentialling and should focus on things that are obvious to many good supervisors : the assumption of responsibility, reflective learning, primacy of education and continuity of care.  http://www.jgme.org/userimages/ContentEditor/1481138241158/06_jgme-09-01-01_Ludmerer.pdf However, I do not agree with the negative interpretation of the limiting of work hours and suspect the principle of continuity should be addressed in other ways. A positive training environment can certainly encourage the learners to be curious about the outcomes of patients they see in the context of good handovers and teamwork.

He suggests there is a need to prepare “residents to adapt to the future, not merely learn for the here and now…excellency in residency training is not a matter of curricula, lectures, conferences, or books and journals…. Nor is it a matter of compliance with rules and regulations. Excellence depends on the intangibles of the learning environment: the skill and dedication of the faculty, the ability and aspirations of trainees, the opportunity to assume responsibility, the freedom to pursue intellectual interests, and the presence of high standards and high expectations.” You can sense his frustration at the increasing bureaucratisation in learning environments.  I am aware of many great supervisors in general practice who do all of this almost intuitively and we rely on their skills and commitment when broader systems are not adequate.

It is not so immediately obvious that the macro environments also have a significant influence on the learning culture.  These can include the ethos of a hospital, training organisation or government policy frameworks.

If the varying stakeholders (government, colleges, standard setters, accreditors, funders) emphasise outcomes and competencies, this can move the learning environment towards one that focusses on assessment and box ticking.  This may have benefits but there may be intangible losses which are not acknowledged.  If efficiencies are sought through larger institutions and faculty mergers, then the interpersonal nature of education may be lost.

Standardisation may increase the quality of training or lead to a lowest common denominator approach and the implementation of IT platforms  is extremely unpredictable in its outcomes.  At its worst, educational quality ends up at the mercy of unresponsive systems and learners and teachers feel they are part of an industrial process.  At its best, resources become more accessible to learners and reflective and self-directed learning can be enhanced.

In the clinic setting a positive learning environment is encouraged when the learner feels free to ask questions and when they observe a culture of learning in their colleagues;  where all staff acknowledge the importance of education and the learning task; where the supervisor is able to admit when she doesn’t know something and where the learner is treated with respect.  Learning is facilitated when there is sufficient challenge matched by the appropriate level of support – the concept of “flow” (another topic of its own) – which is not always easy to achieve and is a shifting dynamic.  The learning environment must also be safe for learner and patient and this often relates to the quality of supervision.

There are other learning environments which include the “workshop” setting. There is more to it than standing up in front and presenting relevant or required content to a group of learners.  The focus of evaluation is often on the presenter but a fantastic performer or an attractive collection of slides does not always ensure the most effective learning. Similarly, pre-prepared learning objectives may have limited relevance to the learning that is actually occurring. The size of the group will affect how active or passive the process is (300 is very different to 30).  Consider the members of the particular group of learners – are they at the same level, do they already know each other, do the presenters know them, have they travelled far?  What about the size of the rooms and the acoustics?  Are the small group facilitators well prepared?  Which of the educational staff takes note of (or has power to influence) these “small” but important issues.

In the bigger picture, consider the effect on the learning environment if service delivery is always prioritised over teaching or if the educational staff have minimal professional development to develop their skills. The “intangibles” of the learning environment that lead to excellence include the unintended consequences of policies and rules.   Learners are enthused to extend their knowledge and skills when they are inspired by mentors, when they can communicate with their educators and interact with their peers, when they feel supported by their supervisors and when the parameters of training include sufficient flexibility to allow for individual needs and rates of progression.

Over the last couple of decades there has been talk of both vertical and horizontal integration in teaching and learning environments. Some of this has been ideological, idealistic or pragmatic. It is affected by the size of institutions, the remoteness of training locations and the training requirements of various health professions. It has been influenced somewhat in Australia by the waxing and waning of funding for the PGPPP (pre-vocational general practice placement program) and it is no doubt also affected by practice economics, student numbers and reimbursement (or otherwise) for teaching.  The GP supervisors group has written about this from a supervisor viewpoint http://gpsupervisorsaustralia.org.au/wp-content/uploads/GPSA-Vertical-and-Horizontal-Learning-Integration-in-General-Practice-Apr2014.pdf  (before the more recent significant changes to the structure of Australian GP training) and there are some notable examples of practices who make a conscious effort to create a learning environment.

Consider the learning environments that you are part of and the factors that are influencing its educational quality. I suggest discussing these with colleagues and considering the broader issues when you are evaluating your teaching sessions and the experience of learners.  We want learners to bloom  (not shrivel up like the pot plants on my windy and salty balcony) and for that they need the right environment!  Bear in mind that you can make assumptions about the factors that create a positive learning environment but, ideally, it would be best practice to actually try to measure this.  The validity of educational methods is very context dependent.

Educators may have limited power to influence decision making at many levels but we have a professional responsibility to inform decision makers when the learning environment can be improved and, especially, when it is under threat.


Early identification of the struggling learner

The holy grail and silver bullets

Early identification of learning needs is of course the holy grail of much education and vocational training. It has become even more pertinent in GP training since time lines for completion of training have been tightened and rigidly enforced.  Gone are the days of relatively leisurely acquisition and reinforcement of knowledge and skills, with multiple opportunities for nurturing the best possible GP skillset.

Consequently there is an even more urgent search for the silver bullet – that one test that will accurately predict potential exam failures whilst avoiding over identifying those who will make it through regardless (effort and funds need to be targeted). If it all sounds a bit impersonal well…..there’s a challenge.


Often the term “at risk registrar” is used but I have limited this discussion to the academic and educational issues in training. The discussion on predictors also often strays into the area of selection in an effort to predict both who will succeed in training and who will be an appropriate practitioner beyond the end of training – but this is beyond the scope of this discussion, although it does suggest utilising existing selection measures.

The literature occasionally comes up with interesting predictors (Most of it is in the undergraduate sphere. Vocational training is less conducive to research).  There are suggestions, for instance, that students who fail to complete paperwork and whose immunisations are not up to date are likely to have worse outcomes.  This is not totally surprising and rings true in vocational training perhaps as the Sandra/Michelle/fill-in-a-name test.  The admin staff who first encounter applicants for training are often noted to predict who will have issues in training. This no doubt is based on a composite of attributes of the trainees and the experienced admin person’s assessment is akin to a doctor’s informed clinical judgment.  However it is not numerical and would not stand up on appeal. It is often an implicit flag.   Obviously undergraduate predictors may be different from post graduate predictors but there is always a tendency to implement tools validated at another level of training. They should then be validated in context.

Note that, once in training, the reason for identifying these “at risk” learners is in order to implement some sort of effective intervention in order to improve the outcomes. This requires diagnosis of the specific problems.

Thus, there is interest in finding the one test that correlates with exam outcomes – and there may be mention of P values, ROC curves, etc. Given that different exams test different collections of skills, it is not surprising that one predictor never quite does the job.  As an educator I’m not happy that something just reaches statistical significance but is too ambiguous to apply on the ground.  I want to feel confident that a set of results can effectively detect at least the extremes of learner progression through training: those who will sail through regardless and those who are highly likely to fail something (if no extra intervention occurs).

veg2The “triple test”

If an appropriate collection of warning flags is implemented, then the number of flags tends to correlate with exam outcomes (our only current gold standard). It is possible to identify a small number of measures that do this best and work has been done on this.  This measure + that measure + a third measure can predict exam outcomes with a higher degree of accuracy.  My colleague, Tony Saltis, interpreted this as “like a triple test”.  It appeared to me that this analogy might cut through to educators who are primarily doctors.  In the educational sphere this analogy can be extended (although one should not push analogies too far).  Combining separate tests can provide extra predictive accuracy.  In prenatal testing there have been a double test and in some places a quadruple test and now there is the more expensive cell-free foetal DNA which is not yet universally used. There are pros and cons of different approaches.  Extra sensitivity and specificity for one condition does not mean that a test detects all conditions and, of course, in Australia, the different modality of ultrasound added to that particular mix.

Any chosen collection of tests will not be the final answer. Each component of any “triple (or quadruple) test” should have the usual constraints of being performed in the equivalent of accredited and reliable labs, in a consistent fashion and results of screening tests should be interpreted in the context of the population on which they are performed.  They also need to be performed at the most appropriate time.

Hints in the evidence

I have previously found that rankings in a pre-entry exam-standard MCQ are highly predictive of final exam results. However, to apply this in different contexts there is a proviso that it be administered in exam conditions and the significance of specific rankings can only confidently be applied to the particular cohort. The addition of data from interview scores, possibly selection bands, from types of early in-training assessments and patient feedback scores appear to add to this accuracy, in the data examined – particularly for the OSCE exam.  (Regan C Identifying at risk registrars: how useful are components of a Commencement Assessment? GPTEC Hobart August 2015).  Research is also ongoing in Australian GP training in other regions by Neil Spike and Rebecca Stewart et al (see GPTEC 2016).  I would suggest that the pattern of results is important.


The way forward

Now that GP training in Australia has been reorganised geographically it is up to the new organisations (and perhaps the colleges) to start collecting all the relevant data anew and to ensure it is accessible for relevant analysis. There is much data that can potentially be used but there needs to be a commitment to this sort of evaluation over the long term. It should not be siloed off from the day to day work of educators who understand the implementation and significance of these data.

Utilising data already collected would obviously be cost-effective and time-efficient – in addition to any additional tools devised for the purpose. I suspect there is a useful “triple test” in your particular training context but you need to generate the evidence by follow-up. Validity does not reside in a specific tool but includes the context and how it is administered.  There needs to be an openness to future change depending on the findings.  The pace of this change (or innovation) can, ironically, be slowed by the need to work through IT systems which develop their own rigidity.

This is an exciting area for evidence-based education and the additional challenge is for collegiality, learning from each other and sharing between training organisations. Only then can we claim to be aiming for best practice.

Of course the big question is, having identified those at risk – what and how much extra effort can you put in to modify the outcomes and what interventions have proven efficacy?

Evaluation – How do we know we are doing a good job?

There are multiple approaches to evaluation and many are related to predicting outcomes in training or to issues of Quality Improvement.  As professionals, medical educators aim to do their job well and benefit from evaluating what they do.  At a higher level, Program Evaluation is an important issue.  At all levels, evaluation helps you decide where to focus energy and resources and when to change or develop new approaches. It also prevents you from becoming stale. However, it needs curiosity, access to the data, expertise in interpreting it and a commitment to acting on it and there needs to be organisational support.

Doing it better next time

So, at the micro level, I get asked to give a lecture on a particular topic, to run a small group or to produce some practice quiz questions for exam preparation.  How do I know if I do it well or even adequately?  How can I know how to do it better next time?

There are many models of evaluation, particularly at higher levels of program evaluation (if you are keen you could look at AMEE guides 27 and 29 or this http://europepmc.org/articles/PMC3184904 or https://www.researchgate.net/publication/49798288_BEME_Guide_No_1_Best_Evidence_Medical_Education ).  They include the straightforward Kirkpatrick hierarchy (a good example of how a 1950’s PhD thesis in industry went a long way) which places learner satisfaction at the bottom, followed by increased knowledge then behaviour in the workplace and, finally, impact on society – or health of the population in our context.  There are very few studies able to look at the final level as you can imagine.

Some methods of evaluation

The simplest evaluation is a tick box Likert Scale of learner satisfaction.  Even this has variable usefulness depending on the way questions are structured, the response rate of the survey and the timeliness of the feedback.  The conclusions drawn from a survey sent out two weeks after the event with a response rate of 20% are unlikely to be very valid.  Another issue with learner satisfaction is the difference between measuring the presenter’s performance versus the educational utility of the session.  I well recall a workshop speaker who got very high ratings and who was a “brilliant speaker” but none of the learners could list anything that they had learnt that was relevant to their practice.  You could try to relate the questions to required “learning objectives” but these can sometimes sound rather formulaic or generic.  It is certainly best if the objectives are the same as those intended by the presenter and they should be geared towards what you actually intended to happen as a result of the session. When evaluating you need to be clear about your question. What do you want to know?

reflectionIf you add free comments to the ratings with a request for constructive suggestions you are likely to get a higher quality response and one that may influence future sessions.  It is also possible to ask reflective questions at the end of a semester about what learners recall as the main learning points of a session.  After all we are really wanting education that sticks!

Another crucial form of evaluation is review with your peers. Ask a colleague to sit in if this is not a routine happening in your context.  Feedback from informed colleagues is very helpful because we can all improve how we do things.  It is hard to be self-critical when you have poured a large amount of effort into preparing a session and outside eyes may see things we cannot.

To progress up the hierarchy you could administer a relevant knowledge test at a point down the track or ask supervisors a couple of pertinent questions about the relevant area of practice.

Trying out something new

If you want to try an innovative education method or implement something you heard at a conference it is good practice to build in some evaluation so that you can have a hint as to whether the change was worth making.

An example

A couple of years ago I decided to change my Dermatology and Aged Care sessions into what is called Flipped Classroom so I put my powerpoint presentations and a pre workshop quiz online as pre-viewing for registrars.  I then wrote several detailed discussion cases with facilitator notes for discussion in small groups.  I did a similar style with a Multmorbidity session where I turned a presentation into several short videos with voice over and wrote several cases to be worked through at the workshop.

I wanted to compare these with the established method so I compared the ratings to those of the previous year’s lecture session (the learning objectives were very similar).  Bear in mind there is always the problem of these being different cohorts.  I also asked specific questions about the usefulness of the quiz and the small group sessions and checked on how many registrars had accessed the online resources prior to the session.  It was interesting to me that the quiz and the small groups were rated as very useful and the new session had slightly higher ratings in the achievement of learning objectives.  Prior access to the online material made little difference to the ratings.  I also assessed confidence levels at different points in subsequent terms. In an earlier trial of a new method of teaching I also assessed knowledge levels.

Education research is often “action research”.  There is much you can’t control and you just do the best you can. However, if you read up on the theory, discuss it with colleagues and see changes made in practice then it all contributes to your professional development.  Sharing it with colleagues at a workshop adds further value.

warningSome warnings

Sometimes evaluations are done just because they are required to tick a box and sometimes we measure only what is easy to measure.  Feedback needs to be collected and reviewed in a timely fashion so that relevant changes can be made and it is not just a paper exercise. There is no point having the best evaluation process if future sessions are planned and prepared without reference to the feedback.  It would be good if we applied some systematic evaluation to new online learning methodologies and didn’t just assume they must be better!

Evaluation is integral to the Medical Educator role

A readable article on the multiple roles of The Good Teacher is found in AMEE guide number 20 at http://njms.rutgers.edu/education/office_education/community_preceptorship/documents/TheGoodTeacher.pdf

Evaluation is a crucial part of the educator role and the educator’s role is diminished and the usefulness of any evaluation is curtailed when the two (education and evaluation) are separated.  Many things have an influence on training outcomes including selection into training, the content and assessment of training and the processes and rules around training. As an educator you may have increasingly less influence over decisions about selection processes and even over the content of the syllabus.  However, you may still have some say in what happens during training.  I would suggest that the less influence educators have in any of these decisions the less engaged they are likely to be.

At the level of program evaluation by funders, these tasks are more likely to be outsourced to external consultants with a consequent limitation in the nature of the questions asked, a restriction in the data utilised and conclusions which are less useful.  “Statistically significant” results may be educationally irrelevant in your particular context..  Our challenge is to evaluate in a way which is both useful and valid and helps to advance our understanding as a community of educators.  A well thought out study is worth presenting or publishing.