AI: Desirable Imperfection?

Might there possibly be benefits in generative AI solutions that hallucinate, make things up and show bias?

We live in a world of convenience;   Once upon a time we had to do research in a library, going through card indexes and looking at the bibliography from one book to identify further reading, which would then necessitate hunting in the library for additional books, and then you would need to summarise everything you read into your piece of work.     Then Google came along and we could do the search far faster, getting instant lists of articles or websites based on a search.   We still needed to look at the content which our searches yielded, before identifying the best source information and then moulding this into our own final piece of work.   Things had become more convenient which was good, but with this came some drawbacks.   As users we tended to look at the first set of results returned, at the first page of search results rather than at subsequent pages meaning we lost some of the opportunities for accidental learning where, in a library, your search for one book might lead you to accidentally find other books which add to your learning.   Also our searches were now being partially manipulated by algorithms as the search wasn’t just a simple search like that of a card index, it was a search which an algorithm used to predict what we might want, what is popular, etc, before yielding it as a search return.    And these algorithms reduced the transparency of the searching process, potentially meaning our eventual work had been partially influenced by unknown algorithmic hands.   Next we started the push for “voice-first” where rather than a list of search items our new voice assistant would boil down the answer to our requests to a single answer spoken with some artificial authority.

So roll in Generative AI and ChatGPT and Bard;  Now we have a tool which will search for content but will also then attempt to synthesise this into a new piece of work.   It doesn’t just find the sources it summarises, expands and explains.    Further convenience combined with further challenges or risks.   But what if there are benefits from some of these challenges such as the hallucinations and the bias?  Is that possible?

Lets step back to the library;   My search was based on my decisions as to which books to select, with my reading and book selections then influencing the further reading I did.    Now bias and error may have been in the books but I could focus on thinking about such bias and error, with error generally a low risk due to the editorial review processes associated with the publishing of a book.     In the modern world however my information might come to me via social media platforms where an algorithm is at play in what I see, choosing what to surface and what not to.   Additionally, content might be written by individuals or groups without the editorial process meaning a greater risk of error or bias.   And with Generative AI now widely available we might find content awash with subtle bias or simply containing errors and misunderstanding presented confidently as face.     As an individual trying to do some research I have more to think about than just about the content.  I need to think more about who wrote the content, how it came to me, what the motivation of the writer was, whether generative AI may have been used, etc.  In effect I need to be more critical than I might have been back in the library.

And maybe this is where the obvious hallucinations and bias is useful, as it highlights our need for criticality when dealing with generative AI content, but also with wider content available in this digital world such as the content which we are constantly bombarded with via social media.   In a world of ever increasing content, increasing division between groups and nations and increasing individuals contributing either for positive or sometimes malicious reasons, being critical of content may now be the most important skill.    

If it werent for these imperfections would we see the need to be critical, in a world where I suspect a critical view is all the more important? And can we humans claim to be without some imperfections? Could it therefore be that actually the issues or challenges of generative Ai, its hallucinations and bias, may be a desirable imperfection?  

Exams and AI: A look at the current system

I recently presented at a conference in relation to AI and assessment.   I think this was reasonably good timing given JCQ had just released further guidance in relation to student coursework and AI plus AQA had announced they were going to use online testing as part of their exam suite in the Italian and Polish GCSEs starting from 2016.    I think this is a positive step forward in both cases however I think it is important that we see this journey as more than simply replacing pencil and paper exams with a hall full of students completing the same exams but as an online/digital exam.   There is significant potential here to ask ourselves what are we seeking to assess, why are we seeking to assess it and how are we best to assess?

The SAMR model

The SAMR model is useful when looking at technology change programmes.   The first element of SAMR is that of simple substitution, similar to the example I gave above in the introduction.   The concern for me is that this might be the goal being aimed at where technology and AI present such significant potential beyond mere substitution, and where the world has moved at a fast technologically drive pace, yet our education system has changed little, and our key assessment methodologies, of terminal coursework and exams have barely changed at all.

In looking to progress beyond substitution it might be useful to unpick some of the limitations of the current system.  For this purpose I am going to focus purely on terminal exams given they are such a significant part of the current formal education system in the UK.   So what are the limitations of the currently accepted system?

Logistics

One of the key drawbacks in the current system, as I see it, is the massive logistical challenge it presents.   Students have to be filed into exams halls across the country and the world all at the same time, to complete exam papers which have been securely delivered to exam centres.    Its quite an undertaking and even more so when you consider trying to keep the papers and questions secure.   In a world of technology where content can quickly and easily be shared it doesn’t take much before questions are out in the open ahead of the exam, advantaging those who have seen the information when compared with those who have missed it.    Then you have the issue of gathering all the completed papers up, sharing them with assessors to mark, quality assurance of marking and then eventual release of results to students some months later.    This is a world where technology supports the sharing of information, written, audio, video and more instantly.  Why cant the exams process be quicker and more streamlined, making use of technology to achieve this?

Diversity

Another key drawback has to be that of diversity.  We, more than ever, identify the individual differences which exist in us all.    Discussion of neurodiversity is common at the moment but despite this we still file all students into a hall to complete the same exam paper.     Now there are exam concessions which can be provided to students but this barely scratches the surface in my opinion.    Where is the valuing of diversity in all of this?

Methodology

We also need to acknowledge that the current exams system very much values those students who are able to memorise facts, processes, etc.   Memorisation is so key to exams success however out in the real world we have access to ChatGPT and Google to find the information we need when we need it, with the key then being how we then interpret, validate and apply this information to the challenges or work in front of us.    Shouldn’t the assessment methodology align with the requirements of the world we live in?   Now I will acknowledge the important of key foundational knowledge so I not suggesting we stop teaching any basic knowledge, but knowledge and memorisation should be less of a focus than it is now.

Conclusion

I believe technology could address a lot of the drawbacks listed above.  Now I note the use of technology will present its own challenges but how often do we find the “perfect” solution?    Wouldn’t a solution which is easier for schools to administer, is quicker and more efficient, is more student centred and more in line with the world we now live in be a good thing?

AI and assessment (Part 2)

Following on from my last post looking at AI and assessment (see here) where I focussed very much on the high stakes world of terminal exams and coursework, I would now like to look towards formative assessment and the learning process.   As with my last post, this post aims at sharing some of the points I made at a recent conference where I spoke on AI and Assessment, presenting some questions which I believe we need to increasingly consider in a world of AI and generative AI solutions.

AI Supported Learning

Learning platforms and computer based learning have existed for some time.   And they havent and dont look like the image here. I remember having to do some Maths learning during my teaching degree using a computer based learning platform and that was in the mid to late 90’s.    At the time I wasn’t that fond of these learning platforms and this feeling stayed with me.  My issue was that the platforms although offering differing routes through the broad content, were largely linear in their offering in relation to each topic or even the smaller units of learning.    This couldn’t compare to a teacher delivering content where they could see students struggling and then instantly seek to adjust the learning content accordingly.

We have came a long way from there, with AI and generative AI now able to provide us with far superior learning platforms with my sense being that these platforms tend to break into two types, one where the AI is analysing usage and interaction data to direction learning content creators and one, the more recent and emerging type, where generative AI provides an AI based support, teaching or coaching agent.

In the model where the platform analyses usage and interaction data the key benefit is that this data is gathered from all users looking for those common patterns or anomalies, looking at issues such as general, language, nationality, and a variety of other factors to find which learning content works and which does not.    This allows creation of effective learning content based on a huge amount of data across many schools and many learners, far beyond the data that a teacher may have at their hands.   As such the content in these platforms progressively improves over time and based on data rather than intuition or other less tangible factors, which may be wrong, which a teacher may rely on.

Where generative AI is used students get a chat bot which prompts and support students as they work through the learning content, with the AI trying to mirror the supportive and coaching role of a teacher, but individualised for each student and available any time, anywhere assuming access to a device and internet connection.    I feel it is here that there is the greatest potential especially in relation to more fundamental skills and knowledge development, freeing up teachers to focus on more advanced concepts and also on wider issues such as resiliency, leadership, interpersonal skills, wellbeing, etc.    I note recently reading a post about a school which uses AI where they don’t have “teachers” instead having “guides”.    I suspect this sounds more radical that it is in practice especially the reported comment by the co-founder that “we don’t have teachers”.   My view is that AI learning platforms wont replace teachers, however through the use of AI learning platforms working with teachers we may be able to achieve more and quicker with our students.   I suspect the school is more akin to this partnership that the report would suggest however have no first hand experience of the school so cannot be sure.

Challenges

AI as a tool to assist and maybe guide and deliver learning delivers a number of benefits however I think it is important to acknowledge some of the challenges and risks.  We may not have a solution at this point however at the very least we need to be aware.

Bias is a clear challenge and something which has been widely reported in relation to AI.    In my session I asked a generative AI solution for a picture of a nurse and a picture of a doctor which the solution returning images where the doctor images were all of males and the nurse images all of females, and where all the images where of white people.    This experiment clearly shows bias however the challenge in AI powered learning platforms is that the bias may not be so easily visible.   What if the platform decides based on statistics that students from particular area, nation, gender, preference, age or other characteristic do generally worse than average.   The platform may then present them content it believes to be appropriate to this ability level, in doing so impacting their ability to achieve, the challenge they receive, and possibly causing a self-fulfilling prophecy.   And when a parent asks regarding a students learning path, is it ethical to use learning platforms if the use of a learning platform means we may not be able to explain the decisions taken in the child’s learning experience and journey, where these decisions were taken by AI?

Data is another challenge we need to consider here in the possible huge and growing wealth of data learning platforms might gather in relation to students.   This isnt just the data a school might provide such as name, email and age, but the data produced through each and every interaction with the platform, plus the data gathered as diagnostic data such as the device being used, IP address, etc.    And then there is the data a platform might be able to infer from the data gathered;   Could an IP address, which suggests a rough geographic location, a device type and internet speed allow you to infer the wealth of a user or users family?     I suspect it could.    Now consider the massive amount of data gathered over time, across different curriculum subjects and each use of the platform;   The potential for inference grows with each additional data point.   How do we manage the risks here in relation to data protection, cyber risk and also accidental or purposeful mis-use of the data?  If we are to use AI assisted learning solutions I think we need to ensure we have considered how we might do this safely.

Conclusion

Educations has had its challenges for some time including teacher recruitment, teacher workload and wellbeing, and equity of access to education.   Maybe AI can help with some of this and maybe AI risks making things worse in some areas;  It is difficult to tell, although the one thing we can tell is that AI is here and here to stay so I think we need to make the most of it and shape its use to be as positive and powerful as it potentially can be.   A difficulty here however is the slow pace with which education changes (little has changed in almost 100yrs!).   Now the pandemic did cause some change in my view, but some of that has rubber banded back to pre-covid setups.   The question now is, is AI the next catalyst for education change, will it impact education as much or more than the pandemic and will its impact be persistent beyond the initial “shiny new thing” period.   Only time will tell although my sense is there is potential for AI to answer in the affirmative to all three questions.


References:

A Texas private school is using AI technology to teach core subjects; A. Garcia (Oct, 2023), CHRON, Texas private school replaces teachers with AI technology (chron.com)

Should AI be held to higher standards than humans?

Darren White posted an interesting question on twitter the other day in relation to the standards we hold AI to.    Should AI be held to higher standards than humans? This is something I have been given some thought to due to both having an interest in human heuristics and bias, plus an interest in artificial intelligence. 

Discussions on AI

There is already a lot of discussion regarding issues and challenges related to AI including discussion of bias and inaccuracy or “hallucinations”.    I myself have been able to recreate these two issues reasonably easily within generative AI solutions, firstly asking an image generation solution to create a picture of a nurse in a hospital setting and then a doctor in a hospital setting.   In this case the images were all of white individuals with the nurses all female and the doctors all male.    The evidence of bias was clear to see.    And in a separate experiment with a tool to help with report writing, the developer forgot to provide any data in relation to the fictitious student for which a report was being created but the tool simply made the report content up.    These issues are therefore clear to see and it is easy to jump to a standpoint where bias needs to be removed and inaccuracies or hallucinations stopped.

A human view

One of the issues here is that I believe we need to take a cold hard look at ourselves, at human beings and how we might respond to prompts if such prompts were direct at us rather than an AI.   Would we fair so much better than and AI?    I have a lovely poster in my office in relation to the cognitive biases which impact on human decision making and there has been plenty written about this and heuristics, with Daniel Kahneman’s book, Thinking, fast and slow, being one of my favourites.   A key issue here is that we are often not aware of the internal or “fast” bias which impacts on us and therefore may assess our biased decisions as being absent of bias.     In terms of hallucinations, again we humans suffer the same issue often stating facts based on memory, and holding to these facts even when presented with contradictory evidence.   The availability and confirmation biases may be at play here.    Another challenge when comparing with AI is that our biases and hallucinations are not clear for us to see, albeit they may be clear to others, yet with AI bias and hallucinations, at least in the form of those raised as examples, it is clear for all to see.  

End point?

I would suggest that in both AI and in human intelligence our ideal would be to remove bias and inaccuracy.   I would also suggest although this is a laudable aim it is also impossible.    As such, rather than focussing on the end we need to focus on the journey and how we might reduce the bias and reduce the inaccuracies both in humans and in AI.    It may be that in reducing bias in humans this may benefit AI, however it may also be possible that things work the other way and discoveries to help reduce bias in AI may help with bias in humans.   I note that a lot of human thinking, especially our fast thinking, can be reduced to heuristics or “generalisations” or “rules of thumb”;  How is this much different to the quick processing of an generative AI solution?  Does generative AIs probabilistic nature not tend towards quick creation of generalisations but based on huge data sets?

The future

So far, I have avoided getting pulled into the future and artificial general intelligence and I mention it for completeness only.   This will likely arrive in the future and most who claim to be AI experts seem to agree with this however there is much disagreement as to the when.   As such our immediate challenge is that of the generative AI we have now and its advancement over the creation of an AI solution capable of more generally out thinking us across different domains;  That said I would suggest that in a number of ways generative AI can already out perform us across many domains.

Conclusion

So back to the question in hand and whether we should seek to hold AI up to higher standards?    We should seek to avoid outcomes which have a negative impact on humankind so bias and inaccuracy and also the other challenges in relation to intelligence, such as equality of access to education, etc, are all things we should seek to reduce.    This I think is a common aim and can be applied to humans and AI.   In terms of the accepted standard, I think it is currently difficult to hold AI up to a higher standard than we hold humanity given the solutions are created by humans, trained on human supplied data and used by humans.   It may be that in AI solutions you get a glimpse of how entrenched some of our human biases actually are.   That said I also think it might be easier to remove bias and inaccuracies in an AI solution as compared to doing the same with a human;  I doubt the AI will seek to hold onto its position or to counter argue a view point, at least not yet.

AI and assessment (Part 1)

I recently spoke at an AI event for secondary schools in which one of the topics I spoke on related to AI and its impact on Assessment.   As such I thought I would share some of my thoughts, with this being the first of two blogs on the first of the sessions I delivered..

Exams

Exams, in the form of terminal GCSE and A-Level exams still form a fairly large part of our focus in schools.  We might talk about curriculum content and learning but at the end of the day, for students in Years 10,11, lower 6 and upper 6 the key thing is preparing them for their terminal exams as the results from these exams will determine the options available to students in the next stage of their educational journey.   The issue though is that these terminal exams have changed little.   I provided a photo of an exam being taken by students in 1940 and a similar exam in recent terms and there is little difference, other than one photo being black and white and the other being colour, between the photos.   The intervening period has seen the invention of DNA sequencing, the mobile phone, the internet and social media, and more recently the public access to generative AI but in terms of education and terminal exams little has changed.

One of the big challenges in terms of exams is scalability.  Any new solution needs to be scalable to exams taken in schools across the world.  Paper and pencil exams, sat by students across the world at the same time accommodates for this.  If we found life on Mars and wanted them to do a GCSE, we would simply need to translate the papers into Martian, stick the exams along with paper and pencils on a rocket and fire them to Mars.   But just as it is the way we have done things and the most easily scalable solution doesn’t make paper and pencil exams the best solutions.   But what is the alternative?

I think we need to acknowledge that a technology solution has to be introduced at some point and the key point is the scalability based on schools with differing resources.   As such we need a solution which can be delivered in schools with only 1 or 2 IT labs, rather than enough PCs to accommodate 200 students being examined at once as is the case with paper based exams.  So we need a solution which allows for students to sit the exams in groups, but without compromising the academic integrity of the exams where student share the questions they were presented with.    The solution, in my view is that of adaptive testing as used for ALIS and MIDYIS testing by the CEM.   Here students complete the test online but are presented different questions which adapt to students performance as they progress.   This means the testing experience is adapted to the student, rather than being a one size fits all as with paper exams.    This helps with keeping students motivated and within what CEM describe as the “learning zone”.   It also means as students receive different questions they can sit the exam at different times which solves the logistical issue of access to school devices.   Taken a step further it might allow for students to complete their exams when they are ready rather than on a date and time set for all students irrespective of their readiness.

AI also raises the question of our current limited pathways though education, with students doing GCSES and then A-Levels, BTecs or T-Levels and then onto university.    I believe there are 60 GCSE options available however most schools will offer only a fraction of this.    So what’s the alternative?    Well CalTech may provide a possible solution;  They require students to achieve calculus as an entry requirement yet lots of US schools don’t offer calculus possibly due to lack of staff or other reasons.   CalTechs solution to this has been to allow students to evidence their mastery of calculus through completion of an online Khan Academy programme.   What if we were more accepting of the online platforms as evidence of learning and subject mastery?   There is also the question of the size of the courses;   GCSEs and A-Levels and BTec quals are all 2 years long but why couldn’t we recognise smaller qualifications and thereby support more flexibility and personalisation in learning programmes?   In working life we might complete a short online course to develop a skill or piece of knowledge on a “just-in-time” basis so why couldn’t this work for schools and formal education?  The Open University already does this through micro credentials so there is evidence as to how it might work.   I suspect the main challenges here are logistical in terms of managing a larger number of courses from an exam board level, plus agreeing the equality between courses;   Is introductory calculus the same as digital number systems for example?

Coursework

Coursework is also a staple part of the current education system and summative assessment.    Ever since Generative AI made its bit entrance in terms of public accessibility we have worried about the cheating of students in relation to homework and coursework.    I suspect the challenge runs deeper as a key part of coursework is its originality or the fact that it is the students own work but what does that look like in a world of generative AI.    If a student has special educational needs and struggles to get started so uses ChatGPT to help start, but then adjusts and modifies the work over a period of time based on their own learning and views, is this the students own work?   And what about the student who does the work independently but then before submitting asks ChatGPT for feedback and advice, before adjusting the work and submitting;   Again, is this the students own work?  

There is a significant challenge in relation to originality of work and independent of AI this challenge has been growing.   As the speed of new content generation, in the form of blogs, YouTube videos, TikTok, etc, has increased year on year, plus as world populations continue to increase it become all the more difficult to be individual.  Consider being original in a room of 2 people compared with a room of 1000 people;    The more people and the more content, the more difficult it is to create something original.   So what does it really mean for a piece of work to be truly original or a students own work?

The challenge of originally and students own work relates to our choice of coursework as a proxy for learning;   It isnt necessarily the best method of measuring learning but it is convenient and scalable allowing for easy standardisation and moderation to ensure equality across schools all over the world.   It is easy to look at ten pieces of work and ensure they have been marked fairly and in a similar fashion;  Having been a moderator myself this was part of my job visited schools and carrying out moderation of coursework in relation to IT qualifications.   If however generative AI means that submitted content is no longer suitable to show student learning, maybe we need to look at the process students go through in creating their coursework.    This however has its own challenges in terms of how we would record our assessment of process and also how we would standardise or moderate this across schools.

Questions

I don’t have a solutions to the concerns or challenges I have outlined, however the purpose of my session was to stimulate some though and to pose some questions to consider.    The key questions I posed during the first part of my session were:

  1. Do we need an annual series of terminal exams?
  2. Does there need to be [such] a limited number of routes through formal education?
  3. Why are courses 2+ years long?
  4. Should we assess the process rather than product [in relation to coursework]?
  5. How can we assess the process in an internationally scalable form?

These are all pretty broad questions however as we start to explore the impact of AI in education I think we need to look broadly to the future.    In terms of technology the future has a tendency to come upon us quickly due to quick technology advancement and change, while education tends to be slow to adapt and change.    The sooner we therefore seek to answer the broad questions or at least think about them the better.

Autumn term blues

We are now in the 2nd half of the autumn term and I cant believe where the time has gone.    We had the usual build up ahead of the start of the new academic year, followed by the unsurprisingly manic start of term.   The start of term in schools and colleges is normally manic as new students and staff join and as everyone tries to quickly get back up to speed following the summer break, trying to establish the positive habits which should underpin the year ahead.    For me, the first half of this years autumn term was made all the busier due a number of events which I had agreed to attend or contribute to, such as a couple of industry cyber security events and speaking at events in Leeds, London and Amsterdam.   Each of these events were really useful however the travel and preparation work related to the events add to the stress and pressure.   Its worthwhile, and I certainly take much from each of the events, the ANME/Elementary Technology AI and EduTech Europe events in particular, but it isnt half tiring.

It was therefore no surprise that I reached the half term feeling very drained and run down but having quite a bit to catch up on before the planned period of rest towards the end of half term.   And this is where sod-law kicks in.    Just as I get the time to regroup and to rest, illness shows its head.   Why is it that just when you get time to enjoy yourself and relax, that you end up ill?    Now I suspect part of the answer is the fact that, when busy, adrenaline carries you through and keeps you going however as soon as you see the light at the end of the tunnel, as soon as you take your foot off the gas and your body and mind relax a little, the bugs, the viruses and the general malaise set in.   And so it was that I spent a fair amount of the half term period working on, as us IT people need to do in school holiday periods, while feeling less than 100%.   When I did get a few days off to relax the time was largely spent in bed or crashed out in front of the TV with little energy and a persistent cough.

Before I knew it, the 2nd half of the term had begun and the opportunity to spend some proper time on wellbeing and mental health has passed me by.    So with the 2nd half of the term now fully back in the swing of things, it is once again time to put the foot to the floor and proceed towards Christmas (bah humbug 😉) .    At this point I still don’t quite feel 100% but I am definitely better than I was during half term and for now I hope I can get to Christmas, and pass into the festive holiday period without any further illness.   But only time will tell.

The challenge we all have is in accepting that life and work is not linear;  There will be periods where things are manic and busy, and where mental health and wellbeing will take 2nd or maybe 3rd place, however equally we need to seek a balance which means there will need to be times when mental health and wellbeing come first, even when this is at the expense of other things.    For me, the manic autumn term just means I need to ensure I put time aside for myself, either at Christmas or at some point in the spring of summer terms, putting myself first over other pressures.  

Onwards and upwards as they say, and also let me share an important message with all my colleagues in schools and colleges;   make sure to look after yourself as unless you are well, physically, mentally, cognitively, etc, you won’t be able to effectively help, look after, teach or otherwise support others.    Take care and good luck for what remains of the autumn term!