Initiative by IMDA, AI Verify Foundation tests AI accuracy, trustworthiness in real-world scenarios

SINGAPORE – Doctors at Changi General Hospital (CGH) are testing the use of generative artificial intelligence (GenAI) to summarise medical reports and provide recommendations on clinical surveillance.
But are these recommendations accurate?
Meanwhile, regulatory technology firm Tookitaki uses GenAI to investigate potential money laundering and fraud cases.
Are its findings trustworthy?
Earlier in 2025, the Infocomm Media Development Authority (IMDA) and the AI Verify Foundation rolled out an initiative focused on real-world uses of GenAI to encourage the safe adoption of AI across various industries.
The AI Verify Foundation is a not-for-profit subsidiary of IMDA that tackles pressing issues arising from AI.
Between March and May, 17 organisations across 10 different sectors – including human resources, healthcare and finance – had their GenAI applications assessed by specialist GenAI testing firms.
The findings were published on May 29, marking Singapore’s commitment to spearhead the development of global standards for the safe deployment of GenAI apps.
The Global AI Assurance Pilot, as the initiative is called, has allowed organisations to see how their GenAI applications perform under practical conditions, said Senior Minister of State for Digital Development and Information Tan Kiat How on May 29.
He was speaking on the last day of the Asia Tech x Singapore conference, held at Capella Singapore.
Clinical Associate Professor Chow Weien, chief data and digital officer at CGH, told The Straits Times that taking part in the initiative helped the hospital design a more robust and reliable way of testing its AI models.
“For example, we could assess whether our GenAI application was extracting the clinical information accurately from the doctor’s colonoscopy report, and if the application was providing the correct recommendation, in line with the clinical guidelines,” he said.
Tookitaki founder and chief executive Abhishek Chatterjee told ST the experience helped make the firm’s AI model more auditable and allowed the company to incorporate guardrails against AI hallucinations.
These are inaccurate or nonsensical results generated due to factors such as insufficient training data.
While earlier initiatives had focused on the testing of AI models, the Global AI Assurance Pilot aimed to test the reliability of GenAI in real-world scenarios, said AI Verify Foundation executive director Shameek Kundu.
This is important as the information fed to AI can be flawed, he said, giving the example of poor-quality scans from a patient provided to a hospital’s AI.
The aim is to make the use of GenAI “boring and predictable”, to ensure the technology’s reliability for day-to-day use, he said.
In a statement, IMDA and AI Verify Foundation said the initiative also showed that human experts were essential at every stage of testing, from designing the right tests to interpreting test results.
While the technology may improve in the future, a human touch is still needed for now, said Mr Shameek.
“The technology is not good enough for us to blindly trust and say it’s working,” he said.
A report detailing the findings is available on AI Verify Foundation’s website.
In line with the pilot, a testing starter kit for GenAI applications has also been developed, serving as a set of voluntary guidelines for businesses that want to responsibly adopt GenAI.
“It draws on insights from the Global AI Assurance Pilot, tapping the experience of practitioners to ensure the guidance is practical and useful,” said Mr Tan.
He added that the kit includes emerging best practices and methodologies for testing GenAI applications, as well as practical guidance on how to conduct such testing.
The guidelines will be complemented by testing tools to help developers conduct these tests, which will be made progressively available via IMDA and AI Verify Foundation’s Project Moonshot, a toolkit targeted at AI app developers.
IMDA is conducting a four-week public consultation on the starter kit, which can be found online. The consultation will end on June 25.
Feedback can be e-mailed to aigov@imda.gov.sg with the e-mail header “Comments on the draft Starter Kit for Safety Testing of LLM-Based Applications”.
Mr Tan also announced that AI Singapore (AISG) – a national initiative to build the Republic’s capabilities in AI – will sign a memorandum of understanding with the United Nations Development Programme (UNDP) to advance AI literacy in developing countries.
This partnership will see AISG’s AI for Good programme, launched in 2024 to bolster national AI capabilities, expand to an international scale, he said.
“AISG and UNDP will explore initial AI for Good pilots in South-east Asia, the Caribbean and the Pacific Islands, so that we can support more inclusive participation in AI-driven growth together,” he added.
This article was first published in The Straits Times. Permission required for reproduction.