AI Development: Embracing Transparency in Machine Learning Applications
Sep 26, 2023
Saki Lagopoulos
It has been discussed in a previous blog post that to ensure fairness in AI we need to invest in transparency. In this article, we look into the challenges of developing modern applications with transparency in mind through a simple comparison of traditional vs. AI application example.
Traditional vs. AI algorithms
Traditionally, algorithms are designed by engineers as a series of steps and explicit instructions that solve a problem. Those steps and instructions are then translated into code by developers, and stakeholders subsequently decide whether an application is fit for deployment. From design to production, and maybe even to the end user, the algorithm’s design and logic can travel easily in a transparent and understandable way.
However, the growth of artificial intelligence (AI) and machine learning (ML) has resulted in systems that make decisions mostly on their own, not just following specific instructions, but by learning from patterns and data relationships. The complexity and often black-box nature of such systems have materialized new challenges regarding the transparency and interpretability of modern computer systems.
Download our eBook “Insurance Digital Transformation: Maximizing Opportunities Through Fraud Mitigation” to learn more about AI’s role in digitizing the insurance industry
Why is there a transparency issue with machine learning systems?
By comparing two different movie review categorization systems, a traditional and a machine learning one, we can easily pinpoint the inherent transparency problem of the latter. A simple traditional system categorizes a review as positive or negative based on a list of predefined, hard-coded words. For example, words such as funny, exciting, moving, etc. indicate a positive review while words such as boring, disappointing, incoherent, etc. a negative review. Such simple rule makes the system transparent and straightforward.
On the other hand, an ML system reaches a decision through a series of steps that are often hard to grasp and interpret. A simplified version of such a system will first remove punctuation and tokenize the input text in the preprocessing step, then transform the words into numerical vectors based on their frequency and semantic relationship, and finally, assign weights to those vectors to produce a prediction (positive or negative review) based on the data the machine learning model has been trained on. It is clear that even a simplified version of a machine learning system is by far more complex than a traditional word-based categorization system. The initial text is transformed, altered, and a decision is reached, not by explicitly coded instructions, but from data, making it challenging for designers, developers, and stakeholders to interpret how a machine learning model reaches a conclusion. It is becoming apparent that machine learning systems inherently lack transparency. However, such systems have proven themselves as more accurate.
In the insurance world, transparency can often be overlooked during transitioning to a machine learning application. This is because traditional fraud detection systems primarily rely on rules derived from the experience and expertise of claim handlers, making them out-of-the-box explainable. However, transparency of algorithms is of utmost importance as they ensure fairness, bypassing bias and catching errors. Moreover, regulators mandate transparent and easily auditable algorithms.
Inevitably, this raises the question of how a machine learning application can become transparent without reducing its predictive performance.
How to overcome the transparency issue within machine learning?
Currently, researchers and engineers put a lot of effort into developing methods for making machine learning applications more interpretable and comprehensive. Lately, the most popular of those methods is model agnostic. A model-agnostic method is applied to any type of predictive model without relying on the model’s internal architecture.
Looking back to our movie review example we can apply one of the most popular model agnostic techniques for interpreting machine learning models called SHAP (Shapley Additive exPlanations). A simplified version of how SHAP can make our movie review application more transparent and interpretable is described in the following steps:
Hide a random word from the movie review text
Make a prediction using the machine learning model
Keep track of the decisions the model made
Repeat the process multiple times by randomly hiding words or phrases
Combine all the predictions by checking which words had the most impact on the prediction By following the above steps we end up with a list of words/phrases and their impact on the final prediction, ultimately presented with an explanation of how the machine learning model works. The same steps can also be applied to any kind of model with any type of data, for tabular data the words become columns, and for images the words become pixels.
Embracing transparency in the AI system
As the world of artificial intelligence and machine learning continues to advance, the need for transparency in these technologies becomes increasingly evident. While traditional algorithms offer a clear and straightforward path from design to deployment, modern machine learning systems often operate as enigmatic black boxes, making it challenging for stakeholders to understand how decisions are reached.
Yet, the accuracy and effectiveness of these systems cannot be denied. The solution lies in embracing model-agnostic techniques like SHAP (Shapley Additive exPlanations), which allow us to demystify machine learning models and make them more transparent without compromising their predictive performance. In our trust automation platform, this is at the core of all AI-powered systems.
As we navigate this journey towards transparency, we ensure fairness, mitigate bias, and uphold the standards set by regulators. In the ever-evolving landscape of AI development, transparency isn't just a choice – it's a necessity. It's a commitment to building a future where technology serves us with clarity and integrity.