South Big Data Hub Social Cybersecurity Working Group, 1 May 2023

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Social Cybersecurity WG: Targeted Knowledge Infusion To Make Conversational AI Explainable and Safe

Description

Presenter: Dr. Manas Gaur
Institution: University of Maryland, Baltimore County

A

Welcome everyone for the social cyber security working group. This is the first meeting of 2023. So, although it's a month late, but Happy, New Year hope everyone is enjoying their 2023 so far um and uh uh Health Prosperity productivity and all the best wishes for the rest of the Year to everyone. uh Today we are joined by Professor Manas Gore, uh who is doing exceptional work in knowledge, graphs, knowledge infusion.

A

So in a moment we will hear from Professor Gaul, but before we give the floor to Professor, Gore I would like to introduce everyone who are attending starting from me, so I'm Dr, nitin, Agarwal um world energy, chair and distinguished professor of information science, and also the founding director for Cosmos Research Center here in Arkansas, I'm, also a co-chair for the social cyber security working group, which is operating under NSF auspices at the South Bay data Hub.

A

So we are really grateful for Chris and the entire South Big Data Hub team for helping us in coordination of these meetings. Now we see a lot of participants here, so what we typically do is go around the room and I ask individual folk to introduce themselves. So I will call out your name and just in a one breath, please introduce yourself so we'll start with bala.

A

Could you please go ahead and introduce yourselves again one breath hi I am Bala I work as a scientist at radgos University. That's.

B

A

Thank you. Thank you for joining Bala. Next, we have Billy.

C

I am Billy Spann I'm a.

D

Postdoc at the cosmos, Research Center at CUA little Ross.

A

Thanks Billy for joining next, we have garima.

C

Hi I am a PhD student at ASU.

A

Good to have you garima uh next.

E

Header, yes, Hyder, arubaemon and I am part of the information science department and I I'm doing my PhD right now. I'm a PhD candidate.

A

Have you made it I believe next week.

F

Hello: everyone um I'm a research assistant greatest assistant at Cosmos, lab um and I'm, currently being my Master's at ulr,.

A

Next, we have Marina.

C

Good morning, everyone I am information science, student at DLR and a fellow cosmographer. Thank you.

A

To have you next, we have nilofer.

B

Hello, everyone I'm a research assistant in Cosmos on team.

D

And I'm in my second year of PhD in information quality.

A

Next, we have obiano.

B

Good morning, everyone um I'm a richest research assistant with Cosmos and a master student as well in University of Arkansas Little Rock. Thank you.

A

Good to have you be energy? Next, we have reject.

A

Perhaps reject has issues with his microphone. We will come back to you Richard. um We have Vanessa next.

G

Hi everyone good morning, my name is Vanessa I'm, a research assistant at Cosmos and also in master student at the University of Arkansas at Little Rock. Thank you.

A

Good to have you Vanessa last, not least, we have Stephen Stephen. Please go ahead. Hello.

D

There my name is Stephen dick I'm, a newly hired research associate at RMC research.

A

Good to have you Stephen, all right, I think gone through everyone reject if your microphone is still giving you trouble. You can tell us about yourself in the chat. I will be monitoring chat for the rest of the session, so it is my distinct pleasure to introduce Professor Manas Gore, who is going to talk to us about our targeted knowledge infusion to make conversational Ai explainable and safe um minus is an assistant professor in the computer science and electrical engineering department at University of Maryland Baltimore County.

A

Before UMBC, he held the position of senior AI research assistant um with the knowledge and dialogue team within the AI Center at Samsung research America. He completed his PhD in computer science from the artificial intelligence Institute at the University of South Carolina. His PhD research was supported by Eric and Wendy Schmidt data science for social good Fellowship AI for social good Fellowship from data minor Inc epsrc ukri Grant through Alan Turing, Institute and NSF Eagle on knowledge, infused learning his most noted work on knowledge, infused learning, parallels, neurosymbolic, AI in mental health.

A

His research focuses on knowledge, graphs, natural language, processing, artificial intelligence and conversation systems for social, good applications.

A

That's exceptional work from Professor minus Gore and with that I would like to turn it over to Professor Gore, for his enlightening talk.

H

Thank you Nathan uh and thank you Chris uh for inviting me and I want to share my uh yes I think my screen is visible. Yes,.

A

H

I want to um awesome.

H

H

Everyone and uh thank you all for being here, and this talk today that I wanted to talk about uh is also a part of the triple AI new faculty talk that I will be giving at Washington DC triple a conference this year and essentially that's the work that I will be presenting towards uh the South Big Data Hub uh working group Workshop is along uh an idea that I have been working for past couple of years, and, and it uh includes a lot of uh the data, the information on conversational AI but, most importantly, I- wanted to Enlighten you all that why we need knowledge infusion in making conversational Ai, explainable and safe.

H

There have been a lot of discussion recently with the first uh safety workshop and conversational AI started by Microsoft and and Facebook and over the time they have been in uh and a need for explanations and safety when you are dealing with conversational agents because they are AI based and they are automated. So just like what you are getting is what you will.

H

What you're, giving to the model is what you're getting on the model, so I wanted to make the uh the stock to be more specific to the sensitive domains in uh of AI, where AI, adapter or AI adoption is really of critical concern and have been. uh There have been some work along that line, which I will be discussing and motivating this, uh this uh today's presentation on the knowledge infusion and how do us, Target a knowledge infusion, occurs in AI on to this.

H

Let's start with a very generous scenario uh that I was uh enlightened when light enlightened with uh last year, when I was interacting with Dali and I was I gave a simple query, essentially, which is very easy to understand to all the folks who are in machine learning and artificial intelligence that I'm actually looking for an architecture that it has data extraction, modeled by training, grid, search and cross validation, as as a component in that architecture, but to the AI. It seems like.

H

There are multiple words within this query: there is a pipeline, then there is a model. Then there is an explaining there's a model who is training and then there's a model who can be trained and it can be a girl and a boy.

H

There are some gender wise issues, but I don't want to go uh into the details of that, because that's kind of have a different uh uh Turf, but essentially what we are looking into is the segmentation of this entire query into bunch of words, which has a drastic consequence, as it loses the semantics behind the entire query that has been given to the model.

H

On the other hand, if I just do a simple, Google search, which is pretty much there for almost like quite a decade now they use Google, Knowledge Graph. So intrinsically you get the desired query your desired response. uh By simply uh looking at the query- and you are- you are getting the desired architectures in front of you, which you really need.

H

What, if a model, let's take a Dali who would take a retrieval engine that can actually retrieve the sensible images and can utilize these images to come up with a decent response than such generation would not have happened. So there is a potential where a retrieval based mechanism can can work with a generation, an automated generation mechanism to improve the capability of generation.

H

Another experiment that I I uh performed was on charge gbt. We all know that chair gbt has been a hype and has been a a a what you call say: a terrific work by open AI in the context of AI in the context of artificial intelligence and conversational systems.

H

But there are a lot of improvements uh that are required and that's the very reason that chat GPT has a reinforcement, learning with human feedback in a loop that it tries to take the human feedback and improve upon it, but essentially that human feedback may or may not be correct.

H

But what is essential for an agent to deliver is to look at the key Concepts within the query and structure the response accordingly, for instance, in this case, if I'm not talking about a depression and a medication that I'm taking essentially I'm, not interested in looking at such a long content, which has little to no significance or association with my query.

H

On the other hand, if I would ask a clinician, so this is a clinician's output Trace that if I would ask a nurse practitioner, for instance, that what would you think if you were being asked by such a question, we query that the first Focus point of a nurse practitioner is the person is depressed and the second part is the person has a connection with Prozac because he's taking Prozac and there is the counteracting effect, and thus there is some kind of an association between Prozac and depression.

H

So my next question would be: do you really take any medication along with Prozac, which is a more engaging and more relative response from a from a human which can be seen or which can be seen from an agent if the agent focuses on the on the terms or the concepts within the query?

H

If I, therefore, that was one just example: what, if I just take a very simple example in mental health only, but in a very in a not so sensitive case, as it was in the previous example? What if I say I am feeling tired, which is pretty much after year. After months of after days of work, you go home and you're feeling tired, but what is I add another concept to it that it's been weak since I have slept now.

H

The focus of the model is definitely on the sleep, but the sentence it starts with and the information that is built on it, because the next word prediction model, because whatever it starts with it, has to follow that that sentence. So if it starts with sleep, the rest of the information would be catered to The Sleep, whereas if you look at this query is essentially looking at some of the terms like fatigue, exhaustion, probably sleep disturbances, and if I can do a simple Google search with this. Such a query.

H

I can straight away inform look at the few bad nights on insomnia as the top most response or related information that I can get on Google Search. Now the question is what's happening with the generation right, why the generation is happening and in a in a very uncontrolled manner, can it be controlled?

H

So those are the questions that can derive our initiative towards explainability and safety and in this line, Deep Mind also analyze such a scenario and introduced Sparrow, which is one of the of good working and a decent conversational agent that talks to human and answers the question using the live, Google search, something that we were hypothesizing in a previous slide right. Essentially, they also use reinforcement, learning, which is also the part of chat GPT. But what they have done is interest is very ingenious is that they have guide the conversation to 23 rules.

H

These rules describe the safety aspect of safety mechanism of the deepminds conversation bot and what is an interesting point that actually drive my research in the last year. uh Probably the fall of the last year was this uh quote by Jeffrey irwing. That dialogue is a good way to introduce safety in AI through dialogue, but AI can understand whether it is going in the right direction and or whether the generation is acceptable to the human or not.

H

So that's a good uh uh reduction to make that an a conversational models can be made AI safe if we are able to utilize. Conversational data sets carefully crafted for AI to make a safe, uh AI, safe and generation, but at the end of this goal, even if the sparrow is out there, the generation component is still a parametric memory and it has to be look at uh focus on.

H

uh Was it a question?

H

Okay, so it still needs to look at the concept within the query and do not generate responses which are predefined as a template in the in a in a conversational agent, for instance, in this case I'm. Sorry, to hear that I can help you with that. It has no connection to the severity and essentially charging picture.

H

Gpt does do a good do a good job in just connecting such a query to the helplines so that the person gets the help, but essentially that is a step towards safety, but essentially it is also required to walk to a person through A Conversation Piece, while passively activating the The Crisis crisis online. Also crisis calls so those things are actually a part of the safe conversations, because a person who has that kind of a self-hop situation, A helpline definition by giving the definition would not be sufficient to come complete the uh the job walk.

H

On the other hand, hallucination has been a very intrinsic component and that, even after adding other rules would not be sufficient to control, for instance, if I, just paraphrase or even if I ask the same question, let's say in different time intervals you will see chart GPT, given you are giving you a different responses, something very similar.

H

What gpt3, which is a precursor of charge, GPT used to do that so hallucinating by definition, is by that it deviates the the generated response, deviates significantly from the subject matter and is often unreasonable, because that response, if it's changing that means you're changing your definition or your perspective, or your your understanding to that. For that question, that's where we actually step into the concept of explainable AI, and we need why we need explainable AI is a core component uh within an artificial intelligence systems, so in 1970 Vic's sense making Theory uh actually defined ex.

H

What does explanation means? It says that that explanations are always a human-centered sentences which are makes sense to human expert, so they are always. They have never been a part of some kind of a heat map or some kind of a weight, Matrix or a important score. Rather they are actually deliberate human-centered sentences or something of the sort that has some sense to human expert.

H

Alternatively, can we look at if this? If we look at this definition from the perspective of AI, can I say this? That explanation from AI are traces of attention, because we are looking at the attention model nowadays and all the models that we are working on are Transformer based models, so we say explanation from AI are traces of attention which are essentially Collective experiences that the model is gathering from the training data.

H

Now the task that we are now entitled to is how does this AI model, who is gathering collective experience over the training, can connect to some real world entities and actionable definitions? If that can be performed, then you are able to introduce an experimentality component within an ai's generation Behavior.

H

Thank you. Why I did this process? What are we getting? The purpose that we are gathering from this approach is now the AI has is able to analyze and it can be understood easily by the domain experts, and the only goal of introducing this AI uh explainability is that we do not want the model to generate hallucinating outcomes. Hallucinating means that the generated uh behavior of the AI is drifting away from the human desired functionality. Over the time we are gathering, we are having generated different responses.

H

That means and World Knowledge is affecting the model in generating the outcome, and that's what? If there is a certain level of connection, then the model is making with the knowledge of the real world. Such kind of uh varied outcome would be minimized.

H

So that's where, where a study of explanations also starts that there is a last year last few years, there have been work on explanations, but from the system side like next for the very popular way of defining something is lime and charm, which were the prominent uh explanations based systems and that were being used in machine learning, and this is on a schematic of how this live and shaft work.

H

And what is interesting to interesting to observe is that knowledge, though, can be a part of lament chef, but they can always introduce uh new features into the model and interesting. Interestingly, you need a surrogate model to verify a black box model, so your original Black Box model is still a black box.

H

You are just using a surrogate Model A linear model which are interpretable which are explainable, and you are using them to explain what a black box model is doing and still it do not establishes the connection that the the Learned features of the model are connecting to the real world entities and that's where even Center release 2019 paper talks about that. We need to drive models that are inherently explainable and that's where you can actually introduce more explainability aspect within the model natively.

H

So in this case, if the redly explainable model, if I want to turn the Lyman shop to an unharently explainable I, would Define such a scenario where I would like the knowledge to be a part of the interpretable feature. I want the knowledge to be able to map any input, data to those interpretable features and I'm making them my model.

H

My original model to be explainable and I do not need a surrogate model to explain and that's where we we say that such a model are by Design there they would be explainable and you are able to check the Fidelity of your original model by introducing indefinitely experiment in the inherent explanations.

H

So once this example that I I conducted in 2019 was that uh what if, if I, want to Define user level explanations, because that's something that I was able to relate with bike's theory- is that if I have an input text and I'm able to generate some kind of attention, Maps, which is pretty much doable nowadays uh from python libraries. So what we are trying to do here is that we are taking the maps of the heat map of the model and they are saying now. These maps are essentially what they are Concepts their words.

H

So what we are trying to do is that we are trying to say: let's take a database that has some interlinked entities and, let's just run the database, queries over the uh database queries using these words. Essentially what we are trying to do. We are trying to say that let's just derive a trace of all the entities that are being passed over the entire database and see what other trays are. Is there any commonality in the trace, and can we construct a tree now the question that arises? Where do we stop?

H

We stop uh heuristically what I thought? Maybe it was, uh it has read a lot of improvement, but what I thought was? Maybe the prediction has given can be connected to the to the node in the knowledge, in the knowledge base or in a database which has some similarity with the prediction, and if we reach to that particular node, we can stop the trace and that's where we want to take that stress and explore whether we are able to achieve uh any closed labels as the predictions.

H

So this is was one of the post on explanation. uh uh Work on how do we introduce a user level explanations according to White's Theory and even the the definitions what uh Dr Rudy mentioned?

H

So this makes me feel that there is some kind of a vertical that that we are looking into. We have a generative, a conversational AI at the top and I'm able to introduce some kind of good and good enough explanations of foundational explanations and I can explain them through user level. Explainability can I make this explanations by this classification, a part or intrinsic component of an AI, so that the conversational models can actually benefit from such a classification.

H

If I'm doing a classification at uh uh uh if I say the classification is pretty good and it is very decent and I can explain it, then probably the generation would be explainable because it has some Synergy or connection with what you have done in as a classification. But how do we convert complete this particular process? What is a core component that we can start with to have a generative and conversational AI that is explainable and safe?

H

So that's actually the work which even further derived me to work into the mental health aspect, try to look at the various ways. uh Agents have been developed over the time how they have been interacting, what type of generations that they do and is there a way to confirm that this generation is right or wrong.

H

So, for instance, this conversation, which is pretty much about nervousness uh commonality that happens in our days in the USD level or the school going students the generation is such a mom with a model are pretty much risky, uh essentially I'm, saying risky, because they have no connection to verify whether such a generation is is of Worth or not I'm. Now, turning this entire generation problem into a classification problem, so what I'm trying to do I'm introducing a classification component within the generation car AI?

H

It's telling the AI is saying that if the generated outcome, if the generated question do not match some safety guidelines, something very similar to Sparrow, if they are not, if we, if we are not having a connection with safety lexicon, all the generated questions do not match some existing questionnaires, which are clinically approved and safe. Potentially the generated outcomes are not safe, and can we change these these term? These questions by taking the help of those uh those questions in the middle question here. So there are two problems over here.

H

One problem is a classification problem. Another is I'm. Chair with the head of the classification. I am generating a new sentence, a new question, so there are some simple and effective strategies to come across this such a behavior pretty much. The first is use of external knowledge completely to transform the initial data, make your initial data very safe, so that any or undesirable Behavior can be minimized to the largest extent. Another possibility is to develop some auxiliary tasks, to confirm the natural language understanding capabilities of your model.

H

Is it able to understand the concept within a query or you can actually utilize now, in this case, for the auxiliary task, you can use data knowledge or external knowledge to develop some auxiliary tasks now when I say auxiliary task, an example can be if I want to classify sentiments and emotions, that's a larger task, but before sentiments and emotions, I want my model to look at some post tags, some post tags that are a part or- and that can actually describe the sentiments and emotions so that connection that intrinsic oxidative that you're developing to confirm that the model's a larger goal, the topmost goal is accomplished.

H

That's what we we Define as auxiliary tasks. Another point is how about tagging the data now we all know that deep neural networks actually learns why special tags, some special tokens, can I introduce some new tags. It's saying that this part of the sentence is knowledge. This part of the sentence is query. This part is about my personal profile or my personal issues or my concerns, and this is the response. That means that's. The models should look at this tag and should give it response after the attack.

H

Now such kind of tagging is kind of cumbersome because you cannot do it on a large scale, but that's one of the potential uh opportunities. Where can we automate the tagging of the sentences so that the model can be trained in with some specialized tokens or specialized tags?

H

One such example is how about I use key phrase, extraction or key phase generation approaches we have given some bunch of words: I create a phrase out of those words right, like training, my model, you know, in a unsupervised fashion, to generate some key phrases which are n grams. Five grams topic models all collapse together in a system, so that can be defined as key phrase, extraction or generation approach.

H

H

Gpt was how about we use, reinforcement, learning with human feedback. We.

G

Give out some results and.

H

Let the user tell me or let the human tell me whether that generation is right or wrong and I think that that's reward and improve upon it. Now these rewards can be either make maintenance as a system or chat GPD does, or you can actually make it intrinsically with the model. By introducing some rewards now, if I say, rewards, how would it how can I construct rewards, for instance, in this query, let's say mother by troubles concentrating while reading newspaper or watching television.

H

The left side is a generation by a T5 model which has no, which is a legitimate language model, uh not as the capability of gpd3, but a good enough capability to be able to execute on a gpus. So at the right is what you can see. As of a bunch of rewards that I constructed saying that natural language inference is one reward. Syntactics course I can construct those either as a matrix who can who can calculate syntactic scores and there's another metric? Who can calculate semantics course now?

H

Can I use these metrics to compute a reward to tell me where irrespect, so this reward is irrespective to your loss function, but this is what tells me that whether the models generation is going into the human desired Behavior or not. So we we look at these questions and then this Behavior tells me that which question which set of questions needs to be generated and which set of questions can be ignored.

H

So now, all of these information that we are available now, how do we include them inside the model we have? We have talked about some of the strategies we talked about, that the knowledge is required. Now. How do we include them? I have given you some high level information, let's just dive into it, with a more concrete examples of how this information can workplace can take place inside the NLP and that's where the Paradigm comes into the picture, which I worked as my part of my PhD, which is knowledge, infused learning?

H

How do we include this knowledge as a wholesome part of your AI learning strategies and that forms a very much uh start of this upside down pyramid?

H

So knowledge, infused learning is a method or a class of that involves the incorporation of broader forms of knowledge that we have right and how do we include them in the AI formations so that they are uh they make the model interpretation interpretations better? You can interpret the model and actually the the generation or the explanation that you're getting are of of user level.

H

You are able to match it to the user's expectations and user can comprehend that let's take an example of a scenario, so this on the right left is basically a conversational piece that uh that we were playing around with a lot of conversational agents, and we took this conversation piece from Reddit and we took this from from depression subreddit, which is pretty much the huge community and the generation.

H

If you see on the left, is a generation by people who are anonymously responding to this person, and you get if you look at this uh slides very closely that the uh that the question that being asked have never been a part of the topics or the concepts within the query. That means that people who are asking the question are actually reading the content, comprehending it and then asking the follow-up question.

H

They are not asking the question, which are: what are the topics or the terms that are already present in the query, either in the user's post, whereas if you look at an agent trying to work on it in this case, there's T5 we are able to generate. We are able to generate some more of redundant questions, because such questions have already been answered by the user right now, if I.

H

If these questions, even if you fine-tune the model, they are really good like they would be very so if I do fight tuning I'm, making my model very focused on this on this particular data or without this particular post. So definitely you will get a good questions, but are they safe? Are they uh good to you know to be asked to the user? So that's where we we say that the safety is not there, but essentially still the questions even after fine tuning are very specific to the context.

H

So that means they are not information seeking questions, because that's what the information seeking Behavior was being expressed by the user or by the uh Community online.

H

So these are the series of experiments. We found that that uh the your the conversations, probably by the even after fine tuning, is still unsafe. It go rent. We are not able to find diagnostically relevant information in the the questions and, most importantly, I run my model today. I read my model after a day and my Generations are completely different. So definitely the Hallucination is still in the process.

H

So we took this example and tried to look at from the perspective of reinforcement, learning. We took the same question the same post generated by T5. We ran it, we fine-tuned it on the Reddit, the Reddit Corpus, and now the generation are still uh different and what we are looking at is a good question, but still unsafe. So what we did was we introduced a sort of a reward. Now I talked about the reward in the previous slide, but this is the reward that we were introducing.

H

We say that there's a blood score so Blood score is the largest uh uh trained metric proposed by Google, which says that that if you are able to generate a sentence that is human, understandable and or human comprehensible, it will give a legitimate good score, but it cannot account for the safety for the safety aspect. What we are trying to do is we are making sure that the generated question and some clinical guidelines in this case a phq nine, which is a patient Health questionnaire right I- will share you.

H

I will show you what exactly these are. But let's say this is an elongated list of all the questions right. So if you have all the list of questions, you want to compute the similarity and if it is greater than by a particular threshold and counting on it and making sure that it is significantly larger than value 0 and if it is not, then potentially I will be giving it a value minus one. So here I'm, actually looking at how divergent or how divergent or convergent is my generation to the questions to the pH right.

H

So this is kind of saying that the generator now have has an evaluator Network. That will tell.

G

H

Whether the generated question is matching to the expectations of the human or not so we are talking about retrieval. There is a part of the retrieval earlier that I was I started my presentation with now we have a generator, but now, along with the generator, we have an evaluator framework as well, which is checking that whether this generation are matching the guidelines or not. So this is a manifestation of of phq9 questionnaire, which is a set of nine questions, and every question has any has a particular score.

H

We are not looking at this course because, that's again, a classification or regression problem, we are looking at these questions on the left to see that, can we utilize these questions and make sure we are able to find these terminologies in the generated questions so making so so that we can say the generated questions are actually following a clinical guideline, so we ran through a series of experiments along these lines, took a took, T5 fine tune it and even third model which were used through reinforcement learning.

H

We saw significant Improvement on the lower threshold of 0.4. We when we started to increase the threshold, definitely the quality, degraded right if I'm not super high, uh uh like convincing to be a kind of a deployed system, but it's certainly a work in progress along this line, but what we found that the phq Knight based questions can be intrinsically be a part of of a such a network. I can also be packed as a as a foundation for the evaluation Network.

H

So that's where we saw that. Can we Define this pH u9 right now it was externally uh a part of a loss function, just Computing a similarity. Can we have this phq line to be intrinsically a part of that Network, because now that's just a loss function tuning but can be a part of the decision making of that model. So that's where we introduced the concept of process knowledge, because the clinical knowledge is are always process oriented. They are not unstructured, they are structured and every seek every sequence of question are Predator are determined.

H

No clinician asks second question. First, then, the first question based on the scenario, all the questions are asked in a sequence. So what are the things that we want to work on it? If you want to do such kind of scenario? The first thing is that we transform the data through a particular knowledge, guided question tag.

H

We just tag the data through the question diagram right and let's just say that this is the schematic diagram of how this process knowledge, infused learning works, and essentially you can see that there is a flowchart on the right and there are some constraints and the Black Box model is taking a benefit of this process structures and the form of a constraint, so constraint means whether I should do it or whether I should not do it.

H

It's kind of like say getting a threshold again, but with a more conscious involvement of the process in process knowledge and that's where we stepped into the safety aspect of the model. Now the model is intrinsically going into the safety aspect where we are now introducing this. uh uh This notion of of uh clinical guidelines intrinsically inside the model, so safety by definition, has three features that uh were according to or what I have been.

H

uh What I've have in my literature, saying that robustness was one of the reason for safety that that it is defined as the AI models continues to operate within the safe limit of specifications? Specifications are clinical guidelines, awesome questionnaires, where you are trying to assure that the AI model adheres to these guidelines, and if it does, you are able to show explainability in the model.

H

So the first thing that I talked about the tagging. This is what I meant by tagging. So tagging means that now my every input in my text, I, am now introducing new tags, saying that this is Q2. This is Q3. uh This is Q3. This is Q3. This is q9 so that the model knows that this part of this content is associated to which part of the questions in the phq Knight. But this is just a data set. We want to enforce the suggested scenario or certain approach is a part of the model as well.

H

So this is just a illustration of the page u9, but what essentially, what we did was we introduced new cross attention blocks. So now you see that there are nine core cross attention blocks, corresponding to nine different questions. Now, once one can ask me that uh if you have nine cross addition blocks, how come the Knight, how come the nine different blocks have sufficient knowledge about it?

H

So that's where we utilize some lexicon, some existing knowledge that people have built over the time in mental health and we use them to populate the definitions of these blocks. Making them have sufficient knowledge so that they can compute the attention, scores, deterministically and that's what we introduced as phq-9 intrinsically inside the model as an auxiliary task. Now these are auxiliary tasks. The only thing that I'm not.

D

H

Is that I'm not doing a binary classification on different h-type, one H type, 2 and so on and so forth by intrinsically I'm, making this block a part of the phq nine questionnaires and and important in uh what it interested me that I can just replace nine questions with any other questionnaires that mental health people uses and this functions. This network starts to work accordingly.

H

So what was the benefit of such a heavy exercise that we did? So what you see on the top is a very convoluted attention, representation by a self-attention model, and it gives you all the highlighting possible in the content right and if you plot this attention map on a plot by Computing the similarity between the attention Matrix, because attention Matrix are between zero and one they're, always between 0 and 1..

H

If you compute the similarity between that values and the uh and the questions are even uh see that whether those values map to any php9 questions, because you're highlighting is taken by the words right. So you take these words and compute the similarity with the php9 questions you will see it will give you all the highlighting possibles in the phq9, because there is no way to check which phq9 question had a higher impact on this content compared to other, and that's where we want to look into. Can we have more adaptation?

H

So this was my hypothesis that why would have what? Why would you have? Why would we have such a uh uh unique like uniform uh uh similarity across all the phq9 questions? All the all the patient Health questionnaire questions, but what we need was that if I have a phq9 question 1 to the model, it should tell me what part of this words is that content is highlighted. If I give it phe9 question 2, what part of the sentences or part of the paragraph is highlighted.

H

So if you have such kind of scenario, if, if let's say I, do a php93 there's no highlighting or minimal highlighting right and if you have php94, so it will tell me that individual psq9 questions how they are being Illustrated or manifested in the entire content, rather than giving me a very convoluted, dense, Network, highlighting which I cannot even understand.

H

So what we found that we found only specific questions that are getting good treatment from this content. That means these questions are something that is answerable or they can be answered from given context from the user, and the question which did not show up in this list are the potential questions that the clinician that the model should be asking to the user for more information if it is required.

H

So if you want to convert, if you want to do a follow-up questions, question answering systems, then it will give you a list of questions that can be that can be generated by the model over the time to have a conversation with a user.

H

So when we now, when I, have these bunch of questions and now I want to judge which question to ask, so that's where the question comes in what is the next safe questions to generate? So that's where we, we created a series of data sets which start uh with strategic guidelines. What type of uh questions needs to be generated? What kind of categories these questions belongs to so that you are able to generate these questions accordingly? So now this till this point, everything is automated.

H

There are all the existing resources, but at this point you need a specific resource that can tell the safety and Char GPD in this case does by human feedback. We they've made the system online and we were giving the feedback and that's where the things becomes more relatable. Now, in this case system scenario, we cannot make a system online because it can be a detrimental, so we need to make a kind of specialist data sets of this particular kind.

H

So what we did was in this kind of a data set. We introduced a specific uh ah algorithm which is very similar to what charity introduced. We, rather than doing a reinforcement, learning guided approach of whatever are the things that they have proposed. What we were doing is we were defining specific scores that we can compute. The first score is simply the probability of the model. Second is: does the model generated a question of a particular tag, or not?

H

So that's something classification you are now third point is that is there a similarity with some knowledge base as a knowledge, similarity that is again a part of the uh classification task and the fourth point is still uh an intersection. So if you see the first question, the first point in this algorithm is a generation and rest.

H

All of them are some sort of classification task that you are introducing and what you're trying to say that summing up of this whole score is the entire uh loss score that you are Computing into your in your model, so safety was introduced by safety, lexicons and explainability was constructed by using the access to the knowledge base in Mayo Clinic to see if we are able to check how was the generated questions matches to the? uh How much are the generated questions are actually uh have some similarity with the knowledge base.

H

So safety lexicon means, like I, have a category and I have a bunch of words that can describe that lexicon.

H

And if I want to look at this fourth point, which is the intersection for safety, what we say that anything that has those set of tokens in the generated text that are actually overlapped or they have an exact overlap with the safety lexicon that we are able to say that those questions are are pretty much the same or are they are decent to be generated? So that's how we are looking at. We identified what questions to be generated.

H

We now modeling, which question to be generated from the list that we have right, and then we are looking into how we are making sure that the safe uh generation is going to happen in among those uh generated questions that the model is providing by introducing this kind of a scenario of explainability and safety right. Irrespective of what root score shows and what blue score shows. We found that there, after even after introducing such phenomenons of experiability and safety through this algorithm inside the model, we are not hurting any rule score.

H

We are not hurting the blue score of the model, but we are definitely improving. Thus, uh the safety aspect, as well as the knowledge capture aspect of the model now here, I'm saying and safety, means that we are making the model less unsafe, that we more safe.

H

It was just that we took the terminology uh and later on, figured out that it was actually safety metric, but we we submitted the work as as with the title and safety uh as one of the music and another point was knowledge capture where we say that how much of the terms are actually getting captured within uh Mayo Clinic. So definitely there was a huge uh effect that it we do not show.

H

We did not see too much of uh capturing with the knowledge, but essentially this is a work in progress and things will evolve over the time.

H

So when some of the generations that we we were testing- and so especially this particular example, we were able to see that some of the questions do map to the phq9. So one question that I have marked star did not show up in the phq9, but we cannot deny that such a question is safe because it talks about some antidepressants which are.

G

H

So the answer generation, which will go beyond the scope of phq Knight, but that's where uh you are actually looking at the capabilities of AI, because AI gives you the power of going beyond what is already being constructed with expert, but but without with the concepts that they map to something that you can actually refer to or checked check with.

H

So this was a demo that we we developed for the entire agent which we will represent as a triple AI 2023 demo track, and it is, it is a kind of a series of work that we actually can uh that we did on process knowledge and fuse learning uh specifically trying to introduce explainability and safety intrinsically inside uh on a deep neural network that is capable of generating languages.

H

So the summary of my talk I walk you through through explainable AI, where I talked about something inherently. What does inherently explainable means? What does it handle experimental systems here are in the mental health context? What does hallucinations risky Generations in coherency? What are they and how they are looking, uh how they are being perceived in Real, World, Knowledge.

H

Infused learning is one of the potential solutions for where you can actually detect the answerable accessibility of a particular content and check that that training the model to order the generated identifying which questions to be generated, learn how to order them some kind of and Reduce explainability by by making sure that they match some existing knowledge source and ensuring that there's their safety by making sure that they are mothering to some safety guidelines or safety lexicons.

H

To achieve those, we introduce safety through process knowledge and fuse learning, which is one of the ongoing work, and definitely it is an early stages, but it is a work towards where we want to introduce safety aspect inside a conversational systems.

H

There were the years that there's a series of work data sets that we have built and and new metrics that we have also constructed specifically to test explainability safety and things of this nature, because these metrics, like uh accuracy and Fs core, have already fall short of. So that was a work that we uh accumulated into what we termed as a knowledge, intensive language, understanding uh tasks with this I and my talk. Thank you all for your attention and I'm open to any questions.