With transformative technologies such as AI and Machine Learning, government agencies can help achieve goals, detect fraud, and create data-driven strategies. Chief Data Scientist and ODS Acting Associate Director at NTIS, US Department of Commerce Chakib Chraibi joins tech transforms to discuss his insights on helping the US Federal agencies and citizens use data to enhance any mission.
Carolyn: Today, we got to talk to Dr. Chakib Chraibi. He’s the Chief Data Scientist in the US Department of Commerce, National Technical Information Service, or NTIS, and acting associate director for the Office of Data Services. He provides expertise and assistance to government agencies in harnessing innovative technologies and delivering data-driven solutions to achieve mission impact within the NTIS framework. Chakib, welcome to Tech Transforms.
Let's start with a brief overview of your role at NTIS as well as the role of NTIS within government agencies.
Chakib: NTIS is a bureau within the US Department of Commerce. We want to think about NTIS as the best-kept secret in government. What I'm going to say about NTIS is going to resonate with a lot of our listeners. NTIS is a very interesting agency that is focused on data science and data innovation. It was created shortly after the Second World War.
The main task at that point was to gather all the information collected from the Second World War that dealt with technical research, et cetera. It became a repository of information for the government. They dealt with any technical papers or publications from the civilian side. But in the 1990s, the internet happened. And so, we're still doing that. We have one of the largest libraries. We're continuing collecting that information, but Congress has thought about focusing us on a different mission at that time. It is actually a great idea, and which is about data science.
Chakib: Currently, that's our main focus at NTIS. We provide a unique pathway for federal agencies towards innovation and digital transformation. We have an authority from Congress that allows us to seek out their partners from the industry, from academic institutions, nonprofits, to help federal agencies address national data center challenges.
It's available to all federal agencies seeking an agile capacity to scale. It has quick access to private sector ingenuity, and expertise, to meet critical mission data priorities. We also use a very innovative framework. It’s based on agile methodology to be able to harness emerging and cutting-edge technologies. We operate outside the Federal Acquisition Regulation, outside of FAR.
It’s in the innovation space, and it's really exciting. Whenever you want to innovate, you are not sure about how to go about it. All federal agencies want to be effective and efficient in accomplishing their missions and addressing data priorities. But, sometimes they don't know how to go about it. They have an idea about the business problems and what they want to achieve, but they don't have all the details, and the steps to go about it. That's because that's part of any innovative work that you're going to do. That is where we can help them with.
We have a very agile framework where they can come and discuss their business problems with us at a very high level and what they want to achieve. What is their mission? What's the most important thing that they want to accomplish? Based on that conversation, we can actually develop a problem statement. It’s a very high level scope statement that tries to address data innovation goals they want to achieve.
Chakib: Once done, we reach out to our partners. We actually have a whiteboard session that is open to anyone. It's a very free-flowing discussion based on design thinking. Discussing what the objectives are and what they want to accomplish, what problems they want to solve, etcetera. That helps us refine the scope statement, and then complete that statement with our partners. We have a selection process, like a review that goes on to finalize the partners that will be involved in conducting that project.
We already have more than 45 partners from the industry. Of course, we have the big ones like IBM, Booz Allen, and universities like Stanford. There are smaller enterprise companies from Silicon Valley that are focused on very specific aspects of data science and artificial intelligence. They are part of our list of partners. The good thing is that, these partners can still partner with others. We don't expect everyone to know everything about anything. As long as they are the major players, they can partner up with others to complete the project.
Mark: How does industry engage with your organization? Do you and your organization determine what technologies and industry partners are getting involved with some of the things that you're working on?
Chakib: We have a notice in the Federal Register, asking for any company. We focus on and also encourage minority-owned businesses to apply, to become a partner, and we have a specific criteria. Our criteria is related to data innovation. If you can provide some value to the government in terms of data science, machine learning, artificial intelligence, and the related emerging technologies, we encourage them to apply.
Chakib: We have a process where we have them evaluated by people from NTIS, as well as outside NTIS from the federal government that volunteer. They help us evaluate if they are providing specific value and meeting some criteria that we define in the notice. After that process, they become partners. Once they are a partner, then they will be able to compete for a project that we bring in.
Carolyn: You guys are like a technology innovation playground. You get an idea, you help the agency or group scope it, and then you put it into action. When you put it into action, do you actually implement it within the agency or do you have a lab that you model it in first?
Chakib: We are a very small agency. We are more like a broker. What we do is reach out to federal agencies and let them know about our services. We tell them, "If you're having difficulties innovating data in a data framework, then come with us."
The advantage in working with us is, we have been involved in several projects already with several departments and federal agencies. We have experience with what federal agencies are trying to achieve, the challenges that they encounter, and the type of solutions they should seek out.
When we talk with them, we always ask them about what their mission is and what they're trying to accomplish. We ask about the challenges that they are encountering and the support they need from us. That's how we start. We start with a specific problem. We can discuss later how we advise federal agencies to go about artificial intelligence. That’s an amazing topic there. It’s an interesting area to discuss.
Chakib: AI is a very transformative technology. It does require a big investment from federal agencies. As an AI expert, I tell them, "Get involved right away because the data doesn't stop, it keeps coming. Then it needs to be processed, accessed, cleaned, and prepared to be used." The models that you're going to develop based on that data, if they are predictive models, they're also iterative models. They need to be refined over the years, as soon as they're deployed and monitored. The sooner, the better.
Carolyn: When you say the sooner, the better for AI engagement, what specifically are you talking about with AI? Is there a specific technology that you recommend that they start with?
Chakib: I wrote a paper recently about the challenges and opportunities in the federal government, and the type of model they should use. I'm not saying that is the perfect model, but let me just try to identify some of the guidelines that I identified. In that model, I try to identify the challenges that all federal agencies encounter. The first challenge and the most important and the most immediate one is about data. We are moving from federal agencies in the IT infrastructure.
We used to have legacy-based applications, so we have a siloed application. So we had the data basically fed in a specific application, and things were going well at that time. That was really what you needed to do to accomplish a specific task. But if you want to really take advantage of data, that's one of our roles at NTIS. It’s to foster and encourage agencies to use data as a strategic asset for evidence-based decisions, to be more efficient, to enhance their processes.
Chakib: All of these need an aggregation of data from different sources, internal and external. The first thing is to switch the model of thinking and the model of operation from siloed application into a more integrated data framework. It will come together whenever it is needed and with the flexibility that is required to address specific problems within the federal government.
Mark: Do you typically need a government agency or a customer to engage, to drive some of the work and innovation that you are working on? Or do you have the flexibility or the autonomy to do the innovation on your own and then take it to federal agencies where they might be doing some stuff like that?
Chakib: It is the former. The agency actually leads the project. That’s a great thing because they are really the subject matter experts. They're the ones that are really working on the specific aspects, and they know what works and what doesn't work. So they do it in collaboration, of course, with the partner, but they have to lead the project.
Carolyn: Without revealing national secrets, is there a favorite project that you've worked on that you can think of?
Chakib: I can tell you the department agencies that we worked with. But, of course, I cannot be very specific about what we do. Some of them have agreed to allow us to publicize that. So, I'll focus more on those. One of the projects is the USAID Presidential Malaria Initiative. The goal of the project is to control and eliminate malaria.
We are now going over a pandemic that has been one of the worst calamities in the history of humanity. But malaria has been around for several centuries.
Chakib: It’s a subtle life-threatening disease as well. It is caused by a parasite which transmits it to people through the bites of infected female mosquitoes. It's preventable and curable. In 2019, there were about 409,000 estimated deaths from malaria, 67% of that were children aged under five years old.
This is really an important project for us because we want to help USAID and the US government. We want to help the countries that are affected by malaria throughout the world and save lives. In our work, we have helped USAID design and build a platform that they call the Malaria Data Integration and Visualization for Eradication platform, with the help of a partner. It includes storage and organization of malaria and related data, literally in different formats, different structures, and different languages.
The goal is to aggregate all that information, collect it, integrate it, and then help them do better evidence-based decision-making. Predict and identify high-risk areas based on geospatial and weighted data. Better manage the preventive supplies, such as mosquito nets or insect repellents. Eventually, to implement some low-cost malaria competitive solution in resource struck area to machine learning tools. This is an important project that ultimately has an amazing outcome.
At the same time, it's also a model for other agencies about how it's important to combine the data, how even geospatial or geographic data can be combined, protected, and secured, and providing self-service capabilities. That is what we always try to encourage federal agencies to do within a program or within a region where the program is applied. To provide them with user-centered, self-service capabilities that allow them to enter the data, process the data, and do the initial cleaning and preparation for the data.
Mark: It sounds like we need to connect you up with some of our friends over at CMS because they're dealing with this thing. They probably could use your help, Chakib.
Carolyn: What your department is doing is just what I love to talk about. Technology, literally, is transforming and saving lives. We’re using technology and data to do better.
Chakib: The pandemic has been a calamity and one of the worst disasters. But the silver lining is that it has helped accelerate digital transformation across all industries, definitely in the government. That is definitely a momentum to be using technologies to achieve our goals. Actually, we're working with HHS-OIG in one of the projects that we are developing. This is our eighth year working with them. We started many years back. It is an interesting example to discuss because it shows the progression of what federal agencies are bound to do to benefit from artificial intelligence and machine learning capabilities.
The issue that they had is that they were working in siloes. Then they had the data in different areas of the country. But fraud exists in California as well as in Florida and they usually use similar patterns. So, why don't we bring that knowledge together, aggregate it, and help everyone take advantage of others' experiences? One of the aspects that we helped them with is to use sentiment analysis. Use social media posts to include them as part of the information that the investigators and auditors need to better identify fraud patterns.
Carolyn: You look at their social media to detect their mood, where they might be leaning politically or otherwise?
Chakib: We try to identify if there is any pattern. One thing that the fraudsters do is that they keep moving. They create a company and they keep moving. The underlying patterns are there, they keep repeating them. But they have become more and more sophisticated.
The advantage of using machine learning is that we can actually be on top of those. You can predict what type of new patterns that they're going to develop instead of just using simple rules that were used before. Where they say, "Okay, if you see this and that, then let's look into it." Now, we can actually have the machine learning algorithm model tell us in advance, "Okay, this is the type of pattern that I see. I believe, with a certain level of confidence, that this might be fraudulent."
That's the fact that we see in the future of machine learning and artificial intelligence, it’s working very closely with human beings. It's an augmented artificial intelligence. What we do with HHS is that, when an investigator or an auditor comes in, we try to make their life easier. We spend a lot of time automating the tasks that should be automated, because they're tedious. There's no need for a human being to do that unless just to check if they're correct or not.
Then, we try to surface which cases can be the most beneficial for the American people, for the taxpayers. So which one will give us a better return based on the time that the investigator units spent, because we have limited resources. All of those are really important in terms of trying to address the fraud and wasted abuse within the government in general.
Chakib: But in HHS, we’re using these incredible tools that help us basically sift through the data, identify the patterns, and surface some areas that we need to focus more time on as human beings.
Mark: Those are complex issues that you guys are working through in a lot of different areas. It's fascinating work that you all are doing. You have a very long and distinguished academic career. How do you perceive some of the differences in working in the academic world, compared with the government world?
Chakib: I started in the industry with one of the largest computer companies. Then I moved to academia where I really enjoyed it a lot. The difference between academia and government is very little because it's all about service-oriented, mission-based activities. I joined the government because I really love the NTIS model, what we do. I'm fascinated by these fields that are exploding, which are data science, machine learning, and artificial intelligence.
I strongly believe that this is not just another technology. It's a transformative technology that's going to directly affect us in many ways. I see my goal, and that's why I'm very active in social media. My participation is to inform the public about this awesome power that is AI.
One aspect that's related is that I started in academia. I'm also working on the side of looking into it. Participating in some forums that try to foster collaboration between government industry and academia, for instance, called Responsible AI. We're trying to look at how we can make sure that AI is actually applied the way we intend it to. Without bias or being harmful to us in some way, short term or long term.
Chakib: It's such a complex issue. Then, at the Department of Commerce, we actually published a Commerce Data Strategy late last year. One of our action plans is to develop some data ethics that we're going to use at the Department of Commerce. As part of the Federal Data Strategy, there is a push to make sure that we understand how AI and the data we use can impact any application we do. We try to prevent and mitigate any issue that relates to equity or bias. So, that's very important as well.
Carolyn: You've reminded me of something that I've heard other data scientists and other AI experts say. AI is going to make us more human because it will free us from the menial, over-repetitive tasks that we don't really need to do. If we can be freed from those, it gives us more space and more time to devote to innovation and ideas. I would love to hear your thoughts on...