It only takes a single drop of pigment to change a colour of paint. The change might not be significant or even noticeable, but it’s still a change. The same can be true of data, and it’s the reason so many people still get caught out with phishing and other online scams. They’re fed just enough information to make them think a request is genuine and they fall for it.
Scammers know that small bits of mis-information hidden in among accurate data is easily overlooked and this is what they’re counting on. Some believe that AI used in cybersecurity tools can help identify these nuggets of mis-information. But that comes with the provisor that the data it’s been trained on is accurate and that, in itself, opens up a whole new can of worms. We can use AI to check data, but can we keep up?
Given that humans aren’t capable of checking data at the speed that AI can, who or what is checking the accuracy of data that AI is processing? Do we genuinely know whether all data is true? We can assume it is, much like companies often assume that the lists of email addresses they market to are accurate. And we all know how that goes. Despite having features to unsubscribe, most people don’t bother. They either ignore it completely or simply trash it, and the company is none the wiser.
Most databases contain much more than email addresses. In fact, it’s the volumes of data that’s driving the development of more AI use cases. Should more of these be focusing on ensuring data is accurate rather than optimizing processing? Especially given just how much chaos mistakes or misinformation can cause when they’re not identified.
Today’s developers have so many angles to take into consideration especially when that data is to be processed with AI. Any mistake, oversight or misinformation can scale the bias or vulnerabilities exponentially.
We’ve talked about AI bias before, how assumptions made during programming can impact data outcomes. Those working to resolve AI bias are not only looking at how systems are programmed, they’re also looking at how to improve on the quality of data.
If it sounds a bit like a catch 22, that’s because it is. The more we build and create, the more complex it becomes. The more complex it becomes, the more tools and applications we build to help us manage the quality of data and accuracy of processing.
In cybersecurity, companies are always trying to pre-empt vulnerabilities and identify what’s true or false. All while threat actors are doing the opposite, working to pass off stolen identities as authentic, or misinformation as being true. And these threat actors are happily using AI to help develop this misinformation. It’s almost becoming a game, seeing how much information can be passed of as true before it is identified as false? Except in this game the cost for companies can be high, and they’re being forced to play, whether they want to or not.