Trust is fragile, and that's one problem with artificial intelligence, which is only as good as the data behind it. Data integrity concerns -- which have vexed even the savviest organizations for decades -- is rearing its head again. And industry experts are sounding the alarm. Users of generative AI may be fed incomplete, duplicative, or erroneous information that comes back to bite them -- thanks to the weak or siloed data underpinning these systems.
"AI and gen AI are raising the bar for quality data," according to a recent analysis published by Ashish Verma, chief data and analytics officer at Deloitte US, and a team of co-authors. "GenAI strategies may struggle without a clear data architecture that cuts across types and modalities, accounting for data diversity and bias and refactoring data for probabilistic systems," the team stated.
Also: The AI model race has suddenly gotten a lot closer, say Stanford scholars
An AI-ready data architecture is a different beast than traditional approaches to data delivery. AI is built on probabilistic models -- meaning output will vary, based on probabilities and the supporting data underneath at the time of query. This limits data system design, Verma and his co-authors wrote. "Data systems may not be designed for probabilistic models, which can make the cost of training and retraining high, without data transformation that includes data ontologies, governance and trust-building actions, and creation of data queries that reflect real-world scenarios."
To the challenges, add hallucinations and model drift, they noted. All these are reasons to keep human hands in the process -- and step up efforts to align and assure consistency in data.
This potentially cuts into trust, perhaps the most valuable commodity in the AI world, Ian Clayton, chief product officer of Redpoint Global, told .
"Creating a data environment with robust data governance, data lineage, and transparent privacy regulations helps ensure the ethical use of AI within the parameters of a brand promise," said Clayton. Building a foundation of trust helps prevent AI from going rogue, which can easily lead to uneven customer experiences."
Also: With AI models clobbering every benchmark, it's time for human evaluation
Across the industry, concern is mounting over data readiness for AI.
"Data quality is a perennial issue that businesses have faced for decades," said Gordon Robinson, senior director of data management at SAS. There are two essential questions on data environments for businesses to consider before starting an AI program, he added. First, "Do you understand what data you have, the quality of the data, and whether it is trustworthy or not?" Second, "Do you have the right skills and tools available to you to prepare your data for AI?"
There is an enhanced need for "data consolidation and data quality" to face AI headwinds, Clayton said. "These entail bringing all data together and out of silos, as well as intensive data quality steps that include deduplication, data integrity, and ensuring consistency."
Also: Integrating AI starts with robust data foundations. Here are 3 strategies executives employ
Data security also takes on a new dimension as AI is introduced. "Shortcutting security controls in an attempt to rapidly deliver AI solutions leads to a lack of oversight," said Omar Khawaja, field chief information security officer at Databricks.
Industry observers point to several essential elements needed to ensure trust in the data behind AI:
Also: Want AI to work for your business? Then privacy needs to come first
An AI-ready data architecture should enable IT and data teams to "measure a variety of outcomes covering data quality, accuracy, completeness, consistency, and AI model performance," said Clayton. "Organizations should take steps to continually verify that AI is paying dividends versus just implementing AI for AI's sake."
Want more stories about AI?Sign up for Innovation, our weekly newsletter.