If software is the engine of the 21st-century business landscape, data is its oil.
And the oil is only becoming more valuable — and important — as generative artificial intelligence bursts onto the scene.
Generative AI technologies are trained using large quantities of data so that the AI systems can identify and learn the patterns in the data, enabling them to subsequently generate new content when queries match the observed patterns gleaned from training data.
Still, for enterprise applications of AI within security-critical processes, there is no such thing as too safe — making effective data governance and a compliant technical infrastructure both a must and a bottleneck.
Now, a consortium of companies called the Data and Trust Alliance has developed data provenance standards for describing the origin, history and legal rights to data.
Announced Thursday (Nov. 30), the new set of standards is intended to help enterprise businesses feel more comfortable and secure when embracing AI and building bespoke programs about the data being used.
“The standards are designed to be used both within a company and with the company’s ecosystem of data providers and data partners for use cases across the enterprise,” the group said when announcing the standards. “They are less applicable to large language models trained with public, web-scale datasets.”
The alliance is made up of more than two dozen companies, including American Express, Humana, IBM, Pfizer, UPS, Warby Parker, CVS Health, General Motors, Mastercard, Meta, Nike, the NFL and Walmart among others.
It comes as organizational leaders, interested in capturing the change-the-game efficiencies AI offers, are increasingly looking to gain further clarity on data lineage before making the jump and integrating AI solutions into their workflows.
See also: Working Capital Tracker®: Demystifying AI’s Capabilities for Use in Payments
More information, more access to that information, around the data being used in AI systems is a crucial way to give enterprise confidence in the innovation a shot in the arm.
For organizational decision-makers, comprehending data lineage is essential for making informed decisions about AI integration. It provides a comprehensive view of how data flows through the organization, offering insights into data quality, security and compliance.
As PYMNTS has reported, the generative AI industry is expected to grow to $1.3 trillion by 2032, but key to its success will be tailoring AI solutions by industry.
And key to tailoring AI solutions by industry is the availability of vast amounts of clean and structured industry-specific data for industry-specific LLMs to process.
Per the Data and Trust Alliance, today’s data scientists spend almost 40% of their time on data preparation and cleansing tasks.
“Artificial intelligence is based on data,” Prateek Kathpal, president and CEO of SymphonyAI Industrial, told PYMNTS in an interview posted Friday (Dec. 1). “The more data you have, the better data you have, the cleaner data you have, then the better solution or a better output you’re going to get. And the better and more relevant the output, the more actionable insights for better decision making you’re provided with.”
The end-to-end tracking of data as it moves through various stages of its lifecycle within an organization can include its origin, transformation and every touchpoint it encounters until its final destination.
Read also: Specific Applications of Gen AI Are Winning Play for Enterprises
The success of AI models hinges on the quality of input data. By understanding data lineage, employees tasked with owning AI initiatives can identify and rectify potential bottlenecks, inconsistencies or inaccuracies in the data pipeline.
“When you ask people, a lot of them don’t know much about AI — only that it is a technology that will change everything,” Akli Adjaoute, founder and general partner at venture capital fund Exponion, told PYMNTS in an interview posted Thursday (Nov. 30).
“But if you go into a field where the data is real, particularly in the payments industry, whether it’s credit risk, whether it’s delinquency, whether it’s AML [anti-money laundering], whether it’s fraud prevention, anything that touches payments … AI can bring a lot of benefit,” he added.
The quest for clarity on data lineage is not just a technical detail but a strategic imperative. By ensuring data quality, transparency, risk mitigation and operational efficiency, CEOs are laying the groundwork for a successful and sustainable integration of AI solutions into their workflows.
“In the payments industry, we’ve had a very data-rich environment, but we’ve operated in an insight-poor environment,” Enigma Technologies CEO Hicham Oudghiri told PYMNTS Nov. 14.
AI can help change that.
For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.