It’s clear that the way in which companies and individuals share their data with other entities is changing. What’s more complex is that there are many factors catalyzing this technological and economic shift. Industry experts have observed multiple causes for commercial data sharing’s development in recent months, from increased concerns about compliance to the global surge in demand for external data, for which we have generative AI to thank.
In this article, we’ll predict five ways that commercial data sharing will change in 2024 and examine the reasons for that change. Whether you’re a data analyst, sharing data between tools and departments, or a chief data officer, tasked with exchanging data securely, or a data provider, who shares data in order to monetize it, make sure you’re prepared for a commercial data sharing ecosystem with ever more SaaS solutions, regulatory scrutiny, and commercial considerations.
In both the technology industry and general news outlets globally, it’s been emphasized that without data, there’s no AI. Generative AI has catalyzed a data market boom, with companies requiring masses of external data to train their models.
More pressingly, many cases have shown that a generative AI model is only as trustworthy as the data it’s fed. Numerous instances of bias and obscenity in chatbot outputs have been a result of flawed datasets used to train the AI system.
As such, it’s not just that there’s a scramble for masses of data for AI, there’s a scramble for masses of quality, accurate data. AI has demonstrated more compellingly than ever that it’s imperative to work with trusted data providers. Commercial data providers selling data for AI training should always be able to prove that they take the required steps to remove inaccuracies, empty fields, and outdated information from their datasets. Indeed, the preliminary step before commercial data sharing - that is, sharing a data sample free of charge as part of the data evaluation process - has become even more important for the end user to test that the data can train a reliable generative AI model.
And the easiest way for providers to share said data sample? That brings us to our next prediction for commercial data sharing in 2024: there’ll be increased adoption of cloud-agnostic sharing technology.
Cloud-agnostic data sharing technologies have made it easier for organizations to share data across diverse cloud environments. In 2024, it’s likely that the demand for cloud-agnostic data sharing technologies will surge. They simplify data integration across diverse cloud ecosystems and contribute to greater data accessibility and collaboration.
For that reason, it’s likely that investment in data sharing tech will also increase, as Bobsled’s recent $17M Series A round attests to. On the whole, both will contribute to a data sharing that’s faster and more cost-effective.
Bobsled, for instance, facilitates data sharing by offering a standardized framework that operates independently of specific cloud providers, enabling organizations to transfer and process data with flexibility and efficiency.
Flatfile, another player in this space, allows users to integrate and cleanse data regardless of the underlying cloud infrastructure, fostering interoperability and reducing friction in data sharing processes.
Additionally, Weld offers a data integration platform that seamlessly connects disparate data sources across multiple clouds, ensuring a cohesive and unified approach to data management.
Following the success of these three example SaaS companies, we could see more cloud-agnostic data sharing companies being created in 2024. There’s certainly demand for the solution they offer. However, it’ll be interesting to see how new players in the field differentiate themselves.
The exchange of data is the final part of the end-to-end commercial data sharing. Before this fulfillment step, there’s the question of how the data deal is initiated. For which there are also software solutions, and we predict that there’ll be more.
Where there’s a user pain, there’s a software solution (or several). The same is the case for commercial data sharing. Data providers and buyers have long complained about many bottlenecks when it comes to monetizing and purchasing data. In response, we’ve seen new SaaS solutions being developed by companies large and small, aimed at making commercial sharing easier.
For example, data providers have long complained about the shortcomings of data marketplaces. To begin, there’s the struggle of integrating with a data marketplace in order to publish data products on it. Such integrations usually require a lot of time and engineering heavy lifting, sometimes months for getting listed on just one marketplace. When providers are finally published on various data marketplaces, there’s the overhead of managing business across these disparate channels.
Software like Data Commerce Cloud (DCC) emerged to alleviate data providers of both pains. Following Shopify’s VP as an omni-channel commerce solution, DCC enables providers to sync their data products to multiple data marketplaces with a click, sparing the engineering effort. Leads from these data marketplaces land centrally in the provider’s DCC inbox. For providers, DCC is a SaaS tool that makes commercial data sharing easier. For buyers, there’s a greater variety of data providers to choose from, whichever the data marketplace they’re using.
Commercial data sharing will become an even smoother process with the emergence of more SaaS solutions facilitating it. Which is important, as there are higher stakes: regulation and compliance is an ever-growing consideration for commercial data sharing.
Heightened concerns surrounding security and regulation have impacted commercial data sharing between companies. A Lowenstein survey published in January 2024 found that 57% of financial firms were concerned about data breaches, with 20-25% concerned about the increased compliance burden and privacy issues surrounding personally identifiable information (PII).
Companies are more apprehensive about sharing sensitive information, fearing the potential repercussions for both their reputation and the security of their customers. This has led to a paradigm shift in the dynamics of data sharing, with organizations becoming more vigilant in choosing their partners.
Moreover, data protection regulations have intensified this caution. Stringent measures such as the General Data Protection Regulation (GDPR) and other global data privacy laws have compelled companies to reevaluate their data-sharing practices to ensure compliance. Failure to adhere to these regulations can result in substantial fines and legal consequences, creating a deterrent for companies engaged in data sharing. This regulatory environment has not only affected current practices but is also influencing the trajectory of future data-sharing initiatives, with companies now prioritizing robust data governance frameworks to navigate the regulatory landscape.
Looking ahead, commercial data sharing in 2024 and beyond will likely be shaped by an ongoing tug-of-war between the necessity for collaboration and the imperative to safeguard data. Companies will need to strike a balance between the benefits of shared data for innovation, while implementing robust security measures and adhering to evolving regulations.
As new regulations - and new security threats - emerge, data sharing will continue to evolve, prompting organizations to adopt adaptive strategies that prioritize security and compliance in an increasingly interconnected business environment.
Year over year, companies have been allocating more budget towards investing in external data. This changed in 2023, when a Lowenstein survey found just 28% of respondents believed their budgets would increase by more than 25%, down from 65% in the previous year. As such, the general economic downturn has - and will likely continue to - put some limitations on commercial data spending.
With tightened budgets, organizations may find themselves constrained in their ability to acquire high-quality external data sets, hindering the depth and breadth of information available for sharing. This financial constraint could limit the scope of data collaborations and partnerships, especially for smaller companies that heavily rely on external data to complement their internal analytics.
Moreover, the decreased investment in external data may lead to a more conservative approach when it comes to sharing existing datasets. Companies, facing financial pressures, may become hesitant to engage in data-sharing agreements or collaborations with external entities, fearing the costs of compliance, security, and governance.
As a result, data sharing may experience a slowdown, as companies prioritize internal cost-cutting measures over external investments, potentially stalling the growth and innovation that arises from collaborative data initiatives. Striking a balance between cost considerations and the strategic value of external data will be crucial for organizations aiming to capitalize on data sharing within constrained budgets and tough economic conditions.
About the author
Lucy Kelly is a researcher at Datarade, the company facilitating the exchange of Big Data. She writes about the various use cases for external data, leading data providers, and developments in the tech industry, with a focus on data monetization trends.