There is an Arms Race to Gather Data to Train AI Models But Organizations Do Not Recognize That Their Data is Often Being Used for Free

Sunil Soares, Co-Founder & CEO, Tavro

Arms Race for AI Training Data

There is an arms race to gather training data for AI Models but organizations do not always recognize that their data is being used for free. This happens when applications with embedded AI use company data to train their AI models, or so-called Shadow AI.

Monetizing AI Training Data

There are multiple approaches to “monetizing” AI training data:

  1. Data Licensing
    License data to model providers. Companies like RedditStack OverflowNews Corp, and Shutterstock have adopted this playbook.
  2. Litigation
    File lawsuits to compel some type of compensation. The New York Times filed a lawsuit against OpenAI. In similar fashion, plaintiffs have sued OpenAI and GitHub for compensation.

Software Company Playbooks

Software companies have generally adopted two approaches:

  1. Exclude Data From AI Training
    Some SaaS providers specifically exclude data from AI training in their public policies. These include Zoom, Adobe, and Microsoft.
  2. Stay Silent on the Subject
    However, most SaaS public policies do not specifically exclude data from AI training.

Responses by Enterprises

Some companies have added “AI Training” clauses in their Vendor Master Services Agreements (MSAs). These clauses specifically exclude the use of company data by vendors to train their AI models. However, companies lack the appropriate mechanisms to enforce these clauses.

Chief Data Officers Need to Explore Avenues to Unlock Data Value

Maybe it’s time for Chief Data Officers to explore another avenue to unlock the value of their data? Start by doing the following:

  1. Shine a spotlight on “Shadow AI” – Tavro has developed Shadow AI Governance agents to automate the research process.
  2. Improve negotiating posture with procurement teams.
  3. Get vendors to formally license AI training data.
  4. Get something back even if it’s free tickets to the vendor’s user conference.