• FaceDeer
    link
    fedilink
    -25 months ago

    If the AI trainers have the original text then “poisoning” the live site’s content isn’t going to do anything at all.

    You can’t touch the original text. It’s already been archived.

    • @tehciolo@lemm.ee
      link
      fedilink
      75 months ago

      If they scrape the updated comments again and ingest copyrighted text, you are poisoning the data.

      • FaceDeer
        link
        fedilink
        25 months ago

        That’s my point. They won’t.

        And even if they did, it’s unclear that copyright has anything to say about AI training anyway.

        • @InternetPerson
          link
          65 months ago

          NYT is currently suing because of copyright infringiments.

          https://www.nytimes.com/2023/12/27/business/media/new-york-times-open-ai-microsoft-lawsuit.html

          it’s unclear that copyright has anything to say about AI training anyway

          Although lawmakers worldwide have slept while AI advanced and therefore missed to make some important laws, they are catching up. Europe recently passed its first AI act. As far as I’ve seen it also states that companies must disclose a detailed summary of their training data.

          https://www.ml6.eu/blogpost/ai-models-compliance-eu-ai-act

          • FaceDeer
            link
            fedilink
            15 months ago

            You can sue about anything you want in the United States, it remains to be seen whether the courts will side with them. I think it’s unlikely they’ll get much of a win out of it.

            A law that requires disclosing a summary of training data isn’t going to stop anyone from using that training data.