• zinderic@programming.dev
    link
    fedilink
    arrow-up
    35
    arrow-down
    2
    ·
    8 months ago

    It’s almost impossible to audit what data got into an AI model. Until this is true companies could scrape and use whatever they like and no one would be the wiser to what data got used or misused in the process. That makes it hard to make such companies accountable to what and how they are using.

    • po-lina-ergi@kbin.social
      link
      fedilink
      arrow-up
      41
      arrow-down
      3
      ·
      8 months ago

      Then it needs to be on companies to prove their audit trail, and until then require all development to be open source

      • zinderic@programming.dev
        link
        fedilink
        arrow-up
        7
        ·
        8 months ago

        That would be amazing. But it won’t happen any time soon if ever… I mean - just think about all that investment in GPU compute and the need to realize good profit margins. Until there are laws and legislation that requires AI companies to open their data pipelines and make public all details about the data sources I don’t think much would happen. They’ll just keep feeding any data they get their hands on and nothing can stop that today.

        • ipkpjersi@lemmy.ml
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          8 months ago

          Until there are laws and legislation that requires AI companies to open their data pipelines and make public all details about the data sources I don’t think much would happen.

          I don’t expect those laws to ever happen. They don’t benefit large corporations so there’s no reason those laws would ever be prioritized or considered by lawmakers, sadly.

        • InputZero@lemmy.ml
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          edit-2
          8 months ago

          Maybe not today and maybe not every AI but maybe some AI in the near future will have it’s data sources made explainable. There are a lot of applications where deploying AI would be an improvement over what we have. One example I can bring up is in-silico toxicology experiments. There’s been a huge desire to replace as many in-vivo experiments with in-vitro or even better in-silico to minimize the number of live animals tested on. Both for ethical reasons and cost savings. AI has been proposed as a new tool to accomplish this but it’s not there yet. One of the biggest challenges to overcome is making the AI models used in-silico to be explainable, because we can not regulate effectively what we can not explain. Regardless there is a profits incentive for AI developers to make at least some AI explainable. It’s just not where the big money is. To which end that will apply to all AI I haven’t the slightest idea. I can’t imagine OpenAI would do anything to expose their data.

    • po-lina-ergi@kbin.social
      link
      fedilink
      arrow-up
      4
      arrow-down
      4
      ·
      8 months ago

      Then it needs to be on companies to prove their audit trail, and until then require all development to be open source