OpenAI’s first open source language model since GPT-2

  • CyberSeeker@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    1
    ·
    7 days ago

    Yes, but 20 billion parameters is too much for most GPUs, regardless of quantization. You would need at least 14GB, and even that’s unlikely without offloading major parts to the CPU and system RAM (which kills the token rate).

    • fuckwit_mcbumcrumble@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 days ago

      I tried it out last night and it ran quite well on my heavily thermally limited i9 11950h/rtx 3080 laptop. I had maybe 6 or 7 gigs of main ram used in total, with docker running. It was only using about 12 gigs of vram in my very limited testing.