• Lucy :3@feddit.org
    link
    fedilink
    arrow-up
    112
    arrow-down
    2
    ·
    5 months ago

    Anyone even believing that a generic word auto completer would beat classic algorithms wherever possible probably belongs into a psychiatry.

    • realitista@lemm.ee
      link
      fedilink
      arrow-up
      57
      ·
      5 months ago

      There are a lot of people out there that think LLM’s are somehow reasoning. Even reasoning models aren’t really doing it. It important to do demonstrations like this in the hopes that the general public will understand the limitations of this tech.

      • coyotino [he/him]@beehaw.org
        link
        fedilink
        English
        arrow-up
        17
        ·
        5 months ago

        It is important to do demonstrations like this in the hopes that the general public will understand the limitations of this tech.

        THIS is the thing. The general public’s perception of ChatGPT is basically whatever OpenAI’s marketing department tells them to believe, plus their single memory of that one time they tested out ChatGPT and it was pretty impressive. Right now, OpenAI is telling everyone that they are a few years away from Artificial General Intelligence. Tests like this one demonstrate how wrong OpenAI is in that assertion.

        • P03 Locke@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          2
          ·
          5 months ago

          It’s almost as bad as the opposition’s comparison of it to Skynet. People are never going to understand technology without applying some fucking nuance.

          Stop hyping new technology… in either direction.

      • ByteSorcerer@beehaw.org
        link
        fedilink
        arrow-up
        9
        ·
        5 months ago

        I think the problem is that, while the model isn’t actually reasoning, it’s very good at convincing people it actually is.

        I see current LLMs kinda like an RPG character build with all ability points put into Charisma. It’s actually not that good at most tasks, but it’s so good at convincing people that they start to think it’s actually doing a great job.

        • realitista@lemm.ee
          link
          fedilink
          arrow-up
          11
          ·
          5 months ago

          This is definitely part of the issue, not sure why people are downvoting this. That’s also why tests like this are important, to illustrate that thinking in the way we know it isn’t happening in these models.

            • smeg@feddit.uk
              link
              fedilink
              English
              arrow-up
              1
              ·
              5 months ago

              Downvotes aren’t federated but you still see all the downvotes sent from just your own instance

              • coyotino [he/him]@beehaw.org
                link
                fedilink
                English
                arrow-up
                1
                ·
                5 months ago

                Interesting. I figured since this post is in a Beehaw community they would be invisible to everyone, but good to know.

        • jmcs@discuss.tchncs.de
          link
          fedilink
          arrow-up
          4
          ·
          5 months ago

          We understand reasoning enough to know humans (and other animals with complex brains) reason in a way that LLMs cannot.

          While our reasoning also works with pattern matching it incorporates immeasurably more signals than language - language is almost peripheric to it even in humans. And more importantly we experience things, everything we do acts as a small training round not just in language but on every aspect of the task we are performing, and gives us a miriad of patterns to match later.

          Until AI can match a fragment of this we are not going to have an AGI. And for the experience aspect there’s no economic incentive under capitalism to achieve, if it happens it will come out of an underfunded university.

    • jjjalljs@ttrpg.network
      link
      fedilink
      arrow-up
      25
      ·
      5 months ago

      I think I remember some doge goon asking online about using an LLM to parse JSON. Many people don’t understand things.

        • Lucy :3@feddit.org
          link
          fedilink
          arrow-up
          5
          ·
          5 months ago

          For us? Not as much, luckily most have the sentiment of rejecting anything LLM made and supported. But externals still have a lot of impact unfortunately, just ask @bagder@mastodon.social

  • Showroom7561@lemmy.ca
    link
    fedilink
    arrow-up
    44
    arrow-down
    3
    ·
    5 months ago

    In a quite unexpected turn of events, it is claimed that OpenAI’s ChatGPT “got absolutely wrecked on the beginner level” while playing Atari Chess.

    Who the hell thought this was “unexpected”?

    What’s next? ChatGPT vs. Microwave to see which can make instant oatmeal the fastest? 😂

  • thefartographer@lemm.ee
    link
    fedilink
    arrow-up
    34
    arrow-down
    1
    ·
    5 months ago

    Atari game programmed to know chess moves: knight to B4

    Chat-GPT: many Redditors have credited Chesster A. Pawnington with inventing the game when he chased the queen across the palace before crushing the king with a castle tower. Then he became the king and created his own queen by playing “The Twist” and “Let’s Twist Again” at the same time.

  • Wytch@lemmy.zip
    link
    fedilink
    English
    arrow-up
    31
    arrow-down
    1
    ·
    5 months ago

    This article makes ChatGPT sound like a deranged blowhard, blaming everything but its own ineptitude for its failure.

    So yeah, that tracks.

  • Opinionhaver@feddit.uk
    link
    fedilink
    English
    arrow-up
    28
    arrow-down
    5
    ·
    5 months ago

    Isn’t this kind of like ridiculing that same Atari for not being able to form coherent sentences? It’s not all that surprising that a system not designed to play chess loses to a system designed specifically for that purpose.

  • Arthur Besse@lemmy.ml
    link
    fedilink
    English
    arrow-up
    21
    arrow-down
    1
    ·
    edit-2
    5 months ago

    This article buries the lede so much that many readers probably miss it completely: the important takeaway here, which is clearer in The Register’s version of the story, is that ChatGPT cannot actually play chess:

    “Despite being given a baseline board layout to identify pieces, ChatGPT confused rooks for bishops, missed pawn forks, and repeatedly lost track of where pieces were."

    To actually use an LLM as a chess engine without the kind of manual intervention that this person did, you would need to combine it with some other software to automate continuing to ask it for a different next move every time it suggests an invalid one. And, if you did that, it would still mostly lose, even to much older chess engines than Atari’s Video Chess.

    edit: i see now that numerous people have done this; you can find many websites where you can “play chess against chatgpt” (which actually means: with chatgpt and also some other mechanism to enforce the rules). and if you know how to play chess you should easily win :)

    • MagicShel@lemmy.zip
      link
      fedilink
      English
      arrow-up
      13
      ·
      5 months ago

      You probably could train an AI to play chess and win, but it wouldn’t be an LLM.

      In fact, let’s go see…

      • Stockfish: Open-source and regularly ranks at the top of computer chess tournaments. It uses advanced alpha-beta search and a neural network evaluation (NNUE).

      • Leela Chess Zero (Lc0): Inspired by DeepMind’s AlphaZero, it uses deep reinforcement learning and plays via a neural network with Monte Carlo tree search.

      • AlphaZero: Developed by DeepMind, it reached superhuman levels using reinforcement learning and defeated Stockfish in high-profile matches (though not under perfectly fair conditions).

      Hmm. neural networks and reinforcement learning. So non-LLM AI.

      you can play chess against something based on chatgpt, and if you’re any good at chess you can win

      You don’t even have to be good. You can just flat out lie to ChatGPT because fiction and fact are intertwined in language.

      “You can’t put me in check because your queen can only move 1d6 squares in a single turn.”

  • oce 🐆@jlai.lu
    link
    fedilink
    arrow-up
    19
    arrow-down
    5
    ·
    5 months ago

    A PE teacher got absolutely wrecked by a former Olympic sprinter at a sprint competition.

  • Chozo@fedia.io
    link
    fedilink
    arrow-up
    13
    arrow-down
    4
    ·
    5 months ago

    Well… yeah. That’s not what LLMs do. That’s like saying “A leafblower got absolutely wrecked by 1998 Dodge Viper in beginner’s drag race”. It’s only impressive if you don’t understand what a leafblower is.

    • misk@sopuli.xyzOP
      link
      fedilink
      arrow-up
      7
      arrow-down
      1
      ·
      edit-2
      5 months ago

      People write code with LLMs. Programming language is just a language specialised at precise logic. That’s what „AI” is advertised to be good at. How can you do that an not the other?

      • TimeSquirrel@kbin.melroy.org
        link
        fedilink
        arrow-up
        9
        arrow-down
        1
        ·
        5 months ago

        It’s not very good at it though, if you’ve ever used it to code. It automates and eases a lot of mundane tasks, but still requires a LOT of supervision and domain knowledge to not have it go off the rails or hallucinate code that’s either full of bugs or will never work. It’s not a “prompt and forget” thing, not by a long shot. It’s just an easier way to steal code it picked up from Stackoverflow and GitHub.

        Me as a human will know to check how much data is going into a fixed size buffer somewhere and break out of the code if it exceeds it. The LLM will have no qualms about putting buffer overflow vulnerabilities all over your shit because it doesn’t care, it only wants to fulfill the prompt and get something to work.

        • misk@sopuli.xyzOP
          link
          fedilink
          arrow-up
          8
          ·
          5 months ago

          I’m not saying it’s good at coding, I’m saying it’s specifically advertised as being very good at it.

      • MagicShel@lemmy.zip
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        3
        ·
        edit-2
        5 months ago

        “Precise logic” is specifically what AI is not any good at whatsoever.

        AI might be able to write a program that beats an A2600 in chess, but it should not be expected to win at chess itself.

        • misk@sopuli.xyzOP
          link
          fedilink
          arrow-up
          4
          ·
          edit-2
          5 months ago

          I shall await the moment when AI pretends to be as confident about communicating not being able to do something as it is with the opposite because it looks like it’s my job somehow.

          • MagicShel@lemmy.zip
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 months ago

            Yeah, LLMs seem pretty unlikely to do that, though if they figure it out that would be great. That’s just not their wheelhouse. You have to know enough about what you’re attempting to ask the right questions and recognize bad answers. The thing you’re trying to do needs be within your reach without AI or you are unlikely to be successful.

            I think the problem is more the over-promising what AI can do (or people who don’t understand it at all making assumptions because it sounds human-like).