• 276 Posts
  • 13 Comments
Joined 1 year ago
cake
Cake day: June 9th, 2023

help-circle





















  • From my own statistics how many I feel worthy posting/linking on Lemmy, the most direct alternative to Kotaku is Eurogamer. PCGamer, PCGamesN and Rock Paper Shotgun are occasionally OK, but you have to cut through a lot of spam and clickbait (i.e. exactly this “50 guides per week” type of corporate guidance). Not sure if this is also the state that Kotaku will end up in. The Verge sometimes also have good articles, but the flood of gadget consumerism articles there is obnoxious.











  • The original comment is dismissive and clearly ment to be trivializing of the capacity of LLMs.

    The trivializing is clearly your personal interpretation. In my response, I was even careful to delineate the arguments between autogressive LLM vs. training for plausibility or truthfulness.

    You’re the one being dishonest in your response. Your whole post, and a large class of arguments about the capacity of these systems rest on it is designed to do something

    My “whole post” is evidently not all about capacity. I had five paragraphs, only a single one discussed model capacity, vs. two for instance about the loss functions. So who is being “dishonest” here?

    […] emergent behavior exists. Is that the case here? Maybe.

    So you have zero proof but still happily conjecture that “emergent behavior” — which you do not care to elaborate how you want to prove — exists. How unsurprising.

    “Emergent behavior” is a worthless claim if the company that trains the model is now even being secretive what training sample was used. Moreover, it became known through research that OpenAI is nowadays basically overtraining straight away on — notably copyrighted, explaining why OpenAI is being secretive — books to make their LLM sound “smart.”

    https://www.theregister.com/2023/05/03/openai_chatgpt_copyright/


  • …as if saying that somehow makes what chatGPT does trivial.

    That is moving the goalpost. @RickyRigatoni is quite correct that the structure of an autoregressive LLM like (Chat)GPT is, well, autoregressive, i.e. to predict the next word. It is not a statement about triviality until you shifted the goalpost.

    What was genuinely lost in the conversation was how the loss function of a LLM is not the truthfulness. The loss function is for the most part, as you noted below, “coherence,” or that it could have been a plausible completion of the text. Only with RLHF there is some weak guidance on truthfulness, which is far meager than the training loss for pure plausibility.

    You’re never going to get coherent text from autocomplete and nor can it understand any arbitrary English phrase.

    Because those are small models. GPT-3 was already trained on the equivalent text volume that would required > 100 years reading by a human, which is a good size to generate the statistical model, but ridiculous for any sign of “intelligence” or “knowing” what is correct.

    Also, “coherence” is not the goal of normal autocomplete for input, which is scored by producing each next word ranked by frequency, and not playing “the long game” in reaching coherence (e.g. involving a few rare words to get the text flow going). Though both are autoregressive, the training losses are absolutely not the same.

    And if you had not veered off-topic with your 1970s reference from text generation, you might know that the Turing test was demonstratively passable even without neural networks back then, let alone plausible text generation:

    https://en.wikipedia.org/wiki/PARRY