tln 1 day ago

The dragon image has issues like one eye, weird tail etc, but the pelican is imo perfect -- the best I've seen!

  • vunderba 1 day ago

    Yeah the dragon one is just a complete mess. The car is sideways but the WHEEL is oriented in a first-person perspective.

    Seems like a case of overfitting with regard to the thousands of pelican bike SVG samples on the internet already.

yrds96 1 day ago

I wonder if this became a so well known "benchmark" that models already got trained for it.

  • Marciplan 1 day ago

    every model release Simon comes with his Pelican and then this comment follows.

    Can we stop both? its so boring

    • refulgentis 1 day ago

      I really appreciate you speaking up. Happened yesterday on GPT Image 2, bit my tongue b/c people would see it as fun policing, and same thing today. And it happens on every. single. LLM. release. thread.

      It's disruptive to the commons, doesn't add anything to knowledge of a model at this point, and it's way out of hand when people are not only engaging with the original and creating screenfuls to wade through before on-topic content, but now people are creating the thread before it exists to pattern-match on the engagement they see for the real thing. So now we have 2x.

      • jszymborski 1 day ago

        No more disruptive than this comment. If you don't like it, downvote and move on. It's on topic and doesn't contradict the rules. The reason you see Simon's comment on the top is because people like it and upvote it.

        • refulgentis 1 day ago

          Our comments are no more disruptive, so we shouldn't write them. The other comments are at most as disruptive & fine.

          Something seems off when I combine those premises.

          You also make a key observation here: the root comment is fine and on-topic. The the replies spin off into nothing to do with the headline, but the example in the comment. Makes it really hard to critique with coming across as fun police.

          Also, worth noting there's a distinction here, we're not in simonw's thread: we're in a brand new account's imitation of it.

      • Mashimo 1 day ago

        > and creating screenfuls to wade through before on-topic content,

        It's often just a single root comment that you can collapse.

        I find how svg drawing skills improve over time interesting. Very simple and very small datapoint. But I still find value in it.

  • HotHotLava 1 day ago

    Given that the pelican looks way better than the dragon, it almost seems like a certainty.

  • sietsietnoac 1 day ago

    Given the likeness of the sky between the 2 examples, the overall similarities and the fact that the pelican is so well done, there is 0-doubt that the benchmark is in the training data of these models by now

    That doesn't make it any less of an achievement given the model size or the time it took to get the results

    If anything, it shows there's still much to discover in this field and things to improve upon, which is really interesting to watch unfold