• 0 Posts
  • 28 Comments
Joined 1 year ago
cake
Cake day: June 12th, 2023

help-circle






  • Yeah, I generally agree there. And you’re right. Nobody knows if they’ll really be the starting point for AGI because nobody knows how to make AGI.

    In terms of usefulness, I do use it for knowledge retrieval and have a very good success rate with that. Yes, I have to double check certain things to make sure it didn’t make them up, but on the whole, GPT4 is right a large percentage of the times. Just yesterday I’d been Googling to find a specific law or regulation on whether airlines were required to refund passengers. I spent half an hour with no luck. ChatGPT with GPT4 pointed me to the exact document down to the right subsection on the first try. If you try that with GPT3.5 or really anything else out there, there’s a much higher rate of failure, and I suspect a lot of people who use the “it gets stuff wrong” argument probably haven’t spent much time with GPT4. Not saying it’s perfect-- it still confidently says incorrect things and will even double down if you press it, but 4 is really impressive.

    Edit: Also agree, anyone saying LLMs are AGI or sentient or whatever doesn’t understand how they work.


  • As I see it, anybody who is not skeptical towards “yet another ‘world changing’ claim from the usual types” is either dumb as a doorknob, young and naive or a greedy fucker invested in it trying to make money out of any “suckers” that jump into that hype train.

    I’ve been working on AI projects on and off for about 30 years now. Honestly, for most of that time I didn’t think neural nets were the way to go, so when LLMs and transformers got popular, I was super skeptical. After learning the architecture and using them myself, I’m convinced they’re part of but not the whole solution to AGI. As they are now, yes, they are world changing. They’re capable of improving productivity in a wide range of industries. That seems pretty world changing to me. There are already products out there proving this (GitHub Copilot, jasper, even ChatGPT). You’re welcome to downplay it and be skeptical, but I’d highly recommend giving it an honest try. If you’re right then you’ll have more to back up your opinion, and if you’re wrong, you’ll have learned to use the tech and won’t be left behind.


  • extraordinary claims without extraordinary proof

    What are you looking for here? Do you want it to be self aware and anything less than that is hot garbage? That latest advances in AI have many uses. Sure Bitcoin was over hyped and so is AI, but Bitcoin was always a solution with no problem. AI (as in AGI) offers literally a solution to all problems (or maybe the end of humans but hopefully not hah). The current tech though is widely useful. With GPT4 and GitHub Copilot, I can write good working code at multiple times my normal speed. It’s not going to replace me as an engineer yet, but it can enhance my productivity by a huge amount. I’ve heard similar from many others in different jobs.



  • Oh ok! Got it. I read it as you saying ChatGPT doesn’t use GPT 4. It’s still unclear what they used for part of it because of the bit before the part you quoted:

    For each of the 517 SO questions, the first two authors manually used the SO question’s title, body, and tags to form one question prompt3 and fed that to the Chat Interface [45] of ChatGPT.

    It doesn’t say if it’s 4 or 3.5, but I’m going to assume 3.5. Anyway, in the end they got the same result for GPT 3.5 that it gets on HumanEval, which isn’t anything interesting. Also, GPT 4 is much better, so I’m not really sure what the point is. Their stuff on the analysis of the language used in the questions was pretty interesting though.

    Also, thanks for finding their mention of 3.5. I missed that in my skim through obviously.





  • Wait a second here… I skimmed the paper and GitHub and didn’t find an answer to a very important question: is this GPT3.5 or 4? There’s a huge difference in code quality between the two and either they made a giant accidental omission or they are being intentionally misleading. Please correct me if I missed where they specified that. I’m assuming they were using GPT3.5, so yeah those results would be as expected. On the HumanEval benchmark, GPT4 gets 67% and that goes up to 90% with reflexion prompting. GPT3.5 gets 48.1%, which is exactly what this paper is saying. (source).



  • Yeah, I think that’s a big part of it. I also wonder if people are getting tired of the hype and seeing every company advertise AI enabled products (which I can sort of get because a lot of them are just dumb and obvious cash grabs).

    At this point, it’s pretty clear to me that there’s going to be a shift in how the world works over the next 2 to 5 years, and people will have a choice of whether to embrace it or get left behind. I’ve estimated that for some programming tasks, I’m about 7 to 10x faster when using Copilot and ChatGPT4. I don’t see how someone who isn’t using AI could compete with that. And before anyone asks, I don’t think the error rate in the code is any higher.


  • I’ve been making the same or similar arguments you are here in a lot of places. I use LLMs every day for my job, and it’s quite clear that beyond a certain scale, there’s definitely more going on than “fancy autocomplete.”

    I’m not sure what’s up with people hating on AI all of a sudden, but there seems quite a few who are confidently giving out incorrect information. I find it most amusing when they’re doing that at the same time as bashing LLMs for also confidently giving out wrong information.