It cant do 30% of tasks vorrectly. It can do tasks correctly as much as 30% of the time, and since it’s llm shit you know those numbers have been more massaged than any human in history has ever been.
yes, that’s generally useless. It should not be shoved down people’s throats. 30% accuracy still has its uses, especially if the result can be programmatically verified.
The problem is they are not i.i.d., so this doesn’t really work. It works a bit, which is in my opinion why chain-of-thought is effective (it gives the LLM a chance to posit a couple answers first). However, we’re already looking at “agents,” so they’re probably already doing chain-of-thought.
Are you just trolling or do you seriously not understand how something which can do a task correctly with 30% reliability can be made useful if the result can be automatically verified.
Its not a magical 30%, factors apply. It’s not even a mind that thinks and just isnt very good.
This isnt like a magical dice that gives you truth on a 5 or a 6, and lies on 1,2,3,7, and for.
This is a (very complicated very large) language or other data graph that programmatically identifies an average. 30% of the time-according to one potempkin-ass demonstration.
Which means the more possible that is, the easier it is to either use a simpler cheaper tool that will give you a better more reliable answer much faster.
And 20 tons of human shit has uses! If you know its providence, there’s all sorts of population level public health surveillance you can do to get ahead of disease trends! Its also got some good agricultural stuff in it-phosphorous and stuff, if you can extract it.
Stop. Just please fucking stop glazing these NERVE-ass fascist shit-goblins.
I think everyone in the universe is aware of how LLMs work by now, you don’t need to explain it to someone just because they think LLMs are more useful than you do.
IDK what you mean by glazing but if by “glaze” you mean “understanding the potential threat of AI to society instead of hiding under a rock and pretending it’s as useless as a plastic radio,” then no, I won’t stop.
It’s absolutely dangerous but it doesnt have to work even a little to do damage; hell, it already has. Your thing just makes it sound much more capable than it is. And it is not.
Also, it’s not AI.
Edit: and in a comment replying to this one, one of your fellow fanboys proved
It cant do 30% of tasks vorrectly. It can do tasks correctly as much as 30% of the time, and since it’s llm shit you know those numbers have been more massaged than any human in history has ever been.
I meant the latter, not “it can do 30% of tasks correctly 100% of the time.”
You get how that’s fucking useless, generally?
yes, that’s generally useless. It should not be shoved down people’s throats. 30% accuracy still has its uses, especially if the result can be programmatically verified.
Run something with a 70% failure rate 10x and you get to a cumulative 98% pass rate. LLMs don’t get tired and they can be run in parallel.
The problem is they are not i.i.d., so this doesn’t really work. It works a bit, which is in my opinion why chain-of-thought is effective (it gives the LLM a chance to posit a couple answers first). However, we’re already looking at “agents,” so they’re probably already doing chain-of-thought.
Very fair comment. In my experience even increasing the temperature you get stuck in local minimums
I was just trying to illustrate how 70% failure rates can still be useful.
What’s 0.7^10?
About 0.02
So the chances of it being right ten times in a row are 2%.
No the chances of being wrong 10x in a row are 2%. So the chances of being right at least once are 98%.
Less broadly useful than 20 tons of mixed texture human shit, and more ecologically devastatimg.
Are you just trolling or do you seriously not understand how something which can do a task correctly with 30% reliability can be made useful if the result can be automatically verified.
Its not a magical 30%, factors apply. It’s not even a mind that thinks and just isnt very good.
This isnt like a magical dice that gives you truth on a 5 or a 6, and lies on 1,2,3,7, and for.
This is a (very complicated very large) language or other data graph that programmatically identifies an average. 30% of the time-according to one potempkin-ass demonstration. Which means the more possible that is, the easier it is to either use a simpler cheaper tool that will give you a better more reliable answer much faster.
And 20 tons of human shit has uses! If you know its providence, there’s all sorts of population level public health surveillance you can do to get ahead of disease trends! Its also got some good agricultural stuff in it-phosphorous and stuff, if you can extract it.
Stop. Just please fucking stop glazing these NERVE-ass fascist shit-goblins.
I think everyone in the universe is aware of how LLMs work by now, you don’t need to explain it to someone just because they think LLMs are more useful than you do.
IDK what you mean by glazing but if by “glaze” you mean “understanding the potential threat of AI to society instead of hiding under a rock and pretending it’s as useless as a plastic radio,” then no, I won’t stop.
It’s absolutely dangerous but it doesnt have to work even a little to do damage; hell, it already has. Your thing just makes it sound much more capable than it is. And it is not.
Also, it’s not AI.
Edit: and in a comment replying to this one, one of your fellow fanboys proved
Wrong