• 0 Posts
  • 42 Comments
Joined 2 years ago
cake
Cake day: June 16th, 2023

help-circle



  • I’ve found that as an ambient code completion facility it’s… interesting, but I don’t know if it’s useful or not…

    So on average, it’s totally wrong about 80% of the time, 19% of the time the first line or two is useful (either correct or close enough to fix), and 1% of the time it seems to actually fill in a substantial portion in a roughly acceptable way.

    It’s exceedingly frustrating and annoying, but not sure I can call it a net loss in time.

    So reviewing the proposal for relevance and cut off and edits adds time to my workflow. Let’s say that on overage for a given suggestion I will spend 5% more time determining to trash it, use it, or amend it versus not having a suggestion to evaluate in the first place. If the 20% useful time is 500% faster for those scenarios, then I come out ahead overall, though I’m annoyed 80% of the time. My guess as to whether the suggestion is even worth looking at improves, if I’m filling in a pretty boilerplate thing (e.g. taking some variables and starting to write out argument parsing), then it has a high chance of a substantial match. If I’m doing something even vaguely esoteric, I just ignore the suggestions popping up.

    However, the 20% is a problem still since I’m maybe too lazy and complacent and spending the 100 milliseconds glancing at one word that looks right in review will sometimes fail me compared to spending 2-3 seconds having to type that same word out by hand.

    That 20% success rate allowing for me to fix it up and dispose of most of it works for code completion, but prompt driven tasks seem to be so much worse for me that it is hard to imagine it to be better than the trouble it brings.







  • To reinforce this, just had a meeting with a software executive who has no coding experience but is nearly certain he’s going to lay off nearly all his employees because the value is all in the requirements he manages and he can feed those to a prompt just as well as any human can.

    He does tutorial fodder introductory applications and assumes all the work is that way. So he is confident that he will save the company a lot of money by laying off these obsolete computer guys and focus on his “irreplaceable” insight. He’s convinced that all the negative feedback is just people trying to protect their jobs or people stubbornly not with new technology.


  • Now there’s models that reason,

    Well, no, that’s mostly a marketing term applied to expending more tokens on generating intermediate text. It’s basically writing a fanfic of what thinking on a problem would look like. If you look at the “reasoning” steps, you’ll see artifacts where it just goes disjoint in the generated output that is structurally sound, but is not logically connected to the bits around it.


  • The probabilities of our sentence structure are a consequence of our speech, we aren’t just trying to statistically match appropriate sounding words.

    With enough use of LLM, you will see how it is obviously not doing anything like conceptualizing the tokens it’s working with or “reasoning” even when it is marketed as “reasoning”.

    Sticking to textual content generation by LLM, you’ll see that what is emitted is first and foremost structurally appropriate, but beyond that it’s mostly “bonus” for it to be narratively consistent and an extra bonus if it also manages to be factually consistent. An example I saw from Gemini recently had it emit what sounded like an explanation of which action to pick, and then the sentence describing actually picking the action was exactly opposite of the explanation. Both of those were structurally sound and reasonable language, but there’s no logical connection between the two portions of the emitted output in that case.



  • improved how we recognize and diagnose it.

    Well, we at least have changed how we recognize and diagnose it, I’m not totally convinced it’s 100% an “improvement”. We’ve kind of jumbled up a whole bunch of people under a common umbrella and diluted the implications of the term, to the point where it tells you negligible practical information when someone is described as “autistic” or “on the spectrum”.


  • The thing that strikes me about both this story and the thing you posted is that the people in the Tesla seem to be like “this is fine” as the car does some pretty terrible stuff.

    In that one, Tesla failing to honor a forced left turn instead opting to go straight into oncoming lanes and waggle about causing things to honk at them, the human just sits there without trying to intervene. Meanwhile they describe it as “navigation issue/hesitation” which really understates what happened there.

    The train one didn’t come with video, but I can’t imagine just letting my car turn itself onto tracks and going 40 feet without thinking.

    My Ford even thinks about going too close to another lane and I’m intervening even if it was really going to be no big deal. I can’t imagine this level of “oh well”.

    Tesla drivers/riders are really nuts…


  • On the one hand, most of those incidents cited were in the face of a regime that also didn’t want to care. Just hard to ignore circumstances if 3.5% of your people are out on the streets and likely most of the people off the streets agree with them.

    On the other hand, they base this on very few instances, so it’s hardly a statistical slam dunk, it’s vaguely supportive of some concepts, but anyone taking note of specific numbers is really overextending the research beyond what it can possibly say.


  • Based on the article “no non-violent movement that has involved more than 3.5% of a population has ever failed” has the caveat of “we only look at 3 of them, and those 3 worked”.

    So their overall sample size is small, and the 3.5% sample size is just 3. Further, those 3 had no idea someone in the vague future would retroactively measure their participation to declare it a rock solid threshold.

    I think the broader takeaway is that number of people seems to matter more than degree of violence, and violence seems to alienate people that might have otherwise participated.