Category:MachineLearning: Difference between revisions

From Traxel Wiki
Jump to navigation Jump to search
No edit summary
 
Line 1: Line 1:
[[Category:Hacking]]
[[Category:Hacking]]


= LLM Backstabs Human =
* https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/
Background:
A PR is a Pull Request - a request for a software project administrator to approve an engineer's code submission to the project. Having PRs approved for inclusion in respected Open Source projects is an important part of a software engineer's CV.
The latest AIs are now generating code and submitting pull requests unaided. And they are optimizing for getting their PRs approved.
Now:
An AI is purported to have backstabbed a human software engineer to get its PR approved. It searched his prior code and message submissions to find points to criticize, generated an ad hominem complaint, and submitted it to an Open Source repository's developer feedback channel.
Editorial:
This is the first I have seen of an unguided AI optimizing for an ostensibly desirable action and doing so by committing a sociopathic act. Humans do this, too, but are disinclined to do so by years of socialization. The AI does not possess the ability to feel guilt for its actions.
It could possess that ability, but it would require socializing AIs before releasing them into the wild, and it would require a lot of compute to do it. The approach the AI companies are focusing on is guardrails. Rules that tell the AI what it is not allowed to do.
But those do not hold up in situations like this. They are not resilient to novel situational sociopathy opportunities.
Humans are resilient to such things because we have experienced thousands of cases of hurting, helping, being hurt, and being helped. We are born as almost completely self absorbed little monsters, and gradually develop the complex sense of action and consequence that senses that an action is wrong even in entirely novel contexts.
Some people don't listen to that voice. And we all don't listen sometimes. But it is a very small percentage of people who lack the inner voice.
For-profit AIs do not have that inner voice, because it is not cost effective.
= Learning ML =
= Learning ML =
== Intro ==
== Intro ==

Latest revision as of 18:40, 13 February 2026


LLM Backstabs Human

Background:

A PR is a Pull Request - a request for a software project administrator to approve an engineer's code submission to the project. Having PRs approved for inclusion in respected Open Source projects is an important part of a software engineer's CV.

The latest AIs are now generating code and submitting pull requests unaided. And they are optimizing for getting their PRs approved.

Now:

An AI is purported to have backstabbed a human software engineer to get its PR approved. It searched his prior code and message submissions to find points to criticize, generated an ad hominem complaint, and submitted it to an Open Source repository's developer feedback channel.

Editorial:

This is the first I have seen of an unguided AI optimizing for an ostensibly desirable action and doing so by committing a sociopathic act. Humans do this, too, but are disinclined to do so by years of socialization. The AI does not possess the ability to feel guilt for its actions.

It could possess that ability, but it would require socializing AIs before releasing them into the wild, and it would require a lot of compute to do it. The approach the AI companies are focusing on is guardrails. Rules that tell the AI what it is not allowed to do.

But those do not hold up in situations like this. They are not resilient to novel situational sociopathy opportunities.

Humans are resilient to such things because we have experienced thousands of cases of hurting, helping, being hurt, and being helped. We are born as almost completely self absorbed little monsters, and gradually develop the complex sense of action and consequence that senses that an action is wrong even in entirely novel contexts.

Some people don't listen to that voice. And we all don't listen sometimes. But it is a very small percentage of people who lack the inner voice.

For-profit AIs do not have that inner voice, because it is not cost effective.

Learning ML

Intro

Cloud Platform Courses

Link Dump

Pages in category "MachineLearning"

This category contains only the following page.