zoi (12:52:50) they just learn to not reason about the cheating
zoi (12:52:36) so with these reasoning LLMs, when they use the reasoning to penalize their training when they decide to cheat on a task
Machforr (12:48:50) or something
Machforr (12:48:40) #freeoneliner
Machforr (12:48:31) nectaz o/
Rapture (12:40:43)
faraday (12:39:50) i don't care
zoner (12:39:48) It was indeed rude on arrakis' part.
faraday (12:39:44) lol
Rapture (12:39:25) yes. it feels laggy too, to post sth. hello faragay <- so rude from arra!
Time Left: 3:39
Related tags: