Unfortunately, that’s not very clear without more. What kind of reward model are they talking about?
This is potentially a 1000x difference in required resources here, assuming you believe their DeepSeek’s quoted figure for spending, so it would have to be an extraordinary change.
Unfortunately, that’s not very clear without more. What kind of reward model are they talking about?
This is potentially a 1000x difference in required resources here, assuming you believe their DeepSeek’s quoted figure for spending, so it would have to be an extraordinary change.