DeepSeek or DeepFake

Distillation is basically “compressing” a model. Perhaps someone else’s model. And that’s a lot cheaper than making a model in the first place.

Here’s my post from a couple days ago about why DeepSeek was cheap to make – because it used select datasets and OTHER MODELS.

OpenAI agrees. It was trained on OpenAI. The way OpenAI was trained on the New York Times…?

Leave a Reply