Couldn’t they just insert a preprocessor that looks for variants of “Thank you” against a list, and returns “You’re welcome” without running it through the LLM?
If I understand correctly this is essentially how condensed models like Deepseek work and how they’re able to attain similar performance on much cheaper hardware. If all still goes through the LLM but LLM is a lot lighter because it has this sort of thing built in. That’s all a vast oversimplification.
Couldn’t they just insert a preprocessor that looks for variants of “Thank you” against a list, and returns “You’re welcome” without running it through the LLM?
If I understand correctly this is essentially how condensed models like Deepseek work and how they’re able to attain similar performance on much cheaper hardware. If all still goes through the LLM but LLM is a lot lighter because it has this sort of thing built in. That’s all a vast oversimplification.