

I’m not sure I can give a satisfying answer. There are a lot of moving parts here, and a big issue here is definitions which you also touch upon with your reference to Searle.
I agree with the sentiment that there must be some objective measure of reasoning ability. To me, reasoning is more than following logical rules. It’s also about interpreting the intent of the task. The reasoning models are very sensitive to initial conditions and tend to drift when the question is not super precise or if they don’t have sufficient context.
The AI models are in a sense very fragile to the input. Organic intelligence on the other hand is resilient and also heuristic. I don’t have any specific idea for the test, but it should test the ability to solve a very ill-posed problem.
I think that’s a very generous use of the word “superintelligent”. They aren’t anything like what I associate with that word anyhow.
I also don’t really think they are knowledge retrieval engines. I use them extensively in my daily work, for example to write emails and generate ideas. But when it comes to facts they are flaky at best. It’s more of a free association game than knowledge retrieval IMO.