Introducing LLMTest: The Deterministic Benchmark for True AI Reasoning

πŸš€ Introducing LLMTest: The Infinite AI Reasoning Test

We’re excited to announce LLMTest, a new program designed to evaluate the core intelligence modes of Large Language Models (LLMs). This isn’t your standard comprehension testβ€”it’s a deep dive into an LLM’s capacity for pure reasoning, independent of its vast training data.

πŸ‘‰ Visit the web application now: llmtest-online


What is LLMTest?

LLMTest uses a linear-time algorithm to generate infinitely many unique test instances. Each test is a simple logical puzzle presented in a token-efficient format, using a domain and steps that are naturally aligned with a computer agent.

The design is deliberate:

  • It requires reasoning, not computation.
  • It eliminates reliance on memorization or prior training.
  • The solution is a deterministic answerβ€”it’s either correct or wrong, providing unambiguous results.

The puzzles may be simple or complex, involving few or many steps, and can be configured to permit simple tool calls, forcing the model to ’think’ its way to the solution rather than rely on statistical pattern matching.


See It in Action

Ready to put an LLM’s reasoning to the test? The best way to understand the puzzles is to interact with them yourself.

πŸ‘‰ Visit the web application now: llmtest-online


πŸ“° Stay Informed

Don’t want the hassle of continually running tests? Stay up-to-date with the performance of leading LLM agents without spinning up your own infrastructure.

πŸ“§ Sign up for the periodic newsletter containing the latest test results: llmtest@snowdon.dev


πŸ’» Installation & Use

For developers and researchers who want to run the tests locally, the package is available on the NPM registry.

  • Install with NPM:
    npm install node-llm-test
    
  • Install globally as a command:
    npm install --global node-llm-test
    

Note on Commercial Use: Commercial use of the code package requires permission. Please contact hello@snowdon.dev if you intend to use it for such purposes. The web app, however, is freely available for your convenience.


πŸ“’ We Welcome Feedback

Encounter a bug or have a suggestion? Issue reports are always welcome. Just let us know!