12/18/2024 | Press release | Distributed by Public on 12/18/2024 09:43
High-Performance Agent Surpasses Leading AI Model in Accuracy, Speed, and Cost Efficiency
Vancouver, BC - 16 December 2024 - (GLOBE NEWSWIRE) - VERSES AI Inc. (CBOE:VERS) (OTCQB:VRSSF) ("VERSES'' or the "Company"), a cognitive computing company, today revealed performance highlights of its flagship product Genius winning the code-breaking game Mastermind in a side by side comparison with a leading generative AI model, OpenAI's o1 Preview, which is positioned as an industry-leading reasoning model. Over one hundred test runs, Genius consistently outperformed OpenAI's o1-preview model one hundred and forty (140) times faster and more than five thousand times (5,000) cheaper.
"Today we're showcasing Genius' advanced reasoning performance against state-of-the-art deep learning-based methods that LLMs are based on," said Hari Thiruvengada, VERSES Chief Technology Officer. "Mastermind was the perfect choice for this test because it requires reasoning through each step logically, predicting the cause-and-effect outcomes of its decisions, and dynamically adapting to crack the code. This exercise demonstrates how Genius outperforms tasks requiring logical and cause-effect reasoning, while exposing the inherent limitations of correlational language-based approaches in today's leading reasoning models.
"This is just a preview of what's to come. We're excited to show how additional reasoning capabilities, available in Genius today and demonstrated with Mastermind, will be further showcased in our upcoming Atari 10k benchmark results," Thiruvengada continued.
The comparison involved 100 games of Mastermind, a reasoning task requiring the models to deduce a hidden code through logical guesses informed by feedback hints. Key metrics included success rate, computation time, number of guesses, and total cost.
In the exercise, VERSES compared OpenAI advanced reasoning model o1-preview to Genius. Each model attempted to crack the Mastermind code on 100 games with up to ten guesses to crack the code. Each model is given a hint for each guess and must reason about the missing part of the correct answer, requiring all six code colors to be correct to crack the code. For perspective, you can play the game at mastermindgame.org.
A highlight of the results is below. You can find a more detailed description and results of the tests on our blog at verses.ai.
The exercise: VERSES' team conducted 100 games for each AI model, using the same secret code parameters: 4 positions and 6 possible colors. Results were measured by success rate, computation time, number of guesses, and total cost. The comparison is summarized below:
Metric | Genius™ | o1-preview | ||
Success Rate | 100% | 71% (29% fail rate) | ||
Total Compute Time |
5 minutes, 18 seconds (Avg 3.1s per game) |
12.5 hours (Avg 345s per game) |
||
Total Cost for 100 Games | $0.05 USD (est.) | $263 USD | ||
Hardware Requirements | Standard laptop (M1) | GPU-based Cloud |
Performance Highlights:
● | Accuracy and Reliability. Genius solved the code every time in a consistent number of steps. | |
● | Speed. Genius consistently solved games in 1.1-4.5 seconds, while ChatGPT's solve times ranged from 7.9 to 889 seconds (approximately 15 mins) | |
● | Efficiency. Genius' total compute time for 100 games was just over 5 minutes, compared to ChatGPT's 12.5 hours. | |
● | Cost. Genius' compute cost was estimated at $0.05 USD for all 100 games, compared to ChatGPT's o1 model at $263 USD. |
In summary, Genius solved Mastermind 100% of the time, was 140 times faster and 5260 times cheaper than o1-preview.
"These impressive results highlight a critical gap in today's AI landscape: the limitations of language-based models like OpenAI's o1 to handle logical reasoning tasks precisely and reliably," said Gabriel René, founder and CEO of VERSES. "Mastermind code-breaking is an indicative test that showcases the class of logical reasoning and understanding of cause and effect needed for real-world applications like cybersecurity, fraud detection, and financial forecasting-domains where causality, accuracy, and efficiency are non-negotiable. Genius not only excels at these tasks but does so faster, cheaper, and with unparalleled consistency, making it ideal for addressing complex business challenges. Genius not only excels at these tasks but does so faster, cheaper, and with unparalleled consistency, making it ideal for addressing complex business challenges."
Mastermind™ is a registered trademark of Pressman Inc.
About VERSES
VERSES is a cognitive computing company building next-generation intelligent software systems modeled after the wisdom and genius of Nature. Designed around first principles found in science, physics and biology, our flagship product, Genius, is a suite of tools for machine learning practitioners to model complex dynamic systems and generate autonomous intelligent agents that continuously reason, plan, and learn. Imagine a Smarter World that elevates human potential through technology inspired by Nature. Learn more at verses.ai, LinkedIn, and X.
On behalf of the Company
Gabriel René, Founder & CEO, VERSES AI Inc.
Press Inquiries: [email protected]
Investor Relations Inquiries
U.S., Matthew Selinger, Partner, Integrous Communications, [email protected] 415-572-8152
Canada, Leo Karabelas, President, Focus Communications, [email protected] 416-543-3120
Cautionary Note Regarding Forward-Looking Statements
When used in this press release, the words "estimate", "project", "belief", "anticipate", "intend", "expect", "plan", "predict", "may" or "should" and the negative of these words or such variations thereon or comparable terminology are intended to identify forward-looking statements and information. Although VERSES believes, in light of the experience of their respective officers and directors, current conditions and expected future developments and other factors that have been considered appropriate, that the expectations reflected in the forward-looking statements and information in this press release are reasonable, undue reliance should not be placed on them because the parties can give no assurance that such statements will prove to be correct. The forward-looking statements and information in this press release include, among other things, statements regarding the Company's goals and plans for future testing of Genius, including the Atari 10K benchmark.
There are risks and uncertainties that may cause actual results to differ materially from those contemplated in those forward-looking statements and information. In making the forward-looking statements in this news release, the Company has applied various material assumptions. By their nature, forward-looking statements involve known and unknown risks, uncertainties and other factors which may cause our actual results, performance or achievements, or other future events, to be materially different from any future results, performance or achievements expressed or implied by such forward-looking statements. There are a number of important factors that could cause VERSUS' actual results to differ materially from those indicated or implied by forward-looking statements and information. Such factors may include, among other things, the ability of the Company to complete further testing of Genius as anticipated, or at all, and that such further testing will achieve the intended results. The Company undertakes no obligation to comment on analyses, expectations or statements made by third parties in respect of its securities or its financial or operating results (as applicable).
Additionally, forward-looking statements involve a variety of known and unknown risks, uncertainties and other factors which may cause the actual plans, intentions, activities, results, performance or achievements of the Company to be materially different from any future plans, intentions, activities, results, performance or achievements expressed or implied by such forward-looking statements. Such risks include, without limitation: the risk that the Company will be unable to complete further testing of Genius as anticipated, or at all; and risks that the Company will not achieve the intended results in such further testing. VERSES cautions that the foregoing list of material factors is not exhaustive. When relying on VERSES' forward-looking statements and information to make decisions, investors and others should carefully consider the foregoing factors and other uncertainties and potential events. VERSES has assumed that the material factors referred to in the previous paragraph will not cause such forward-looking statements and information to differ materially from actual results or events. However, the list of these factors is not exhaustive and is subject to change and there can be no assurance that such assumptions will reflect the actual outcome of such items or factors. The forward-looking information contained in this press release represents the expectations of VERSES as of the date of this press release and, accordingly, are subject to change after such date. VERSES does not undertake to update this information at any particular time except as required in accordance with applicable laws.