Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
166 changes: 158 additions & 8 deletions benchmarks.shtml
Original file line number Diff line number Diff line change
Expand Up @@ -50,19 +50,169 @@ Released for free under a Creative Commons Attribution 3.0 License
<div class="content_resize">
<div class="mainbar">
<h2>Benchmarks</h2>
<h3>Availability</h3>
<p>
The latest release of the SMT-LIB benchmark library is hosted on Zenodo under the <a href="https://zenodo.org/communities/smt-lib">SMT-LIB community</a>.
The SMT-LIB initiative provides a large collection of SMT-LIB benchmarks.
These benchmarks are used for research on SMT solving and by the SMT Competition.
</p>
<p>
The latest release of the SMT-LIB benchmark library is hosted on Zenodo
under the <a href="https://zenodo.org/communities/smt-lib">SMT-LIB community</a>.
Also available there are all previous releases starting from 2023.
</p>

<h3>StarExec</h3>
<p>Historical releases of the benchmark library, from 2013 to 2023, are available on <a href="https://www.starexec.org/starexec/secure/explore/spaces.jsp?id=239">StarExec</a>.
<p>
Some benchmarks that are not compliant with the SMT-LIB standard, such as
those that use theories that are not yet specified, are available on
<a href="https://github.com/SMT-LIB/pending-benchmarks.git">GitHub</a>.
</p>
<h3>Submitting Benchmarks</h3>
<p>For details on how to submit benchmarks to SMT-LIB, please refer to the README file in
the <a href="https://github.com/SMT-LIB/benchmark-submission">benchmark-submission</a>
SMT-LIB GitHub repository.
<h4>Benchmark Catalog</h4>
<p>
To make the benchmark collection more accessible, we provide an SQLite
database that collects benchmark metadata including past competition
results. This database and its documentation can also be found on
<a href="https://zenodo.org/records/16290040">Zenodo</a>.
</p>
<h4>StarExec</h4>
<p>Historical releases of the benchmark library, from 2013 to 2023, were
available on <a href="https://www.starexec.org/starexec/secure/explore/spaces.jsp?id=239">StarExec</a>.
</p>
<h3>Submitting Benchmarks</h3>
<p>
The SMT-LIB benchmark collection would not exist without the countless
benchmarks submitted by users and developers of SMT solvers.
We welcome all kinds of benchmarks, from individual benchmarks that
test semantic edge cases, to large benchmark sets generated by verification
problems.
</p>
<p>
For details on how to submit benchmarks to SMT-LIB, please refer to the README file in
the <a href="https://github.com/SMT-LIB/benchmark-submission">benchmark-submission</a>
SMT-LIB GitHub repository.
</p>
<h3>Archive Structure</h3>
<p>
On Zenodo, benchmarks are grouped in two collections: <a href="https://zenodo.org/records/16740866">non-incremental</a> and <a href="https://zenodo.org/records/15493096">incremental</a> benchmarks.
The incremental benchmarks contain more than one call to <code>check-sat</code>.
After extracting a benchmark archive, the benchmarks are stored in
folders with the structure
</p>
<pre><code>[non-]incremental/&lt;logic&gt;/&lt;set-name&gt;/.../&lt;name&gt;.smt2</code></pre>
<p>
Non-incremental benchmarks are always
be separated from the incremental benchmarks, even if a set of
benchmarks mostly consists of incremental benchmarks.
</p>
<p>
The benchmarks are separated according to their logic.
</p>
<p>
The <code>&lt;set-name&gt;</code> is either
<code>&lt;date&gt;-&lt;application&gt;-&lt;submitter&gt;</code>,
<code>&lt;date&gt;-&lt;application&gt;</code>, or
<code>&lt;date&gt;-&lt;submitter&gt;</code>. The
<code>&lt;data&gt;</code> is written as <code>YYYYMMDD</code>.
The application or submitter strings are typically chosen by the
person submitting the benchmarks.
</p>
<p>
Benchmarks can be nested within a deeper directory structure below
the set directory. This nesting should represent the structure
of the benchmarks and is application dependent.
</p>
<p>
Benchmarks are always valid SMT-LIB2 files. They contain at least
one <code>check-sat</code> command and end with an <code>exit</code>
command.
</p>
<p>
Benchmarks contain one <code>set-info :status</code> command for each
<code>check-sat</code> command. In non-incremental benchmarks, the
<code>set-info :status</code> command is part of the header. In
incremental benchmarks the <code>set-info :status</code> commands is
placed in the line just above the corresponding
<code>check-sat</code> command.
</p>
<h3>Benchmark Header</h3>
<p>
All benchmarks contain a metadata header.
For new benchmarks must header has the following structure, but for
older benchmarks some fields may be missing.
</p>
<pre><code>
(set-info :smt-lib-version &lt;version&gt;)
(set-logic &lt;logic&gt;)
(set-info :source |&lt;description&gt;|)
(set-info :license &quot;https://creativecommons.org/licenses/by/4.0/&quot;)
(set-info :category &lt;category&gt;)
(set-info :status &lt;status&gt;)
</code></pre>
<p>
where:
<ul>
<li><code>&lt;version&gt;</code> is the SMT-LIB version number, usually 2.6 or 2.7.</li>
<li><code>&lt;logic&gt;</code> is one of the accepted SMT-LIB logics.</li>
<li><code>&lt;description&gt;</code> is a textual description that provides additional metadata. See below for a description of this field.</li>
<li><code>&lt;category&gt;</code> is either <code>"crafted"</code>, indicating that it was hand-made,
<code>"random"</code>, indicating that it was generated randomly,
or <code>"industrial"</code> (everything else).
<li><code>&lt;status&gt;</code> is either <code>sat</code> or <code>unsat</code> according to the status of the benchmark, or <code>unknown</code> if not known.</li>
</ul>
</p>
<p>
Most benchmarks are distribute under the <a
href="https://creativecommons.org/licenses/by/4.0/">Creative Commons
Attribution 4.0 International License</a>, but contributors can specify
their own licence in the benchmark itself using the
<code>(set-info :license "licence string")</code> command.
</p>
<p>
The <code>&lt;description&gt;</code> field is used to provide general
information about the benchmark. The first lines are of the form
<code>&lt;field&gt;: &lt;data&gt;</code> where the
<code>&lt;field&gt;</code> values are
<ol>
<li><code>Generated by</code>: the name(s) of those who generated the benchmark; </li>
<li><code>Generated on</code>: generation date with format <code>YYYY-MM-DD</code>; </li>
<li><code>Generator</code>: tool which generated the benchmark (if any); </li>
<li><code>Application</code>: the general goal; </li>
<li><code>Target solver</code>: the solvers that were initially used to check the benchmarks; </li>
<li><code>Publications</code>: references to related publications; </li>
<li><code>Time limit</code>: the typical time limit in seconds imposed by the intended application of the benchmark. </li>
</ol>
</p>
<p>
Fields without meaningful values can be omitted. For example, a
manually written benchmark will usually not have a
<code>Generator</code>.
</p>
<p>
After these structured lines, the <code>&lt;description&gt;</code>
field can contain additional information as free text.
We encourage benchmark contributors to use this free text to describe characteristics
of the benchmark that might be of interest to solver authors.
For example, observed performance discrepancies between solver
configurations, or structures in the benchmark that are particularly
difficult.
</p>
<p>A fantasy example of a typical header:</p>
<pre><code>
(set-info :smt-lib-version 2.6)
(set-logic QF_UFLIA)
(set-info :source |
Generated by: Sabrina Samplescientist
Generated on: 2016-12-31
Generator: Sledgehammer
Application: Isabelle proof of Goedel theorem
Target solver: CVC4
Time limit: 2.0
Benchmarks generated by the proof search tool Sledgehammer for Isabelle/HOL.
We observed that the `app` operator introduced by Sledgehammer to eliminate
higher-order function application is especially difficult for SMT solvers.
|)
(set-info :license &quot;https://creativecommons.org/licenses/by/4.0/&quot;)
(set-info :category &quot;industrial&quot;)
(set-info :status unsat)
</code></pre>
</div>

<div class="sidebar">
Expand Down
2 changes: 1 addition & 1 deletion virtual/SSI-free.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/bin/bash
#!/usr/bin/env bash

replace() {
IFS=
Expand Down
166 changes: 158 additions & 8 deletions virtual/benchmarks.shtml
Original file line number Diff line number Diff line change
Expand Up @@ -27,19 +27,169 @@ Released for free under a Creative Commons Attribution 3.0 License
<div class="content_resize">
<div class="mainbar">
<h2>Benchmarks</h2>
<h3>Availability</h3>
<p>
The latest release of the SMT-LIB benchmark library is hosted on Zenodo under the <a href="https://zenodo.org/communities/smt-lib">SMT-LIB community</a>.
The SMT-LIB initiative provides a large collection of SMT-LIB benchmarks.
These benchmarks are used for research on SMT solving and by the SMT Competition.
</p>
<p>
The latest release of the SMT-LIB benchmark library is hosted on Zenodo
under the <a href="https://zenodo.org/communities/smt-lib">SMT-LIB community</a>.
Also available there are all previous releases starting from 2023.
</p>

<h3>StarExec</h3>
<p>Historical releases of the benchmark library, from 2013 to 2023, are available on <a href="https://www.starexec.org/starexec/secure/explore/spaces.jsp?id=239">StarExec</a>.
<p>
Some benchmarks that are not compliant with the SMT-LIB standard, such as
those that use theories that are not yet specified, are available on
<a href="https://github.com/SMT-LIB/pending-benchmarks.git">GitHub</a>.
</p>
<h4>Benchmark Catalog</h4>
<p>
To make the benchmark collection more accessible, we provide an SQLite
database that collects benchmark metadata including past competition
results. This database and its documentation can also be found on
<a href="https://zenodo.org/records/16290040">Zenodo</a>.
</p>
<h4>StarExec</h4>
<p>Historical releases of the benchmark library, from 2013 to 2023, were
available on <a href="https://www.starexec.org/starexec/secure/explore/spaces.jsp?id=239">StarExec</a>.
</p>
<h3>Submitting Benchmarks</h3>
<p>
The SMT-LIB benchmark collection would not exist without the countless
benchmarks submitted by users and developers of SMT solvers.
We welcome all kinds of benchmarks, from individual benchmarks that
test semantic edge cases, to large benchmark sets generated by verification
problems.
</p>
<p>
For details on how to submit benchmarks to SMT-LIB, please refer to the README file in
the <a href="https://github.com/SMT-LIB/benchmark-submission">benchmark-submission</a>
SMT-LIB GitHub repository.
</p>
<h3>Archive Structure</h3>
<p>
On Zenodo, benchmarks are grouped in two collections: <a href="https://zenodo.org/records/16740866">non-incremental</a> and <a href="https://zenodo.org/records/15493096">incremental</a> benchmarks.
The incremental benchmarks contain more than one call to <code>check-sat</code>.
After extracting a benchmark archive, the benchmarks are stored in
folders with the structure
</p>
<pre><code>[non-]incremental/&lt;logic&gt;/&lt;set-name&gt;/.../&lt;name&gt;.smt2</code></pre>
<p>
Non-incremental benchmarks are always
be separated from the incremental benchmarks, even if a set of
benchmarks mostly consists of incremental benchmarks.
</p>
<p>
The benchmarks are separated according to their logic.
</p>
<p>
The <code>&lt;set-name&gt;</code> is either
<code>&lt;date&gt;-&lt;application&gt;-&lt;submitter&gt;</code>,
<code>&lt;date&gt;-&lt;application&gt;</code>, or
<code>&lt;date&gt;-&lt;submitter&gt;</code>. The
<code>&lt;data&gt;</code> is written as <code>YYYYMMDD</code>.
The application or submitter strings are typically chosen by the
person submitting the benchmarks.
</p>
<p>
Benchmarks can be nested within a deeper directory structure below
the set directory. This nesting should represent the structure
of the benchmarks and is application dependent.
</p>
<p>
Benchmarks are always valid SMT-LIB2 files. They contain at least
one <code>check-sat</code> command and end with an <code>exit</code>
command.
</p>
<p>
Benchmarks contain one <code>set-info :status</code> command for each
<code>check-sat</code> command. In non-incremental benchmarks, the
<code>set-info :status</code> command is part of the header. In
incremental benchmarks the <code>set-info :status</code> commands is
placed in the line just above the corresponding
<code>check-sat</code> command.
</p>
<h3>Benchmark Header</h3>
<p>
All benchmarks contain a metadata header.
For new benchmarks must header has the following structure, but for
older benchmarks some fields may be missing.
</p>
<h3>Submitting Benchmarks</h3>
<p>For details on how to submit benchmarks to SMT-LIB, please refer to the README file in
the <a href="https://github.com/SMT-LIB/benchmark-submission">benchmark-submission</a>
SMT-LIB GitHub repository.
<pre><code>
(set-info :smt-lib-version &lt;version&gt;)
(set-logic &lt;logic&gt;)
(set-info :source |&lt;description&gt;|)
(set-info :license &quot;https://creativecommons.org/licenses/by/4.0/&quot;)
(set-info :category &lt;category&gt;)
(set-info :status &lt;status&gt;)
</code></pre>
<p>
where:
<ul>
<li><code>&lt;version&gt;</code> is the SMT-LIB version number, usually 2.6 or 2.7.</li>
<li><code>&lt;logic&gt;</code> is one of the accepted SMT-LIB logics.</li>
<li><code>&lt;description&gt;</code> is a textual description that provides additional metadata. See below for a description of this field.</li>
<li><code>&lt;category&gt;</code> is either <code>"crafted"</code>, indicating that it was hand-made,
<code>"random"</code>, indicating that it was generated randomly,
or <code>"industrial"</code> (everything else).
<li><code>&lt;status&gt;</code> is either <code>sat</code> or <code>unsat</code> according to the status of the benchmark, or <code>unknown</code> if not known.</li>
</ul>
</p>
<p>
Most benchmarks are distribute under the <a
href="https://creativecommons.org/licenses/by/4.0/">Creative Commons
Attribution 4.0 International License</a>, but contributors can specify
their own licence in the benchmark itself using the
<code>(set-info :license "licence string")</code> command.
</p>
<p>
The <code>&lt;description&gt;</code> field is used to provide general
information about the benchmark. The first lines are of the form
<code>&lt;field&gt;: &lt;data&gt;</code> where the
<code>&lt;field&gt;</code> values are
<ol>
<li><code>Generated by</code>: the name(s) of those who generated the benchmark; </li>
<li><code>Generated on</code>: generation date with format <code>YYYY-MM-DD</code>; </li>
<li><code>Generator</code>: tool which generated the benchmark (if any); </li>
<li><code>Application</code>: the general goal; </li>
<li><code>Target solver</code>: the solvers that were initially used to check the benchmarks; </li>
<li><code>Publications</code>: references to related publications; </li>
<li><code>Time limit</code>: the typical time limit in seconds imposed by the intended application of the benchmark. </li>
</ol>
</p>
<p>
Fields without meaningful values can be omitted. For example, a
manually written benchmark will usually not have a
<code>Generator</code>.
</p>
<p>
After these structured lines, the <code>&lt;description&gt;</code>
field can contain additional information as free text.
We encourage benchmark contributors to use this free text to describe characteristics
of the benchmark that might be of interest to solver authors.
For example, observed performance discrepancies between solver
configurations, or structures in the benchmark that are particularly
difficult.
</p>
<p>A fantasy example of a typical header:</p>
<pre><code>
(set-info :smt-lib-version 2.6)
(set-logic QF_UFLIA)
(set-info :source |
Generated by: Sabrina Samplescientist
Generated on: 2016-12-31
Generator: Sledgehammer
Application: Isabelle proof of Goedel theorem
Target solver: CVC4
Time limit: 2.0
Benchmarks generated by the proof search tool Sledgehammer for Isabelle/HOL.
We observed that the `app` operator introduced by Sledgehammer to eliminate
higher-order function application is especially difficult for SMT solvers.
|)
(set-info :license &quot;https://creativecommons.org/licenses/by/4.0/&quot;)
(set-info :category &quot;industrial&quot;)
(set-info :status unsat)
</code></pre>
</div>
<!--#include virtual="sidebar.shtml" -->
</div>
Expand Down