CASC-JC Entrants' System Descriptions
Bliksem 1.12
H. de Nivelle
Max Planck Institut für Informatik,
Germany
nivelle@mpi-sb.mpg.de
Architecture
Bliksem is a theorem prover implementing resolution and paramodulation.
It implements many different orderings, including many orders that
define decision procedures. The automatic selection of strategies
is poor. A special feature of Bliksem is that it is able to output
totally specified proofs. After processing, these proofs can be
checked by Coq.
Implementation
Bliksem is written in portable C. High priority has been given
to portability. Bliksem has been compiled succcessfully using
cc, gcc, and djpcc. We plan to rewrite Bliksem, but at this
moment we are not certain about the direction in which Bliksem
should develop. Bliksem is available from:
http://www.mpi-sb.mpg.de/~bliksem
Expected Competition Performance
Bliksem did not change significantly in the last 1.5 years.
But the competition has changed, and the other systems as well.
We will see ...
References
DCTP 0.1
R. Letz, G. Stenz
Institut für Informatik, Technische Universität München,
Germany
{letz,stenz}@informatik.tu-muenchen.de
Architecture
DCTP 0.1 [LS01] is an automated theorem prover
for first order clause logic.
It is an implementation of the disconnection calculus described in
[Bil96].
The disconnection calculus is a proof confluent and inherently
cut-free tableau calculus with a weak connectedness condition.
Initially, an initial active path is selected through the input
clause set, which is then expanded by repeated application of the single
inference rule, the linking rule: Two clauses that are linked via
the active path are unified and appended to the bottom of the active path.
The new path containing the linked literals is closed,
The remaining new paths are solved in the same manner, until no open path
remains in the tableau, in which case the input set is considered proven,
or no further links can be applied, then a model for the input set can
be derived from the open path.
The concept of branch closure is different from most other tableau calculi.
A branch is considered closed iff it is ground closed, i.e. it contains a
pair of contradicting ground literals when all variables on the branch are
instantiated to the same new constant symbol.
The inherently depth-first proof search is guided by a literal selection
based on literal instantiatedness and a heavily parameterized link selection.
The pruning mechanisms indispensable for a proof confluent calculus mostly
rely on different forms variant deletion and unit based
strategies. Additionally the calculus has been augmented by relevancy
based backtracking.
The current version has no builtin equality treatment, that is,
equality problems must be augmented by the full set of equality axioms to
preserve completeness.
Currently a version of ordered paramodulation, different from the proposals
in Billon's paper, is being integrated into DCTP, but will not be ready for
the competition.
Implementation
DCTP has been implemented in the Bigloo dialect of the Scheme language.
Major emphasis has been placed on memory preserving techniques, as a
proof confluent calculus otherwise easily gobbles up all available
resources.
All parts of the tableau no longer needed are immediately discarded.
DCTP uses a kind of weak term sharing via bound variables to prevent
excessive replication of terms during unification.
Expected Competition Performance
As DCTP is a very new system and little experience could be gathered on the
effects of the existing guidance heuristics, generally DCTP is not expected to
perform on par with the current state of the art systems. Slightly better
performance might possibly be expected in the new EPR division.
References
- Bil96
- Billon J-P. (1996),
The Disconnection Method: A Confluent Integration of Unification
in the Analytic Framework,
Miglioli P., Moscato U., Mundici D., Ornaghi M.,
Proceedings of TABLEAUX'96: the 5th Workshop on Theorem Proving with
Analytic Tableaux and Related Methods
(Palermo, Italy),
pp.110-126,
Lecture Notes in Artificial Intelligence 1071,
Springer-Verlag.
- LS01
- Letz R., Stenz G. (2001),
System Description: DCTP - A Disconnection Calculus Theorem
Prover,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
E 0.6
S. Schulz
Institut für Informatik, Technische Universität München,
Germany
schulz@informatik.tu-muenchen.de
Architecture
E 0.6 [Sch99] is a purely equational
theorem prover. The calculus used by E combines superposition (with
selection of negative literals) and rewriting. For the special case
that only unit clauses are present, it degenerates into unfailing
completion. For the Horn case E can simulate various positive
unit strategies. No special rules for non-equational literals have
been implemented, i.e., resolution is simulated via paramodulation and
equality resolution.
Proof search in E is controlled by a literal selection strategy and an
evaluation heuristic. A variety of selection strategies (including
e.g., no selection, or select all negative literals)
is available. Evaluation heuristics can be constructed on the fly by
combining various parameterized primitive evaluation functions. A
simple automatic mode can select strategy, term ordering, and
heuristic based on simple problem characteristics.
New features since last year include a wider variety of literal
selection strategies, and an improved automatic mode that now also
switches between two different Knuth-Bendix orderings. A proof checker
is now available as well.
Implementation
E is implemented in ANSI C, using the GNU C compiler. The most unique
feature of the implementation is the maximally shared term
representation. This includes parallel rewriting for all instances of
a particular subterm. A second important feature is the use of
perfect discrimination trees with age and size constraints
for rewriting and unit-subsumption.
The program has been successfully installed under SunOS 4.3.x,
Solaris 2.x, HP-UX B 10.20, and various versions of
Linux. Sources are available freely from:
http://wwwjessen.informatik.tu-muenchen.de/~schulz/WORK/eprover.html
References
- Sch99
- Schulz S. (1999),
System Abstract: E 0.3,
Ganzinger H.,
Proceedings of the 16th International Conference on Automated
Deduction
(Trento, Italy),
pp.297-301,
Lecture Notes in Artificial Intelligence 1632,
Springer-Verlag.
E 0.62 and EP 0.62
S. Schulz
Institut für Informatik, Technische Universität München,
Germany
schulz@informatik.tu-muenchen.de
Architecture
E 0.62 [Sch01] is a purely equational
theorem prover. The calculus used by E combines superposition (with
selection of negative literals) and rewriting. No special rules for
non-equational literals have been implemented, i.e. resolution is simulated
via paramodulation and equality resolution. New additions to the calculus since
E 0.6 include AC redundancy elimination and AC simplification for
dynamically recognized associative and commutative equational theories, as well
as simulated clause splitting (inspired by Vampire).
E is based on the DISCOUNT-loop variant of the given-clause algorithm,
i.e. a strict separation of active and passive facts. Proof search in E is
primarily controlled by a literal selection strategy, a clause evaluation
heuristic, and a simplification ordering. The prover supports a large number of
preprogrammed literal selection strategies, many of which are only
experimental. Clause evaluation heuristics can be constructed on the fly by
combining various parameterized primitive evaluation functions, or can be
selected from a set of predefined heuristics. Supported term orderings are
several parameterized instances of Knuth-Bendix-Ordering (KBO) and Lexicographic
Path Ordering (LPO).
An automatic mode can select literal selection strategy, term ordering
(different versions of KBO and LPO), and search heuristic based on simple
problem characteristics.
EP 0.62 is just a combination of E 0.62 in full verbose mode
and a proof generation tool.
Proof object generation is handled by complex postprocessing
of the full search protocol.
Implementation
E is implemented in ANSI C, using the GNU C compiler. The most unique
feature of the implementation is the maximally shared term
representation. This includes parallel rewriting for all instances of
a particular subterm. A second important feature is the use of
perfect discrimination trees with age and size constraints
for rewriting and unit-subsumption.
The program has been successfully installed under SunOS 4.3.x,
Solaris 2.x, HP-UX B 10.20, and various versions of
Linux. Sources of the latest released version and a current snapshot are
available freely from:
http://wwwjessen.informatik.tu-muenchen.de/~schulz/WORK/eprover.html
EP 0.62 is a simple Bourne shell script calling E and the postprocessor in
a pipeline.
Expected Competition Performance
In the last year, E won the MIX category of CASC and came in fourth in the UEQ
division. We believe that E will again be among the strongest provers in the
MIX category, in particular due to its good performance for Horn problems. We
also hope to improve in the UEQ category, although we probably will not nearly
reach the performance of Waldmeister.
EP 0.62 will be hampered by the fact that it has to reconstruct proofs in a
post-processing step that typically usually uses at least the same amount of
time as the proof search itself. This is particularly grave because E is able to
find a relatively large number of proofs late in the proof search.
E's auto mode is optimized for performance on the TPTP 2.3.0. The optimization
is based on a fairly large number of test runs and consists of the selection of
one of about 50 different strategies for each problem.
- Sch01
- Schulz S. (2001),
System Abstract: E 0.61,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
E-SETHEO csp01
R. Letz, S. Schulz, G. Stenz
Technische Universität München,
Germany
{letz,schulz,stenz}@informatik.tu-muenchen.de
Architecture
E-SETHEO is a compositional theorem prover for formulae in first order
clause logic, combining the systems E [Sch99]
and SETHEO [MI+97].
It incorporates different calculi and proof procedures
like superposition, model elimination and semantic trees (the DPLL
procedure).
Furthermore, the system contains transformation techniques
which may split the formula into independent subparts or which may
perform a ground instantiation.
Finally, advanced methods for controlling and optimizing the combination
of the subsystems are applied.
E-SETHEO additionally includes the program Flotter
[WGR96] as a
preprocessing module for transforming non-clausal formulae to clausal form.
Since version 99csp of E-SETHEO, the different strategies
are run sequentially, one after the other, depending on the allocation
of resources to the different strategies, so-called schedules,
which have been computed from experimental data using machine learning
techniques as described in [SW99].
Schedule selection depends on syntactic characteristics of the input formula.
E-SETHEO csp01 incorporates the disconnection prover DCTP
[LS01] as a new strategy as well as a
new version of the E prover.
Implementation
According to the diversity of the contained systems, the modules of
E-SETHEO are implemented in different programming languages like C,
Prolog, Scheme, and Shell tools.
The program runs under Solaris 2.6. Sources are available freely.
Expected Competition Performance
We expect E-SETHEO to perform well in all categories it participates in.
References
- LS01
- Letz R., Stenz G. (2001),
System Description: DCTP - A Disconnection Calculus Theorem
Prover,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
- MI+97
- Moser M., Ibens O., Letz R., Steinbach J., Goller C., Schumann J.,
Mayr K. (1997),
SETHEO and E-SETHEO: The CADE-13 Systems,
Journal of Automated Reasoning 18(2),
pp.237-246.
- Sch99
- Schulz S. (1999),
System Abstract: E 0.3,
Ganzinger H.,
Proceedings of the 16th International Conference on Automated
Deduction
(Trento, Italy),
pp.297-301,
Lecture Notes in Artificial Intelligence 1632,
Springer-Verlag.
- SW99
- Stenz G., Wolf A. (1999),
E-SETHEO: Design, Configuration and Use of a Parallel Automated
Theorem Prover,
Foo N.,
Proceedings of AI'99: The 12th Australian Joint Conference on
Artificial Intelligence
(Sydney, Australia),
pp.231-243,
Lecture Notes in Artificial Intelligence 1747,
Springer-Verlag.
- WGR96
- Weidenbach C., Gaede B., Rock G. (1996),
SPASS and FLOTTER,
McRobbie M., Slaney J.K.,
Proceedings of the 13th International Conference on Automated
Deduction
(New Brunswick, USA),
pp.141-145,
Lecture Notes in Artificial Intelligence 1104,
Springer-Verlag.
Gandalf and GandalfFOF c-2.3
T. Tammet
Tallinn Technical University, Estonia
tammet@cc.ttu.ee
Architecture
Gandalf version "c" stands for the Gandalf for classical logic. Special
versions for intuitionistic logic and type theory exist but are not
covered here.
Gandalf is a reolsution prover which implements a number of different
basic strategies: binary ordered resolution for several orderings,
versions of set-of-support
resolution, binary unit resolution, hyperresolution. Enhancements are
used for cutting off literals and combining different strategies, in
particular forward and
backward reasoning, into one run. Equality is handled by ordered
paramodulation and demodulation.
The basic idea of the automatic mode in Gandalf is time-slicing: Gandalf
selects a set of different search strategies, allocates time to these
and finally runs the
strategies one after another. The motivation is the following: since it
is very hard to determine a suitable strategy by some heuristic
procedure, it pays off to try a
number of possibly suitable strategies. The selection of strategies and
the percentage of time they receive is in a some what ad hoc way based
on the characteristics
of the problem: percentage of Horn and almost-Horn clauses, the number
of clauses, percentage of clauses where literals can be ordered using
term and variable depths.
GandalfFOF c-2.3 is a pipeline of Otter
[McC94], which converts the
problem to CNF, and Gandalf c-2.3.
Some relevant publications are: [Tam97,
Tam98,
TS98].
Implementation
Gandalf is implemented in Scheme, using the scm interpreter developed by
A. Jaffer and the Scheme-to-C compiler Hobbit developed by T. Tammet for
the scm system.
Expected Competition Performance
Gandalf is likely to do well in the non-Horn-with-equality subclass of
problems. It is likely to have mediocre performance in the general mixed
class, and poor performance in the pure equality class.
References
- McC94
- McCune W.W. (1994),
Otter 3.0 Reference Manual and Guide,
ANL-94/6,
Argonne National Laboratory, Argonne, USA.
- Tam97
- Tammet T. (1997),
Gandalf,
Journal of Automated Reasoning 18(2),
pp.199-204.
- Tam98
- Tammet T. (1998),
Towards Efficient ATPProgress,
Kirchner C., Kirchner H.,
Proceedings of the 15th International Conference on Automated
Deduction
(Lindau, Germany),
pp.427-440,
Lecture Notes in Artificial Intelligence 1421,
Springer-Verlag.
- TS98
- Tammet T., Smith J. (1998),
Optimised Encodings of Fragments of Type Theory in First
Order Logic,
Journal of Logic and Computation 8(6),
pp.713-744.
GandalfSat 1.0
T. Tammet
Tallinn Technical University, Estonia
tammet@cc.ttu.ee
Architecture
GandalfSat is a special version of the
Gandalf
theorem prover, optimised for satisifiability checking. Essentially
it contains the same code as Gandalf, but uses different default search
strategies to Gandalf.
Ordinary Gandalf is a resolution prover which implements a number of
different basic strategies: binary ordered
resolution for several orderings, versions of set-of-support resolution,
binary unit resolution, hyperresolution.
Enhancements are used for cutting off literals and combining different
strategies, in particular forward and backward
reasoning, into one run. Equality is handled by ordered paramodulation
and demodulation.
GandalfSat uses time-slicing similarly to ordinary Gandalf.
It attempts hyperresolution, ordered resolution based on
[FL+93] results, and two kinds of
finite model building algorithms (Falcon-like and MACE-like) for attempting
to show satisfiability.
Some relevant publications are:
[FL+93,
Tam97,
TS98].
Implementation
Gandalf is implemented in Scheme, using the scm interpreter developed
by A. Jaffer and the Scheme-to-C compiler Hobbit developed by T. Tammet
for the scm system.
References
- FL+93
- Fermüller C., Leitsch A., Tammet T., Zamov N. (1993),
Resolution Methods for the Decision Problem,
Lecture Notes in Computer Science 679,
Springer-Verlag.
- Tam97
- Gandalf,
Journal of Automated Reasoning 18(2),
pp.199-204.
- Tam98
- Towards Efficient Subsumption,
Kirchner C., Kirchner H.,
Proceedings of the 15th International Conference on Automated
Deduction
(Lindau, Germany),
pp.427-440,
Lecture Notes in Artificial Intelligence 1421,
Springer-Verlag.
GandalfSat 1.1
T. Tammet
Tallinn Technical University, Estonia
tammet@cc.ttu.ee
Architecture
GandalfSat is a special version of the Gandalf
theorem prover, optimised for satisifiability checking.
Essentially it contains the same code as Gandalf, but uses different
default search strategies to Gandalf.
Ordinary Gandalf is a resolution prover which implements a number of
different basic strategies: binary ordered resolution for several
orderings, versions of set-of-support resolution, binary unit resolution,
hyperresolution.
Enhancements are used for cutting off literals and combining different
strategies, in particular
forward and backward reasoning, into one run. Equality is handled by
ordered paramodulation and demodulation.
GandalfSat uses time-slicing similarly to ordinary Gandalf.
It attempts hyperresolution, ordered resolution based on
[FL+93] results, and two
kinds of finite model
building algorithms (Falcon-like and MACE-like) for attempting to show
satisfiability.
Some relevant publications are: [FL+93,
Tam97,
TS98].
Implementation
Gandalf is implemented in Scheme, using the scm interpreter developed by
A. Jaffer and the Scheme-to-C compiler Hobbit developed by T. Tammet for
the scm system.
Expected Competition Performance
We expect it to be among the better provers for satisfiability checking.
References
- FL+93
- Fermüller C., Leitsch A., Tammet T., Zamov N. (1993),
Resolution Methods for the Decision Problem,
Lecture Notes in Computer Science 679,
Springer-Verlag.
- Tam97
- Tammet T. (1997),
Gandalf,
Journal of Automated Reasoning 18(2),
pp.199-204.
- TS98
- Tammet T., Smith J. (1998),
Optimised Encodings of Fragments of Type Theory in First
Order Logic,
Journal of Logic and Computation 8(6),
pp.713-744.
MUSCADET 2.3
D. Pastre
Université René Descartes (Paris 5),
France
pastre@math-info.univ-paris5.fr
Architecture
The MUSCADET theorem prover is a knowledge-based system.
It is based on Natural Deduction, following the terminology of
[Ble71] and
[Pas78],
and uses methods which resembles those used by humans.
It is composed of an inference engine, which interprets and executes rules,
and of one or several bases of facts,
which are the internal representation of "theorems to be proved".
Rules are either universal and put into the system, or built by the system
itself by metarules from data (definitions and lemmas).
Rules may add new hypotheses, modify the conclusion, create objects,
split theorems into two or more subtheorems
or build new rules which are local for a (sub-)theorem.
There are specific strategies for existential, universal, conjonctive or
disjunctive hypotheses and conclusions.
Functional symbols may be used, but an automatic creation of intermediate
objects allows deep subformulae to be flattened and treated as if the
concepts were defined by predicate symbols.
The successive steps of a proof may be forward deduction (deduce new hypotheses
from old ones), backward deduction (replace the conclusion by a new one) or
refutation (only if the conclusion is a negation).
The system is also able to work with second order statements. It
may also receive knowledge and know-how for a specific domain from a
human user; see [Pas89] and
[Pas93]. These two possibilities
are not used while working with the TPTP Library.
Implementation
MUSCADET 2 [Pas01]
is implemented in SWI-Prolog.
Rules are written as declarative Prolog clauses.
Metarules are written as sets of Prolog clauses, more or less declarative.
The inference engine includes the Prolog interpreter and some procedural Prolog
clauses.
MUSCADET 2.3 can be obtained from:
http://www.math-info.univ-paris5.fr/~pastre/muscadet/muscadet.html
Expected Competition Performance
The best performances of MUSCADET will be for problems
manipulating many concepts in which all statements (conjectures,
definitions, axioms) are expressed in a manner similar to the
practice of humans, especially of mathematicians. It will have poor
performances for problems using few concepts but large and deep
formulas leading to many splittings.
References
- Ble71
- Bledsoe W.W. (1971),
Splitting and Reduction Heuristics in Automatic Theorem
Proving,
Artificial Intelligence 2,
pp.55-77.
- Pas78
- Pastre D. (1978),
Automatic Theorem Proving in Set Theory,
Artificial Intelligence 10,
pp.1-27.
- Pas89
- Pastre D. (1989),
MUSCADET : An Automatic Theorem Proving System using Knowledge
and Metaknowledge in Mathematics,
Artificial Intelligence 38,
pp.257-318.
- Pas93
- Pastre D. (1993),
Automated Theorem Proving in Mathematics,
Annals of Mathematics and Artificial Intelligence 8,
pp.425-447.
- Pas01
- Pastre D. (2001),
MUSCADET version 2.3 : User's Manual,
http://www.math-info.univ-paris5.fr/~pastre/muscadet/manual-en.ps.
Otter-MACE 3.2-2.0
W. McCune
Argonne National Laboratory,
USA
mccune@mcs.anl.gov
Architecture
Otter 3.2 [McC94,
McC00b,MW97]
and
MACE 2.0 [McC01,
McC00a]
are introduced in this section. It describes
the calculus, inference rules, and search control used.
Both programs
apply to statements in first-order (unsorted) logic with equality. The
two programs are independent and work by completely different
methods. (In practice, the programs are complementary.)
Otter is based on resolution and
paramodulation applied to clauses; an Otter search uses the ``given
clause algorithm'' and typically involves a large database of clauses;
subsumption and demodulation play an important role.
Roughly speaking, MACE works as follows. For a
given domain size, an equivalent propositional problem is constructed
and a Davis-Putnam-Loveland-Logeman decision procedure is applied. MACE
typically cannot handle clauses with many variables or deeply nested
terms.
Implementation
The programs are written in C and accept the same input files.
Otter uses shared data structures for
clauses and terms, and it uses indexing for resolution,
paramodulation, forward and backward subsumption, forward and backward
demodulation, and unit conflict. MACE is much simpler than Otter; its
Davis-Putnam-Loveland-Logeman procedure uses a counting algorithm to do the
unit resolutions, and forward subsumption is not applied.
Expected Competition Performance
Otter's original automatic mode, which reflects no tuning to the TPTP
problems, will be used. Otter is especially ineffective on non-Horn
problems, so it is not expected to do well on the NNE or NEQ
categories. Otter's performance in the other categories should be acceptable.
MACE won the SAT division at the 1999 CASC and should do well on the
SAT problems this year.
References
- McC94
- McCune W.W. (1994),
Otter 3.0 Reference Manual and Guide,
ANL-94/6,
Argonne National Laboratory, Argonne, USA.
- MW97
- McCune W.W., Wos L. (1997),
Otter: The CADE-13 Competition Incarnations,
Journal of Automated Reasoning 18(2),
pp.211-220.
- McC00a
- McCune W.W. (2000),
MACE: Models and Counterexamples,
http://www-unix.mcs.anl.gov/AR/mace/.
- McC00b
- McCune W.W. (2000),
Otter: An Automated Deduction System,
http://www-unix.mcs.anl.gov/AR/otter/.
- McC01
- McCune W.W. (2001),
MACE 2.0 Reference Manual and Guide,
ANL/MCS-TM-249,
Argonne National Laboratory, Argonne, USA.
Acknowledgments: Ross Overbeek, Larry Wos, Bob Veroff, and
Rusty Lusk contributed to the development of Otter. MACE was
motivated by previous work of John Slaney, Mark Stickel, and Hantao
Zhang.
PizEAndSATO 0.1
G. Sutcliffe^{1}, S. Schulz^{2}
^{1}University of Miami,
USA,
^{2}Technische Universität München
geoff@cs.miami.edu, schulz@informatik.tu-muenchen.de
Architecture
PizEAndSATO is a system for CNF first order problems with a finite
Herbrand universe. It is implemented by combining the ground-instance
generating program PizE with the propositional prover
SATO.
PizE is currently a very simple program. It systematically
enumerates all ground instances of a clause set with a finite Herbrand
universe. The program is not yet very much optimized. There are only
two techniques implemented to cut down on the number of instances
generated. First, PizE will split clause input clauses that can be
partitioned into variable-disjoint subsets. Secondly, it performs
forward unit subsumption and forward unit resolution on the generated
ground instances to simplify the formula and to reduce the amount of
memory used.
SATO [ZS00], is a highly optimized
implementation of the Davis-Putnam-Loveland-Logemann procedure.
SATO is well known and has been sucessfully applied in a variety of
challenging propositional domains.
PizE and SATO are combined with a shell script called And.
Implementation
PizE is implemented in ANSI C.
It is based on the E [Sch01]
libraries, shares a large amount of code with E, and will be bundled
with future releases of E. Since E is optimized for the first order
case, clause representation in PizE is far from optimal for
propositional clauses. SATO is implemented in C. It implements its
own highly efficient data structures and algorithms.
Expected Competition Performance
We expect a reasonable performance in the EPR division.
Testing indicates that the only problems that have very many clauses
instances are beyond PizEAndSATO's reach, because PizE runs out of
memory.
References
- Sch01
- System Abstract: E 0.61,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
- ZS00
- Zhang H., Stickel M. (2000),
Implementing the Davis-Putnam Method,
Journal of Automated Reasoning 24(1/2),
pp.277-296.
SCOTT 6
K. Hodgson, J. Slaney
Australian National University,
Australia
{kahlil,jks}@arp.anu.edu.au
Architecture
SCOTT (Semantically Constrained OTTER) [HS99]
is an extension of OTTER [McC94] and uses the
same inference rules and overall proof search structure. This extension
incorporates FINDER [Sla94] (a finite model
generator) as a subsystem of OTTER in order to determine consistent subsets
of the clause space. These consistent subsets are used to guide the proof
search by increasing the likelihood that certain necessary inferences are
made. (This can be seen to loosely generalise the set of support strategy).
Not all consistent subsets are created equal, with different collections
providing different levels of guidance, and so a number of strategies for
determining which collections provide the best guidance have been
developed. These incorporate theoretical estimates of the guidance
provided and the corresponding cost incurred, enabling us to manage the
trade-off much more effectively than in previous versions of SCOTT.
Implementation
SCOTT is written in C. It uses the existing program OTTER to perform the
proof search and additional code to determine which clauses should be
chosen for inference. This additional code calls the existing program
FINDER in order to witness the consistency of particular sets of
clauses. The system may be obtained from:
http://discus.anu.edu.au/software/scott/
Expected Competition Performance
In principle SCOTT should perform at least as well as OTTER across all
parts of the MIX and UEQ divisions, if only because it has OTTER as a
subroutine. However, searching for models and testing clauses against them
can be costly, and so SCOTT may perform poorly if the time-limit is too
severe. Preliminary results show that, SCOTT-6 out performs SCOTT-5, so we
expect an improvement on the performance in CASC-17. Overall, we expect
good performance in comparison with cognate systems that employ relatively
naive guidance.
References
- HS99
- Hodgson K., Slaney J.K. (1999),
Semantic Guidance with SCOTT,
TR-ARP-01-99,
Automated Reasoning Project, Australian National University, Canberra,
Australia.
- McC94
- McCune W.W. (1994),
Otter 3.0 Reference Manual and Guide,
ANL-94/6,
Argonne National Laboratory, Argonne, USA.
- Sla94
- Slaney J.K. (1994),
FINDER: Finite Domain Enumerator, System Description,
Bundy A.,
Proceedings of the 12th International Conference on Automated
Deduction
(Nancy, France),
pp.798-801,
Lecture Notes in Artificial Intelligence 814,
Springer-Verlag.
Vampire 1.0
A. Riazanov, A. Voronkov
Computer Science Department, The University of Manchester
{riazanoa,voronkov}@cs.man.ac.uk
Architecture
Vampire 1.0 is an automatic theorem prover for first-order classical logic.
It implements the calculi of ordered binary resolution,
hyperresolution and superposition for handling equality.
The splitting rule is simulated by introducing new predicate symbols.
A number of standard redundancy criteria and simplification techniques
is used for pruning the search space: subsumption,
tautology deletion, subsumption resolution and rewriting by ordered
unit equalities. The only term ordering used in Vampire at the moment
is a special version of the Knuth-Bendix ordering which allows
efficient approximation algorithms for solving ordering constraints.
The competition version will exploit a number of simple preprocessing
techniques, such as elimination of simple predicate and function
definitions. A number of efficient indexing techniques is used
to implement all major operations on sets of terms and clauses,
such as an improved version of code trees
[Vor95]
for forward subsumption and a combination of path indexing
[Sti89] and database joins for backward
subsumption.
Implementation
Vampire 1.0 is implemented in C++ and can be compiled by 2.91 or newer
versions of gcc. The binaries for Solaris and Linux are
available from the authors, for details see:
http://www.cs.man.ac.uk/~riazanoa/Vampire/
References
- Sti89
- Stickel M.E. (1989),
The Path Indexing Method for Indexing Terms,
Technical Note 473,
SRI International, Menlo Park, USA.
- Vor95
- Voronkov A. (1995),
The Anatomy of Vampire,
Journal of Automated Reasoning 15(2),
pp.237-265.
Vampire 2.0 and Vampire CASC-JC
A. Riazanov, A. Voronkov
Computer Science Department, The University of Manchester
{riazanoa,voronkov}@cs.man.ac.uk
Architecture
Vampire 2.0 is an automatic theorem prover for first-order classical logic.
It implements the calculi of ordered binary
resolution, hyperresolution, and superposition for handling equality.
The splitting rule is simulated by introducing new
predicate symbols. A number of standard redundancy criteria and
simplification techniques are used for pruning the
search space: subsumption, tautology deletion, subsumption
resolution and rewriting by ordered unit equalities. The only
term ordering used in Vampire at the moment is a special version
of the Knuth-Bendix ordering which allows efficient
approximation algorithms for solving ordering constraints.
A number of efficient indexing techniques are used to implement all major
operations on sets of terms and clauses, such as an improved version
[RV00]
of code trees [Vor95] for forward
subsumption, and a combination
of path indexing [Sti89] and database
joins for backward subsumption.
Compared to Vampire 1.0 that participated in the previous competition,
this version has many more literal selection functions, more flexible
splitting without backtracking and improved memory management.
The automatic mode of Vampire 2.0 is primitive, it
recognises only very simple syntactic properties of the input
problem, like the presence of equality or non-Horn clauses.
In the preprocessing stage it exploits a number of primitive techniques,
such as elimination of simple predicate and function definitions.
Vampire CASC-JC is a version based on Vampire 2.0, specialised to
win the competition. To adjust the strategy it estimates the problem
by checking some syntactic properties, such as
presence of multiliteral, non-Horn and ground clauses, equations and
nonequational literals. Additionally we consider such quantitative
characteristics as the number of axioms, literals, small and large
terms.
Implementation
Vampire 2.0 is implemented in C++ and can be compiled
by 2.91 or newer versions of gcc. The binaries for Solaris and
Linux are available from the authors, for details see:
http://www.cs.man.ac.uk/~riazanoa/Vampire
Expected Competition Performance
We expect Vampire CASC-JC to perform much better than Vampire 2.0.
References
- RV00
- Riazanov A., Voronkov A. (2000),
Partiallly Adaptive Code Trees,
Ojeda-Aciego M., de Guzman I., Brewka G., Pereira L.,
Proceedings of the 7th European Workshop on Logics in Artificial
Inteligence
(Malaga, Spain),
pp.209-223,
Lecture Notes in Artificial Intelligence 1919,
Springer-Verlag.
- Sti89
- Stickel M.E. (1989),
The Path Indexing Method for Indexing Terms,
Technical Note 473,
SRI International, Menlo Park, USA.
- Vor95
- Voronkov A. (1995),
The Anatomy of Vampire,
Journal of Automated Reasoning 15(2),
pp.237-265.
Waldmeister 600
T. Hillenbrand, B. Löchner, A. Jaeger, and A. Buch
FB Informatik, Universität Kaiserslautern
waldmeister@informatik.uni-kl.de
Architecture
Waldmeister 600 [HJL99] is an
implementation of unfailing
Knuth-Bendix completion [BDP89].
The system architecture is
built on a stringent distinction of active facts (selected
rewrite rules) and passive facts (critical pairs): Firstly,
only the active facts induce the normalization relation.
Secondly, the passive facts are normalized not permanently,
but on generation only. The saturation cycle is
parameterized by a reduction ordering and a heuristic
assessment of passive facts. The control parameters are
chosen according to the algebraic structure given in the
problem specification
[HJL99]. Recent developments include
the integration of the kernel strategy
[HL00] as well as
extensions towards ordered completion, but are still in
progress.
Implementation
The system is implemented in ANSI-C and runs under SunOS,
Solaris and Linux. Refined perfect discrimination trees are
employed to index the active facts, whereas the passive
facts are stored in a compressed representation. The
hypotheses are blown up to sets of rewrite successors and
sometimes predecessors. A step-by-step proof object is
constructed at run-time. Waldmeister can be found on the Web
at:
http://www-avenhaus.informatik.uni-kl.de/waldmeister
References
- BDP89
- Bachmair L., Dershowitz N., Plaisted D.A. (1989),
Completion Without Failure,
Ait-Kaci H., Nivat M.,
Resolution of Equations in Algebraic Structures,
pp.1-30,
Academic Press.
- HJL99
- Hillenbrand T., Jaeger A., Löchner B. (1999),
Waldmeister - Improvements in Performance and Ease of
Use,
Ganzinger H.,
Proceedings of the 16th International Conference on Automated
Deduction (Trento, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
- HL00
- Hillenbrand T., Löchner B. (2000),
Shorter Fixed Point Combinators with Waldmeister and the
Kernel Strategy,
AAR Newsletter 46.
Waldmeister 601
T. Hillenbrand^{1}, B. Löchner^{2}, A. Jaeger, and
A. Buch
^{1}Max-Planck-Institut für Informatik Saarbrücken,
^{2}Universität Kaiserslautern
waldmeister@informatik.uni-kl.de
Architecture
Waldmeister is a system for unit equational deduction.
Its theoretical basis is unfailing completion in the sense of
[BDP89] with refinements towards
ordered completion. The prover saturates the input axiomatization in a repeated
cycle that works on a set of active resp. passive facts. The selection of the
reduction ordering and the heuristical guidance of the proof search are
described in [HJL99].
Since last year's version, stronger redundancy criteria have been integrated,
including ground joinability tests with ordering constraints on variables
[AHL00].
In several problem domains this technique is helpful especially
for harder proof tasks. Some restructuring of the prover is on the way to
also include an implementation of confluence trees with full ordering
constraints [AL01]; but further work
is necessary to have them speed up the proof search.
Implementation
The prover is coded in ANSI-C and available for SunOS, Solaris and Linux. The
set of active facts is represented by perfect discrimination trees.
Stimulated by the evaluation of indexing techniques reported on in
[NH+01], the implementation of these
has somewhat been improved. The storage of passive
facts, and the treatment of hypotheses have remained unchanged. The
Waldmeister Web page is located at:
http://www-avenhaus.informatik.uni-kl.de/waldmeister
Expected Competition Performance
Once again, the Waldmeister system being that powerful already, we hope to
improve over last year's version in some problem domains.
References
- AHL00
- Avenhaus J., Hillenbrand T., Löchner B. (2000),
On Using Ground Joinable Equations in Equational Theorem
Proving,
Baumgartner P., Zhang H.,
Proceedings of the 3rd International Workshop on First Order Theorem
Proving
(St Andrews, Scotland),
pp.33-43.
- AL01
- Avenhaus J., Löchner B. (2001),
System Description: CCE: Testing Ground Joinability,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
- BDP89
- Bachmair L., Dershowitz N., Plaisted D.A. (1989),
Completion Without Failure,
Ait-Kaci H., Nivat M.,
Resolution of Equations in Algebraic Structures,
pp.1-30,
Academic Press.
- HJL99
- Hillenbrand T., Jaeger A., Löchner B. (1999),
Waldmeister - Improvements in Performance and Ease of Use,
Ganzinger H.,
Proceedings of the 16th International Conference on Automated
Deduction
(Trento, Italy),
pp.232-236,
Lecture Notes in Artificial Intelligence 1632,
Springer-Verlag.
- NH+01
- Nieuwenhuis R., Hillenbrand T., Riazanov A., Voronkov A. (2001),
On the Evaluation of Indexing Techniques for Theorem
Proving,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag.