**
Sheraton Breakwater Hotel, Townsville, Australia
16th July, 1997
**

The CADE-14 ATP System Competition will evaluate the performance of sound, fully automatic, 1st order ATP systems, in terms of:

- the number of problems solved, and
- the average runtime for successful solutions;

- a bounded number of eligible problems, chosen from the TPTP Problem Library, and
- a specified CPU time limit for each solution attempt,

The competition machines are SUN Ultra 140s, each with 64MB memory and running Solaris 2.5.1.

The competition is being organized by Christian Suttner and Geoff Sutcliffe. If you have any questions about the competition, please email the organizers. The competition will be overseen by a panel of knowledgeable researchers who are not participating in the event. It is planned to publish the competition results in a form that includes contributions written by entrants.

This document contains information about the:

- Competition Divisions
- Problem Selection and Preparation
- Time Limits and Timing
- System Execution
- Performance Evaluation
- FOF Demonstration Division
- Special Hardware Demonstration Divisions
- Conditions for Participation
- Competition Dinner
- System Registration Form

- The
**MIX Division**: Mixed CNF Non-Propositional Theorems

*Mixed*means Horn and non-Horn problems, with or without equality, but not unit equality problems (see the UEQ division below).

The MIX division is divided into four categories:- Horn with No Equality
- Non-Horn with No Equality
- Horn with Equality
- Non-Horn with Equality

- The
**UEQ Division**: Unit Equality CNF Theorems

Theorems containing only unit equality clauses. - The
**SAT Division**: Mixed CNF Non-Propositional Non-theorems

- ATP systems may be entered only at the division level.
- ATP systems may be entered into more than one division (a system that is not entered into a particular division is assumed to be useless for that type of problem).
- All problems will be presented in CNF. ATP systems that accept FOF problems may enter the FOF Demonstration division.
- All systems must run on a single locally provided standard UNIX workstation. ATP systems that cannot run on the locally provided standard UNIX workstations may enter the Special Hardware Demonstration divisions.

The TPTP difficulty rating scheme identifies problems as:

- Solvable by all state-of-the-art ATP systems (easy)
- Solvable by some state-of-the-art ATP systems (difficult)
- Solvable by no state-of-the-art ATP systems (unsolved)
- Theorem-hood unknown (open)

The TPTP distinguishes versions of problems as one of standard, non-standard, incomplete, or special. The competition will use standard TPTP problems.

The eligible problems for each competition division and category can be extracted from the TPTP using the tptp1T script that is distributed with the TPTP, as follows:

- MIX Division, Horn with No Equality

`tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard Horn NoEq` - MIX Division, Non-Horn with No Equality

`tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard NonHorn NoEq` - MIX Division, Horn with Equality

`tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard Horn SomeEq`

and

`tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard Horn NonUnitEq` - MIX Division, Non-Horn with Equality

`tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard NonHorn SomeEq` - UEQ Division

`tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard UnitEq` - SAT Division

`tptp1T CNF NonProp Satisfiable Rating 0.00 0.00 Standard`

and

`tptp1T CNF NonProp Satisfiable Rating 0.00 0.00 Special`*Note that all the problems are "easy". We have had to accept this to get some eligible problems. Further,*`Special`*problems have been accepted, as there are too few*`Standard`*problems. Finally, there have been a few changes to problem version status since the release of TPTP v2.0.0, so check the explicit eligible problems list.* - FOF Divisions

`tptp1T FOF Standard NonProp Theorem`*There is no*`Rating`*information available for FOF problems.*

The number of problems to be used will be chosen between a minimum that ensures sufficient confidence in the competition results (the competition organizers will ensure that there are sufficient resources available), and a maximum determined from the number of workstations available, the time allocated to the competition, the number of ATP systems, and the minimal time limit:

Number of workstations * Time for competition Maximum = --------------------------------------------- Number of systems * Minimal time limit

The problems to be used will be randomly selected on the day of the competition, from the most recent TPTP release. The tptp2X utility, distributed with the TPTP, will be used to:

- replace predicate and function symbols by short, meaningless symbols,
- randomly reorder the clauses and literals,
- randomly reverse the unit equalities in the UEQ division,
- remove equality axioms that are not needed by an ATP system,
- output the problems in the format required. (If the tptp2X utility does not support the format required, then it is the entrant's responsibility to provide a reformating program. The most acceptable form for this program is a tptp2X format module. If the ATP system is publically available, then the competition organizers may be able to produce such a module.)

Number of workstations * Time for competition Maximum = ---------------------------------------------- Number of systems * Minimal number of problems

The timing will be done in units of 1 second, and the minimal time to find a solution is 1 second. If an ATP system cannot solve a problem, the runtime will be set to the time limit.

The ATP systems will be executed by a shell script that invokes each system by a single command line, with the TPTP file name, the time limit (if required by the entrant), and entrant specified system switches (the same for all problems) as command line arguments. The command line may not use UNIX shell features, e.g., redirection and piping cannot be used.

The time limit will be imposed by sending a SIGXCPU (signal 30) to the ATP system. The ATP systems must be interruptable by SIGXCPU.

When terminating, the ATP system must output a distinguished string (specified by the entrant) to stdout, indicating the result:

- A proof exists, or
- No proof exists (clauses are satisfiable), or
- No solution found.

For every problem solved, the solution process must be reproducible by running the system again.

Note: If only one ATP system is registered for a particular division or category, no winner can be announced for that division or category, but the results for that system will still be presented.

The infrastructure for a FOF competition division will not be ready for CASC-14. However, systems that can deal with FOF syntax can demonstrate their abilities in the FOF Demonstration division (the tptp2X utility will contain a clausifer that may be prepended to a CNF system to form a FOF system; the tptp2X runtime is included in the total runtime). Ideally the systems will run on locally provided standard UNIX workstations, but use of any hardware supplied by the entrant or accessed via the Internet is acceptable.

The FOF Demonstration division will use non-trivial FOF theorems randomly
selected from the TPTP.
A CPU time limit, equal to the one in the competition divisions, will be
imposed on each solution attempt.
The system execution will be controlled by a `perl` script, provided
by the competition organizers.

The results will be presented, but no winner assessment will be made.

The rules for entry are the same as for the competition divisions, and
the same problems will be used.
A wall clock time limit, equal to the CPU time limit in the competition
divisions, will be imposed on each solution attempt.
The system execution will be controlled by a `perl` script, provided
by the competition organizers.

The results will be presented along with the competition divisions' results, but no winner assessment will be made.

- A person must be nominated to handle all issues (including execution difficulties) arising before and during the competition.
- It is not necessary for entrants to physically attend the competition.
- It is assumed that each entrant has read all the documents related to the competition (they will be made available to entrants) and will comply with the competition rules. Non-compliance with the rules could lead to disqualification.
- Entering many similar versions of the same system is deprecated. Entrants may be required to limit the number of system versions that they enter.
- The precomputation and storage of any information for individual TPTP problems for usage during the competition is contrary to the spirit of the competition. The panel may disqualify unfairly doped systems.
- Winners are expected to provide public access to their system's source code. If you want to participate without releasing your source code, please email the organizers.
- Catch-all rule: no cheating is allowed; the panel may disqualify entrants due to unfairness and may adjust competition rules in case of misuse.

Note: You have to be directly associated with an entered ATP system to attend this dinner. It's an exclusive event. (Spouses are welcome to come along; please indicate this in your registration.)

Registration Deadline: 15th June 1997 |
---|

Register as early as possible, so that the organizers can ensure
that

sufficient resources are available. Do it now!