Introduction

The TPTP (Thousands of Problems for Theorem Provers) is a library of problems for Automated Theorem Proving (ATP) systems. The principal motivation for the TPTP project is to move the testing and evaluation of ATP systems from the previously ad hoc situation onto a firm footing. This became necessary, as results being published do not always accurately reflect the capabilities of the ATP system being considered. A common library of problems is necessary for meaningful system evaluations, meaningful system comparisons, repeatability of testing, and the production of statistically significant results. The TPTP is such a library. The TPTP provides a simple, unambiguous, source and reference mechanism for ATP problems. It is comprehensive and up-to-date, and thus provides an overview of the current application of ATP. The TPTP problems are stored in a specifically designed, easy to understand format. Utilities for manipulating the problems, for converting the problems to other known ATP formats, and for finding problems with certain characteristics, are provided. Since its first release in 1993, many researchers have used the TPTP as an appropriate and convenient basis for ATP system evaluation.

This technical report explains the motivations and reasoning behind the development of the TPTP, and thus implicitly explains the design decisions made. It also serves as a manual explaining the structure and use of the TPTP: it provides a full description of the TPTP contents and organization, details of the utility programs, and guidelines for obtaining and using the TPTP.

A Quick Installation Guide for the TPTP is given in Section  Getting and Using the TPTP. Please be sure to read the guidelines for using TPTP problems and presenting results, given in Section Using the TPTP.

What's New in TPTP v2.7.0 :

Previous Problem Collections

A large number of interesting problems have accumulated over the years in the ATP community. Besides publishing particularly interesting individual problems, from early on researchers collected problems in order to obtain a basis for experimentation. Problems in FOF, published by mathematicians and logicians prior to the mechanization of reasoning (e.g., [
Chu56]) provided the first source for ATP researchers. The first major publication in this regard was [MOW76], which provides an explicit listing of clauses for 63 CNF problems, many of which are still relevant today (to our knowledge, the first circulation of problems for testing ATP systems was due to Larry Wos in the late sixties). In the same year [WM76] documented 86 CNF problems which have commonly been used for ATP testing. However, the problem clauses are not supplied in [WM76]. A second major thrust was provided by [Pel86], which lists 75 problems in both FOF and CNF. Other more recent papers are [BL+86], [Qua92], [MW92], and [McC93], to name a few. The Journal of Automated Reasoning's Problem Corner also provided interesting challenge problems. Problems published in hardcopy form are, however, often not suitable for testing ATP systems, because they have to be transcribed to electronic form. This is a cumbersome, error-prone process, and is feasible for only small numbers of small problems.

The sparsness of research into ATP systems for FOF problems meant that no electronic collections of FOF test problems have previously been commonly available. A CNF problem collection in electronic form was made publicly available by Argonne National Laboratories (in Otter format [McC94]) in 1988 [ANL]. This collection was a major source of problems for ATP researchers. Other electronic collections of CNF problems are available, but have not been announced officially (e.g., that distributed with the SPRFN ATP system [SPRFN]). Although some of these collections provide significant support to researchers, and formed the early core of the TPTP library, none (with the possible exception of the ANL collection) was specifically designed to serve as a common basis for ATP research. Rather, these collections typically were built in the course of research into a particular ATP system. As a result there are several factors that limit their usefulness as a common basis for research. In particular, previously existing problem collections:

The problem of meaningfully interpreting results can be even worse than indicated. A few problems may be selected and hand-tuned (formulae arranged in a special way, irrelevant formulae omitted, lemmas added in, etc) specifically for the ATP system being tested. The presentation of a problem can significantly affect the nature of the problem, and changing the formulae clearly makes a different problem altogether. Nevertheless the problem may be referenced under the same name as it was presented elsewhere. As a consequence the experimental results reveal little. Some researchers avoid this ambiguity by listing the formulae explicitly, but obviously this usually cannot be done for a large number of problems or for large individual problems. The only satisfactory solution to these issues is a common and stable library of problems. The TPTP is such a library.

What is Required?

The goal for building the TPTP has been to overcome previous drawbacks, and to centralize the burden of problem collection and maintenance. The TPTP tries to address all relevant issues. In particular, the TPTP:

The development of the TPTP problem library is an ongoing project, with the aim to provide all of the desired properties.

Current Limitations of the TPTP.
The current release of the TPTP library is limited to problems expressed in 1st order logic. There are no problems for induction, or for non-classical theorem proving. However, see Section  Present and Future Activity for upcoming and planned extensions.