# Research Projects for Students

Project Level Description Prerequisites
Reasoning over Theories PhD Automated Theorem Proving (ATP) systems are designed for reasoning in symbolic logic. An ATP problem is consists of a set of axioms that describe a theory, and a conjecture to prove. Normally the axioms are written as a finite set of logic formulae, but there are significant cases where this is impossible (e.g., arithmetic) or inefficient (e.g., reasoning about arrays). It is therefore necessary to have alternative ways of writing theories, e.g., as done in SMT.

The ATP community often uses the TPTP language for writing problems, and recently the TPTP language has been extended to include a built-in theory of arithmetic. However, it seems impractical to extend the TPTP language for every theory that needs to be written without using a set of axioms. This project is to investigate alternative ways of writing theories in ways that are suitable for use by ATP systems, choose one and support its integration into the TPTP framework, and extend the E prover ATP system to reason over theories written in this way.

A course in automated reasoning. Ability to program in C.
Command Language for ATP

Undergraduate/Masters Automated Theorem Proving (ATP) systems are designed for reasoning with symbols. Typically ATP systems are used to solve one problem at a time, where each problem is self-contained. For users doing interactive development, there is a need for an interface that allows them to construct problems from components, apply various preprocessing tools, apply ATP systems, and apply various ppostprocessing tools. As the work is interactive, it must be possible to stop and save, then later restart and restore the state. The project is to implement a suite of commands that provide such an interactive development environment. Some basic knowledge of automated reasoning. Ability to program in Perl or Python.
Encoding Puzzles in Logic Undergraduate Automated Theorem Proving (ATP) systems are designed for reasoning with symbols. The TPTP is a library of test problems for ATP systems. One of the TPTP domains is "Puzzles". There are many books and web sites that describe puzzles in natural language. The project is to express lots of such puzzles in first-order (or other) logic, so that they can be added to the TPTP. Knowledge of first-order logic.
TPTP Lint Undergraduate Automated Theorem Proving (ATP) systems are designed for reasoning with symbols. The TPTP is a library of test problems for ATP systems. The TPTP is curated by hand, and as such it is possible for minor bugs to creep in. Examples are: duplicate formula names, misuse of predicate names like equal true false, ambiguous parenthesizing, free variables, etc. The project is to write a "lint" program that scans TPTP problems to find such errors, and wheree possible suggest corrections. Ability to write shell or perl scripts, program in Prolog or C. Some knowledge of, or interest in, logics.
Solving OWL test cases by ATP Undergraduate/Masters Automated Theorem Proving (ATP) systems are designed for reasoning with symbols. Semantic web is a growing technology to provide reasoning capability for web data and web applications. Semantic web applications typically use a reasonably weak logic, but have to deal with very large quantities of data. Therefore specialized reasoning tools have been developed for semantic web applications. This project will apply general purpose ATP systems to semantic web data, to determine to what extent such systems can deal with the semantic web data, and to find ways that such systems can be tuned to improve their performance on semantic web data. The project will start with the 700+ OWL 2 Test Cases. These have to be converted to the TPTP format (that is used by many ATP systems), and then submitted to various ATP systems. Based on the results from that initial testing it will be known which ATP systems are best suited to semantic web data. Further data sets will then be processed using the best ATP systems. The most interesting data sets and associated tests will be added to the TPTP problem library. Ability to write shell or perl scripts, stream editors, and data analysis tools (spreadsheets, etc.). it would be helpful, but not absolutely necessary, to have done a course in logic or artificial intelligence or automated reasoning.
Typed First-order Logic for the TPTP

Taken by Michael Schick

Undergraduate/Masters Automated Theorem Proving (ATP) systems are designed for reasoning with symbols. The TPTP is a library of test problems for ATP systems. Up to now the TPTP (and also almost all first-order ATP systems) has dealt with untyped first-order logic. Over the next year the TPTP (and hopefully some ATP systems) will be extended to include problems in typed first-order logic (to be precise, it is "many sorted" first order logic). This in turn will be used to support problems that include arithmetic. The project is to provide development support for these extensions to the TPTP, collecting problems, testing ATP systems, and extending the TPTP software infrastructure to provide processing of such problems Course in logic or artificial intelligence or automated reasoning. Ability to program in C or Java or Prolog (those are inclusive "or"s (and if you don't know what that means, this project might not be for you :-)).
Can ATP play Trivial Pursuit?

ACE to logic taken by Nelson Dellis

Undergraduate/Masters/PhD (different students could work on different aspects) Automated Theorem Proving (ATP) systems are designed for reasoning with symbols. SPASS is a well known ATP system, and SPASS-XDB is an extended version of SPASS that that retrieves external axioms asynchronously, on demand. SPASS-XDB can prove some pretty funky stuff, much of which is based on the general knowledge it retrieves from large external databases. The project is to see if this scales up to playing Trivial Pursuit. In order to make this "human like", the questions need to be posed in a natural language that is also easy enough to parse into logic - one possibility is Attempto Controlled English (ACE). Aspects of the project will include:
• Translating a Trivial Pursuit question set to ACE.
• Converting ACE to logic, and that logic to a form that matches the style used by SPASS-XDB.
• Finding appropriate knowledge bases for answering the questions, and making them available to SPASS-XDB. One particular project will be to convert general first-order facts into SQL database data.
• Using SPASS-XDB to answer the questions. A general project to answer any general knowledge questions using SPASS-XDB is also available.
• Linking the proofs output by SPASS-XDB to a natural language output system (I have a plan!).
• Challenging humans to play the system.
• Getting commercial interest.
Advanced course in artificial intelligence or automated reasoning.
SPASS and SPASS-XDB in SInE and SigmaKEE

Some parts taken by Jose Hernandez

Undergraduate Automated Theorem Proving (ATP) systems are designed for reasoning with symbols. SPASS is a well known ATP system, and SPASS-XDB is an extended version of SPASS that that retrieves external axioms asynchronously, on demand. SInE is an ATP meta-system designed to work on large theories (LTs), by selecting axioms that are expected to be useful for proving a given conjecture, and then calling the E ATP system to try find the proof. SigmaKEE is an system for developing, viewing and debugging theories in first order logic, specifically the Suggested Upper Merged Ontology (SUMO). The project is to integrate these systems' capabilities, in a cascading fashion: (i) replace E by SPASS in SInE; (ii) test the combined system on SUMO problems; (iii) replace SPASS by SPASS-XDB; (iv) test the combined system on SUMO problems with external axioms; (v) make the combination available for reasoning in SigmaKEE. Course in artificial intelligence or automated reasoning. Java and JSP programming. perl programming.
Adding Context to SUMO in SigmaKEE Masters SigmaKEE is an system for developing, viewing and debugging theories in first order logic, specifically the Suggested Upper Merged Ontology (SUMO). Much commonsense reasoning uses a notion of context - a state of the world that does not change during the reasoning episode, e.g., "in USA in June 2008". The project is to extend SigmaKEE to enable contextual reasoning over SUMO. The initial work will be to dynamically add a context to all facts before reasoning - this is easy. The next phase will be to use existing SUMO binary-predicates as contexts, and to enable reasoning with super- and sub-contexts, e.g., "in Florida in 2008". Such predicates have to be reflexive, anti-symmetric, transitive, and acyclic. The first three properties can be found in SUMO, while the last will have to be declared and (as far as possible) verified. Course in automated reasoning. Java and JSP programming.
Multi-modal Logic ATP by THF Masters Propositional multi-modal logic can be translated to classical higher-order logic, in several different ways. Multi-modal logic theorems can then be proved by translation and application of a higher-order Automated Theorem Proving (ATP) program. The project is to develop such an ATP system, and test various translation options to establish which provides the best general purpose multi-modal logic ATP system. Course in automated reasoning. Prolog programming. Statistical analysis.
ATP Access to External Data, in E Masters or PhD Automated Theorem Proving (ATP) system have traditionally been used in a mode where all the axioms are read in before deduction starts. This is not possible for very (VERY, even infinitely) large axiom sets, and also gives "out-of-date" results for dynamic axioms sets (e.g., mined dynamicaly from the web). Recently a version of SPASS ( SPASS-XDB) has been implemented, that provides a paradigm for ATP to request and receive axioms from external sources, dynamically, on-demand, during deduction. This enables the system to access effectively infinite numbers of axioms from a range of non-traditional sources, e.g., SQL databases, web services, computer algebra systems, etc. SPASS-XDB is a prototype for this new paradigm in ATP. The project is to continue this line of research and development, ending with an implementation based on the E ATP system. Course in automated reasoning. C programming. UNIX system programming.
HOL ATP via FOL Masters Automated Theorem Proving (ATP) in Higher-Order Logic (HOL) has been in progress for as long as that for First-Order Logic (FOL). It has become clear that the computational issues in the higher-order setting are significantly harder than those in first-order. Thus there are few ATP systems for higher-order logic. One way to build an ATP system for HOL is simply to provide a translation of the HO problem to FOL, and then apply a FOL ATP system. See this paper and this paper and this paper for example translations. Different translations are effective for different problems, therefore one way to increase coverahe is to do several different translations and run a FOL ATP system on them all (either sequentially or in parallel). The project is to build a HOL ATP system based on translation to FOL. Course in automated reasoning or logic. Prolog and Perl programming.
ATP for Bioinformatics Masters or PhD Automated Theorem Proving (ATP) has the potential to be a useful tool in bioinformatics. The project is to use ATP for modeling signaling networks (see this, this, and this), and metabolic pathways (see this). Course in automated reasoning. Course in bioinformatics. UM students only.
ML ATP in LT Masters Automated Theorem Proving (ATP) in Large Theories (LTs) can benefit from the use of Machine Learning (ML) for selecting axioms that are likely to be relevant for proving a given conjecture. The ML is done from existing proofs in the same theory, and is updated when new proofs are found. The full details are explained in this document. The project is to implement the ideas described in Section 6 of the document. Course in automated reasoning preferred, but not required. Knowledge of ML techniques. Perl programming.
ATP Flow Tool Masters The use of Automated Theorem Proving (ATP) systems has largely focussed on proving a conjecture from a set of consistent axioms. In real applications users face situations where the properties of the axioms and conjecture are unknown. The project is to build a flow tool that controls the processing of user formulae by ATP tools, as shown in this picture. The picture can be extended to the case when there is no conjecture, to to into account user expectations and hopes, provide feedback when things do not meet the user's expectations and hopes, etc. The system can be implemented as a sequential process, or can run the tasks in parallel with appropriate task control. Course in automated reasoning preferred, but not required. Perl programming. UNIX system programming.
Theory Development Masters The use of Automated Theorem Proving (ATP) systems in developing theories is currently hampered by a lack of tools for controlling the application of the systems and tracking dependencies between theorems proved and the axioms used, with regard to the subsequent use of such theorems as lemmas in proving later theorems. The project is to build a control environment that will allow a user to interactively select and use axioms, previously proved theorems, and a conjecture, and apply an appropriate ATP system. (See some ideas for functionality.) The controller will have to maintain a database of axiom-theorem dependency, and ensure that the use of previously proved theorems does not lead to any circular dependencies. The controller will also have to provide an interface to the ATP systems, and will provide guidance to the user regarding the use of the ATP systems. Course in automated reasoning. Programming in Perl, Javascript, Prolog
Learning SSCPA Masters Early Automated Theorem Proving (ATP) research focussed largely on the development of inference rules and search guidance heuristics. However, in recent years there has been an increased emphasis on the implementation of high performance ATP systems. There are now several such systems available from research groups around the world. In 1998 the SSCPA (Smart Selective Competition Parallelism ATP) system was built. When given a problem, SSCPA determines syntactic characteristics of the problems, and from an existing database of knowledge determines which ATP systems are most likely to be able to solve the problem. These chosen ATP systems are then executed in parallel, and all are stopped when a solution is found by any one of them. The Learning SSCPA project is to build a version of SSCPA that starts with no knowledge about the strengths and weaknesses of the available ATP systems, and through experience learns which are most likely to solve given problems, based on their syntactic characteristics. Course in automated reasoning. Programming in Perl, Prolog. Use of ML tools.
Cooperative SSCPA Masters Early Automated Theorem Proving (ATP) research focussed largely on the development of inference rules and search guidance heuristics. However, in recent years there has been an increased emphasis on the implementation of high performance ATP systems. There are now several such systems available from research groups around the world. In 1998 the SSCPA (Smart Selective Competition Parallelism ATP) system was built. When given a problem, SSCPA determines syntactic characteristics of the problems, and from an existing database of knowledge determines which ATP systems are most likely to be able to solve the problem. These chosen ATP systems are then executed in parallel, and all are stopped when a solution is found by any one of them. The Cooperative SSCPA project is to build a version of SSCPA that improves the performance of the individual ATP systems by exchanging information between them. The CSSCPA will not make any changes to the ATP systems, but instead will rely on standard output options of the systems. This project can be extended to use semantic information to help decide what information should be exchanged between which of the systems. Course in automated reasoning. Programming in Perl, Prolog. Distributed systems.
Semantic Resolution ATP Masters In 1967 John Slagle developed an Automated Theorem Proving (ATP) technique called semantic resolution, which uses semantic information about the axioms and conjecture to guide the actions of the ATP system. A simple instance of semantic resolution, called hyperresolution, has been very successfully used in several ATP systems. The project is to build an ATP system that fully implements semantic resolution. The basic implementation will need to be reasonably high performance, and then the semantic guidance will be imposed on the basic system. This will also require investigating how the required semantic information can be acquired or generated. Course in automated reasoning. Programming in Prolog, C, C++.
WWW Interface for SPASS-XDB

Taken by Alex Teyssandier

Undergraduate Automated Theorem Proving (ATP) systems are designed for reasoning with symbols. SPASS is a well known ATP system, and SPASS-XDB is an extended version of SPASS that that retrieves external axioms asynchronously, on demand. While SPASS-XDB can prove some pretty funky stuff using very large external databases, currently the user is required to have some specific knowledge of the information stored in the databases, and how to formulate logical conjectures appropriately. The project is to build an interactive web interface for users to comfortably specify what they want to prove (know), generate the corresponding logical conjecture, submit the conjecture to SPASS-XDB, and present the result to the user as a web page. Course in artificial intelligence or automated reasoning. Web programming, e.g., Javascript, cgi-bin, PHP.
Survey of arithmetic in ATP

Taken by Peter Watson

Masters (UWI style) or undergraduate Automated Theorem Proving (ATP) systems are designed for reasoning with symbols. Users of ATP systems often have a need for integrated arithmetic capabilities, and some ATP systems provide some of the required capabilities. Some systems that provide some arithmetic capabilities are Otter, SPASS+T SPASS(T), SPASS-XDB, Mace4, SNARK, MetaTarski. There is a real need to understand and standardize arithmetic in ATP, and the TPTP proposal is one step in that direction. The project is to survey the field and provide a comprehensive overview of the state of the art of arithmetic in ATP. The survey will cover issues such as syntax, data types, logic, operators, formulations. A corpus of examples will be collected, illustrating the types of arithmetic that the different systems can deal with. Course in automated reasoning or artificial intelligence. Software collection, installation, and use. Writing.