## Content

## News

##### Dagstuhl Seminar on Inductive Programming

The 7th AAIP Workshop on Approaches and Applications of Inductive Programming is accepted as Dagstuhl Seminar 17382 and will take place September 17 to 20, 2017. The workshop is jointly organized by Ute Schmid (University of Bamberg), Stephen Muggleton (Imperial College London), and Rishabh Singh (MIT and Microsoft Research).

For more information, see the Dagstuhl Seminar Page.

##### RuleML Blog Report about Dagstuhl Seminar AAIP'16

AAIP'16 in the RuleML Blog

http://via.aayo.ws/YmI4J

##### CACM on Inductive Programming

The review article “Inductive programming meets the real world” by
Gulwani,
Hernández-Orallo,
Kitzelmann,
Muggleton,
Schmid, and
Zorn
has been published in the Communications of the ACM, Vol. 58 No. 11, Pages 90-99. 10.1145/2736282

see fulltext

##### Wikipedia Page on Inductive Programming

José Hernández-Orallo and Ute Schmid created Wikipedia articles for Inductive Programming and Inductive Functional Programming.

##### Dagstuhl Seminar "Approaches and Applications of Inductive Programming"

José Hernández-Orallo (Polytechnic University of Valencia, ES), Stephen H. Muggleton (Imperial College London, GB), Ute Schmid (Universität Bamberg, DE) and Benjamin Zorn (Microsoft Research - Redmond, US) organize Dagstuhl Seminar 15442 "Approaches and Applications of Inductive Programming" scheduled for October 25 to 30, 2015.

The seminar is a continuation of the AAIP workshop series.

Please visit the AAIP 15 Homepage.

##### Report of Dagstuhl Seminar

We're pleased to inform you that the report of Dagstuhl Seminar 13502 is now published as part of the periodical Dagstuhl Reports.

The report is available online at the DROPS Server.

##### Dagstuhl Seminar "Approaches and Applications of Inductive Programming"

Ute Schmid (University of Bamberg), Emanuel Kitzelmann (University of Duisburg-Essen), Sumit Gulwani (Microsoft Research) and Marcus Hutter (Austrian National University) organize Dagstuhl Seminar 13502 "Approaches and Applications of Inductive Programming" scheduled for Monday, December 09 to December 11, 2013. The seminar is a continuation of the AAIP workshop series.

Please visit the AAIP 13 Homepage.

##### 4th Workshop AAIP 2011

Ute Schmid and Emanuel Kitzelmann organize the 4th Workshop on Approaches and Applications of Inductive Programming. It will take place on July 19, 2011, in Odense, Denmark. Co-located events are the 13th International ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming (PPDP 2011) and the 21st International Symposium on Logic-Based Program Synthesis and Transformation (LOPSTR 2011).

Details can be found on the AAIP 2011 Homepage.

## Systems

### ADATE

**A**utomatic

**D**esign of

**A**lgorithms

**T**hrough

**E**volution)

- Roland Olsson's ADATE homepage

Language: ADATE ML

Year: 1994

Category: Genetic Programming

#### Short Description «expand»

Automatic Design of Algorithms Through Evolution (ADATE) is an automatic programming system that synthesizes function definitions in a subset of the programming language Standard ML. ADATE is primarily intended as a system that automatically improves a part of an existing program, but is also useful for generating programs from scratch provided that they are fairly small.

The evolution of programs is guided by an evaluation function that tests a given program and says how good it is. Note that the function generated by ADATE might be only a small part of this given program. Thus, ADATE is especially suitable for so-called reinforcement learning, but also useful in more traditional machine learning where input-output pairs are employed to define the evaluation function.

For example, one possible application of ADATE is to automatically improve image processing algorithms. One may then start with a possibly advanced and complex algorithm, select a part of it to be improved and then define an evaluation function that measures the performance of the new versions of the original algorithm that result from plugging in ADATE generated functions instead of the selected part. ADATE has been used in this way to improve one of the best known image segmentation algorithms.

Automatic algorithm improvement as outlined above appears to be practically useful in applications where programs written by human beings contain subroutines or functions that are designed through experimental feedback rather than being optimally defined using purely theoretical considerations.

Depending on the final size of the function to be improved and its algorithmic complexity, ADATE may need to test millions or even billions of candidates. However, we have found experimentally that the number of evaluations needed typically is determined by a low degree polynomial in program size provided that the training inputs and the evaluation function are well defined. The overall run time of ADATE approximately equals the number of evaluations multiplied with the time required to run a program on all the training inputs.

Therefore, ADATE is most easily used in applications where the overall program has a short run time and where the part of it selected for improvement is reasonably small, say smaller than 100 lines of Standard ML code.

ADATE can automatically generate non-trivial and novel algorithms that contain one or more help functions that are created by ADATE when and where needed. For example, ADATE can easily synthesize a functional program that generates a list of lists containing all permutations of a given input list even if it is not given any help functions to start with except the necessary data type constructors. In this case, the best synthesized program contains two invented auxiliary functions that perform combinations of list rotations and concatenations in non-trivial ways.

ADATE uses large scale combinatorial search that employs sophisticated program transformations. Examples of transformations includes expression synthesis of terms that do not make a program worse and adding extra parameters to local help functions in order to make them more general.

The system currently runs on Linux clusters with MPI and can effectively utilize a few hundred CPU cores but a rewrite to obtain practically unlimited scalability with an increasing number of cores per CPU and number of CPUs is being planned.

*(Roland Olsson)*

### ALEPH

**A**

**L**earning

**E**ngine for

**P**roposing

**H**ypotheses)

- Ashwin Srinivasan's Aleph homepage

Language: PROLOG

Year: 2007

Category: Inductive Logic Programming

#### Short Description «expand»

Aleph is intended to be a prototype for
exploring ideas. Earlier incarnations (under the
name P-Progol) originated in 1993 as part of a
fun project undertaken by Ashwin Srinivasan and
Rui Camacho at Oxford University. The main
purpose was to understand ideas of inverse
entailment which eventually appeared in Stephen
Muggleton's 1995 paper: *Inverse Entailment
and Progol*, New Gen. Comput., 13:245-286,
available
at ftp://ftp.cs.york.ac.uk/pub/ML_GROUP/Papers/InvEnt.ps.gz. Since
then, the implementation has evolved to emulate
some of the functionality of several other ILP
systems. Some of these of relevance to Aleph
are: CProgol, FOIL, FORS, Indlog, MIDOS, SRT,
Tilde, and WARMR.

During routine use, Aleph follows a very simple procedure that can be described in 4 steps:

**Select example.**Select an example to be generalised. If none exist, stop,otherwise proceed to the next step.-
**Build most-specific-clause.**Construct the most specific clause that entails the example selected, and is within language restrictions provided. This is usually a definite clause with many literals, and is called the "bottom clause." This step is sometimes called the "saturation" step. Details of constructing the bottom clause can be found in Stephen Muggleton's 1995 paper:*Inverse Entailment and Progol*, New Gen. Comput., 13:245-286, available at ftp://ftp.cs.york.ac.uk/pub/ML_GROUP/Papers/InvEnt.ps.gz. -
**Search.**Find a clause more general than the bottom clause. This is done by searching for some subset of the literals in the bottom clause that has the "best" score. Two points should be noted. First, confining the search to subsets of the bottom clause does not produce all the clauses more general than it, but is good enough for this thumbnail sketch. Second, the exact nature of the score of a clause is not really important here. This step is sometimes called the "reduction" step. -
**Remove redundant.**The clause with the best score is added to the current theory, and all examples made redundant are removed. This step is sometimes called the "cover removal" step. Note here that the best clause may make clauses other than the examples redundant. Again, this is ignored here. Return to Step 1.

*(from the Aleph Manual)*

### ATRE

**A**pprendimento di

**T**eorie of

**R**corsive da

**E**sempi)

- ATRE homepage

Language:

Year: 1998

Category: Inductive Logic Programming

#### Short Description «expand»

coming soon...

### De-Typechecker

- Oleg Kiselyov's homepage

Language: Haskell

Year: 2005

Category: Typechecking and theorem proving

#### Short Description «expand»

Unlike Djinn, which is a separate, standalone system, de-typechecker uses the Haskell own type-checker to create terms from types; the terms can be then be evaluated by the same Haskell system.

(Oleg Kiselyov)

### DIALOGS

**D**ialogue-based

**I**nductive and

**A**bductive

**LOG**ic program

**S**ynthesiser)

- Pierre Flener's homepage
- DIALOGS
- DIALOGS II

Language: PROLOG

Year: 1997

Category: Inductive Logic Programming

#### Short Description «expand»

The DIALOGS (Dialogue-based Inductive/Abductive LOGic program Synthesiser) technique resulted from an effort at building a fully interactive version of SYNAPSE, and at extending its power while at the same time simplifying its machinery.

The main objective was to take all burden from the specifier by having the technique ask for exactly and only the information it needs. As a result, no evidence needs to be prepared in advance, since the technique invents its own evidence and queries an oracle (usually a human specifier) about it.

This is suitable for all levels of expertise of
human users, as the queries are formulated in
the specifier's (initially unknown) conceptual
language, in a program-independent way, and such
that the specifier must know the answers if
s/he really feels the need for the
program. For instance, the query "When does
*sort([X,Y],R)* hold?" can be answered
by

*(R=[X,Y] ∧ X ≤ Y) ∨ (R=[Y,X] ∧ X > Y)*

thereby expanding the perceived subset of the
conceptual language of the specifier, and giving
the same information as in a *property*
provided to SYNAPSE. The evidence language
implicitly amounts to (non-recursive) normal
logic programs. Type declarations are provided
as language bias. Mode and determinism
information are not required, because the focus
is on synthesising the logic component of logic
programs. The technique is schema-guided, and
currently has two schemas
(*divide-and-conquer* and
*accumulate*) where multiple base clauses
and multiple recursive clauses are possible. The
hypothesis language is normal logic programs,
where negation is restricted to the case
discriminants and appears there by extraction
from the query answers (i.e., it can only be
applied to primitive predicates and could thus
be avoided by using the complementary primitives
in the query answers, as we did above by writing
*X > Y* instead of *¬ X ≤
Y*).

*(Pierre Flener)*

### Djinn

Language: Haskell

Year: 2005

Category: Typechecking and theorem proving

#### Short Description «expand»

Djinn takes a (Haskell) type and gives back a function of that type if one exists. The system uses a decision procedure for intuitionistic propositional calculus due to Roy Dyckhoff. It's a variation of Gentzen's LJ system. This means that (in theory) Djinn will always find a function if one exists, and if one doesn't exist Djinn will terminate telling you so.

*(Lennart Augustsson)*

### FLIP

- The homepage of the MIP Group

Language: C++

Year: 1999

Category: Induction of Functional Logic Programs

#### Short Description «expand»

FLIP implements a framework for the Induction of Functional Logic Programs (IFLP) from facts. This can be seen as an extension to the now consolidated field of Inductive Logic Programming (ILP). Inspired in the inverse resolution operator of ILP, the system is based on the reversal of narrowing, the more usual operational mechanism for Functional Logic Programming. The main advantages of the FLIP system over the most used ILP systems are a natural handling of functions, without the use of mode or determinism declarations, and its power for inducing short recursive programs. Its applications are mainly program synthesis, program debugging and data mining of small highly structured documents.

The FLIP system accepts both positive and negative examples. Examples consist of sets of equations, with its rhs. normalised wrt. the program to be induced.

Background knowledge can also be used as a conditional functional logic program. Note that any logic program can be expressed as a conditional functional logic program. Therefore, virtually any logic program can be used as background knowledge as well as functional definitions. FLIP must be given the function symbols from the background knowledge that should be used in the definition of the function to be learned.

The FLIP system can induce 'correct' programs from very few examples. There is no need for good examples (in the sense of Ling), nor Basic Representative Sets (BRS) such as FOIL, GOLEM, Progol, amongst others, require.

*(Cèsar Ferri)*

### FOIL/FFOIL

- Ross Quinlan's personal homepage

Language: C

Year: 1996

Category: Inductive Logic Programming

#### Short Description «expand»

coming soon...

### GOLEM

- Stephen Muggleton's personal homepage

Language: PROLOG

Year: 1992

Category: Inductive Logic Programming

#### Short Description «expand»

Golem (Muggleton and Feng, 1990) is an ILP system, written in C, which iteratively constructs a set of definite clauses from groups of atomic examples based on Plotkin's Relative Least General Generalisation (RLGG). Plotkin's approach provides a method of constructing a unique clause which covers a set of examples relative to given background knowledge. However, such a clause can in the worst case contain infinitely many literals, or at best grow exponentially with the number of examples involved. By bounding the depth of variables and imposing a deterimanacy constraint the Golem system was guaranteed to build clauses whose length was bounded by a polynomial function of various features of the background knowledge.

Golem is given positive and negative examples together with extensional background knowledge. It starts by randomly choosing pairs of positive examples. For each pair the ij-determinate RLGG is constructed and then maximally generalised by dropping body atoms while maintaining consistency with respect to the negative examples. All reduced clauses are evaluated with respect to the degree to which they compress the information in the examples. The pair of examples leading to the most compressive clause is then extended by randomly adding examples to produce triples and later n-tuples, for which further RLGG clauses are constructed and reduced in a similar fashion. The single clause construction terminates once further extenions of the RLGG no longer lead to improved compression. Once a single clause has been asserted, all covered examples are removed and the clause construction process continues on all remaining examples. The process terminates when no remaining pair of examples has a consistent RLGG.

Golem was applied to a variety of relational learning problems including the learning of simple Prolog programs as well as real-world applications involving protein-fold prediction, drug-activity prediction, learning of computer-aided design rules and various chess problems.

The use of inverse tabling methods allowed the efficiency of Golem to be comparable to the contemporary ILP system FOIL. FOIL greedily searches the space guided by an information measure similar to that used in ID3. This measure supports the addition of a literal in the body of a clause on the basis of its ability to discriminate between positive and negative examples. This gains efficiency at the expense of completeness. For instance the literal

*partition(Head,Tail,List1,List2)*

in the recursive quick-sort clause does not produce any discrimination between positive and negative examples. As a result quick-sort cannot be learned by FOIL. The problem recurs throughout a large class of predicates in which new variables in the body are functionally derived from terms in the head of the clause. Such predicates include arithmetic multiply, list reverse and various real-world domains.

Golem's key weaknesses with respect to the later-developed Progol system were associated with the required use of ground unit clause background knowledge and its limitation to the construction of determinate clauses. The determinacy constraint was particularly awkward for applications involving learning the properties of chemical compounds described in the form of atom and bond relationships.

*(S.H. Muggleton)*

### IGOR I

Language: LISP

Year: 2006

Category: Analytical Inductive Functional Programming

#### Short Description «expand»

IGOR I is a modern extension of Summer's seminal THESYS system. The underlying program template describes the set of all functional programs with the following restrictions: built-in functions can only be first-order, and no nested or mutual recursion is allowed. IGOR I adopts the two-step approach of THESYS. Synthesis is still restricted to structural problems, where only the structure of the arguments matters, but not their contents, such as in list reversing. Nevertheless, the scope of synthesisable programs is considerably larger. For instance, tree-recursive functions and functions with hidden parameters can be induced. Most notably, programs consisting of a calling function and an arbitrary set of further recursive functions can be induced. The first step of synthesis (trace construction) is therefore expanded such that traces can contain nestings of conditions. The second step is expanded such that the synthesis of a function can rely on the invention and synthesis of other functions (that is, IGOR I uses a technique of function invention in correspondence to the concept of predicate invention).

*(Ute Schmid)*

### IGOR II

Language: MAUDE

Year: 2008

Category: Analytical Inductive Functional Programming

#### Short Description «expand»

IGOR II relies on constructor-term rewriting
techniques. Thereby characteristics of modern
functional programming languages are covered:
Functions are defined with respect to algebraic
datatypes and control flow is guided by pattern
matching. Specifications are presented as sets
of example equations. Several dependent
functions, e.g., the mutual recursive solution
for *even* and *odd*, can be specified
and learned simultaneously. The example
equations can contain variables, which is
especially convenient for specifying more
complex problems where large sets of ground
examples would be necessary to cover all
possible cases of the desired behaviour of the
program to be learned.

IGOR II performs synthesis in one step from a given set of example equations together with the algebraic datatype. Background knowledge can be provided in form of additional example equations. Due to the generality of user-defined datatypes, list problems as well as problems on natural numbers or trees or any other algebraic type can be solved. Since also the Boolean values can be defined as algebraic type, functions yielding trees, lists, numbers, etc, as well as predicates, e.g., equality and order relations, can be induced.

The patterns are assured to be non-unifying such
that the equations constituting the induced
program can be regarded as a \emph{set} of
equations, where order does not matter. As a
consequence, e.g., *member* cannot be
defined based on the patterns *(X,[])*,
*(X,[X|Xs])* and *(X,[Y|Xs])* since
these patterns unify and hence cannot be
considered in an arbitrary order. Thus, for
solving problems where pattern variables need to
be compared regarding equality or any other
predicate, patterns alone do not suffice and
*if-then-else* conditionals are
needed. Currently, IGOR II can introduce such
conditionals if needed when the condition is an
equality or order test on the pattern variables
such that, e.g., *member* or inserting a
number at the right place into an ordered list
can be induced. When providing background
knowledge, a broader class of semantic problems
becomes feasible. For example, *insertion
sort* can be synthesised providing
*insert* as background knowledge; *quick
sort* can be synthesised providing
*append* and *partition*.

Induction in IGOR II relies on calculating least general generalisations (lggs) over the example equations and refining these lggs using the following three techniques: splitting rules by pattern refinement, introducing function calls, and function invention.

*(Ute Schmid)*

### Inverse typechecking and theorem proving in intuitionistic and classical logics

- Oleg Kiselyov's homepage

Language: Kanren Project

Year: 2006

Category: Typechecking and theorem proving

#### Short Description «expand»

This program enumerates terms given a particular type. It uses a search strategy that seems better than BFS and certainly DFS: it seems complete (like BFS), that is, if a term with a given type exists, it will be found. On the other hand, it takes much less resources than breadth-first search. The advantage of using logic-programming system is that we can give the system the overall shape of expected terms -- a term with `blanks'. The system will fill in the blanks. We can also restrict the set of generated terms to normal terms.

*(Oleg Kiselyov)*

### Magic Haskeller

- Susumu Katayama's Magic Haskeller homepage

Language: Haskell

Year: 2006

Category: Exhaustive Search-based Inductive Programming

#### Short Description «expand»

coming soon...

### PROGOL

- Stephen Muggleton's personal homepage

Language: PROLOG

Year: 1999

Category: Inductive Logic Programming

#### Short Description «expand»

coming soon...

### SYNAPSE

**SYN**thesis of

**A**lgorithms from

**P**ropertie

**S**and

**E**xamples)

- Download at Pierre Flener's homepage

Language: PROLOG

Year: 1992

Category: Inductive Logic Programming

#### Short Description «expand»

The SYNAPSE (SYNthesis of Algorithms from
PropertieS and Examples) technique takes as
evidence a set of (non-recursive) Horn clauses
describing a single intended relation. Ground
unit clauses are called (positive)
*examples* and data-drive the synthesis;
all other clauses are called *properties*
and are used to find instances of the
place-holders of the schema. For instance, the
clauses

*sort([X,Y],[X,Y]) ← X ≤ Y*

*sort([X,Y],[Y,X]) ← X > Y*

are properties. No other bias is given, though
types are inferred from the examples. Mode and
determinism information are not required,
because the focus is on synthesising the logic
component of logic programs. Synthesis is
passive, although there is an *expert* mode
where the technique asks for a preference among
the possible instances of some place-holders,
rather than non-deterministically choosing each
from a repository. These problem-independent
repositories form the (partitioned) background
knowledge. The technique is based on a
divide-and-conquer schema where multiple base
clauses and multiple recursive clauses are
possible. The hypothesis language is normal
logic programs, where negation is restricted to
the case discriminants and appears there by
extraction from the properties (i.e., it can
only be applied to primitive predicates and
could thus be avoided by using the complementary
primitives in the properties)

*(Pierre Flener)*