Consortium

Default

Name

A. Srinivasan

Address

Oxford University Computing Laboratory Wolfson Building, Parks Road Oxford OX1 3QD, UK (email: ashwin@comlab.ox.ac.uk)

Materials

Data

The data is partitioned into 3 groups: (A) compounds in PTE-1; (B) compounds in PTE-2; and (C) compounds in the rest. As of 14/04/97, these had the following distribution:

Category

Carcinogenic

Non-carcinogenic

A

20

19

B

at least 7

at least 6

C

162

136

Program

Given a training set, the algorithm MajClass calculates the most frequently occuring class. The theory constructed by this algorithm classifies all compounds as this most frequently occuring class. MajClass uses the Prolog data representation followed in experiments with Progol

Method

  1. Compounds in Category C are provided as the training set to MajClass
  2. The resulting theory is used to classify compounds in Categories A, and B.

Results

Based on the distribution of compounds in Category C, the theory constructed by MajClass classifies all compounds as being carcinogenic. This gives it no explanatory power.

Comments

References