# Binary Classification¶

This is a basic example in which we learn to ground unary predicate $A$ that is defined in the space of $[0,1]^2$.

We define the predicate $A$ to apply to points that are close to the middle point $c=(.5,.5)$.In order to get training data, we randomly sample data from the domain. We split the sample data into two separate sets based on their euclidian distance to $c$. We then define two facts for the predicate $A$. For all points the predicate should apply to, we provide them as positive examples and vice versa for all points that the predicate does not apply to.

In [1]:
import logging; logging.basicConfig(level=logging.INFO)
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import logictensornetworks as ltn

plt.rcParams['font.size'] = 12
plt.rcParams['axes.linewidth'] = 1

Init Plugin
Init Graph Optimizer
Init Kernel


Sample random data from $[0,1]^2$. Our groundtruth positive training data for $A$ is close to the center (.5,.5). All other data is considered as negative examples.

In [2]:
batch_size=64
nr_samples = 100
nr_samples_train = 50
data = np.random.uniform([0,0],[1,1],(nr_samples,2))
labels = np.sum(np.square(data-[.5,.5]),axis=1)<.09

# 400 examples for training; 100 examples for training
ds_train = tf.data.Dataset\
.from_tensor_slices((data[:nr_samples_train],labels[:nr_samples_train]))\
.batch(batch_size)
ds_test = tf.data.Dataset\
.from_tensor_slices((data[nr_samples_train:],labels[nr_samples_train:]))\
.batch(batch_size)

plt.figure(figsize=(4,4))
plt.scatter(data[labels][:,0],data[labels][:,1],label='A')
plt.scatter(data[np.logical_not(labels)][:,0],data[np.logical_not(labels)][:,1],label='~A')
plt.title("Groundtruth")
plt.legend()
plt.show()

Metal device set to: Apple M1

systemMemory: 16.00 GB
maxCacheSize: 5.33 GB


2021-08-30 16:44:34.365605: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-08-30 16:44:34.365723: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)


Define the predicate $A$. $A$ has arity 1 (single argument). The dimension of the argument is 2 (since the domain is $[0,1]^2$).

In [3]:
A = ltn.Predicate.MLP([2],hidden_layer_sizes=(16,16))


Import some operators to write the axioms.

In [4]:
Not = ltn.Wrapper_Connective(ltn.fuzzy_ops.Not_Std())
And = ltn.Wrapper_Connective(ltn.fuzzy_ops.And_Prod())
Or = ltn.Wrapper_Connective(ltn.fuzzy_ops.Or_ProbSum())
Implies = ltn.Wrapper_Connective(ltn.fuzzy_ops.Implies_Reichenbach())
Forall = ltn.Wrapper_Quantifier(ltn.fuzzy_ops.Aggreg_pMeanError(p=2),semantics="forall")
Exists = ltn.Wrapper_Quantifier(ltn.fuzzy_ops.Aggreg_pMean(p=2),semantics="exists")


Now we add some facts to the knowledgebase. We express that for all points in $\mathrm{data\_A}$, $A$ should be true. For all points in $\mathrm{data\_not\_A}$, $A$ is not true.

In [5]:
formula_aggregator = ltn.Wrapper_Formula_Aggregator(ltn.fuzzy_ops.Aggreg_pMeanError(p=2))

@tf.function
def axioms(data, labels):
x_A = ltn.Variable("x_A",data[labels])
x_not_A = ltn.Variable("x_not_A",data[tf.logical_not(labels)])
axioms = [
Forall(x_A, A(x_A)),
Forall(x_not_A, Not(A(x_not_A)))
]
sat_level = formula_aggregator(axioms).tensor
return sat_level


Initialize all layers and the static graph.

In [6]:
for _data, _labels in ds_test:
print("Initial sat level %.5f"%axioms(_data, _labels))
break

Initial sat level 0.49834

2021-08-30 16:44:47.259520: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-08-30 16:44:47.260300: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2021-08-30 16:44:47.260349: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.


Train on the knowledgebase.

In [7]:
mean_metrics = tf.keras.metrics.Mean()

trainable_variables = A.trainable_variables
for epoch in range(2000):
for _data, _labels in ds_train:
loss = 1. - axioms(_data, _labels)
if epoch%100 == 0:
mean_metrics.reset_states()
for _data, _labels in ds_test:
mean_metrics(axioms(_data, _labels))
print("Epoch %d: Sat Level %.3f"%(epoch, mean_metrics.result() ))
mean_metrics.reset_states()
for _data, _labels in ds_test:
mean_metrics(axioms(_data, _labels))
print("Training finished at Epoch %d with Sat Level %.3f"%(epoch, mean_metrics.result() ))

2021-08-30 16:44:50.118987: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-08-30 16:44:50.186417: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.

Epoch 0: Sat Level 0.499
Epoch 100: Sat Level 0.500
Epoch 200: Sat Level 0.520
Epoch 300: Sat Level 0.584
Epoch 400: Sat Level 0.585
Epoch 500: Sat Level 0.617
Epoch 600: Sat Level 0.645
Epoch 700: Sat Level 0.640
Epoch 800: Sat Level 0.627
Epoch 900: Sat Level 0.616
Epoch 1000: Sat Level 0.608
Epoch 1100: Sat Level 0.602
Epoch 1200: Sat Level 0.597
Epoch 1300: Sat Level 0.594
Epoch 1400: Sat Level 0.592
Epoch 1500: Sat Level 0.590
Epoch 1600: Sat Level 0.588
Epoch 1700: Sat Level 0.587
Epoch 1800: Sat Level 0.586
Epoch 1900: Sat Level 0.585
Training finished at Epoch 1999 with Sat Level 0.585


The following queries the database on training data and test data. Vizualisation show the extent of generalization.

In [8]:
fig = plt.figure(figsize=(9, 11))

ax = plt.subplot2grid((3,8),(0,2),colspan=4)
ax.set_title("groundtruth")
ax.scatter(data[labels][:,0],data[labels][:,1],label='A')
ax.scatter(data[np.logical_not(labels)][:,0],data[np.logical_not(labels)][:,1],label='~A')
ax.legend()

# Training data
x = ltn.Variable("x",data[:nr_samples_train])
result=A(x)
plt.title("A(x) - training data")
plt.scatter(data[:nr_samples_train,0],data[:nr_samples_train,1],c=result.tensor.numpy().squeeze())
plt.colorbar()

result=Not(A(x))
plt.title("~A(x) - training data")
plt.scatter(data[:nr_samples_train,0],data[:nr_samples_train,1],c=result.tensor.numpy().squeeze())
plt.colorbar()

# Test data
x = ltn.Variable("x",data[nr_samples_train:])

result=A(x)
plt.title("A(x) - test data")
plt.scatter(data[nr_samples_train:,0],data[nr_samples_train:,1],c=result.tensor.numpy().squeeze())
plt.colorbar()