In this notebook we see how to call Gobnilp from with an R session and how to use Gobnilp and bnearn together. First of all let's attach the bnlearn package (which I will assume is already installed)

In [1]:

library(bnlearn)

Attaching package: ‘bnlearn’

The following object is masked from ‘package:stats’:

    sigma

To communicate with Gobnilp we will use the excellent reticulate package (which is also assumed installed):

In [2]:

library(reticulate)

Now that reticulate is in use we can use the "import" command to import the gobnilp module:

In [3]:

gob <- import("pygobnilp.gobnilp")

Now we can use the newly created gob to access attributes of the gobnilp module. Usually the class Gobnilp is all we are interested in and we just want that to create a Gobnilp object. Here's how to do that:

In [4]:

m <- gob$Gobnilp()

Note that instead of Python's '.' to get to attributes we have R's '$'. Although in the above we created the variable gob usually there is no need to do so, since we just want the Gobnilp object, so doing this is simpler:

In [5]:

m <- import("pygobnilp.gobnilp")$Gobnilp()

Now that we have the object m we can use it to, for example, learn a BN from a data file:

In [6]:

m$learn("discrete.dat",plot=FALSE)

I have put plot=FALSE since although when running Gobnilp from R the learned BN is indeed plotted, errors are also generated. Note also that not even any textual output about the learned BN is here. This is an issue with Jupyter since this is textual ouptut is generated when using a normal R session. But we can still get a report on what was learned as follows:

In [7]:

m$learned_bn

**********
BN has score -24028.0947783535
**********
A<- -5502.137377150637
B<-A -3688.9395212202216
C<- -3501.5105385969146
D<-A,C -3555.014444236549
E<-B,F -4310.304956470649
F<- -3470.18794067853
**********
bnlearn modelstring = 
[A][B|A][C][D|C:A][E|B:F][F]

In [8]:

writeLines(m$learned_bn$cpdag_str())

**********
CPDAG:
Vertices: A,B,C,D,E,F
A-B
A->D
B->E
C->D
F->E

We can also use Gobnilp to learn a BN from data supplied as an R data frame. Since we have bnlearn available we can grab its built in dataset learning.test and then have a look at it.

In [9]:

data(learning.test)
summary(learning.test)

 A        B        C        D        E        F       
 a:1668   a:2362   a:3717   a:1755   a:1941   a:2509  
 b:1670   b: 568   b:1024   b:1570   b:1493   b:2491  
 c:1662   c:2070   c: 259   c:1675   c:1566

So learning.test is a data frame with 6 discrete variables. Here's how to use Gobnilp to learn from this data frame (with default parameters)

In [10]:

m2 <- gob$Gobnilp()
m2$learn(learning.test,plot=FALSE)

In [11]:

m2$learned_bn

**********
BN has score -24028.0947783535
**********
A<- -5502.137377150637
B<-A -3688.939521220218
C<- -3501.5105385969146
D<-A,C -3555.014444236549
E<-B,F -4310.3049564706525
F<- -3470.18794067853
**********
bnlearn modelstring = 
[A][B|A][C][D|C:A][E|B:F][F]

So we get the same BN as before, which is no surpise since learning.test and the data in the file discrete.dat are exactly the same! Since we have only 6 variables there is no need to use Gobnilp's default parent set limit of 3. So let's learn again without it.

In [12]:

m2 <- gob$Gobnilp()
m2$learn(learning.test,plot=FALSE,palim=99)

In [13]:

m2$learned_bn

**********
BN has score -24028.0947783535
**********
A<- -5502.137377150637
B<-A -3688.939521220218
C<- -3501.5105385969146
D<-A,C -3555.014444236549
E<-B,F -4310.3049564706525
F<- -3470.18794067853
**********
bnlearn modelstring = 
[A][B|A][C][D|C:A][E|B:F][F]

We get the same BN yet again. So this really is a globally optimal BN. We can use one of bnlearn's structure learning algorithms on learning.test. Here's hill-climbing, for example:

In [14]:

m3 <- hc(learning.test)

In [15]:

m3

  Bayesian network learned via Score-based methods

  model:
   [A][C][F][B|A][D|A:C][E|B:F] 
  nodes:                                 6 
  arcs:                                  5 
    undirected arcs:                     0 
    directed arcs:                       5 
  average markov blanket size:           2.33 
  average neighbourhood size:            1.67 
  average branching factor:              0.83 

  learning algorithm:                    Hill-Climbing 
  score:                                 BIC (disc.) 
  penalization coefficient:              4.258597 
  tests used in the learning procedure:  40 
  optimized:                             TRUE

So hill-climbing with the BIC score finds the same network.

We can use Gobnilp to learn from continuous data in the same way. Here is an example using the bnlearn dataset gaussian.test:

In [16]:

data(gaussian.test)
m4 <- gob$Gobnilp()
m4$learn(gaussian.test,data_type="continuous",score="BGe",plot=FALSE,palim=99)

In [17]:

m4$learned_bn

**********
BN has score -53258.94161814129
**********
A<- -7124.782936593145
B<-D 487.26955605315015
D<- -14692.560410628308
C<-A,B -3743.043565645814
E<- -10545.851006239516
F<-A,D,E,G -7109.807736959381
G<- -10530.165518128275
**********
bnlearn modelstring = 
[A][B|D][D][C|B:A][E][F|G:E:D:A][G]

And again, we can see how hill-climbing does (using BGe scoring) ...

In [18]:

m5 <- hc(gaussian.test,score="bge")
m5
score(m5,gaussian.test,type="bge")

  Bayesian network learned via Score-based methods

  model:
   [A][B][E][G][C|A:B][D|B][F|A:D:E:G] 
  nodes:                                 7 
  arcs:                                  7 
    undirected arcs:                     0 
    directed arcs:                       7 
  average markov blanket size:           4.00 
  average neighbourhood size:            2.00 
  average branching factor:              1.00 

  learning algorithm:                    Hill-Climbing 
  score:                                 Bayesian Gaussian (BGe) 
  graph prior:                           Uniform 
  imaginary sample size (normal):        1 
  imaginary sample size (Wishart):       9 
  tests used in the learning procedure:  75 
  optimized:                             TRUE

-53258.9416181405

Hill-climbing and Gobnilp's exact algorithm find BNs in the same Markov equivalence class, and since BGe is score-equivalent they have the same score. In this particular example, the only advantage of exact learning is that we know we have the optimal BN.