In [1]:

```
require 'daru'
```

Out[1]:

This is animal shelter data taken from kaggle compeption.

Its animals that are given up by their owner to a shelter. Lets gain some insight about this data.

In [2]:

```
shelter_data = Daru::DataFrame.from_csv '../data/animal_shelter_train.csv'
shelter_data.head(3)
```

Out[2]:

In [3]:

```
shelter_data.shape
```

Out[3]:

We are not interested in `DateTime`

, `AnimalID`

and `OutcomeSubtype`

so we will delete them.

Since `OutcomeType`

, `AnimalType`

, `SexuponOutcome`

, `Breed`

and `Color`

are qualitative variable, we'll convert them to type category.

In [4]:

```
shelter_data.delete_vectors 'DateTime', 'AnimalID', 'OutcomeSubtype'
shelter_data.to_category 'OutcomeType', 'AnimalType', 'SexuponOutcome', 'Breed', 'Color'
shelter_data.first 5
```

Out[4]:

We'll categorize `AgeuponOutcome(Weeks)`

to get quick summary of the ages (as we will see later).

In [5]:

```
shelter_data['AgeuponOutcome'] = shelter_data['AgeuponOutcome(Weeks)'].cut [0, 1, 4, 52, 260, 1500], labels: [:less_than_week, :less_than_month, :less_than_year, :one_to_five_years, :more_than__five_years]
shelter_data.delete_vector 'AgeuponOutcome(Weeks)'
nil
```

Lets look at the categories we have formed.

In [6]:

```
shelter_data['AgeuponOutcome'].frequencies.sort ascending: false
```

Out[6]:

Say we are interested in looking at percentage of each animals we have having in the shelter.

In [7]:

```
shelter_data['AnimalType'].frequencies :percentage
```

Out[7]:

This tells us that we have 58% of dogs and 41% of cats in out dataset. Lets explore further.

Lets look at what are the possible outcomes along with their frequencies.

In [8]:

```
shelter_data['OutcomeType'].frequencies
```

Out[8]:

So, a large amount of these animals are adopted which is great.

Lets get some insight into animals who died.

In [9]:

```
died = shelter_data.where shelter_data['OutcomeType'].eq('Died')
died['AnimalType'].frequencies :percentage
```

Out[9]:

Lets have some insight into ages of cats and dogs that died.

In [10]:

```
died.where(died['AnimalType'].eq 'Dog')['AgeuponOutcome'].frequencies :percentage
```

Out[10]:

In [11]:

```
died.where(died['AnimalType'].eq 'Cat')['AgeuponOutcome'].frequencies :percentage
```

Out[11]:

Also younger cats are more prone to die.

Lets move our attention to animals which got adopted.

In [12]:

```
adopted = shelter_data.where shelter_data['OutcomeType'].eq('Adoption')
adopted['AnimalType'].frequencies :percentage
```

Out[12]:

Hmm... Dogs are more likely to be adopted, maybe that explains why so many cats die.

Lets now look at those animals which got adopted by their owner back.

In [13]:

```
owner = shelter_data.where shelter_data['OutcomeType'].eq('Return_to_owner')
owner['AnimalType'].frequencies :percentage
```

Out[13]:

Astonishingly 90% of dogs returns to their owner while only 10% of cats do.