This notebook describe indexing in Daru::DataFrame with the newly added Categorical Index and other index classes.
require 'daru'
true
Helper function to get a sample dataframe.
def sample_df idx
Daru::DataFrame.new({
a: 1..5,
b: 'a'..'e',
c: 11..15
}, index: idx)
end
:sample_df
idx = Daru::CategoricalIndex.new [:a, :b, :a, :b, :c]
#<Daru::CategoricalIndex(5): {a, b, a, b, c}>
df = sample_df idx
Daru::DataFrame(5x3) | |||
---|---|---|---|
a | b | c | |
a | 1 | a | 11 |
b | 2 | b | 12 |
a | 3 | c | 13 |
b | 4 | d | 14 |
c | 5 | e | 15 |
Retrive rows by category or position
Note: When index is both a valid category as well as position, then it will treated as category.
df.row[:a, :c]
Daru::DataFrame(3x3) | |||
---|---|---|---|
a | b | c | |
a | 1 | a | 11 |
a | 3 | c | 13 |
c | 5 | e | 15 |
df.row[0, 1]
Daru::DataFrame(2x3) | |||
---|---|---|---|
a | b | c | |
a | 1 | a | 11 |
b | 2 | b | 12 |
Its to fetch vectors and works similar to #row[]
.
df[:a, :b]
Daru::DataFrame(5x2) | ||
---|---|---|
a | b | |
a | 1 | a |
b | 2 | b |
a | 3 | c |
b | 4 | d |
c | 5 | e |
To retrive rows by position.
df.row.at 0, 1, 2
Daru::DataFrame(3x3) | |||
---|---|---|---|
a | b | c | |
a | 1 | a | 11 |
b | 2 | b | 12 |
a | 3 | c | 13 |
To retrive vectors by position.
df.at 0, 1
Daru::DataFrame(5x2) | ||
---|---|---|
a | b | |
a | 1 | a |
b | 2 | b |
a | 3 | c |
b | 4 | d |
c | 5 | e |
Set rows by categories or positions.
Note: In case index is both a valid category and position, it will taken as category.
df.row[:a] = ['x', 'y', 'z']
df
Daru::DataFrame(5x3) | |||
---|---|---|---|
a | b | c | |
a | x | y | z |
b | 2 | b | 12 |
a | x | y | z |
b | 4 | d | 14 |
c | 5 | e | 15 |
Works similar to #row[]=
and is for vectors.
df[:a] = [1]*5
df
Daru::DataFrame(5x3) | |||
---|---|---|---|
a | b | c | |
a | 1 | y | z |
b | 1 | b | 12 |
a | 1 | y | z |
b | 1 | d | 14 |
c | 1 | e | 15 |
Set rows by positions to a given vector
#reset dataframe
df = sample_df idx
Daru::DataFrame(5x3) | |||
---|---|---|---|
a | b | c | |
a | 1 | a | 11 |
b | 2 | b | 12 |
a | 3 | c | 13 |
b | 4 | d | 14 |
c | 5 | e | 15 |
df.row.set_at [0, 4], ['x', 'y', 'z']
df
Daru::DataFrame(5x3) | |||
---|---|---|---|
a | b | c | |
a | x | y | z |
b | 2 | b | 12 |
a | 3 | c | 13 |
b | 4 | d | 14 |
c | x | y | z |
Works similar to #row.at_set
df.set_at [0, 1], [nil]*5
df
Daru::DataFrame(5x3) | |||
---|---|---|---|
a | b | c | |
a | z | ||
b | 12 | ||
a | 13 | ||
b | 14 | ||
c | z |