Chapter 1: GemPy Basic¶

In this first example, we will show how to construct a first basic model and the main objects and functions. First we import gempy:

In [1]:

# These two lines are necessary only if gempy is not installed
import sys, os
sys.path.append("../")

# Importing gempy
import gempy as gp

# Embedding matplotlib figures into the notebooks
%matplotlib inline

# Aux imports
import numpy as np

All data get stored in a python object InputData. This object can be easily stored in a Python pickle. However, these files have the limitation that all dependecies must have the same versions as those when the pickle were created. For these reason to have more stable tutorials we will generate the InputData from raw data---i.e. csv files exported from Geomodeller.

These csv files can be found in the input_data folder in the root folder of GemPy. These tables contains uniquely the XYZ (and poles, azimuth and polarity in the foliation case) as well as their respective formation name (but not necessary the formation order).

In [8]:

# Importing the data from csv files and settign extent and resolution
geo_data = gp.create_data([0,2000,0,2000,-2000,0],[ 50,50,50],
                         path_f = os.pardir+"/input_data/FabLessPoints_Foliations.csv",
                         path_i = os.pardir+"/input_data/FabLessPoints_Points.csv")

With the command get data is possible to see all the input data. However, at the moment the (depositional) order of the formation and the separation of the series (More explanation about this in the next notebook) is totally arbitrary.

In [1]:

gp.get_data(geo_data).head()

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-ceab18c1653c> in <module>()
----> 1 gp.get_data(geo_data).head()

NameError: name 'gp' is not defined

To set the number of (depositional) series and which formation belongs to which, it is possible to use the function set_series. Here, there are two important things to notice:

set_series requires a dictionary. In Python dictionaries are not order (keys dictionaries since Python 3.6 are) and therefore there is not guarantee of having the right order.
- The order of the series are vital for the method (from younger to older).
- The order of the formations (as long they belong to the correct series) are only important for the color code. If the order of the pile differs from the final result the color of the interfaces and input data will be different
Every fault is treated as an independent series and have to be at the top of the pile. The relative order between the distinct faults represent the tectonic relation between them (from younger to older as well).

The order of the series (for Python < 3.6, otherwise passing the correct order of keys in the dictionary is enough) can be given as attribute as in the cell below. For the order of formations can be passed as attribute as well or using the specific function set_order_formations.

In [10]:

# Assigning series to formations as well as their order (timewise)
gp.set_series(geo_data, {"fault":'MainFault', 
                      "Rest": ('SecondaryReservoir','Seal', 'Reservoir', 'Overlying')},
                       order_series = ["fault", 'Rest'],
                       order_formations=['MainFault', 
                                         'SecondaryReservoir', 'Seal','Reservoir', 'Overlying',
                                         ]) 

Out[10]:

<gempy.strat_pile.StratigraphicPile at 0x7f6e21b43048>

As an alternative the stratigraphic pile is interactive given the right backend (try %matplotlib notebook or %matplotlib qt5). These backends sometimes give some trouble though. Try to execute the cell twice:

In [7]:

%matplotlib notebook
gp.get_stratigraphic_pile(geo_data)

Out[7]:

<gempy.strat_pile.StratigraphicPile at 0x7f6e21c1d470>

Notice that the colors depends on the order and therefore every time the cell is executed the colors are always in the same position. Be aware of the legend to be sure that the pile is as you wish!! (In the future every color will have the annotation within the rectangles to avoid confusion)

This geo_data object contains essential information that we can access through the correspondent getters. Such a the coordinates of the grid.

In [6]:

print(gp.get_grid(geo_data))

[[    0.             0.         -2000.        ]
 [    0.             0.         -1959.18371582]
 [    0.             0.         -1918.36730957]
 ..., 
 [ 2000.          2000.           -81.63265228]
 [ 2000.          2000.           -40.81632614]
 [ 2000.          2000.             0.        ]]

The main input the potential field method is the coordinates of interfaces points as well as the orientations. These pandas dataframes can we access by the following methods:

Interfaces Dataframe¶

In [7]:

gp.get_data(geo_data, 'interfaces').head()

Out[7]:

	X	Y	Z	formation	series	annotations
0	1000	1000	-1000	MainFault	fault	${\bf{x}}_{\alpha \,{\bf{1}},0}$
1	800	1000	-1600	MainFault	fault	${\bf{x}}_{\alpha \,{\bf{1}},1}$
2	1200	1000	-400	MainFault	fault	${\bf{x}}_{\alpha \,{\bf{1}},2}$
3	1100	1000	-700	MainFault	fault	${\bf{x}}_{\alpha \,{\bf{1}},3}$
4	900	1000	-1300	MainFault	fault	${\bf{x}}_{\alpha \,{\bf{1}},4}$

Foliations Dataframe¶

Now the formations and the series are correctly set.

In [8]:

gp.get_data(geo_data, 'foliations').head()

Out[8]:

	X	Y	Z	dip	azimuth	polarity	formation	series	annotations
0	917.45	1000.0	-1135.398	71.565	270.0	1	MainFault	fault	${\bf{x}}_{\beta \,{\bf{1}},0}$
1	1450.00	1000.0	-1150.000	18.435	90.0	1	Reservoir	Rest	${\bf{x}}_{\beta \,{\bf{4}},0}$

It is important to notice the columns of each data frame. These not only contains the geometrical properties of the data but also the formation and series at which they belong. This division is fundamental in order to preserve the depositional ages of the setting to model.

A projection of the aforementioned data can be visualized in to 2D by the following function. It is possible to choose the direction of visualization as well as the series:

In [11]:

%matplotlib inline
gp.plot_data(geo_data, direction='y')

GemPy supports visualization in 3D as well trough vtk. These plots are interactive. Try to drag and drop a point or interface! In the perpendicular views only 2D movements are possible to help to place the data where is required.

In [12]:

gp.plot_data_3D(geo_data)

The ins and outs of Input data objects¶

As we have seen objects DataManagement.InputData (usually called geo_data in the tutorials) aim to have all the original geological properties, measurements and geological relations stored.

Once we have the data ready to generate a model, we will need to create the next object type towards the final geological model:

In [13]:

interp_data = gp.InterpolatorInput(geo_data, u_grade=[3,3])
print(interp_data)

Level of Optimization:  fast_compile
Device:  cpu
Precision:  float32
<gempy.DataManagement.InterpolatorInput object at 0x7f6e219505c0>

In [14]:

interp_data.get_formation_number()

Out[14]:

{'DefaultBasement': 0,
 'MainFault': 1,
 'Overlying': 5,
 'Reservoir': 4,
 'Seal': 3,
 'SecondaryReservoir': 2}

By default (there is a flag in case you do not need) when we create a interp_data object we also compile the theano function that compute the model. That is the reason why takes long.

gempy.DataManagement.InterpolatorInput (usually called interp_data in the tutorials) prepares the original data to the interpolation algorithm by scaling the coordinates for better and adding all the mathematical parametrization needed.

In [15]:

gp.get_kriging_parameters(interp_data)

range 0.8882311582565308 3464.1015172
Number of drift equations [2 2]
Covariance at 0 0.01878463476896286
Foliations nugget effect 0.009999999776482582

These later parameters have a default value computed from the original data or can be changed by the user (be careful of changing any of these if you do not fully understand their meaning).

At this point, we have all what we need to compute our model. By default everytime we compute a model we obtain:

Lithology block model
- with the lithology values in 0
- with the potential field values in 1
Fault block model
- with the faults zones values (i.e. every divided region by each fault has one number) in 0
- with the potential field values in 1

In [16]:

lith_block, fault_block = gp.compute_model(interp_data)

This solution can be plot with the correspondent plotting function. Blocks:

In [17]:

%matplotlib inline
gp.plot_section(geo_data, lith_block[0], 25, plot_data=True)

Potential field:

In [18]:

gp.plot_potential_field(geo_data, lith_block[1], 25)

From the potential fields (of lithologies and faults) it is possible to extract vertices and simpleces to create the 3D triangles for a vtk visualization.

In [19]:

ver, sim = gp.get_surfaces(interp_data,lith_block[1], fault_block[1], original_scale=True)

In [20]:

gp.plot_surfaces_3D(geo_data, ver, sim, alpha=1)

Additionally is possible to update the model and recompute the surfaces in real time. To do so, we need to pass the data rescaled. To get an smooth response is important to have the theano optimizer flag in fast_run and run theano in the gpu. This can speed up the modeling time in a factor of 20.

In [21]:

ver_s, sim_s = gp.get_surfaces(interp_data,lith_block[1],
                               fault_block[1],
                               original_scale=False)

In [22]:

gp.plot_surfaces_3D_real_time(interp_data, ver_s, sim_s)

In the same manner we can visualize the fault block:

In [23]:

gp.plot_section(geo_data, fault_block[0], 25)

In [24]:

gp.plot_potential_field(geo_data, fault_block[1], 25)