To start using pyglotaran
in your analysis, you only have to import the Project
class
and open a project.
from glotaran.project import Project
quickstart_project = Project.open("quickstart_project")
quickstart_project
If the project does not already exist this will create a new project and its folder structure for
you. In our case we had only the models
+ parameters
folders and the data
+ results
folder
were created when opening the project.
%ls quickstart_project
Let us get some example data to analyze:
from glotaran.testing.simulated_data.sequential_spectral_decay import DATASET as my_dataset
my_dataset
Like all data in pyglotaran
, the dataset is a xarray.Dataset.
You can find more information about the xarray
library the xarray hompage.
The loaded dataset is a simulated sequential model.
Now lets plot some time traces.
plot_data = my_dataset.data.sel(spectral=[620, 630, 650], method="nearest")
plot_data.plot.line(x="time", aspect=2, size=5);
We can also plot spectra at different times.
plot_data = my_dataset.data.sel(time=[1, 10, 20], method="nearest")
plot_data.plot.line(x="spectral", aspect=2, size=5);
As long as you can read your data into a xarray.Dataset
or xarray.DataArray
you can directly
import it in to your project.
This will save your data as NetCDF
(.nc
) file into the data
folder inside of your project with the name that you gave it (here quickstart_project/data/my_data.nc
).
If the data format you are using is supported by a plugin you can simply copy the file to
the data folder of the project (here quickstart_project/data
).
quickstart_project.import_data(my_dataset, dataset_name="my_data")
quickstart_project
After importing our quickstart_project
is aware of the data that we named my_data
when importing.
To get an idea about how to model your data, you should inspect the singular
value decomposition.
As a convenience the load_data
method has the option to add svd data on the fly.
dataset_with_svd = quickstart_project.load_data("my_data", add_svd=True)
dataset_with_svd
First, take a look at the first 10 singular values:
plot_data = dataset_with_svd.data_singular_values.sel(singular_value_index=range(10))
plot_data.plot(yscale="log", marker="o", linewidth=0, aspect=2, size=5);
This tells us that our data have at least three components which we need to model.
To analyze our data, we need to create a model.
Create a file called my_model.yaml
in your projects models
directory and fill it with the
following content.
quickstart_project.show_model_definition("my_model")
You can check your model for problems with the validate
method.
quickstart_project.validate("my_model")
Now define some starting parameters. Create a file called parameters.yaml
in your projects
parameters
directory with the following content.
quickstart_project.show_parameters_definition("my_parameters")
Note the { "vary": False }
which tells pyglotaran
that those parameters should not be changed.
You can use validate
method also to check for missing parameters.
quickstart_project.validate("my_model", "my_parameters")
Since not all problems in the model can be detected automatically it is wise to visually inspect the model. For this purpose, you can just load the model and inspect its markdown rendered version.
quickstart_project.load_model("my_model")
The same way you should inspect your parameters.
quickstart_project.load_parameters("my_parameters")
Now we have everything together to optimize our parameters.
result = quickstart_project.optimize("my_model", "my_parameters")
result
Each time you run an optimization the result will be saved in the projects results folder.
%ls "quickstart_project/results"
To visualize how quickly the optimization converged we ca plot the optimality of the
optimization_history
.
result.optimization_history.data["optimality"].plot(logy=True)
result.optimized_parameters
You can inspect the data of your result
by accessing data
attribute. In our example it only
contains our single my_data
dataset, but it ca contain as many dataset as you analysis needs.
result.data
The resulting data can be visualized the same way as the dataset. To judge the quality of the fit, you should look at first left and right singular vectors of the residual.
result_dataset = result.data["my_data"]
residual_left = result_dataset.residual_left_singular_vectors.sel(left_singular_value_index=0)
residual_right = result_dataset.residual_right_singular_vectors.sel(right_singular_value_index=0)
residual_left.plot.line(x="time", aspect=2, size=5)
residual_right.plot.line(x="spectral", aspect=2, size=5);