What is this notebook

TODO

In [1]:
from brightway2 import *

Basic setup

Start a new project, and install base data

In [2]:
projects.set_current("student-project-SimaPro-import")
In [5]:
bw2setup()
Creating default biosphere

Applying strategy: drop_unspecified_subcategories
Writing activities to SQLite3 database:
0%                          100%
[##############################] | ETA[sec]: 0.000 
Total time elapsed: 1.329 sec
Title: Writing activities to SQLite3 database:
  Started: 05/22/2015 11:08:36
  Finished: 05/22/2015 11:08:38
  Total time elapsed: 1.329 sec
  CPU %: 49.500000
  Memory %: 0.259149
Created database: biosphere3
Creating default LCIA methods

Applying strategy: set_biosphere_type
Applying strategy: drop_unspecified_subcategories
Applying strategy: link_iterable_by_fields
Wrote 692 LCIA methods with 170915 characterization factors
Creating core data migrations

Also want ecoinvent database

In [6]:
ei = SingleOutputEcospold2Importer(
    "/Users/cmutel/Documents/LCA Documents/Ecoinvent/3.1/cutoff/datasets", 
    "ecoinvent 3.1 cutoff"
)
ei.apply_strategies()
ei.write_database()
Extracting ecospold2 files:
0%                          100%
[##############################] | ETA[sec]: 0.000 | Item ID: fff527b1-0fe4-4
Total time elapsed: 102.789 sec
Title: Extracting ecospold2 files:
  Started: 05/22/2015 11:09:41
  Finished: 05/22/2015 11:11:24
  Total time elapsed: 102.789 sec
  CPU %: 85.200000
  Memory %: 2.959931
Extracted 11301 datasets in 103.76 seconds
Applying strategy: remove_zero_amount_coproducts
Applying strategy: remove_zero_amount_inputs_with_no_activity
Applying strategy: es2_assign_only_product_with_amount_as_reference_product
Applying strategy: assign_single_product_as_activity
Applying strategy: create_composite_code
Applying strategy: drop_unspecified_subcategories
Applying strategy: link_biosphere_by_flow_uuid
Applying strategy: link_internal_technosphere_by_composite_code
Applying strategy: delete_exchanges_missing_activity
Applying strategy: delete_ghost_exchanges
Writing activities to SQLite3 database:
0%                          100%
[##############################] | ETA[sec]: 0.000 
Total time elapsed: 86.280 sec
Title: Writing activities to SQLite3 database:
  Started: 05/22/2015 11:11:29
  Finished: 05/22/2015 11:12:56
  Total time elapsed: 86.280 sec
  CPU %: 79.200000
  Memory %: 3.224015
Created database: ecoinvent 3.1 cutoff
Out[6]:
Brightway2 SQLiteBackend: ecoinvent 3.1 cutoff

First attempt

In [7]:
fp = "data/caes.CSV"
sp = SimaProCSVImporter(fp, name="CAES")
sp.statistics()
NameError
   AirMass
name 'AirMass' is not defined
---------------------------------------------------------------------------
DuplicateName                             Traceback (most recent call last)
<ipython-input-7-abd31e09665d> in <module>()
      1 fp = "data/caes.CSV"
----> 2 sp = SimaProCSVImporter(fp, name="CAES")
      3 sp.statistics()

/Users/cmutel/local34/bw3dev/lib/python3.4/site-packages/bw2io/importers/simapro_csv.py in __init__(self, filepath, name, delimiter, encoding, normalize_biosphere, biosphere_db)
     37                 delimiter=delimiter,
     38                 name=name,
---> 39                 encoding=encoding,
     40             )
     41         print(u"Extracted {} unallocated datasets in {:.2f} seconds".format(

/Users/cmutel/local34/bw3dev/lib/python3.4/site-packages/bw2io/extractors/simapro_csv.py in extract(cls, filepath, delimiter, name, encoding)
    102                     filepath,
    103                     global_parameters,
--> 104                     project_metadata
    105                 )
    106                 datasets.append(ds)

/Users/cmutel/local34/bw3dev/lib/python3.4/site-packages/bw2io/extractors/simapro_csv.py in read_data_set(cls, data, index, db_name, filepath, gp, pm)
    512         ps = ParameterSet(
    513             ds['parameters'],
--> 514             {key: value['amount'] for key, value in gp.items()}
    515         )
    516         # Changes in-place

/Users/cmutel/local34/bw3dev/lib/python3.4/site-packages/bw2parameters/parameter_set.py in __init__(self, params, global_params)
     13         self.params = params
     14         self.global_params = global_params or {}
---> 15         self.basic_validation()
     16         self.references = self.get_references()
     17         for name, references in self.references.items():

/Users/cmutel/local34/bw3dev/lib/python3.4/site-packages/bw2parameters/parameter_set.py in basic_validation(self)
     86             elif key in EXISTING_SYMBOLS:
     87                 raise DuplicateName(
---> 88                     u"Parameter name {} is a built-in symbol".format(key)
     89                 )
     90         for key, value in self.global_params.items():

DuplicateName: Parameter name pi is a built-in symbol

The formula parser throws an error - it has a certain number of symbols already defined, and when one of those symbol names is used as a variable it raises an error. In this case, we already have $\pi$ defined, so we can manually delete it from the CSV file.

In [9]:
fp = "data/caes-no-pi.CSV"
sp = SimaProCSVImporter(fp, name="CAES")
sp.statistics()
Extracted 4 unallocated datasets in 0.03 seconds
4 datasets
1312 exchanges
1312 unlinked exchanges
  Type biosphere: 1294 unique unlinked exchanges
  Type production: 4 unique unlinked exchanges
  Type technosphere: 12 unique unlinked exchanges
Out[9]:
(4, 1312, 1312)

Apply default strategies

In [10]:
sp.apply_strategies()
sp.statistics()
Applying strategy: assign_only_product_as_production
Applying strategy: drop_unspecified_subcategories
Applying strategy: sp_allocate_products
Applying strategy: split_simapro_name_geo
Applying strategy: strip_biosphere_exc_locations
Applying strategy: link_technosphere_based_on_name_unit_location
Applying strategy: normalize_biosphere_categories
Applying strategy: normalize_simapro_biosphere_categories
Applying strategy: normalize_biosphere_names
Applying strategy: normalize_simapro_biosphere_names
Applying strategy: link_iterable_by_fields
4 datasets
1312 exchanges
12 unlinked exchanges
  Type biosphere: 2 unique unlinked exchanges
  Type technosphere: 10 unique unlinked exchanges
Out[10]:
(4, 1312, 12)

Look at each undefined exchange

In [11]:
for exc in sp.unlinked:
    print(exc['name'])
Air compressor, screw-type compressor, 300kW {GLO}| market for | Alloc Def, U
Electricity, medium voltage {CH}| market for | Alloc Def, U
Gas turbine, 10MW electrical {GLO}| market for | Alloc Def, U
Concrete block {GLO}| market for | Alloc Def, U
Reinforcing steel {GLO}| market for | Alloc Def, U
Clay
Water, unspecified natural origin/m3
Steel, chromium steel 18/8 {GLO}| market for | Alloc Def, U
Aluminium, cast alloy {GLO}| market for | Alloc Def, U
Silicon, metallurgical grade {GLO}| market for | Alloc Def, U
Casting, brass {CH}| processing | Alloc Def, U
Selective coat, aluminium sheet, nickel pigmented aluminium oxide {GLO}| market for | Alloc Def, U

This is a two-step process. First, we have to change the names to ones that ecoinvent uses (no, there is not ecoinvent process Steel, chromium steel 18/8 {GLO}| market for | Alloc Def, U!).

Then we can try to link against the Ecoinvent 3.1 database.

In [12]:
sp.migrate("simapro-ecoinvent-3")
Applying strategy: migrate_datasets
Applying strategy: migrate_exchanges

Because we are importin from SimaPro, the categories are screwed up (i.e. ecoinvent categories aren't cleanly imported). Tell the matching algorithm to ignore categories and only use product, reference product, unit, and location.

In [13]:
import functools
from bw2io.strategies import link_iterable_by_fields

sp.apply_strategy(functools.partial(
        link_iterable_by_fields, 
        other=Database("ecoinvent 3.1 cutoff"),
        kind="technosphere",
        fields=["reference product", "name", "unit", "location"]
))
sp.statistics()
Applying strategy: link_iterable_by_fields
4 datasets
1312 exchanges
2 unlinked exchanges
  Type biosphere: 2 unique unlinked exchanges
Out[13]:
(4, 1312, 2)

What is left?

In [15]:
for exc in sp.unlinked:
    print(exc)
{'amount': 0.0001942, 'type': 'biosphere', 'name': 'Clay', 'categories': ('natural resource', 'in ground'), 'comment': '', 'loc': 0.0001942, 'uncertainty type': 0, 'unit': 'kilogram'}
{'amount': 3.6721e-05, 'type': 'biosphere', 'name': 'Water, unspecified natural origin/m3', 'categories': ('natural resource', 'in water'), 'comment': '', 'loc': 3.6721e-05, 'uncertainty type': 0, 'unit': 'cubic meter'}

What to do when the matching is "good enough"

In this case, we are going to ignore these missing exchanges. We do this (naturally) through another strategy, thoough this one has a shortcut:

In [16]:
sp.drop_unlinked()
Warning: This is the nuclear weapon of linking, and should only be used in extreme cases. Must be called with the keyword argument ``i_am_reckless=True``!

Oops, we got a complaint. But we proceed anyway:

In [17]:
sp.drop_unlinked(i_am_reckless=True)
Applying strategy: drop_unlinked

Write an excel sheet with all exchanges

Just to get a better view on what was imported

In [8]:
sp.write_excel()
Wrote matching file to:
/Users/cmutel/Library/Application Support/Brightway3/student-project-SimaPro-import.679e4acaf664ea565578a3224feee904/export/db-matching-CAES.xlsx

Write the database

In [18]:
sp.write_database()
Writing activities to SQLite3 database:
0%  100%
[####] | ETA[sec]: 0.000 
Total time elapsed: 0.206 sec
Title: Writing activities to SQLite3 database:
  Started: 05/22/2015 11:33:00
  Finished: 05/22/2015 11:33:00
  Total time elapsed: 0.206 sec
  CPU %: 88.700000
  Memory %: 3.946972
Created database: CAES
Out[18]:
Brightway2 SQLiteBackend: CAES