Usage guide

THIS CHAPTER IS WORK IN PROGRESS…

DeflexScenario

The scenario class DeflexScenario is a central element of deflex.

All input data is stored as a dictionary in the input_data attribute of the DeflexScenario class. The keys of the dictionary are names of the data table and the values are pandas.DataFrame or pandas.Series with the data.

https://raw.githubusercontent.com/reegis/deflex/master/docs/images/deflex_scenario_class.svg

Load input data

At the moment, there are two methods to populate this attribute from files:

  • read_csv() - read a directory with all needed csv files.
  • read_xlsx() - read a spread sheet in the .xlsx

To learn how to create a valid input data set see “REFERENCE”.

from deflex import scenario
sc = scenario.DeflexScenario()
sc.read_xlsx("path/to/xlsx/file.xlsx")
# OR
sc.read_csv("path/to/csv/dir")

Solve the energy system

A valid input data set describes an energy system. To optimise the dispatch of the energy system a external solver is needed. By default the CBC solver is used but different solver are possible (see: solver).

The simplest way to solve a scenario is the compute() method.

sc.compute()

To use a different solver one can pass the solver parameter.

sc.compute(solver="glpk")

Store and restore the scenario

The dump() method can be used to store the scenario. a solved scenario will be stored with the results. The scenario is stored in a binary format and it is not human readable.

sc.dump("path/to/store/results.dflx")

To restore the scenario use the restore_scenario function:

sc = scenario.restore_scenario("path/to/store/results.dflx")

Analyse the scenario

Most analyses cannot be taken if the scenario is not solved. However, the merit order can be shown only based on the input data:

from deflex import DeflexScenario
from deflex import analyses
sc = DeflexScenario()
sc.read_xlsx("path/to/xlsx/file.xlsx")
pp = analyses.merit_order_from_scenario(sc)
ax = plt.figure(figsize=(15, 4)).add_subplot(1, 1, 1)
ax.step(pp["capacity_cum"].values, pp["costs_total"].values, where="pre")
ax.set_xlabel("Cumulative capacity [GW]")
ax.set_ylabel("Marginal costs [EUR/MWh]")
ax.set_ylim(0)
ax.set_xlim(0, pp["capacity_cum"].max())
plt.show()

With the de02_co2-price_var-costs.xlsx from the examples the code above will produce the following plot:

_images/merit_order_example_plot_simple.svg

Filling the area between the line and the x-axis with colors according the fuel of the power plant oen get the following plot:

_images/merit_order_example_plot_coloured.svg

IMPORTANT: This is just an example and not a source for the actual merit order in Germany.

Results

All results are stored in ther results attribute of the Scenario class. It is a dictionary with the following keys:

  • main – Results of all variables
  • param – Input parameter
  • meta – Meta information and tags of the scenario
  • problem – Information about the linear problem such as lower bound, upper bound etc.
  • solver – Solver results
  • solution – Information about the found solution and the objective value

The deflex package provides some analyse functions as described below but it is also possible to write your own post processing. See the results chapter of the oemof.solph documentation to learn how to access the results.

Fetch results

To find results file on your hard disc you can use the search_results() function. This function provides a filter parameter which can be used to filter your own meta tags. The meta attribute of the Scenario class can store these meta tags in a dictionary with the tag-name as key and the value.

meta = {
    "regions": 17,
    "heat": True,
    "tag": "value",
    }

The filter for these tags will look as follows. The values in the filter have to be strings regardless of the original type:

search_results(path=TEST_PATH, regions=["17", "21"], heat=["true"])

There is always an AND connection between all filters and an OR connectionso within a list. So The filter above will only return results with 17 or 21 regions and with the heat-tag set to true. The returning list can be used as an input parameter to load the results and get a list of results dictionaries.

my_result_files = search_results(path=my_path)
my_results = restore_results(my_result_files)

If a single file name is passed to the restore_results() function a single result will be returned, otherwise a list.

Get common values from results

Common values are emissions, costs and energy of the flows. The function get_flow_results() returns a MultiIndex DataFrame with the costs, emissions and the energy of all flows. The values are absolute and specific. The specific values are divided by the power so that the specific power gives you the status (on/off).

At the moment this works only with hourly time steps. The units are as flows:

  • absolute emissions -> tons
  • specific emissions -> tons/MWh
  • absolute costs -> EUR
  • specific costs -> EUR/MWh
  • absolute energy -> MWh
  • specific energy -> –

The resulting table of the function can be stored as a .csv or .xlsx file. The input is one results dictionary:

from deflex import postprocessing as pp
from deflex.analyses import get_flow_results

my_result_files = pp.search_results(path=my_path)
my_results = pp.restore_results(my_result_files[0])
flow_results = get_flow_results(my_result)
flow_results.to_csv("/my/path/flow_results.csv")

The resulting table can be used to calculate other key values in your own functions but you can also use some ready-made functions. Follow the link to get information about each function:

  • calculate_market_clearing_price()
  • calculate_emissions_most_expensive_pp()

We are planing to add more calculations in the future. Please let us know if you have any ideas and open an issue. All these functions above are integrated in the get_key_values_from_results() function. This function takes a list of results and returns one MultiIndex DataFrame. It contains all the return values from the functions above for each scenario. The first column level contains the value names and the second level the names of the scenario. The value names are:

  • mcp
  • emissions_most_expensive_pp

The name of the scenario is taken from the name key of the meta attribute. If this key is not available you have to set it for each scenario, otherwise the function will fail. The resulting table can be stored as a .csv or .xlsx file.

from deflex import postprocessing as pp
from deflex.analyses import get_flow_results

my_result_files = pp.search_results(path=my_path)
my_results = pp.restore_results(my_result_files)
kv = get_key_values_from_results(my_results)
kv.to_csv("/my/path/key_values.csv")

If you have many scenarios, the resulting table may become quite big. Therefore, you can skip values you do not need in your resulting table. If you do need only the emissions and not the market clearing price you can exclude the mcp.

kv = get_key_values_from_results(my_results, mcp=False)

Parallel computing of scenarios

For the typical work flow (creating a scenario, loading the input data, computing the scenario and storing the results) the model_scenario() function can be used.

To collect all scenarios from a given directory the function fetch_scenarios_from_dir() can be used. The function will search for .xlsx files or paths that end on _csv and cannot distinguish between a valid scenario and any .xlsx file or paths that accidentally contain _csv.

No matter how you collect a list of a scenario input data files the batch_model_scenario() function makes it easier to run each scenario and get back the relevant information about the run. It is possible to ignore exceptions so that the script will go on with the following scenarios if one scenario fails.

If you have enough memory and cpu capacity on your computer/server you can optimise your scenarios in parallel. Use the model_multi_scenarios() function for this task. You can pass a list of scenario files to this function. A cpu fraction will limit the number of processes as a fraction of the maximal available number of cpu cores. Keep in mind that for large models the memory will be the limit not the cpu capacity. If a memory error occurs the script will stop immediately. It is not possible to catch a memory error. A log-file will log all failing and successful runs.

Input data

The input data is stored in the input_data attribute of the DeflexScenario class (s. DeflexScenario). It is a dictionary with the name of the data set as key and the data table itself as value (pandas.DataFrame or pandas.Series).

The input data is divided into four main topics: High-level-inputs, electricity sector, heating sector (optional) and mobility sector (optional).

Download a fictive input data example to get an idea of the structure. Then go on with the following chapter to learn everything about how to define the data for a deflex model.

Overview

A Deflex scenario can be divided into regions. Each region must have an identifier number and be named after it as DEXX, where XX is the number. For refering the Deflex scenario as a whole (i.e. the sum of all regions) use DE only.

At the current state the distribution of fossil fuels is neglected. Therefore, in order to keep the computing time low it is recommended to define them supra-regional using DE without a number. It is still possible to define them regional for example to add a specific limit for each region.

In most cases it is also sufficient to model the fossil part of the mobility and the decentralised heating sector supra-regional. It is assumed that a gas boiler or a filling station is always supplied with enough fuel, so that the only the annual values affect the model. This does not apply to electrical heating systems or cars.

In most spread sheet software it is possible to connect cells to increase readability. These lines are interpreted correctly. In csv files the values have to appear in every cell. So the following two tables will be interpreted equally!

Connected cells

    value
DE01 F1  
F2  
DE02 F1  

Unconnected cells

    value
DE01 F1  
DE01 F2  
DE02 F1  

Note

NaN-values are not allowed in any table. Some columns are optional and can be left out, but if a column is present there have to be values in every row. Neutral values can be 0, 1 or inf.

High-level-input (mandatory)

General

key: ‘general’, value: pandas.Series()

This table contains basic data about the scenario.

year  
number of time steps  
co2 price  
name  

INDEX

year: int, [-]
A time index will be created started with January 1, at 00:00 with the number of hours given in number of time steps.
number of time steps: int, [-]
The number of hourly time steps.
co2 price: float, [€/t]
The average price for CO2 over the whole time period.
name: str, [-]
A name for the scenario. This name will be used to compare key values between different scenarios. Therefore, it should be unique within a group of scenarios. It does not have to be intuitive. Use the info table for a human readable description of your scenario.

Info

key: ‘info’, value: pandas.Series()

On this sheet, additional information that characterizes the scenario can be added. The idea behind Info is that the user can filter stored scenarios using the search_results() function.

You can create any key-value pair which is suitable for you group of scenarios.

e.g. key: scenario_type value: foo / bar / foobar

Afterwards you can search for all scenarios where the scenario_type is foo using:

search_results(path=my_path, scenario_type=["foo"])

or with other keys and multiple values:

search_results(path=my_path, scenario_type=["foo", "bar"], my_key["v1"])

The second code line will return only files with (foo or bar) and v1.

key1  
key2  
key3  

Commodity sources

key: ‘commodity sources’, value: pandas.DataFrame()

This sheet requires data fromm all the commodities used in the scenario. The data can be provided either supra-regional under DE, regional under DEXX or as a combination of both, where some commodities are global and some are regional. Regionalised commodities are specially useful for commodities with an annual limit, for example bioenergy.

    costs emission annual limit
DE F1      
F2      
DE01 F1      
DE02 F2      

INDEX

level 0: str
Region (e.g. DE01, DE02 or DE).
level 1: str
Fuel type.

COLUMNS

costs: float, [€/MWh]
The fuel production cost.
emission: float, [t/MWh]
The fuel emission factor.
annual limit: float, [MWh]
The annual maximum energy generation (if there is one, otherwise just use inf). If the annual limit is inf in every line the column can be left out.

Data sources

key: ‘data sources’, value: pandas.DataFrame()

Highly recomended. Here the type data, the source name and the url from where they were obtained can be listed. It is a free format and additional columns can be added. This table helps to make your scenario as transparent as possible.

  source url v1
cost data Institute http1 a1
pv plants Organisation http2 a2

Electricity sector (mandatory)

Electricity demand series

key: ‘electricity demand series’, value: pandas.DataFrame()

This sheet requires the electricity demand of the scenario as a time series. One summarised demand series for each region is enough, but it is possible to distinguish between different types. This will not have any effect on the model results but may help to distinguish the different flows in the results.

  DE01 DE02 DE03
  all indsutry buildings rest all
Time step 1          
Time step 2          

INDEX

time step: int
Number of time step. Must be uniform in all series tables.

COLUMNS

unit: [MW]

level 0: str
Region (e.g. DE01, DE02).
level 1: str
Specification of the series e.g. “all” for an overall series.

Power plants

key: ‘power plants’, value: pandas.DataFrame()

The power plants will feed in the electricity bus of the region the are located. The data must be divided by region and subdivided by fuel. Each row can indicate one power plant or a group of power plants. It is possible to add additional columns for information purposes.

    capacity fuel efficiency annual electricity limit variable_cost downtime_factor source_region
DE01 N1              
N2              
N3              
DE02 N2              
N3              

INDEX

level 0: str
Region (e.g. DE01, DE02).
level 1: str
Name, arbitrary. The combination of region and name is the unique identifier for the power plant or the group of power plants.

COLUMNS

capacity: float, [MW]
The installed capacity of the power plant or the group of power plants.
fuel: str, [-]
The used fuel of the power plant or group of power plants. The combination of source_region and fuel must exist in the commodity sources table.
efficiency: float, [-]
The average overall efficiency of the power plant or the group of power plants.
annual limit: float, [MWh]
The absolute maximum limit of produced electricity within the whole modeling period.
variable_costs: float, [€/MWh]
The variable costs per produced electricity unit.
downtime_factor: float, [-]
The time fraction of the modeling period in which the power plant or the group of power plants cannot produce electricity. The installed capacity will be reduced by this factor capacity * (1 - downtime_factor).
source_region, [-]
The source region of the fuel source. Typically this is the region of the index or DE if it is a global commodity source. The combination of source_region and fuel must exist in the commodity sources table.

Volatiles plants

key: ‘volatile plants’, value: pandas.DataFrame()

Examples of volatile power plants are solar, wind, hydro, geothermal. Data must be provided divided by region and subdivided by energy source. Each row can indicate one plant or a group of plants. It is possible to add additional columns for information purposes.

    capacity
DE01 N1  
  N2  
DE02 N1  
DE03 N1  
  N3  

INDEX

level 0: str
Region (e.g. DE01, DE02).
level 1: str
Name, arbitrary. The combination of the region and the name has to exist as a time series in the volatile series table.

COLUMNS

capacity: float, [MW]
The installed capacity of the plant.

Volatile series

key: ‘volatile series’, value: pandas.DataFrame()

This sheet provides the normalised feed-in time series in MW/MW installed. So each time series will multiplied with its installed capacity to get the absolute feed-in. Therefore, the combination of region and name has to exist in the volatile plants table.

  DE01 DE02 DE03
  N1 N2 N1 N1 N3
Time step 1          
Time step 2          

INDEX

time step: int
Number of time step. Must be uniform in all series tables.

COLUMNS

unit: [MW]

level 0: str
Region (e.g. DE01, DE02).
level 1: str
Name of the energy source specified in the previous sheet.

Electricity storages

key: ‘electricity storages’, value: pandas.DataFrame()

A types of electricity storages can be defined in this table. All different storage technologies (pumped hydro, batteries, compressed air, etc) have to be entered in a general way. Each row can indicate one storage or a group of storages. It is possible to add additional columns for information purposes.

    energy content energy inflow charge capacity discharge capacity charge efficiency discharge efficiency loss rate
DE01 S1              
  S2              
DE02 S2              

INDEX

level 0: str
Region (e.g. DE01, DE02).
level 1: str
Name, arbitrary.

COLUMNS

energy content: float, [MWh]
The maximum energy content of a storage or a group storages.
energy inflow: float, [MWh]
The amount of energy that will feed into the storage of the model period in MWh. For example a river into a pumped hydroelectric energy storage.
charge capacity: float, [MW]
Maximum capacity to charge the storage or the group of storages.
discharge capacity: float, [MW]
Maximum capacity to discharge the storage or the group of storages.
charge efficiency: float, [-]
Charging efficiency of the storage or the group of storages.
discharge efficiency: float, [-]
Discharging efficiency of the storage or the group of storages.
loss rate: float, [-]
The relative loss of the energy content of the storage. For example a loss rate or 0.01 means that the energy content of the storage will be reduced by 1% in each time step.

Power lines

key: ‘power lines’, value: pandas.DataFrame()

The power lines table defines the connection between the electricity buses of each region of the scenario. There is no default connection. If no connection is defined the regions will be self-sufficient.

  capacity efficiency
DE01-DE02    
DE01-DE03    
DE02-DE03    

INDEX

Name: str
Name of the 2 connected regions separated by a dash. Define only one direction. In the model one line for each direction will be created. If both directions are defined in the table two lines for each direction will be created for the model, so that the capacity will be the sum of both lines.

COLUMNS

capacity: float, [MW]
The maximum transmission capacity of the power lines.
efficiency:float, [-]
The transmission efficiency of the power line.

Heating sector (optional)

Heat demand series

key: ‘heat demand series’, value: pandas.DataFrame()

The heat demand can be entered regionally under DEXX or supra-regional under DE. The only type of demand that must be entered regionally is district heating. As recommendation, coal, gas, or oil demands should be treated supra-regional.

  DE01 DE02   DE
  district heating N1 district heating N1 N2 N3 N4 N5
Time step 1                  
Time step 2                  

INDEX

time step: int
Number of time step. Must be uniform in all series tables.

COLUMNS

unit: [MW]

level 0: str
Region (e.g. DE01, DE02 or DE).
level 1: str
Name. Specification of the series e.g. district heating, coal, gas. Except for district heating each combination of region and name must exist in the decentralised heat table.

Decentralised heat

key: ‘decentralised heat’, value: pandas.DataFrame()

This sheet covers all heating technologies that are used to generate decentralized heat. In this context decentralised does not mean regional it represents the large group of independent heating systems. If there is no specific reason to define a heating system regional they should be defined supra-regional.

    efficiency source source region
DE01 N1     DE01
DE02 N1     DE02
N2     DE02
     
DE N3     DE
N4     DE
N5     DE

INDEX

level 0: str
Region (e.g. DE01, DE02 or DE).
level 1: str
Name, arbitrary.

COLUMNS

efficiency: float, [-]
The efficiency of the heating technology.
source: str, [-]
The source that the heating technology uses. Examples are coal, oil for commodities, but it could also be electricity in case of a heat pump. Except for electricity the combination of source and source region has to exist in the commodity sources table. The electricity source will be connected to the electricity bus of the region defined in source region.
source region: str
The region where the source comes from (see source).

Chp - heat plants

key: ‘chp-heat plants’, value: pandas.DataFrame()

This sheet covers CHP and heat plants. Each plant will feed into the district heating bus of the region it it is located. The demand of district heating is defined in the heat demand series table with the name district heating. All plants of the same region with the same fuel can be defined in one row but it is also possible to divide them by additional categories such as efficiency etc.

    limit heat chp capacity heat chp capacity elec chp limit hp capacity hp efficiency hp efficiency heat chp efficiency elec chp fuel source region
DE01 N1                   DE01
N3                   DE
N4                   DE
DE02 N1                   DE02
N2                   DE02
N3                   DE
N4                   DE
N5                   DE

INDEX

level 0: str
Region (e.g. DE01, DE02).
level 1: str
Name, arbitrary.

COLUMNS

limit heat chp: float, [MWh]
The absolute maximum limit of heat produced by chp within the whole modeling period.
capacity heat chp: float, [MW]
The installed heat capacity of all chp plants of the same group in the region.
capacity elect chp: float, [MW]
The installed electricity capacity of all chp plants of the same group in the region.
limit hp: float, [MWh]
The absolute maximum limit of heat produced by the heat plant within the whole modeling period.
capacity hp: float, [MW]
The installed heat capacity of all heat of the same group in the region.
efficiency hp: float, [-]
The average overall efficiency of the heat plant.
efficiency heat chp: float, [-]
The average overall heat efficiency of the chp.
efficiency elect chp: float, [-]
The average overall electricity efficiency of the chp.
fuel: str, [-]
The used fuel of the plants. The fuel name must be equal to the fuel type of the commodity sources. The combination of fuel and source region has to exist in the commodity sources table.
source_region, [-]
The source region of the fuel source. Typically this is the region of the index or DE if it is a global commodity source.

Mobility sector (optional)

Mobility demand series

key: ‘mobility series’, value: pandas.DataFrame()

The mobility demand can be entered regionally or supra-regional. However, it is recommended to define the mobility demand supra-regional except for electricity. The demand for electric mobility has be defined regional because it will be connected to the electricity bus of each region. The combination of region and name has to exist in the mobility table.

  DE01 DE02 DE
  electricity electricity   N1
Time step 1        
Time step 2        

INDEX

time step: int
Number of time step. Must be uniform in all series tables.

COLUMNS

unit: [MW]

level 0: str
Region (e.g. DE01, DE02 or DE).
level 1: str
Specification of the series e.g. “electricity” for each region or “diesel”, “petrol” for DE.

Mobility

key: ‘mobility’, value: pandas.DataFrame()

This sheet covers the technologies of the mobility sector.

    efficiency source source region
DE01 electricity   electricity DE01
DE02 electricity   electricity DE02
       
DE N1   oil/biofuel/H2/etc DE

INDEX

level 0: str
Region (e.g. DE01, DE02 or DE).
level 1: str
Name, arbitrary.

COLUMNS

efficiency: float, [-]
The efficiency of the fuel production. If a diesel demand is defined in the mobility demand series table the efficiency represents the efficiency of diesel production from the commodity source e.g. oil. For a biofuel demand the efficiency of the production of biofuel from biomass has to be defined.
source: str, [-]
The source that the technology uses. Except for electricity the combination of source and source region has to exist in the commodity sources table. The electricity source will be connected to the electricity bus of the region defined in source region.
source region: str, [-]
The region where the source comes from.