Usage guide¶
THIS CHAPTER IS WORK IN PROGRESS…
DeflexScenario¶
The scenario class DeflexScenario
is a central
element of deflex.
All input data is stored as a dictionary in the input_data
attribute of the
DeflexScenario
class. The keys of the dictionary
are names of the data
table and the values are pandas.DataFrame
or pandas.Series
with the
data.
Load input data¶
At the moment, there are two methods to populate this attribute from files:
- read_csv() - read a directory with all needed csv files.
- read_xlsx() - read a spread sheet in the
.xlsx
To learn how to create a valid input data set see “REFERENCE”.
from deflex import scenario
sc = scenario.DeflexScenario()
sc.read_xlsx("path/to/xlsx/file.xlsx")
# OR
sc.read_csv("path/to/csv/dir")
Solve the energy system¶
A valid input data set describes an energy system. To optimise the dispatch of the energy system a external solver is needed. By default the CBC solver is used but different solver are possible (see: solver).
The simplest way to solve a scenario is the compute()
method.
sc.compute()
To use a different solver one can pass the solver
parameter.
sc.compute(solver="glpk")
Store and restore the scenario¶
The dump()
method can be used to store the scenario. a solved scenario will
be stored with the results. The scenario is stored in a binary format and it is
not human readable.
sc.dump("path/to/store/results.dflx")
To restore the scenario use the restore_scenario
function:
sc = scenario.restore_scenario("path/to/store/results.dflx")
Analyse the scenario¶
Most analyses cannot be taken if the scenario is not solved. However, the merit order can be shown only based on the input data:
from deflex import DeflexScenario
from deflex import analyses
sc = DeflexScenario()
sc.read_xlsx("path/to/xlsx/file.xlsx")
pp = analyses.merit_order_from_scenario(sc)
ax = plt.figure(figsize=(15, 4)).add_subplot(1, 1, 1)
ax.step(pp["capacity_cum"].values, pp["costs_total"].values, where="pre")
ax.set_xlabel("Cumulative capacity [GW]")
ax.set_ylabel("Marginal costs [EUR/MWh]")
ax.set_ylim(0)
ax.set_xlim(0, pp["capacity_cum"].max())
plt.show()
With the de02_co2-price_var-costs.xlsx from the examples the code above will produce the following plot:
Filling the area between the line and the x-axis with colors according the fuel of the power plant oen get the following plot:
IMPORTANT: This is just an example and not a source for the actual merit order in Germany.
Results¶
All results are stored in ther
results
attribute of the
Scenario
class. It is a dictionary with the
following keys:
- main – Results of all variables
- param – Input parameter
- meta – Meta information and tags of the scenario
- problem – Information about the linear problem such as lower bound, upper bound etc.
- solver – Solver results
- solution – Information about the found solution and the objective value
The deflex
package provides some analyse functions as described below but
it is also possible to write your own post processing. See the
results chapter of the oemof.solph documentation
to learn how to access the results.
Fetch results¶
To find results file on your hard disc you can use the
search_results()
function. This function
provides a filter parameter which can be used to filter your own meta tags. The
meta
attribute of the
Scenario
class can store these meta tags in a
dictionary with the tag-name as key and the value.
meta = {
"regions": 17,
"heat": True,
"tag": "value",
}
The filter for these tags will look as follows. The values in the filter have to be strings regardless of the original type:
search_results(path=TEST_PATH, regions=["17", "21"], heat=["true"])
There is always an AND
connection between all filters and an OR
connectionso within a list. So The filter above will only return results with
17 or
21 regions and
with the heat-tag set to true. The returning list
can be used as an input parameter to load the results and get a list of results
dictionaries.
my_result_files = search_results(path=my_path)
my_results = restore_results(my_result_files)
If a single file name is passed to the
restore_results()
function a single result will
be returned, otherwise a list.
Get common values from results¶
Common values are emissions, costs and energy of the flows. The function
get_flow_results()
returns a MultiIndex
DataFrame with the costs, emissions and the energy of all flows. The values
are absolute and specific. The specific values are divided by the power so
that the specific power gives you the status (on/off).
At the moment this works only with hourly time steps. The units are as flows:
- absolute emissions -> tons
- specific emissions -> tons/MWh
- absolute costs -> EUR
- specific costs -> EUR/MWh
- absolute energy -> MWh
- specific energy -> –
The resulting table of the function can be stored as a .csv
or .xlsx
file. The input is one results dictionary:
from deflex import postprocessing as pp
from deflex.analyses import get_flow_results
my_result_files = pp.search_results(path=my_path)
my_results = pp.restore_results(my_result_files[0])
flow_results = get_flow_results(my_result)
flow_results.to_csv("/my/path/flow_results.csv")
The resulting table can be used to calculate other key values in your own functions but you can also use some ready-made functions. Follow the link to get information about each function:
calculate_market_clearing_price()
calculate_emissions_most_expensive_pp()
We are planing to add more calculations in the future. Please let us know if
you have any ideas and open an issue.
All these functions above are integrated in the
get_key_values_from_results()
function. This function
takes a list of results and returns one MultiIndex DataFrame. It contains all
the return values from the functions above for each scenario. The first column
level contains the value names and the second level the names of the scenario.
The value names are:
- mcp
- emissions_most_expensive_pp
The name of the scenario is taken from the name
key of the meta attribute.
If this key is not available you have to set it for each scenario, otherwise
the function will fail. The resulting table can be stored as a .csv
or
.xlsx
file.
from deflex import postprocessing as pp
from deflex.analyses import get_flow_results
my_result_files = pp.search_results(path=my_path)
my_results = pp.restore_results(my_result_files)
kv = get_key_values_from_results(my_results)
kv.to_csv("/my/path/key_values.csv")
If you have many scenarios, the resulting table may become quite big.
Therefore, you can skip values you do not need in your resulting table. If you
do need only the emissions and not the market clearing price you can exclude
the mcp
.
kv = get_key_values_from_results(my_results, mcp=False)
Parallel computing of scenarios¶
For the typical work flow (creating a scenario, loading the input data,
computing the scenario and storing the results) the
model_scenario()
function can be used.
To collect all scenarios from a given directory the function
fetch_scenarios_from_dir()
can be used. The function will
search for .xlsx
files or paths that end on _csv
and cannot
distinguish between a valid scenario and any .xlsx
file or paths that
accidentally contain _csv
.
No matter how you collect a list of a scenario input data files the
batch_model_scenario()
function makes it easier to run
each scenario and get back the relevant information about the run. It is
possible to ignore exceptions so that the script will go on with the following
scenarios if one scenario fails.
If you have enough memory and cpu capacity on your computer/server you can
optimise your scenarios in parallel. Use the
model_multi_scenarios()
function for this task. You can
pass a list of scenario files to this function. A cpu fraction will limit the
number of processes as a fraction of the maximal available number of cpu cores.
Keep in mind that for large models the memory will be the limit not the cpu
capacity. If a memory error occurs the script will stop immediately. It is not
possible to catch a memory error. A log-file will log all failing and
successful runs.
Input data¶
The input data is stored in the
input_data
attribute of the
DeflexScenario
class (s. DeflexScenario). It is a dictionary with the name of the
data set as key and the data table itself as value (pandas.DataFrame or
pandas.Series).
The input data is divided into four main topics: High-level-inputs, electricity sector, heating sector (optional) and mobility sector (optional).
Download a fictive input data example to get an idea of the structure. Then go on with the following chapter to learn everything about how to define the data for a deflex model.
Overview¶

A Deflex scenario can be divided into regions. Each region must have an
identifier number and be named after it as DEXX
, where XX
is the
number. For refering the Deflex scenario as a whole (i.e. the sum of all
regions) use DE
only.
At the current state the distribution of fossil fuels is neglected. Therefore,
in order to keep the computing time low it is recommended to define them
supra-regional using DE
without a number. It is still possible to define
them regional for example to add a specific limit for each region.
In most cases it is also sufficient to model the fossil part of the mobility and the decentralised heating sector supra-regional. It is assumed that a gas boiler or a filling station is always supplied with enough fuel, so that the only the annual values affect the model. This does not apply to electrical heating systems or cars.
In most spread sheet software it is possible to connect cells to increase readability. These lines are interpreted correctly. In csv files the values have to appear in every cell. So the following two tables will be interpreted equally!
Connected cells
value | ||
DE01 | F1 | |
F2 | ||
DE02 | F1 |
Unconnected cells
value | ||
DE01 | F1 | |
DE01 | F2 | |
DE02 | F1 |
Note
NaN-values are not allowed in any table. Some columns are optional and can
be left out, but if a column is present there have to be values in every
row. Neutral values can be 0
, 1
or inf
.
High-level-input (mandatory)¶
General¶
key:
‘general’, value:
pandas.Series()
This table contains basic data about the scenario.
year | |
number of time steps | |
co2 price | |
name |
INDEX
- year:
int
, [-] - A time index will be created started with January 1, at 00:00 with the number of hours given in number of time steps.
- number of time steps:
int
, [-] - The number of hourly time steps.
- co2 price:
float
, [€/t] - The average price for CO2 over the whole time period.
- name:
str
, [-] - A name for the scenario. This name will be used to compare key values between different scenarios. Therefore, it should be unique within a group of scenarios. It does not have to be intuitive. Use the info table for a human readable description of your scenario.
Info¶
key:
‘info’, value:
pandas.Series()
On this sheet, additional information that characterizes the scenario can be
added. The idea behind Info is that the user can filter stored scenarios using
the search_results()
function.
You can create any key-value pair which is suitable for you group of scenarios.
e.g. key: scenario_type
value: foo
/ bar
/ foobar
Afterwards you can search for all scenarios where the scenario_type
is
foo
using:
search_results(path=my_path, scenario_type=["foo"])
or with other keys and multiple values:
search_results(path=my_path, scenario_type=["foo", "bar"], my_key["v1"])
The second code line will return only files with (foo
or bar
) and
v1
.
key1 | |
key2 | |
key3 | |
… | … |
Commodity sources¶
key:
‘commodity sources’, value:
pandas.DataFrame()
This sheet requires data fromm all the commodities used in the scenario. The data can be provided either supra-regional under DE, regional under DEXX or as a combination of both, where some commodities are global and some are regional. Regionalised commodities are specially useful for commodities with an annual limit, for example bioenergy.
costs | emission | annual limit | ||
DE | F1 | |||
F2 | ||||
DE01 | F1 | |||
DE02 | F2 | |||
… | … | … | … | … |
INDEX
- level 0:
str
- Region (e.g. DE01, DE02 or DE).
- level 1:
str
- Fuel type.
COLUMNS
- costs:
float
, [€/MWh] - The fuel production cost.
- emission:
float
, [t/MWh] - The fuel emission factor.
- annual limit:
float
, [MWh] - The annual maximum energy generation (if there is one, otherwise just use
inf). If the
annual limit
isinf
in every line the column can be left out.
Data sources¶
key:
‘data sources’, value:
pandas.DataFrame()
Highly recomended. Here the type data, the source name and the url from where they were obtained can be listed. It is a free format and additional columns can be added. This table helps to make your scenario as transparent as possible.
source | url | v1 | … | |
cost data | Institute | http1 | a1 | … |
pv plants | Organisation | http2 | a2 | … |
… | … | … | … | … |
Electricity sector (mandatory)¶
Electricity demand series¶
key:
‘electricity demand series’,
value:
pandas.DataFrame()
This sheet requires the electricity demand of the scenario as a time series. One summarised demand series for each region is enough, but it is possible to distinguish between different types. This will not have any effect on the model results but may help to distinguish the different flows in the results.
DE01 | DE02 | DE03 | … | |||
all | indsutry | buildings | rest | all | … | |
Time step 1 | … | |||||
Time step 2 | … | |||||
… | … | … | … | … | … | … |
INDEX
- time step:
int
- Number of time step. Must be uniform in all series tables.
COLUMNS
unit: [MW]
- level 0:
str
- Region (e.g. DE01, DE02).
- level 1:
str
- Specification of the series e.g. “all” for an overall series.
Power plants¶
key:
‘power plants’, value:
pandas.DataFrame()
The power plants will feed in the electricity bus of the region the are located. The data must be divided by region and subdivided by fuel. Each row can indicate one power plant or a group of power plants. It is possible to add additional columns for information purposes.
capacity | fuel | efficiency | annual electricity limit | variable_cost | downtime_factor | source_region | ||
DE01 | N1 | |||||||
N2 | ||||||||
N3 | ||||||||
DE02 | N2 | |||||||
N3 | ||||||||
… | … | … | … | … | … | … | … | … |
INDEX
- level 0:
str
- Region (e.g. DE01, DE02).
- level 1:
str
- Name, arbitrary. The combination of region and name is the unique identifier for the power plant or the group of power plants.
COLUMNS
- capacity:
float
, [MW] - The installed capacity of the power plant or the group of power plants.
- fuel:
str
, [-] - The used fuel of the power plant or group of power plants. The combination of source_region and fuel must exist in the commodity sources table.
- efficiency:
float
, [-] - The average overall efficiency of the power plant or the group of power plants.
- annual limit:
float
, [MWh] - The absolute maximum limit of produced electricity within the whole modeling period.
- variable_costs:
float
, [€/MWh] - The variable costs per produced electricity unit.
- downtime_factor:
float
, [-] - The time fraction of the modeling period in which the power plant or the
group of power plants cannot produce electricity. The installed capacity
will be reduced by this factor
capacity * (1 - downtime_factor)
. - source_region, [-]
- The source region of the fuel source. Typically this is the region of the
index or
DE
if it is a global commodity source. The combination of source_region and fuel must exist in the commodity sources table.
Volatiles plants¶
key:
‘volatile plants’, value:
pandas.DataFrame()
Examples of volatile power plants are solar, wind, hydro, geothermal. Data must be provided divided by region and subdivided by energy source. Each row can indicate one plant or a group of plants. It is possible to add additional columns for information purposes.
capacity | ||
DE01 | N1 | |
N2 | ||
DE02 | N1 | |
DE03 | N1 | |
N3 | ||
… | … | … |
INDEX
- level 0:
str
- Region (e.g. DE01, DE02).
- level 1:
str
- Name, arbitrary. The combination of the region and the name has to exist as a time series in the volatile series table.
COLUMNS
- capacity:
float
, [MW] - The installed capacity of the plant.
Volatile series¶
key:
‘volatile series’, value:
pandas.DataFrame()
This sheet provides the normalised feed-in time series in MW/MW installed. So each time series will multiplied with its installed capacity to get the absolute feed-in. Therefore, the combination of region and name has to exist in the volatile plants table.
DE01 | DE02 | DE03 | … | |||
N1 | N2 | N1 | N1 | N3 | … | |
Time step 1 | … | |||||
Time step 2 | … | |||||
… | … | … | … | … | … | … |
INDEX
- time step:
int
- Number of time step. Must be uniform in all series tables.
COLUMNS
unit: [MW]
- level 0:
str
- Region (e.g. DE01, DE02).
- level 1:
str
- Name of the energy source specified in the previous sheet.
Electricity storages¶
key:
‘electricity storages’, value:
pandas.DataFrame()
A types of electricity storages can be defined in this table. All different storage technologies (pumped hydro, batteries, compressed air, etc) have to be entered in a general way. Each row can indicate one storage or a group of storages. It is possible to add additional columns for information purposes.
energy content | energy inflow | charge capacity | discharge capacity | charge efficiency | discharge efficiency | loss rate | ||
DE01 | S1 | |||||||
S2 | ||||||||
DE02 | S2 | |||||||
… | … | … | … | … | … | … | … | … |
INDEX
- level 0:
str
- Region (e.g. DE01, DE02).
- level 1:
str
- Name, arbitrary.
COLUMNS
- energy content:
float
, [MWh] - The maximum energy content of a storage or a group storages.
- energy inflow:
float
, [MWh] - The amount of energy that will feed into the storage of the model period in MWh. For example a river into a pumped hydroelectric energy storage.
- charge capacity:
float
, [MW] - Maximum capacity to charge the storage or the group of storages.
- discharge capacity:
float
, [MW] - Maximum capacity to discharge the storage or the group of storages.
- charge efficiency:
float
, [-] - Charging efficiency of the storage or the group of storages.
- discharge efficiency:
float
, [-] - Discharging efficiency of the storage or the group of storages.
- loss rate:
float
, [-] - The relative loss of the energy content of the storage. For example a loss rate or 0.01 means that the energy content of the storage will be reduced by 1% in each time step.
Power lines¶
key:
‘power lines’, value:
pandas.DataFrame()
The power lines table defines the connection between the electricity buses of each region of the scenario. There is no default connection. If no connection is defined the regions will be self-sufficient.
capacity | efficiency | |
DE01-DE02 | ||
DE01-DE03 | ||
DE02-DE03 | ||
… | … | … |
INDEX
- Name:
str
- Name of the 2 connected regions separated by a dash. Define only one direction. In the model one line for each direction will be created. If both directions are defined in the table two lines for each direction will be created for the model, so that the capacity will be the sum of both lines.
COLUMNS
- capacity:
float
, [MW] - The maximum transmission capacity of the power lines.
- efficiency:
float
, [-] - The transmission efficiency of the power line.
Heating sector (optional)¶
Heat demand series¶
key:
‘heat demand series’, value:
pandas.DataFrame()
The heat demand can be entered regionally under DEXX or supra-regional under DE. The only type of demand that must be entered regionally is district heating. As recommendation, coal, gas, or oil demands should be treated supra-regional.
DE01 | DE02 | DE | |||||||
district heating | N1 | district heating | N1 | N2 | … | N3 | N4 | N5 | |
Time step 1 | |||||||||
Time step 2 | |||||||||
… | … | … | … | … | … | … | … | … | … |
INDEX
- time step:
int
- Number of time step. Must be uniform in all series tables.
COLUMNS
unit: [MW]
- level 0:
str
- Region (e.g. DE01, DE02 or DE).
- level 1:
str
- Name. Specification of the series e.g. district heating, coal, gas. Except for district heating each combination of region and name must exist in the decentralised heat table.
Decentralised heat¶
key:
‘decentralised heat’, value:
pandas.DataFrame()
This sheet covers all heating technologies that are used to generate decentralized heat. In this context decentralised does not mean regional it represents the large group of independent heating systems. If there is no specific reason to define a heating system regional they should be defined supra-regional.
efficiency | source | source region | ||
DE01 | N1 | DE01 | ||
DE02 | N1 | DE02 | ||
N2 | DE02 | |||
… | … | |||
DE | N3 | DE | ||
N4 | DE | |||
N5 | DE |
INDEX
- level 0:
str
- Region (e.g. DE01, DE02 or DE).
- level 1:
str
- Name, arbitrary.
COLUMNS
- efficiency:
float
, [-] - The efficiency of the heating technology.
- source:
str
, [-] - The source that the heating technology uses. Examples are coal, oil for commodities, but it could also be electricity in case of a heat pump. Except for electricity the combination of source and source region has to exist in the commodity sources table. The electricity source will be connected to the electricity bus of the region defined in source region.
- source region:
str
- The region where the source comes from (see source).
Chp - heat plants¶
key:
‘chp-heat plants’, value:
pandas.DataFrame()
This sheet covers CHP and heat plants. Each plant will feed into the district heating bus of the region it it is located. The demand of district heating is defined in the heat demand series table with the name district heating. All plants of the same region with the same fuel can be defined in one row but it is also possible to divide them by additional categories such as efficiency etc.
limit heat chp | capacity heat chp | capacity elec chp | limit hp | capacity hp | efficiency hp | efficiency heat chp | efficiency elec chp | fuel | source region | ||
DE01 | N1 | DE01 | |||||||||
N3 | DE | ||||||||||
N4 | DE | ||||||||||
DE02 | N1 | DE02 | |||||||||
N2 | DE02 | ||||||||||
N3 | DE | ||||||||||
N4 | DE | ||||||||||
N5 | DE | ||||||||||
… | … | … | … | … | … | … | … | … | … | … | … |
INDEX
- level 0:
str
- Region (e.g. DE01, DE02).
- level 1:
str
- Name, arbitrary.
COLUMNS
- limit heat chp:
float
, [MWh] - The absolute maximum limit of heat produced by chp within the whole modeling period.
- capacity heat chp:
float
, [MW] - The installed heat capacity of all chp plants of the same group in the region.
- capacity elect chp:
float
, [MW] - The installed electricity capacity of all chp plants of the same group in the region.
- limit hp:
float
, [MWh] - The absolute maximum limit of heat produced by the heat plant within the whole modeling period.
- capacity hp:
float
, [MW] - The installed heat capacity of all heat of the same group in the region.
- efficiency hp:
float
, [-] - The average overall efficiency of the heat plant.
- efficiency heat chp:
float
, [-] - The average overall heat efficiency of the chp.
- efficiency elect chp:
float
, [-] - The average overall electricity efficiency of the chp.
- fuel:
str
, [-] - The used fuel of the plants. The fuel name must be equal to the fuel type of the commodity sources. The combination of fuel and source region has to exist in the commodity sources table.
- source_region, [-]
- The source region of the fuel source. Typically this is the region of the
index or
DE
if it is a global commodity source.
Mobility sector (optional)¶
Mobility demand series¶
key:
‘mobility series’, value:
pandas.DataFrame()
The mobility demand can be entered regionally or supra-regional. However, it is recommended to define the mobility demand supra-regional except for electricity. The demand for electric mobility has be defined regional because it will be connected to the electricity bus of each region. The combination of region and name has to exist in the mobility table.
DE01 | DE02 | … | DE | |
electricity | electricity | N1 | ||
Time step 1 | ||||
Time step 2 | ||||
… | … | … | … | … |
INDEX
- time step:
int
- Number of time step. Must be uniform in all series tables.
COLUMNS
unit: [MW]
- level 0:
str
- Region (e.g. DE01, DE02 or DE).
- level 1:
str
- Specification of the series e.g. “electricity” for each region or “diesel”, “petrol” for DE.
Mobility¶
key:
‘mobility’, value:
pandas.DataFrame()
This sheet covers the technologies of the mobility sector.
efficiency | source | source region | ||
DE01 | electricity | electricity | DE01 | |
DE02 | electricity | electricity | DE02 | |
… | ||||
DE | N1 | oil/biofuel/H2/etc | DE |
INDEX
- level 0:
str
- Region (e.g. DE01, DE02 or DE).
- level 1:
str
- Name, arbitrary.
COLUMNS
- efficiency:
float
, [-] - The efficiency of the fuel production. If a diesel demand is defined in the mobility demand series table the efficiency represents the efficiency of diesel production from the commodity source e.g. oil. For a biofuel demand the efficiency of the production of biofuel from biomass has to be defined.
- source:
str
, [-] - The source that the technology uses. Except for electricity the combination of source and source region has to exist in the commodity sources table. The electricity source will be connected to the electricity bus of the region defined in source region.
- source region:
str
, [-] - The region where the source comes from.