Usage guide¶

THIS CHAPTER IS WORK IN PROGRESS…

DeflexScenario
Results
- Fetch results
- Get common values from results
Parallel computing of scenarios
Input data

DeflexScenario ¶

The scenario class DeflexScenario is a central element of deflex.

All input data is stored as a dictionary in the input_data attribute of the DeflexScenario class. The keys of the dictionary are names of the data table and the values are pandas.DataFrame or pandas.Series with the data.

Load input data ¶

At the moment, there are two methods to populate this attribute from files:

read_csv() - read a directory with all needed csv files.
read_xlsx() - read a spread sheet in the .xlsx

To learn how to create a valid input data set see “REFERENCE”.

from deflex import scenario
sc = scenario.DeflexScenario()
sc.read_xlsx("path/to/xlsx/file.xlsx")
# OR
sc.read_csv("path/to/csv/dir")

Solve the energy system ¶

A valid input data set describes an energy system. To optimise the dispatch of the energy system a external solver is needed. By default the CBC solver is used but different solver are possible (see: solver).

The simplest way to solve a scenario is the compute() method.

sc.compute()

To use a different solver one can pass the solver parameter.

sc.compute(solver="glpk")

Store and restore the scenario ¶

The dump() method can be used to store the scenario. a solved scenario will be stored with the results. The scenario is stored in a binary format and it is not human readable.

sc.dump("path/to/store/results.dflx")

To restore the scenario use the restore_scenario function:

sc = scenario.restore_scenario("path/to/store/results.dflx")

Analyse the scenario ¶

Most analyses cannot be taken if the scenario is not solved. However, the merit order can be shown only based on the input data:

from deflex import DeflexScenario
from deflex import analyses
sc = DeflexScenario()
sc.read_xlsx("path/to/xlsx/file.xlsx")
pp = analyses.merit_order_from_scenario(sc)
ax = plt.figure(figsize=(15, 4)).add_subplot(1, 1, 1)
ax.step(pp["capacity_cum"].values, pp["costs_total"].values, where="pre")
ax.set_xlabel("Cumulative capacity [GW]")
ax.set_ylabel("Marginal costs [EUR/MWh]")
ax.set_ylim(0)
ax.set_xlim(0, pp["capacity_cum"].max())
plt.show()

With the de02_co2-price_var-costs.xlsx from the examples the code above will produce the following plot:

_images/merit_order_example_plot_simple.svg

Filling the area between the line and the x-axis with colors according the fuel of the power plant oen get the following plot:

_images/merit_order_example_plot_coloured.svg

IMPORTANT: This is just an example and not a source for the actual merit order in Germany.

Results ¶

All results are stored in ther results attribute of the Scenario class. It is a dictionary with the following keys:

main – Results of all variables

param – Input parameter

meta – Meta information and tags of the scenario

problem – Information about the linear problem such as lower bound, upper bound etc.

solver – Solver results

solution – Information about the found solution and the objective value

The deflex package provides some analyse functions as described below but it is also possible to write your own post processing. See the results chapter of the oemof.solph documentation to learn how to access the results.

Fetch results ¶

To find results file on your hard disc you can use the search_results() function. This function provides a filter parameter which can be used to filter your own meta tags. The meta attribute of the Scenario class can store these meta tags in a dictionary with the tag-name as key and the value.

meta = {
    "regions": 17,
    "heat": True,
    "tag": "value",
    }

The filter for these tags will look as follows. The values in the filter have to be strings regardless of the original type:

search_results(path=TEST_PATH, regions=["17", "21"], heat=["true"])

There is always an AND connection between all filters and an OR connectionso within a list. So The filter above will only return results with 17 or 21 regions and with the heat-tag set to true. The returning list can be used as an input parameter to load the results and get a list of results dictionaries.

my_result_files = search_results(path=my_path)
my_results = restore_results(my_result_files)

If a single file name is passed to the restore_results() function a single result will be returned, otherwise a list.

Get common values from results ¶

Common values are emissions, costs and energy of the flows. The function get_flow_results() returns a MultiIndex DataFrame with the costs, emissions and the energy of all flows. The values are absolute and specific. The specific values are divided by the power so that the specific power gives you the status (on/off).

At the moment this works only with hourly time steps. The units are as flows:

absolute emissions -> tons

specific emissions -> tons/MWh

absolute costs -> EUR

specific costs -> EUR/MWh

absolute energy -> MWh

specific energy -> –

The resulting table of the function can be stored as a .csv or .xlsx file. The input is one results dictionary:

from deflex import postprocessing as pp
from deflex.analyses import get_flow_results

my_result_files = pp.search_results(path=my_path)
my_results = pp.restore_results(my_result_files[0])
flow_results = get_flow_results(my_result)
flow_results.to_csv("/my/path/flow_results.csv")

The resulting table can be used to calculate other key values in your own functions but you can also use some ready-made functions. Follow the link to get information about each function:

calculate_market_clearing_price()

calculate_emissions_most_expensive_pp()

We are planing to add more calculations in the future. Please let us know if you have any ideas and open an issue. All these functions above are integrated in the get_key_values_from_results() function. This function takes a list of results and returns one MultiIndex DataFrame. It contains all the return values from the functions above for each scenario. The first column level contains the value names and the second level the names of the scenario. The value names are:

mcp

emissions_most_expensive_pp

The name of the scenario is taken from the name key of the meta attribute. If this key is not available you have to set it for each scenario, otherwise the function will fail. The resulting table can be stored as a .csv or .xlsx file.

from deflex import postprocessing as pp
from deflex.analyses import get_flow_results

my_result_files = pp.search_results(path=my_path)
my_results = pp.restore_results(my_result_files)
kv = get_key_values_from_results(my_results)
kv.to_csv("/my/path/key_values.csv")

If you have many scenarios, the resulting table may become quite big. Therefore, you can skip values you do not need in your resulting table. If you do need only the emissions and not the market clearing price you can exclude the mcp.

kv = get_key_values_from_results(my_results, mcp=False)

Parallel computing of scenarios ¶

For the typical work flow (creating a scenario, loading the input data, computing the scenario and storing the results) the model_scenario() function can be used.

To collect all scenarios from a given directory the function fetch_scenarios_from_dir() can be used. The function will search for .xlsx files or paths that end on _csv and cannot distinguish between a valid scenario and any .xlsx file or paths that accidentally contain _csv.

No matter how you collect a list of a scenario input data files the batch_model_scenario() function makes it easier to run each scenario and get back the relevant information about the run. It is possible to ignore exceptions so that the script will go on with the following scenarios if one scenario fails.

If you have enough memory and cpu capacity on your computer/server you can optimise your scenarios in parallel. Use the model_multi_scenarios() function for this task. You can pass a list of scenario files to this function. A cpu fraction will limit the number of processes as a fraction of the maximal available number of cpu cores. Keep in mind that for large models the memory will be the limit not the cpu capacity. If a memory error occurs the script will stop immediately. It is not possible to catch a memory error. A log-file will log all failing and successful runs.

Input data ¶

The input data is stored in the input_data attribute of the DeflexScenario class (s. DeflexScenario). It is a dictionary with the name of the data set as key and the data table itself as value (pandas.DataFrame or pandas.Series).

The input data is divided into four main topics: High-level-inputs, electricity sector, heating sector (optional) and mobility sector (optional).

Download a fictive input data example to get an idea of the structure. Then go on with the following chapter to learn everything about how to define the data for a deflex model.

Overview
High-level-input (mandatory)
Electricity sector (mandatory)
Heating sector (optional)
Mobility sector (optional)

Overview ¶

https://raw.githubusercontent.com/reegis/deflex/master/docs/images/spreadsheet_examples.png

A Deflex scenario can be divided into regions. Each region must have an identifier number and be named after it as DEXX, where XX is the number. For refering the Deflex scenario as a whole (i.e. the sum of all regions) use DE only.

At the current state the distribution of fossil fuels is neglected. Therefore, in order to keep the computing time low it is recommended to define them supra-regional using DE without a number. It is still possible to define them regional for example to add a specific limit for each region.

In most cases it is also sufficient to model the fossil part of the mobility and the decentralised heating sector supra-regional. It is assumed that a gas boiler or a filling station is always supplied with enough fuel, so that the only the annual values affect the model. This does not apply to electrical heating systems or cars.

In most spread sheet software it is possible to connect cells to increase readability. These lines are interpreted correctly. In csv files the values have to appear in every cell. So the following two tables will be interpreted equally!

Connected cells

		value
DE01	F1
DE01	F2
DE02	F1

Unconnected cells

		value
DE01	F1
DE01	F2
DE02	F1

Note

NaN-values are not allowed in any table. Some columns are optional and can be left out, but if a column is present there have to be values in every row. Neutral values can be 0, 1 or inf.

General ¶

key: ‘general’, value: pandas.Series()

This table contains basic data about the scenario.

year
number of time steps
co2 price
name

INDEX

year: int, [-]: A time index will be created started with January 1, at 00:00 with the number of hours given in number of time steps.
number of time steps: int, [-]: The number of hourly time steps.
co2 price: float, [€/t]: The average price for CO₂ over the whole time period.
name: str, [-]: A name for the scenario. This name will be used to compare key values between different scenarios. Therefore, it should be unique within a group of scenarios. It does not have to be intuitive. Use the info table for a human readable description of your scenario.

Info ¶

key: ‘info’, value: pandas.Series()

On this sheet, additional information that characterizes the scenario can be added. The idea behind Info is that the user can filter stored scenarios using the search_results() function.

You can create any key-value pair which is suitable for you group of scenarios.

e.g. key: scenario_type value: foo / bar / foobar

Afterwards you can search for all scenarios where the scenario_type is foo using:

search_results(path=my_path, scenario_type=["foo"])

or with other keys and multiple values:

search_results(path=my_path, scenario_type=["foo", "bar"], my_key["v1"])

The second code line will return only files with (foo or bar) and v1.

key1
key2
key3
…	…

Commodity sources ¶

key: ‘commodity sources’, value: pandas.DataFrame()

This sheet requires data fromm all the commodities used in the scenario. The data can be provided either supra-regional under DE, regional under DEXX or as a combination of both, where some commodities are global and some are regional. Regionalised commodities are specially useful for commodities with an annual limit, for example bioenergy.

		costs	emission	annual limit
DE	F1
DE	F2
DE01	F1
DE02	F2
…	…	…	…	…

INDEX

level 0: str: Region (e.g. DE01, DE02 or DE).
level 1: str: Fuel type.

COLUMNS

costs: float, [€/MWh]: The fuel production cost.
emission: float, [t/MWh]: The fuel emission factor.
annual limit: float, [MWh]: The annual maximum energy generation (if there is one, otherwise just use inf). If the annual limit is inf in every line the column can be left out.

Data sources ¶

key: ‘data sources’, value: pandas.DataFrame()

Highly recomended. Here the type data, the source name and the url from where they were obtained can be listed. It is a free format and additional columns can be added. This table helps to make your scenario as transparent as possible.

	source	url	v1	…
cost data	Institute	http1	a1	…
pv plants	Organisation	http2	a2	…
…	…	…	…	…

Electricity sector (mandatory)¶

Electricity demand series
Power plants
Volatiles plants
Volatile series
Electricity storages
Power lines

Electricity demand series ¶

key: ‘electricity demand series’, value: pandas.DataFrame()

This sheet requires the electricity demand of the scenario as a time series. One summarised demand series for each region is enough, but it is possible to distinguish between different types. This will not have any effect on the model results but may help to distinguish the different flows in the results.

	DE01	DE02			DE03	…
	all	indsutry	buildings	rest	all	…
Time step 1						…
Time step 2						…
…	…	…	…	…	…	…

INDEX

time step: int: Number of time step. Must be uniform in all series tables.

COLUMNS

unit: [MW]

level 0: str: Region (e.g. DE01, DE02).
level 1: str: Specification of the series e.g. “all” for an overall series.

Power plants ¶

key: ‘power plants’, value: pandas.DataFrame()

The power plants will feed in the electricity bus of the region the are located. The data must be divided by region and subdivided by fuel. Each row can indicate one power plant or a group of power plants. It is possible to add additional columns for information purposes.

		capacity	fuel	efficiency	annual electricity limit	variable_cost	downtime_factor	source_region
DE01	N1
	N2
	N3
DE02	N2
DE02	N3
…	…	…	…	…	…	…	…	…

INDEX

level 0: str: Region (e.g. DE01, DE02).
level 1: str: Name, arbitrary. The combination of region and name is the unique identifier for the power plant or the group of power plants.

COLUMNS

capacity: float, [MW]: The installed capacity of the power plant or the group of power plants.
fuel: str, [-]: The used fuel of the power plant or group of power plants. The combination of source_region and fuel must exist in the commodity sources table.
efficiency: float, [-]: The average overall efficiency of the power plant or the group of power plants.
annual limit: float, [MWh]: The absolute maximum limit of produced electricity within the whole modeling period.
variable_costs: float, [€/MWh]: The variable costs per produced electricity unit.
downtime_factor: float, [-]: The time fraction of the modeling period in which the power plant or the group of power plants cannot produce electricity. The installed capacity will be reduced by this factor capacity * (1 - downtime_factor).
source_region, [-]: The source region of the fuel source. Typically this is the region of the index or DE if it is a global commodity source. The combination of source_region and fuel must exist in the commodity sources table.

Volatiles plants ¶

key: ‘volatile plants’, value: pandas.DataFrame()

Examples of volatile power plants are solar, wind, hydro, geothermal. Data must be provided divided by region and subdivided by energy source. Each row can indicate one plant or a group of plants. It is possible to add additional columns for information purposes.

		capacity
DE01	N1
	N2
DE02	N1
DE03	N1
	N3
…	…	…

INDEX

level 0: str: Region (e.g. DE01, DE02).
level 1: str: Name, arbitrary. The combination of the region and the name has to exist as a time series in the volatile series table.

COLUMNS

capacity: float, [MW]: The installed capacity of the plant.

Volatile series ¶

key: ‘volatile series’, value: pandas.DataFrame()

This sheet provides the normalised feed-in time series in MW/MW _installed. So each time series will multiplied with its installed capacity to get the absolute feed-in. Therefore, the combination of region and name has to exist in the volatile plants table.

	DE01		DE02	DE03		…
	N1	N2	N1	N1	N3	…
Time step 1						…
Time step 2						…
…	…	…	…	…	…	…

INDEX

time step: int: Number of time step. Must be uniform in all series tables.

COLUMNS

unit: [MW]

level 0: str: Region (e.g. DE01, DE02).
level 1: str: Name of the energy source specified in the previous sheet.

Electricity storages ¶

key: ‘electricity storages’, value: pandas.DataFrame()

A types of electricity storages can be defined in this table. All different storage technologies (pumped hydro, batteries, compressed air, etc) have to be entered in a general way. Each row can indicate one storage or a group of storages. It is possible to add additional columns for information purposes.

		energy content	energy inflow	charge capacity	discharge capacity	charge efficiency	discharge efficiency	loss rate
DE01	S1
	S2
DE02	S2
…	…	…	…	…	…	…	…	…

INDEX

level 0: str: Region (e.g. DE01, DE02).
level 1: str: Name, arbitrary.

COLUMNS

energy content: float, [MWh]: The maximum energy content of a storage or a group storages.
energy inflow: float, [MWh]: The amount of energy that will feed into the storage of the model period in MWh. For example a river into a pumped hydroelectric energy storage.
charge capacity: float, [MW]: Maximum capacity to charge the storage or the group of storages.
discharge capacity: float, [MW]: Maximum capacity to discharge the storage or the group of storages.
charge efficiency: float, [-]: Charging efficiency of the storage or the group of storages.
discharge efficiency: float, [-]: Discharging efficiency of the storage or the group of storages.
loss rate: float, [-]: The relative loss of the energy content of the storage. For example a loss rate or 0.01 means that the energy content of the storage will be reduced by 1% in each time step.

Power lines ¶

key: ‘power lines’, value: pandas.DataFrame()

The power lines table defines the connection between the electricity buses of each region of the scenario. There is no default connection. If no connection is defined the regions will be self-sufficient.

	capacity	efficiency
DE01-DE02
DE01-DE03
DE02-DE03
…	…	…

INDEX

Name: str: Name of the 2 connected regions separated by a dash. Define only one direction. In the model one line for each direction will be created. If both directions are defined in the table two lines for each direction will be created for the model, so that the capacity will be the sum of both lines.

COLUMNS

capacity: float, [MW]: The maximum transmission capacity of the power lines.
efficiency:float, [-]: The transmission efficiency of the power line.

Heating sector (optional)¶

Heat demand series
Decentralised heat
Chp - heat plants

Heat demand series ¶

key: ‘heat demand series’, value: pandas.DataFrame()

The heat demand can be entered regionally under DEXX or supra-regional under DE. The only type of demand that must be entered regionally is district heating. As recommendation, coal, gas, or oil demands should be treated supra-regional.

	DE01		DE02				DE
	district heating	N1	district heating	N1	N2	…	N3	N4	N5
Time step 1
Time step 2
…	…	…	…	…	…	…	…	…	…

INDEX

time step: int: Number of time step. Must be uniform in all series tables.

COLUMNS

unit: [MW]

level 0: str: Region (e.g. DE01, DE02 or DE).
level 1: str: Name. Specification of the series e.g. district heating, coal, gas. Except for district heating each combination of region and name must exist in the decentralised heat table.

Decentralised heat ¶

key: ‘decentralised heat’, value: pandas.DataFrame()

This sheet covers all heating technologies that are used to generate decentralized heat. In this context decentralised does not mean regional it represents the large group of independent heating systems. If there is no specific reason to define a heating system regional they should be defined supra-regional.

		efficiency	source	source region
DE01	N1			DE01
DE02	N1			DE02
DE02	N2			DE02
	…			…
DE	N3			DE
	N4			DE
	N5			DE

INDEX

level 0: str: Region (e.g. DE01, DE02 or DE).
level 1: str: Name, arbitrary.

COLUMNS

efficiency: float, [-]: The efficiency of the heating technology.
source: str, [-]: The source that the heating technology uses. Examples are coal, oil for commodities, but it could also be electricity in case of a heat pump. Except for electricity the combination of source and source region has to exist in the commodity sources table. The electricity source will be connected to the electricity bus of the region defined in source region.
source region: str: The region where the source comes from (see source).

Chp - heat plants ¶

key: ‘chp-heat plants’, value: pandas.DataFrame()

This sheet covers CHP and heat plants. Each plant will feed into the district heating bus of the region it it is located. The demand of district heating is defined in the heat demand series table with the name district heating. All plants of the same region with the same fuel can be defined in one row but it is also possible to divide them by additional categories such as efficiency etc.

		limit heat chp	capacity heat chp	capacity elec chp	limit hp	capacity hp	efficiency hp	efficiency heat chp	efficiency elec chp	fuel	source region
DE01	N1										DE01
	N3										DE
	N4										DE
DE02	N1										DE02
	N2										DE02
	N3										DE
	N4										DE
	N5										DE
…	…	…	…	…	…	…	…	…	…	…	…

INDEX

level 0: str: Region (e.g. DE01, DE02).
level 1: str: Name, arbitrary.

COLUMNS

limit heat chp: float, [MWh]: The absolute maximum limit of heat produced by chp within the whole modeling period.
capacity heat chp: float, [MW]: The installed heat capacity of all chp plants of the same group in the region.
capacity elect chp: float, [MW]: The installed electricity capacity of all chp plants of the same group in the region.
limit hp: float, [MWh]: The absolute maximum limit of heat produced by the heat plant within the whole modeling period.
capacity hp: float, [MW]: The installed heat capacity of all heat of the same group in the region.
efficiency hp: float, [-]: The average overall efficiency of the heat plant.
efficiency heat chp: float, [-]: The average overall heat efficiency of the chp.
efficiency elect chp: float, [-]: The average overall electricity efficiency of the chp.
fuel: str, [-]: The used fuel of the plants. The fuel name must be equal to the fuel type of the commodity sources. The combination of fuel and source region has to exist in the commodity sources table.
source_region, [-]: The source region of the fuel source. Typically this is the region of the index or DE if it is a global commodity source.

Mobility sector (optional)¶

Mobility demand series
Mobility

Mobility demand series ¶

key: ‘mobility series’, value: pandas.DataFrame()

The mobility demand can be entered regionally or supra-regional. However, it is recommended to define the mobility demand supra-regional except for electricity. The demand for electric mobility has be defined regional because it will be connected to the electricity bus of each region. The combination of region and name has to exist in the mobility table.

	DE01	DE02	…	DE
	electricity	electricity		N1
Time step 1
Time step 2
…	…	…	…	…

INDEX

time step: int: Number of time step. Must be uniform in all series tables.

COLUMNS

unit: [MW]

level 0: str: Region (e.g. DE01, DE02 or DE).
level 1: str: Specification of the series e.g. “electricity” for each region or “diesel”, “petrol” for DE.

Mobility ¶

key: ‘mobility’, value: pandas.DataFrame()

This sheet covers the technologies of the mobility sector.

		efficiency	source	source region
DE01	electricity		electricity	DE01
DE02	electricity		electricity	DE02
…
DE	N1		oil/biofuel/H2/etc	DE

INDEX

level 0: str: Region (e.g. DE01, DE02 or DE).
level 1: str: Name, arbitrary.

COLUMNS

efficiency: float, [-]: The efficiency of the fuel production. If a diesel demand is defined in the mobility demand series table the efficiency represents the efficiency of diesel production from the commodity source e.g. oil. For a biofuel demand the efficiency of the production of biofuel from biomass has to be defined.
source: str, [-]: The source that the technology uses. Except for electricity the combination of source and source region has to exist in the commodity sources table. The electricity source will be connected to the electricity bus of the region defined in source region.
source region: str, [-]: The region where the source comes from.