Time Domain Reduction (TDR)

GenX.RemoveConstColsFunction
RemoveConstCols(all_profiles, all_col_names)

Remove and store the columns that do not vary during the period.

source
GenX.check_conditionMethod
check_condition(Threshold, R, OldColNames, ScalingMethod, TimestepsPerRepPeriod)

Check whether the greatest Euclidean deviation in the input data and the clustered representation is within a given proportion of the "maximum" possible deviation.

(1 for Normalization covers 100%, 4 for Standardization covers ~95%)

source
GenX.cluster_inputsFunction
cluster_inputs(inpath, settings_path, mysetup, stage_id=-99, v=false; random=true)

Use kmeans or kmedoids to cluster raw demand profiles and resource capacity factor profiles into representative periods. Use Extreme Periods to capture noteworthy periods or periods with notably poor fits.

In Demand_data.csv, include the following:

  • Timesteps_per_Rep_Period - Typically 168 timesteps (e.g., hours) per period, this designates the length of each representative period.
  • UseExtremePeriods - Either 1 or 0, this designates whether or not to include outliers (by performance or demand/resource extreme) as their own representative periods. This setting automatically includes the periods with maximum demand, minimum solar cf and minimum wind cf as extreme periods.
  • ClusterMethod - Either 'kmeans' or 'kmedoids', this designates the method used to cluster periods and determine each point's representative period.
  • ScalingMethod - Either 'N' or 'S', this designates directs the module to normalize ([0,1]) or standardize (mean 0, variance 1) the input data.
  • MinPeriods - The minimum number of periods used to represent the input data. If using UseExtremePeriods, this must be at least three. If IterativelyAddPeriods if off, this will be the total number of periods.
  • MaxPeriods - The maximum number of periods - both clustered periods and extreme periods - that may be used to represent the input data.
  • IterativelyAddPeriods - Either 1 or 0, this designates whether or not to add periods until the error threshold between input data and represented data is met or the maximum number of periods is reached.
  • Threshold - Iterative period addition will end if the period farthest (Euclidean Distance) from its representative period is within this percentage of the total possible error (for normalization) or ~95% of the total possible error (for standardization). E.g., for a threshold of 0.01, every period must be within 1% of the spread of possible error before the clustering iterations will terminate (or until the max number of periods is reached).
  • IterateMethod - Either 'cluster' or 'extreme', this designates whether to add clusters to the kmeans/kmedoids method or to set aside the worst-fitting periods as a new extreme periods.
  • nReps - The number of times to repeat each kmeans/kmedoids clustering at the same setting.
  • DemandWeight - Default 1, this is an optional multiplier on demand columns in order to prioritize better fits for demand profiles over resource capacity factor profiles.
  • WeightTotal - Default 8760, the sum to which the relative weights of representative periods will be scaled.
  • ClusterFuelPrices - Either 1 or 0, this indicates whether or not to use the fuel price time series in Fuels_data.csv in the clustering process. If 'no', this function will still write Fuels_data_clustered.csv with reshaped fuel prices based on the number and size of the representative weeks, assuming a constant time series of fuel prices with length equal to the number of timesteps in the raw input data.
  • MultiStageConcatenate - (Only considered if MultiStage = 1 in genx_settings.yml) If 1, this designates that the model should time domain reduce the input data of all model stages together. Else if 0, [still in development] the model will time domain reduce only the first stage and will apply the periods of each other model stage to this set of representative periods by closest Eucliden distance.

For co-located VRE-STOR resources, all capacity factors must be in the Generatorsvariability.csv file in addition to separate Vreandstorsolarvariability.csv and Vreandstorwind_variability.csv files. The co-located solar PV and wind profiles for co-located resources will be separated into different CSV files to be read by loading the inputs after the clustering of the inputs has occurred.

source
GenX.get_absolute_extremeMethod
get_absolute_extreme(DF, statKey, col_names, ConstCols)

Get the period index of the single timestep with the minimum or maximum demand or capacity factor.

source
GenX.get_demand_multipliersFunction
get_demand_multipliers(ClusterOutputData, ModifiedData, M, W, DemandCols, TimestepsPerRepPeriod, NewColNames, NClusters, Ncols)

Get multipliers to linearly scale clustered demand profiles L zone-wise such that their weighted sum equals the original zonal total demand. Scale demand profiles later using these multipliers in order to ensure that a copy of the original demand is kept for validation.

Find $k_z$ such that:

\[\sum_{i \in I} L_{i,z} = \sum_{t \in T, m \in M} C_{t,m,z} \cdot \frac{w_m}{T} \cdot k_z \: \: \: \forall z \in Z\]

where $Z$ is the set of zones, $I$ is the full time domain, $T$ is the length of one period (e.g., 168 for one week in hours), $M$ is the set of representative periods, $L_{i,z}$ is the original zonal demand profile over time (hour) index $i$, $C_{i,m,z}$ is the demand in timestep $i$ for representative period $m$ in zone $z$, $w_m$ is the weight of the representative period equal to the total number of hours that one hour in representative period $m$ represents in the original profile, and $k_z$ is the zonal demand multiplier returned by the function.

source
GenX.get_extreme_periodFunction
get_extreme_period(DF, GDF, profKey, typeKey, statKey,
   ConstCols, demand_col_names, solar_col_names, wind_col_names)

Identify extreme week by specification of profile type (Demand, PV, Wind), measurement type (absolute (timestep with min/max value) vs. integral (period with min/max summed value)), and statistic (minimum or maximum). I.e., the user could want the hour with the most demand across the whole system to be included among the extreme periods. They would select "Demand", "System, "Absolute, and "Max".

source
GenX.get_integral_extremeMethod
get_integral_extreme(GDF, statKey, col_names, ConstCols)

Get the period index with the minimum or maximum demand or capacity factor summed over the period.

source
GenX.get_worst_period_idxMethod
get_worst_period_idx(R)

Get the index of the period that is farthest from its representative period by Euclidean distance.

source
GenX.parse_dataMethod
parse_data(myinputs)

Get demand, solar, wind, and other curves from the input data.

source
GenX.rmse_scoreMethod
rmse_score(y_true, y_pred)

Calculates Root Mean Square Error.

\[RMSE = \sqrt{\frac{1}{n}\Sigma_{i=1}^{n}{\Big(\frac{d_i -f_i}{\sigma_i}\Big)^2}}\]

source
GenX.scale_weightsFunction
scale_weights(W, H)

Linearly scale weights W such that they sum to the desired number of timesteps (hours) H.

\[w_j \leftarrow H \cdot \frac{w_j}{\sum_i w_i} \: \: \: \forall w_j \in W\]

source
GenX.run_timedomainreduction!Function

Run the GenX time domain reduction on the given case folder

case - folder for the case stage_id - possibly something to do with MultiStage verbose - print extra outputs

This function overwrites the time-domain-reduced inputs if they already exist.

source
GenX.full_time_series_reconstructionFunction
full_time_series_reconstruction(path::AbstractString, setup::Dict, DF::DataFrame)

Internal function for performing the reconstruction. This function returns a DataFrame with the full series reconstruction.

Arguments

  • path (AbstractString): Path input to the results folder
  • setup (Dict): Case setup
  • DF (DataFrame): DataFrame to be reconstructed

Returns

  • reconDF (DataFrame): DataFrame with the full series reconstruction
source