Product Manuals

These manuals are designed to help you use the OpenEEmeter and associated products.

Desktop OpenEEmeter CLI

The eemeter package contains a set of routines to load energy and project data, enrich it with weather data, and estimate savings realized from a given project. Many users of eemeter take advantage of the datastore application--which calls eemeter internally--to store energy data and execute savings calculations. However, eemeter can be run on its own from the command line using properly-formatted CSV files as inputs. This is a useful way of testing and validating the code, and for running one-off analyses of archival data.

This document will describe the basic eemeter command line tool provided as part of the package. The command line interface is implemented in eemeter/cli.py, and is intended to serve as a template to be modified by analysts to support their specific use case.

Input data format

Two input files are required:

Note:
The eemeter package can handle projects which cannot be fully specified in this simplified format; for example, it supports projects with more than one trace per project per interpretation, and can handle estimated meter reads. If this functionality is needed, users should refer to the documentation to modify the example cli.py as needed, or make use of datastore.

Sample data

For purposes of testing, a sample data set is provided in eemeter/sample_data. The sample data set contains a single project undertaken at a single site, and includes two-year-long hourly electricity and natural gas “traces” (energy usage timestreams) synthesized from typical usage patterns provided by the Department of Energy.

  • projects.csv contains project metadata for all projects, and includes the following columns:
    • project_id, a string that serves as a unique identifier.
    • zipcode, a 5-digit ZIP code for the project site used to select a weather station.
    • project_start, a timestamp in the form YYYY-MM-DD HH:MM:SS specifying the date at which work started at the site (and thus the end date of the baseline period).
    • project_end, a timestamp in the same form specifying the date at which work ended at the site (and thus the beginning of the reporting period).
  • traces.csv contains the metered gas and/or electric usage for all projects, and includes the following columns:
    • start, a timestamp in the form YYYY-MM-DD HH:MM:SS that identifies the beginning of the measurement period.
    • value, the metered usage for the measurement period, assumed to be unestimated. Electric usage should be provided in kWh, and gas usage in Therms.
    • project_id identifies the project, which should be one of those included in projects.csv.
    • interpretation identifies the usage type, which should be “gas” for natural gas and “electricity” for electricity.

Running eemeter

Note:
The eemeter core library and CLI tool are installed as a python package. If you do not have an existing python installation, we recommend downloading and installing the free, open-source anaconda python distribution before proceeding with eemeter installation.

If you have an existing installation and would prefer not to alter your existing python environment, you may wish to isolate your python environment. This is not strictly necessary, but can be accomplished using either a virtualenv or a conda environment.

Installation

Start by installing the eemeter package:

$ pip install eemeter

You can verify that it has been installed successfully by starting an interactive python session and running:

>>> import eemeter; eemeter.get_version()
'0.5.13'

Running with sample data

From the shell, type:

$ eemeter sample

(The eemeter sample command is a shortcut for eemeter analyze sample_data/.)

Outputs

You should see output that looks like this:

$ eemeter sample
Running a meter for ABC NATURAL_GAS_CONSUMPTION_SUPPLIED
Normal year savings estimate:
  1216.948071
  68% confidence interval: (1163.898845, 1269.997296)
Reporting period savings estimate:
  1170.606076
  68% confidence interval: (1117.637227, 1223.574925)

Running a meter for ABC ELECTRICITY_CONSUMPTION_SUPPLIED
Normal year savings estimate:
  10465.511272
  68% confidence interval: (10297.508645, 10633.513899)
Reporting period savings estimate:
  10394.802426
  68% confidence interval: (10227.045114, 10562.559737)

These outputs correspond to the annualized weather normalized savings (i.e., the baseline period model applied to the site’s TMY3 normal year weather, minus the reporting model applied to the site’s TMY3 normal year weather) and to the realized savings over the reporting year (i.e., the baseline period model applied to the site’s reporting period observed weather, minus the observed reporting period usage).

In addition to these two cumulative savings estimates, many additional outputs can be reported by eemeter, including daily time series and best-fit model parameters. To see a full menu of options, refer to the documentation, or inspect the meter_output dictionary returned from the _analyze() routine:

>>> from eemeter import cli
>>> retval = cli._analyze('sample_data')

Pay particular attention to the list of derivatives, i.e. derived outputs from the meter estimation:

>>> print([i['series'] for i in retval[0]['derivatives']])

Running with your own data

From the shell, type:

$ eemeter analyze <directory_name></directory_name>

This should point to a directory with trace and project data in CSV files formatted according to the description above. I may also be helpful to inspect the sample data.

Enterprise charting backend

Enterprise deployments of the OpenEEmeter provide a data warehouse organized to make querying and chart-making as simple as possible. This manual describes the structure of these tables and how they are used to create new charts or analyses.

Data warehouse tables: a quick tour

The data warehouse consists of a set of tables organized roughly by object type. They are:

The rows of these tables are as follows:

Trace Summary Mart

The Trace Summary Mart contains data about loaded traces.

Column name
trace_id
interpretation
n_records
n_records_start_date_duplicate
n_records_value_nan
n_records_value_null
n_records_estimated_true
n_records_estimated_false
n_unique_intervals
unique_interval_descriptions
start_date_min
start_date_max
value_min
value_mean
value_max
value_sum
Column description
A unique trace identifier.
Type of values stored in trace. E.g., ELECTRICITY_CONSUMPTION_SUPPLIED.
The number of raw records in the trace.
The number of records in the trace with duplicate start dates.
The number of records in the trace with NaN values.
The number of records in the trace with null values.
The number of records in the trace marked as estimated.
The number of records in the trace marked as not estimated.
The number of unique intervals (between start dates) in the trace
A comma-separated list of strings representing unique intervals in the trace.
The minimum record start date/time in the trace.
The maximum record start date/time in the trace.
The minimum record value in the trace.
The mean of record values in the trace.
The maximum record value in the trace.
The sum of record values in the trace.

Project Summary Mart

The Project Summary Mart contains savings numbers grouped by trace interpretation and project. It also contains custom project metadata fields as requested.

Column name
project_id
electricity_savings_kwh
natural_gas_savings_thm
net_solar_savings_kwh
electricity_savings_std_kwh
natural_gas_savings_std_thm
net_solar_savings_std_kwh
electricity_baseline_kwh
electricity_reporting_kwh
natural_gas_baseline_thm
natural_gas_reporting_thm
net_solar_baseline_kwh
net_solar_reporting_kwh
electricity_reporting_std_kwh
electricity_baseline_std_kwh
natural_gas_reporting_std_thm
natural_gas_baseline_std_thm
net_solar_baseline_std_kwh
net_solar_reporting_std_kwh
electricity_trace_count
natural_gas_trace_count
net_solar_trace_count
electricity_trace_ids
natural_gas_trace_ids
net_solar_trace_ids
electricity_trace_count_baseline_success_reporting_success
natural_gas_trace_count_baseline_success_reporting_success
net_solar_trace_count_baseline_success_reporting_success
electricity_trace_ids_baseline_success_reporting_success
natural_gas_trace_ids_baseline_success_reporting_success
net_solar_trace_ids_baseline_success_reporting_success
electricity_trace_count_baseline_failure_reporting_success
natural_gas_trace_count_baseline_failure_reporting_success
net_solar_trace_count_baseline_failure_reporting_success
electricity_trace_ids_baseline_failure_reporting_success
natural_gas_trace_ids_baseline_failure_reporting_success
net_solar_trace_ids_baseline_failure_reporting_success
electricity_trace_count_baseline_success_reporting_failure
natural_gas_trace_count_baseline_success_reporting_failure
net_solar_trace_count_baseline_success_reporting_failure
electricity_trace_ids_baseline_success_reporting_failure
natural_gas_trace_ids_baseline_success_reporting_failure
net_solar_trace_ids_baseline_success_reporting_failure
electricity_trace_count_baseline_failure_reporting_failure
natural_gas_trace_count_baseline_failure_reporting_failure
net_solar_trace_count_baseline_failure_reporting_failure
electricity_trace_ids_baseline_failure_reporting_failure
natural_gas_trace_ids_baseline_failure_reporting_failure
net_solar_trace_ids_baseline_failure_reporting_failure
project_start_date
project_end_date
project_zipcode
project_climate_zone
project_weather_station_usaf_id
project_weather_normal_station_usaf_id
project_county_fips
[... custom project metadata fields]
Column description
A unique project identifier
Total weather-normalized annual savings across all electricity traces for this project.
Total weather-normalized annual savings across all natural gas traces for this project.
Total weather-normalized annual net metered solar production decrease across all solar traces for this project.
Standard deviation of weather-normalized annual savings across all electricity traces for this project.
Standard deviation of weather-normalized annual savings across all natural gas traces for this project.
Standard deviation of weather-normalized annual net metered solar production increase across all solar traces for this project.
Total weather-normalized baseline usage across all electricity traces for this project.
Total weather-normalized reporting usage across all electricity traces for this project.
Total weather-normalized baseline usage across all natural gas traces for this project.
Total weather-normalized reporting usage across all natural gas traces for this project.
Total weather-normalized baseline net metered production across all solar traces for this project.
Total weather-normalized reporting net metered production across all solar traces for this project.
Standard deviation of weather-normalized reporting usage across all electricity traces for this project.
Standard deviation of weather-normalized baseline usage across all electricity traces for this project.
Standard deviation of weather-normalized reporting usage across all natural gas traces for this project.
Standard deviation of weather-normalized baseline usage across all natural gas traces for this project.
Standard deviation of weather-normalized baseline net metered production across all solar traces for this project.
Standard deviation of weather-normalized reporting net metered production across all solar traces for this project.
The number of electricity traces associated with this project.
The number of natural gas traces associated with this project.
The number of solar traces associated with this project.
A comma-separated list of electricity trace trace_ids associated with this project.
A comma-separated list of natural gas trace trace_ids associated with this project.
A comma-separated list of solar trace trace_ids associated with this project.
Count of electricity traces with model success in baseline and reporting periods for this project.
Count of natural gas traces with model success in baseline and reporting periods for this project.
Count of solar traces with model success in baseline and reporting periods for this project.
A comma-separated list of electricity trace trace_ids with model success in baseline and reporting periods for this project.
A comma-separated list of natural gas trace trace_ids with model success in baseline and reporting periods for this project.
A comma-separated list of solar trace trace_ids with model success in baseline and reporting periods for this project.
Count of electricity traces with model failure in baseline and model success in reporting periods for this project.
Count of natural gas traces with model failure in baseline and model success in reporting periods for this project.
Count of solar traces with model failure in baseline and reporting model success in periods for this project.
A comma-separated list of electricity trace trace_ids with model failure in baseline and model success in reporting periods for this project.
A comma-separated list of natural gas trace trace_ids with model failure in baseline and model success in reporting periods for this project.
A comma-separated list of solar trace trace_ids with model failure in baseline and model success in reporting periods for this project.
Count of electricity traces with model success in baseline and reporting model failure in periods for this project.
Count of natural gas traces with model success in baseline and reporting model failure in periods for this project.
Count of solar traces with model success in baseline and reporting model failure in periods for this project.
A comma-separated list of electricity trace trace_ids with model success in baseline and model failure in reporting periods for this project.
A comma-separated list of natural gas trace trace_ids with model success in baseline and model failure in reporting periods for this project.
A comma-separated list of solar trace trace_ids with model success in baseline and model failure in reporting periods for this project.
Count of electricity traces with model failure in baseline and reporting periods for this project.
Count of natural gas traces with model failure in baseline and reporting periods for this project.
Count of solar traces with model failure in baseline and reporting periods for this project.
A comma-separated list of electricity trace trace_ids with model failure in baseline and reporting periods for this project.
A comma-separated list of natural gas trace trace_ids with model failure in baseline and reporting periods for this project.
A comma-separated list of solar trace trace_ids with model failure in baseline and reporting periods for this project.
Latest known pre-project date.
Earliest known post-project date.
Project ZIP code.
Project climate zone.
Project weather station id. (USAF six-digit code).
Project weather normal station id. (TMY3, USAF six-digit code).
FIPS county code for project
Any custom project information as requested and supplied by user.

Meter Result Mart

The Meter Result Mart contains data about model parameters, associated data, model fits, and data sufficiency requirements.

Column name
trace_id
project_id
directive_label
digest
meter_input_url
meter_output_url
eemeter_version
datastore_version
model_class
model_kwargs
formatter_class
formatter_kwargs
weather_source_station
weather_normal_source_station
baseline_model_status
baseline_model_traceback
baseline_data_start_date
baseline_data_end_date
baseline_period_end_date
baseline_n_rows
baseline_r2
baseline_cvrmse
baseline_rmse
reporting_model_status
reporting_model_traceback
reporting_data_start_date
reporting_data_end_date
reporting_period_start_date
reporting_n_rows
reporting_r2
reporting_cvrmse
reporting_rmse
Column description
The unique identifier of the trace data used in this meter run.
The unique identifier of the project data used in this meter run.
A label for this meter run batch, null if blank.
A hex digest of meter declaration settings.
The cloud storage URL of serialized meter input JSON.
The cloud storage URL of serialized meter output JSON.
The version number for the eemeter software package used for this meter run.
The version number for the datsatore application used for this meter run.
The model class used in the meter run.
The keyword arguments used in model class initialization for this meter run.
The formatter class used in the meter run.
The keyword arguments used in the formatte class initialization for this meter run.
The six-digit USAF ID for the weather station from which temperature data was gathered for this meter run.
The six-digit USAF ID for the TMY3 weather station from which temperature data was gathered for this meter run.
The SUCCESS or FAILURE status of the baseline period modeling.
The error traceback, if any, from an execption raised during the baseline period modeling.k
The start date of the earliest data point in the baseline period.
The start date of the latest data point in the baseline period.
The latest known pre-project date, which marks the end of the baseline period.
The number of data points used during the baseline model fitting step.
The R-squared error value of the baseline model fit.
The coefficient of variation of root mean squared error of the baseline model fit.
The root mean squared error of the baseline model fit
The SUCCESS or FAILURE status of the reporting period modeling.
The error traceback, if any, from an execption raised during the reporting period modeling.k
The start date of the earliest data point in the reporting period.
The start date of the latest data point in the reporting period.
The earliest known post-project date, which marks the end of the reporting period.
The number of data points used during the reporting model fitting step.
The R-squared error value of the reporting model fit.
The coefficient of variation of root mean squared error of the reporting model fit.
The root mean squared error of the reporting model fit

Derivative Summary Mart

The Derivative Summary Mart contains data about official time-series OpenEEmeter outputs (called "derivatives") for each meter run.

Column name
meter_declaration_digest
series_name
derivative_key
n_records
n_records_orderable_duplicate
n_records_value_nan
n_records_value_null
n_records_value_pos_inf
n_records_value_neg_inf
n_records_variance_nan
n_records_variance_null
n_records_variance_pos_inf
n_records_variance_neg_inf
orderable_min
orderable_max
value_min
value_mean
value_median
value_max
value_sum
variance_min
variance_mean
variance_median
variance_max
variance_sum
Column description
The hex digest of the meter declaration inputs in which this derivative series was computed.
The series name for the meter declaration input
A surrogate key identifying this derivative series.
The number of records in this derivative series.
The number of records in this derivative series with a duplicate orderable value.
The number of records in this derivative series with a NaN value.
The number of records in this derivative series with a null value.
The number of records in this derivative series with a positive infinity value.
The number of records in this derivative series with a negative infinity value.
The number of records in this derivative series with a NaN variance.
The number of records in this derivative series with a null variance.
The number of records in this derivative series with a positive infinity variance.
The number of records in this derivative series with a negative infinity variance.
The minimum orderable in the derivative series.
The maximum orderable in the derivative series.
The minimum value in the derivative series.
The mean value of the derivative series.
The median value of the derivative series.
The maximum value in the derivative series.
The sum of values in the derivative series.
The minimum variance in the derivative series.
The mean variance of the derivative series.
The median variance of the derivative series.
The maximum variance in the derivative series.
The sum variances in the derivative series.

Derivative Record Mart

The Derivative Record Mart contains the individual time-series records from OpenEEmeter derivative outputs, and references the Derivative Summary Mart, which contains metadata about each series.

Column name
derivative_key
orderable
value
variance
Column description
A surrogate key referring to the Derivative Summary Mart identifying the values in a particular derivative series.
The orderable of this value; used for sorting items lexicographically.
The value of this derivative record.
The variance of this derivative record.

County Info

The County Info tables contain county-level geolocation data (for dashboard map support).

Column name
geoid
name
latitude
longitude
Column description
The FIPS code of the county.
The name of the county.
The latitude at the centroid of the county polygon.
The longitude at the centroid of the county polygon.

ZCTA Info

The ZCTA Info tables contain ZIP code Tabulation Area level geolocation data (for dashboard map support).

Column name
zcta
latitude
longitude
Column description
The ZIP code Tabulation Area ID
The latitude at the centroid of the ZCTA polygon.
The longitude at the centroid of the ZCTA polygon.