Skip to content

Tips on Time Series Data

October 16, 2017

Over the years as a software engineer I have repeatedly come into contact with time series data.  Some of the useful “time series data” lessons I’ve learned have become increasingly relevant in our current age of IoT, the Cloud, and Big Data.  This blog article sketches a few of them.  To start, please review the Wikipedia’s definition of time series data.

Race Conditions and Queues Can Disorder Time Series Data

As time series data flows through a software system, like an IoT system, it typically flows through a number of components, each doing a step of processing the data in what can be viewed as a data pipeline or data stream.  Downstream components often assume that incoming data is in proper time series sequence (temporal order), which means that all upstream components must absolutely keep that data in proper time series order.  When this assumption is violated, bad things can happen.

For example moving averages are typically used to smooth time series data, removing the spiky, short term fluctuations so that the underlying trends in the data can be more clearly seen, measured and acted upon.  Many times the alerts and alarms in IoT systems are based on moving averages so as to avoid spurious trigger conditions often present in the more rapidly fluctuating raw time series data.

If time series data becomes disordered, then moving averages, plus any statistics based upon them (like rate of change of the moving average) can become absolutely meaningless, dependent upon the magnitude and time duration of the temporal disorder.  And, critically, any alerts and alarms based on the moving average of the disordered time series also becomes meaningless.  In this situation a system has lost its capability to notify its operators of dangerous conditions.  This can result in serious trouble.

Disordering of time series data can be prevented by the following means:

  • Calculate the time series dependent measures, like moving averages, as close to the data source as possible so as to reduce the opportunity for disordering the time series data by reducing the number of components that process or contain the data as it flows through its data pipeline to its final destination.
  • Don’t use Azure Service Bus Queues to contain time series data since they are not guaranteed to be first-in-first-out queues. Generally an Azure Event Hub is a better choice to contain time ordered data.  If you must use Service Bus Queues you must reorder the data downstream of the queue, or take other measures to ensure the data stays in proper time series order.
  • A potential race condition exists when using Azure Functions to remove time series data from an IoT Hub, Event Hub, blob, or to do any kind of processing of a data stream that must remain time ordered. Why?  The race condition here is that since Azure Functions are capable of scaling out with multiple instances of the same function executing in parallel, yet with no guarantee of the order in which they complete.  Thus, a simple intermittent transient fault (very common in the cloud) could easily result in a burst of time series data being input into an Azure Function that runs instances in parallel to handle the load of the burst.  Each Azure Function will complete in its own time, rather than in a way guaranteeing preservation of the time series order of the data stream.  The time series ordering of the data downstream of the Azure Functions cannot be guaranteed under all conditions, although many times the temporal ordering will be OK.  But, you cannot count on that happening all the time.  If you must use Azure Functions like this, then you need to reorder the data downstream of the Azure Functions output.

The Display of Missing Data

What happens when an IoT device stops transmitting its operational telemetry data for a period of time?  And how can this scenario be presented to a user in a helpful way?  This is especially important in a mission critical measure.  An example is an equipment high temperature condition that will endanger the equipment or people’s safety.

There are 3 key concepts in this area to be aware of:

  • The distinction between Operational Data and Health Monitoring Data:
    • Operational data is the main data of interest emitted from an IoT device or a sensor (like temperature, pressure, etc.). Operational data is closely related to why that device or sensor is there.
    • Health monitoring data is about the health state of a device or sensor, including its ability to transmit and/or process data. For example health data can include “IsAlive” information that the device is responsive, exceptions encountered by the device or sensor when doing its job, etc.  Health monitoring data can also be about the health of other components in a data pipeline downstream of devices and sensors.
  • LKV – The Last Known Value of a data stream. This will either be a valid measurement or “MISSING DATA”. It applies to both operational and health monitoring data.
  • LGV – The Last Good Value of a data stream. This will always be a valid measurement and will never be “MISSING DATA”.  This also applies to both operational and health monitoring data.

In a user interface displaying mission critical operational data, my past experience has shown that such a display is vastly more useful when the following data items are displayed near each other on a dashboard or control panel.  Note this only applies to key mission critical data items since the display takes up a lot of space.

  • The LKV of the mission critical operational data item. This may be a valid measurement value like a number, or it may display “MISSING DATA” indicating the time series data is out of whack.
  • The LGV of the mission critical operational data item. This will always be a valid measurement value, specifically the last good measurement value of the mission critical operational LKV data item.  If there are no problems, then LKV will equal LGV.
    • The LGV turns out to be immensely helpful for dealing with emergencies since at least you have some information, even if it is out of date.
    • And it is also highly useful to display the time of the LGV as well, so the user will know how stale the LGV is.  That information could greatly aid the user in making effective decisions and reduce uncertainty when mission critical measurements have missing LKV data.
  • The health monitoring data related to the device or sensor that is doing the measurements of the operational data making up the LKV and LGV.
    • It turns out to be very helpful to know the health status of a device or sensor, or another relevant component of a data pipeline, when the LKV starts displaying “MISSING DATA”.

Using the above ideas will make your time series data processing systems more robust and more effective for humans to operate.

George Stevens

Software Architect, Sr. Software Engineer at Solid Value Software, LLC.

Creative Commons License

dotnetsilverlightprism blog by George Stevens is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Based on a work at dotnetsilverlightprism.wordpress.com.

Advertisements
Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: