Perspective from the edge: What sensor data analytics applications need from cloud infrastructure
The Internet of Things (IoT) begins and ends at the “leaf” or “edge” nodes of the network. This is where sensors and actuators, under the control of resource-constrained embedded processors, interact with the real, physical world. Raw sensor measurements like pressure, temperature, vibration, and motion are extracted and interpreted into actionable information that is transmitted to intelligent process controllers upstream.
The rate at which sensor data is acquired at edge nodes varies based on the physical phenomena of interest: from once every few minutes for ambient barometric pressure to tens of thousands of samples per second for vibration monitors. Furthermore, multiple edge nodes can collaborate so upstream intelligence can make a better informed decision. For example, the degradation of a motor that is part of a printer could be detected from vibration of the printer, ambient sound, and infrared imaging. To allow each end node to provide corroborating evidence of a pending failure, samples from each end code must be accurately time stamped and aligned to the same time base.
Edge nodes may also be aggregated by “gateways” or “agents” that accept and correlate time stamped data, and typically run a general-purpose operating system like Linux and provide multiple ways of connecting to edge nodes. These gateways provide several functions: fusion of underlying raw data; management of child devices; translation from proprietary to standard protocols; and updating edge node programming, either in the course of a system upgrade or more frequently in the context of controlled system changes. In some IoT systems it is possible for the gateway function to be implemented virtually as a cloud-based service, but this comes with the tradeoff of requiring edge node devices to consume more power and computing resources.
When sensors in edge nodes are part of a real-time information system, it is important for sensor data analytics to be robust enough to compensate for the inevitable loss of sensor data. Quality of service (QoS) or delay in the overall network could cause sensor data from some of the edge nodes involved to arrive late or not at all.
The design of sensor analytics algorithms places some demands on the network architecture. For example, a support vector machine can be robust against intermittent data loss, so the network protocol should favor providing data with shorter latency rather than retrying unsuccessfully received packets. Or, a Kalman filter could be designed to to accommodate uneven data transmission latencies from different edge nodes. Analytics that are intolerant to missing data may elect to use physical links like CAN instead of relying on wireless connectivity.
At the same time, not all data should find its way into the cloud. There may be sensitive data (e.g., images or personal information) that must be secured on local site servers. Or, an application may have real-time demands that require local processing and low-latency, high-speed data exchange that cannot be provided by cloud-based IoT frameworks. These functions are placed on generic server computing hardware using an amalgam of open source and proprietary software to implement these application-specific features.
In the cloud, various infrastructure providers deliver specific support for IoT, such as device management, access control, IoT protocols, or virtual gateways. These systems, known as platforms-as-a-service (PaaS), support generic cloud software-as-a-service (SaaS) ecosystems of proprietary and open source tools for data storage, on-demand capacity, and analytics. However, as a holistic framework that comprehends development and deployment, these services and infrastructures are still falling short.
Today, developing sensor data analytics using resources in the cloud and distributing real-time workloads amongst the cloud, gateways, and edge nodes often means replicating development efforts for each layer of the implementation. Tools and enablement from infrastructure and hardware vendors should collaborate to greatly improve development efficiencies.
Topics covered in this article