IoT and smart devices generate a data glut that exacts a speed penalty on intelligent automation: what’s the solution?

Intelligent home devices such as smart switches, smart thermostats and smart speakers with voice assistants are fast going mainstream. Revenue for the smart home market in Asia is projected to reach US$34.6m in 2020, according to Statista, and this is expected to grow at an annual rate of close to 16% over the next four years.

The popularity of smart home devices is no surprise as they are becoming more intuitive by the day. Increasingly, these devices can gather huge amounts of data and use the information to learn and perform tasks: from controlling home entertainment systems to regulating robots tasked to perform household chores.

At the same time, with the sophisticated smart home environments of today, it is impossible for systems to collate all the information into one central repository, analyze that information, and then push recommendations back to the devices.

Nonetheless, we expect the technology to evolve in 2020 and beyond, and devices will be able to execute the compute function on site so that systems will not have to process the data centrally. By computing at the edge (a.k.a. edge computing), these devices will improve functional efficiency by learning to adjust in real-time rather than being slowed down by the transfer of information to and from a central system.

The smart get smarter

It was only a few years ago that traditional devices became smarter and new smart devices started emerging. Now, devices with smart voice assistants like Alexa, Google Assistant and Siri, are common in homes around the developed world. They are also becoming even more intelligent with the ability to adjust to conditions on-the-fly.

For instance, devices like Google’s Nest are equipped with machine learning capabilities. This means, they can learn about users’ habits, such as weeks of temperature adjustments on the thermostat, to figure out users’ preference.

Traditionally, these systems are designed to analyze data and derive intelligence from it a central repositories. That is, data is extracted from operational systems, transformed into the appropriate format, and then loaded into the data warehouses. The latter are the workhorse of business intelligence and serve as central repositories where data is turned into insights.

However, data warehouses are losing their appeal as the ‘single source of the truth’ for various reasons. Firstly, data warehouses can only store structured data, whereas, the bulk of data these days is unstructured. Another reason is the volume of data; it has become so vast that it is not economically feasible to store all the data in a single data warehouse, usually situated in the Cloud.

Companies have tried to adopt alternatives like Hadoop, which can store unstructured data, as central repositories. However, it is still not possible to collect all the information generated across multiple devices residing in various locations, into one central repository. It is also not feasible to analyze the information for intelligence, and then send smart recommendations back to the devices for optimal performance.

Edge computing as the solution

With the impracticalities of having to keep sending and receiving data to a centralized data warehouse for compute work, the solution is to send captured or generated data to edge nodes, located near the devices, where analysis and computation processes are managed. This is edge computing: it allows devices such as Internet of Things (IoT) gadgets to meet their users’ needs much faster since they only need to communicate with their edge nodes.

However, such independence does not mean that the devices and edge nodes function autonomously. The edge nodes are still connected to central systems and transmit the information that is needed for the central systems to analyze across multiple devices.

In other words, there is a duality of computation in which some analysis happens at the edges to the extent needed for local operation. At the same time, data is also transmitted to central analytical systems to perform more holistic analysis.

The filtering of data at the sources and transmitting only the required information to a central system is not a new concept. In data virtualization, a method of data integration, we perform this selective data processing and delivery in real-time without replicating the data itself.

Democratizing data processing

Two aspects of edge computing technology have evolved much faster than the others in recent years: compute power and storage. For instance, the mobile phones we have today have compute power and memory that overshadow those of desktop computers of 30 years ago.

However, what is holding the technology back is bandwidth limitations and data transmission, as it can still take minutes and hours for data to move from one location to another. This gets even more challenging when we have devices being carried physically further away from cloud and central systems. Edge computing allows devices to compute, adjust and learn in real-time rather than being slowed down by the transfer of information to and from a central system. With the proliferation of connected and smart devices, data volume has exploded and will continue to build up, thus further reducing the efficacy of centralized computation and analysis. Edge computing solves this problem by making the smart devices even smarter as they can process locally instead of at a centralized repository.