Are Mobile Networks Ready for the Streaming Data Tsunami?


In our latest contribution, Simon Crosby, CTO of SWIM.AI, asks if mobile networks are ready for the forthcoming data deluge, looks at the impact it might have on event processing and how best to handle all that streaming data in real-time

I bet you think this piece is about the rise of Netflix, YouTube, Disney+ etc. It isn’t. Sure, mobile and terrestrial networks are being swamped with video traffic consumed by both young and old alike. But I’m referring to the rise of streaming data from the edge – data from every consumer and industrial product imaginable.

By many estimates over 20 billion smart devices enter the market each year (that’s 2 million per hour), and they all have something to say. A lot to say, and all of the time. The stream of data heading into the mobile network from the edge to carriers and to Internet-facing enterprise apps and SaaS vendors is growing at an enormous rate. So in fact, it isn’t a stream, it’s a tsunami that won’t end.

Data Deluge

For mobile operators, there are two opportunities that this data deluge presents. The first is to use device data and network status data to gain real-time insights into network performance, handset performance, user experience, network traffic issues and outages to deliver a more robust network and to improve customer satisfaction.

The second is to use their position at the edge to deliver “Edge Cloud” services that help tame the flood of data before it hits cloud service providers. In this scenario, operators will host edge cloud services on computers close to data sources. The opportunity has never been greater than with the introduction of 5G networking – where operators can offer enterprise customers secure, private slices of network capacity with access to real-time edge computing capabilities with low latency, enabling them to deliver smart cities, smart grids, and tailored enterprise-focused offerings. Vendors have spotted this opportunity too: Ericsson “Edge Gravity” is one example.

What’s needed to succeed across these two areas is for operators to both become fluent in the language of cloud-native messaging services such as those offered by AWS Kinesis, Azure Event Hubs & Enterprise Service Bus, and Google pub/sub, Apache Kafka & Pulsar, Spark etc, and become fluent in open source platforms for real-time stream analysis.

Delivering Insights

The key to deriving insights from the edge may be in supporting pub/sub messaging. One can argue that pub/sub messaging is a new “dial tone” for both consumer and enterprise-focused service providers. Delivering a platform that helps companies securely scale messaging from edge devices is an important service offering, and just as importantly, adopting cloud-native software architectures is crucial for operators to master in order to deliver customer service and understand the state of their networks, in real-time.

Pub/sub messaging enables an unknown number of publishers to deliver asynchronous messages to subscribers – which are the applications that process it – without either of them needing to know the identity of the other. In pub/sub, sources publish events for a topic to a broker that stores them in the order in which they are received. An application subscribes to one or more topics and the broker forwards matching events.

READ MORE: Watching the Big Data Throne

Apache Kafka & Pulsar, and the Cloud Native Foundation’s NATS are pub/sub frameworks that are rapidly becoming the de-facto standard for messaging in the cloud era. Pub/sub is offered as a cloud service by all major clouds, so one might question whether or not mobile operators should enter the fray. I think there are a couple of reasons for them to invest:

  1. First, mobile operators have points of presence that are closer to the network edge, and that can, therefore, offer infrastructure to process events and respond in real-time.
  2. Second, pub/sub messaging will be a key component of a future real-time operations platform within every mobile operator

For use cases in traffic prediction, routing and any interactive service, the response time is critical. Using real-time messaging to “Edge Cloud” application micro-services can save hundreds of milliseconds of event processing time. For a real-time stream processing framework such as Apache Samza or SwimOS, getting hold of events fast is key to real-time analysis, learning and prediction to drive visualizations and automated responses.

For the second, one can consider subscribing to events at a broker to be the streaming equivalent of the database-era “SELECT”. App dev teams can independently subscribe to and write apps for different event topics. All apps, from customer care to predicting outages in network equipment, are feasible when all events are reported in real-time.  

Streaming data contains events that are updates to the state of applications or infrastructure. When choosing an application architecture to process it, the role of a data distribution system, like Kafka or Pulsar, is limited. Take into consideration:

  1. Data is often noisy – Many real-world systems are noisy and repetitive; large numbers of data sources add up to a huge amount of data. If raw data is delivered as a stream of pubs from the edge, the transport cost can be huge.
  2. State matters, not data – Streaming environments never stop producing data – typically real-world events – but analysis is dependent on the meaning of those events, or the state changes that the data represents. Even de-duplicating repetitive updates require a stateful processing model. This means that the stream processor must manage the state of every object in the data stream.

So how do enterprises write apps that consume pub/sub-events? Smart service providers will support application frameworks that take much of the pain out of application delivery. They will need these capabilities internally, so whether or not they offer them as service capabilities very much depends on their appetite for competition with the major cloud vendors.

For use cases in traffic prediction, routing and any interactive service, the response time is critical.

Now For Stream Processing

The category of a software platform that can enable developers to quickly create, deploy, scale and manage an application that consumes data from the edge is called “stream processing”.

Stream Processors are application runtime platforms that support applications that consume events from brokers, the real-world and even change-logs for database systems, then they process them, and deliver real-time insights to users, and other applications. 

Stream processing can involve both the simplest and the most complex kinds of analysis. At the simplest, streaming “transform and load” (STL) – the streaming equivalent of “extract, transform and load” from the store-then-analyze era – simply takes events, transforms and labels them, and delivers them to a cloud data lake like Azure Data Lake Service (ADLS). This is not necessarily even stateful. 

“Organizations need to look for solutions, from third parties or delivered through service providers, that support stream-centric unsupervised learning and prediction that avoids the complexity of model training and deployment in the cloud.”

At the other end of the spectrum, stream processors drive complex analytical processes including real-time analysis, accumulation, learning and prediction. Some analytical frameworks use Apache Spark or Flink. On the other hand, leading stream processors offer a powerful set of analytical functions “in the box” but can also be used to drive more complex analysis using other frameworks like Spark. 

Organizations need to look for solutions, from third parties or delivered through service providers, that support stream-centric unsupervised learning and prediction that avoids the complexity of model training and deployment in the cloud. By utilizing this technology, service providers and businesses throughout industries can derive important insights that can help save time and expense and deliver far better customer service.

Simon Crosby

Simon Crosby is the CTO of SWIM.AI, an edge intelligence company transforming fast data into big insights.

eCMR: If not now, then when?

Gerry Daalhuisen • 17th July 2024

There have been several unexpected pit stops on the road to eliminating paper-based processes in logistics. But, is paper finally set to be a thing of the past?

Tackling Tech Debt

Wes van den Berg • 16th July 2024

5 years ago if you were a CIO without a cloud strategy you’d likely be out of a job. But making decisions in haste might mean businesses ended up with technology they regret, that doesn’t deliver on the promised value.

Laying the foundations for global connectivity

Waldemar Sterz • 26th June 2024

With the globalisation of trade, the axis is shifting. The world has witnessed an unprecedented rise in new digital trade routes that are connecting continents and increasing trade volumes between nations. Waldemar Sterz, CEO of Telegraph42 explains the complexities involved in establishing a Global Internet and provides insight into some of the key initiatives Telegraph42...

IoT Security: Protecting Your Connected Devices from Cyber Attacks

Miro Khach • 19th June 2024

Did you know we’re heading towards having more than 25 billion IoT devices by 2030? This jump means we have to really focus on keeping our smart devices safe. We’re looking at everything from threats to our connected home gadgets to needing strong encryption methods. Ensuring we have secure ways to talk to these devices...

Future Proofing Shipping Against the Next Crisis

Captain Steve Bomgardner • 18th June 2024

Irrespective of whether the next crisis for ship owners is war, weather or another global health event, one fact is ineluctable: recruiting onboard crew is becoming difficult. With limited shore time and contracts that become ever longer, morale is a big issue on board. The job can be both mundane and high risk. Every day...