TheWeb of Things (WoT) together with mashup-like applications is gaining popularity with the development of the Internet towards a network of interconnected objects, ranging from cars and transportation cargos to electrical appliances. Here I will provide a brief architectural overview of technologies which can be used in Web of Things mashups with emphasis on artificial intelligence technologies such as conceptualization and stream processing and at data sources and existing Web of Things mashups.
The Web of Things is an emerging concept which extends already existing concepts such as the Sensor Web , where all sensor data and metadata would be published and available to anyone. The things themselves are everyday objects (i.e. coffee mug, chair, truck, robotic arm, etc.) containing a small computing and communicating device. This device is most often a sensor node, however, it can also be an active or passive RFID tag in which case computing is done at the server. The things currently form isolated networks, controlled by different entities, and most often the data remain closed and are rarely used to full potential. Connecting (or federating) the islands of things using web standards is referred to as the Web of Things (WoT).
The mashups for the Web of Things, also referred to as physical mashups, use raw or processed data coming from things, as well as already existing web data and services to build new applications. The development of such technology is expected to have a high impact on humanity, among others on efficiently servicing increasingly urbanized cities with food, transport, electricity and water in an environmentally sustainable way.
One way of looking at the Web of Things—is to see things as organs which detect stimuli. These are then sent via wireless or wired technology, typically on an IP/HTTP network, to processing and storage engines. These engines then crunch the received information and generate knowledge. Sometimes they can also trigger an action, such as sending a tweet. This is somewhat similar to how we, humans, function: we have five senses which are perceived by corresponding organs, then the stimuli are sent to the brain via the nerves, finally the brain processes these stimuli. The result is most often knowledge, and sometimes also actions can be triggered: the brain transmits commands via the nerves to the muscles which then contract and cause moving of hands, legs, talking, etc. One distinction is that while in the case of the humans the sensors and processors are spatially close to each other (e.g. nose and brain or ears and brain), in the case of WoT we may be looking at a global distributed system.
The technological pipeline for the WoT, The raw data and metadata coming from the network of things can be annotated and enriched—we refer to this as conceptualization—it can be stored using specific approaches for streaming and it can be processed using techniques such as stream mining, event and anomaly detection. WoT mashups can take and use the data at any of these stages.
Network of Things
The things are objects that can be digitally identified by some code such as Electronic Product Code (EPC), Radio Frequency IDentification (RFID), Near Field Communication (NFC), Internet Protocol (IP) v4 or v6, etc. Using these digital identities, things can then be observed by tracking in production plants, warehouses, etc.; by observing usage patterns, by observing their context, etc. focus on things that feature sensors and an embedded device, mostly because the mashup we develop addresses environmental intelligence based on sensor data streams.
The embedded device typically contains four modules: the central processing unit and memory, the communication module, the sensor/actuator and the power source. The CPU controls the embedded device: it tells the sensors to capture data, it sends the data to the storage and/or to the communication module which then transmits them to the destination. A sensor is a device that measures physical phenomena and converts them to a signal that can be read by an observer, or, in our context, by a computer. The communication module typically uses wireless transmission (i.e. IEEE 802.15.4). The operation of the embedded device is constrained by the available power.
Conceptualization of the domain
For small and medium size isolated projects it can be relatively straightforward to know which stream of data measures a given property. Traditional database tables can work well in such situations. However, if we are talking about web scale and are aiming for interoperability, some conceptualization of the WoT domain is needed.
Knowledge about sensors needs to be encoded and structured so that it can be used to its full potential. Additional information such as the phenomena they are measuring, the units of measurement, the location of the sensor node, etc. are needed to accompany the numbers. For instance, if we wanted to know the amount of rain, we should be able to recognize that raindrop, rainfall, and precipitation belong to the same physical phenomena and that all such sensors are a good source for our query. If we were interested in the outside temperatures in the morning, we should be able to infer that a sensor node that is positioned in a stable, is not a good source for us, because it is measuring the temperature inside. If we wanted to find out what is the air pressure in our city, we would need the system to be able to tell which geographical coordinates of a sensor node belong to the area (inverse geocoding). The conceptualization of the domain refers to modeling all this knowledge in a standard way. By using standards also interoperability between different systems can be achieved.
As a growing number of observers realize, one of the most important aspects of the emerging Internet of Things is its incredible breadth and scope. Within a few years, devices on the IoT will vastly outnumber human beings on the planet—and the number of devices will continue to grow. Billions of devices worldwide will form a network unprecedented in history. Devices as varied as soil moisture sensors, street lights, diesel generators, video surveillance systems—even the legendary Internet-enabled toasters—will all be connectedin one fashion or another.
Some pundits have focused only on the myriad addresses necessary for the sheer arithmetic count of devices and have pronounced IPv6 sufficient for the IoT. But this mistakes address space for addressability. No central address repository or existing address translation scheme can possibly deal with the frontier aspects of the IoT. Nor can addresses alone create the costly needed networking “horsepower” within the appliances, sensors, and actuators.
Devices from millions of manufacturers based in hundreds of countries will appear on the IoT (and disappear) completely unpredictably. This creates one of the greatest challenges of the IoT: management. This is a matter both of scope and device capabilities.
These devices incorporate the processors, memory, and human interfaces necessary for traditional networking protocol stacks (typically IPv6 today), the human interfaces necessary for control, and an infrastructure for management (unique addresses, management servers, and so on).
Data exchanged by Internet of Things Devices
The kinds of information these hundreds of billions of IoT devices exchange will also be very different from the traditional Internet Much of today’s Internet traffic is primarily human-to-machine oriented. Applications such as e-mail, web browsing, and video streaming consist of relatively large chunks of data generated by machines and consumed by humans.
But the typical IoT data flow will be nearly diametrically opposed to this model. Machine-to-machine communications require minimal packaging and presentation overhead. For example, a moisture sensor in a farmer’s field may have only a single value to send of volumetric water content. It can be communicated in a few characters of data, perhaps with the addition of a location/identification tag. This value might change slowly throughout the day, but the frequency of meaningful updates will be low. Similar terse communication forms can be imagined for millions of other types of IoT sensors and devices. Many of these IoT devices may be simplex or nearly simplex in data flows, simply broadcasting a state or reading over and over while switched on without even the capacity to “listen” for a reply.
This raises another aspect of the typical IoT message: it’s individually unimportant. For simple sensors and state machines, the variations in conditions over time may be small. Thus, any individual transmission from the majority of IoT devices is likely completely uncritical. These messages are being collected and interpreted elsewhere in the network, and a gap in data will simply be ignored or extrapolated
Even more complex devices, such as a remotely monitored diesel generator, should generate little more traffic, again in terse formats unintelligible to humans, but gathered and interpreted by other devices in the IoT. Overall, the meaningful amount of data generated from each IoT device is vanishingly small—nearly exactly the opposite of the trends seen in the traditional Internet. For example, a temperature sensor might generate only a few hundred bytes of useful data per day, about the same as a couple of smartphone text messages. Because of this, very low bandwidth connections might be utilized for savings in cost, battery life, and other factors.
Loss of Data
Today’s traditional Internet is extremely reliable, even if labeled “best effort.” Overprovisioning of bandwidth (for normal situations) and backbone routing diversity have created an expectation of high service levels among Internet users. “Cloud” architectures and the structure of modern business organizations are built on this expectation of
Internet quality and reliability.
But at the extreme edges of the network that will make up the vast statistical majority of the IoT, connections may often be intermittent and inconsistent in quality. Devices may be switched off at times or powered by solar cells with limited battery back-up. Wireless connections may be of low bandwidth or shared among multiple devices.
Traditional protocols such as TCP/IP are designed to deal with lossy and inconsistent connections by resending data. Even though the data flowing to or from any individual IoT device may be exceedingly small, it will grow quite large in aggregate IoT traffic. The inefficiencies of resending vast quantities of mostly individually unimportant data are clearly an unnecessary redundancy.
Sensors play an integral role in numerous modern industrial applications, including food processing and
everyday monitoring of activities such as transport, air quality, medical therapeutics, and many more. While sensors
have been with us for more than a century, modern sensors with integrated information and communications
technology (ICT) capabilities—smart sensors—have been around for little more than three decades. Remarkable
progress has been made in computational capabilities, storage, energy management, and a variety of form factors,
connectivity options, and software development environments. These advances have occurred in parallel to a
significant evolution in sensing capabilities. We have witnessed the emergence of biosensors that are now found in a
variety of consumer products, such as tests for pregnancy, cholesterol, allergies, and fertility.
The development and rapid commercialization of low-cost microelectromechanical systems (MEMS) sensors,
such as 3D accelerometers, has led to their integration into a diverse range of devices extending from cars to
smartphones. Affordable semiconductor sensors have catalyzed new areas of ambient sensing platforms, such as
those for home air-quality monitoring. The diverse range of low-cost sensors fostered the emergence of pervasive
sensing. Sensors and sensor networks can now be worn or integrated into our living environment or even into our
clothing with minimal effect on our daily lives. Data from these sensors promises to support new proactive healthcare
paradigms with early detection of potential issues, for example, heart disease risk (elevated cholesterols levels) liver disease (elevated bilirubin levels in urine), anemia (ferritin levels in blood) and so forth. Sensors are increasingly
used to monitor daily activities, such as exercise with instant access to our performance through smartphones.
The relationship between our well-being and our ambient environment is undergoing significant change. Sensor
technologies now empower ordinary citizens with information about air and water quality and other environmental
issues, such as noise pollution. Sharing and socializing this data online supports the evolving concepts of citizen-led
sensing. As people contribute their data online, crowdsourced maps of parameters such air quality over large
geographical areas can be generated and shared.
Sensors utilize a wide spectrum of transducer and signal transformation approaches with corresponding variations
in technical complexity. These range from relatively simple temperature measurement based on a bimetallic
thermocouple, to the detection of specific bacteria species using sophisticated optical systems. Within the healthcare,
wellness, and environmental domains, there are a variety of sensing approaches, including microelectromechanical
systems (MEMS), optical, mechanical, electrochemical, semiconductor, and biosensing. The the proliferation of sensor-based applications is growing across a range of sensing targets such as air, water, bacteria, movement, and physiology. As with any form of technology, sensors have both strengths and weaknesses. Operational performance may be a function of the transduction method, the deployment environment, or the system components.
Key Sensor Modalities
Each sensor type offers different levels of accuracy, sensitivity, specificity, or ability to operate in different environmental conditions. There are also cost considerations. More expensive sensors typically have more
sophisticated features that generally offer better performance characteristics. Sensors can be used to measure
quantities of interest in three ways:
• Contact: This approach requires physical contact with the quantity of interest. There are many classes to sense in this way—liquids, gases, objects such as the human body, and more. Deployment of such sensors obviously perturbs the state of the sample or subject to some degree. The type and the extent of this impact is application-specific. Let us look at the example of human body-related applications in more detail.
Comfort and biocompatibility are important considerations for on-body contact sensing. For example, sensors can cause issues such as skin irritation when left in contact for extended periods of time. Fouling of the sensor may also be an issue, and methods to minimize these effects are critical for sensors that have to remain in place for long durations. Contact sensors may have restrictions on size and enclosure design. Contact sensing is commonly used in healthcare- and wellness-oriented applications, particularly where physiological measurements are required, such as in electrocardiography (ECG), electromyography (EMG), and electroencephalography (EEG). The response time of
contact sensors is determined by the speed at which the quantity of interest is transported to the measurement site. For example, sensors such as ECGs that measure an electrical signal have a very fast response time. In comparison, the response time of galvanic skin response (GSR) is lower as it requires the transport of sweat to an electrode, a slower
process. Contact surface effects, such as the quality of the electrical contact between an electrode and subject’s skin, also play a role. Poor contact can result in signal noise and the introduction of signal artifacts.
On-body contact sensing can be further categorized in terms of the degree of “invasion” or impact. Invasive sensors are those, for example, introduced into human organs through small incisions or into blood vessels, perhaps for in vivo glucose sensing or blood pressure monitoring. Minimally invasive sensing includes patch-type devices on the skin that
monitor interstitial fluids. Non-invasive sensors simply have contact with the body without effect, as with pulse oximetery.
• Noncontact: This form of sensing does not require direct contact with the quantity of interest. This approach has the advantage of minimum perturbation of the subject or sample. It is commonly used in ambient sensing applications—applications based on sensors that are ideally hidden from view and, for example, track daily activities and behaviors of individuals in their own homes. Such applications must have minimum impact on the environment or subject of interest in order to preserve state. Sensors that are used in non-contact modes, passive infrared (PIR) , for example, generally have fast response times.
• Sample removal: This approach involves an invasive collection of a representative sample by a human or automated sampling system. Sample removal commonly occurs in healthcare and environmental applications, to monitor E. coli in water or glucose levels in blood, for example. Such samples may be analyzed using either sensors or laboratory-based analytical instrumentation.
With sensor-based approaches, small, hand-held, perhaps disposable sensors are commonly used, particularly where rapid measurements are required. The sensor is typically in close proximity to the sample collection site, as is the case with a blood glucose sensor. Such sensors are increasingly being integrated with computing capabilities to provide sophisticated features, such as data processing, presentation, storage, and remote connectivity. Analytical instrumentations, in contrast, generally have no size limitations and typically contain a variety of sophisticated features, such as autocalibration or inter-sample auto-cleaning and regeneration. Sample preparation is normally required before analysis. Some instruments include sample preparation as an integrated capability. Results for nonbiological samples are generally fast and very accurate. Biological analysis, such bacteria detection, is usually
slower taking hours or days.
Web Real-Time Communication (WebRTC) is a new standard and industry effort that extends the web browsing model. For the first time, browsers are able to directly exchange real-time media with other browsers in a peer-to-peer fashion.
The standardization goal is to define a WebRTC API that enables a web application running on any device, through secure access to the input peripherals (such as webcams and microphones), to exchange real-time media and data with a remote party in a peerto-peer fashion.
The classic web architecture semantics are based on a client-server paradigm, where browsers send an HTTP (Hypertext Transfer Protocol) request for content to the web server, which replies with a response containing the information requested.
The resources provided by a server are closely associated with an entity known by a URI (Uniform Resource Identifier) or URL (Uniform Resource Locator).
WebRTC extends the client-server semantics by introducing a peer-to-peer communication paradigm between browsers.
In the WebRTC Trapezoid model, both browsers are running a web application, which is downloaded from a different web server. Signaling messages are used to set up and terminate communications. They are transported by the HTTP or WebSocket protocol via web servers that can modify, translate, or manage them as needed. It is worth noting that the signaling between browser and server is not standardized in WebRTC, as it is considered to be part of the application. As to the data path, a PeerConnection allows media to flow directly between browsers without any intervening servers. The two web servers can communicate using a standard signaling protocol such as SIP or Jingle (XEP-0166). Otherwise, they can use a proprietary signaling protocol.
The most common WebRTC scenario is likely to be the one where both browsers are running the same web application, downloaded from the same web page.
WebRTC in the browser
application also interacts with the browser, using both WebRTC and other standardized APIs, both proactively (e.g., to query browser capabilities) and reactively (e.g., to receive browser-generated notifications).
The WebRTC API must therefore provide a wide set of functions, like connection management (in a peer-to-peer fashion), encoding/decoding capabilities negotiation, selection and control, media control, firewall and NAT element traversal, etc.
Let us imagine a real-time audio and video call between two browsers. Communication, in such a scenario, might involve direct media streams between the two browsers, with the media path negotiated and instantiated through a complex sequence of interactions involving the following entities: