Specific Patterns of Web 2.0: Chapter 7 - Web 2.0 Architectures

by James Governor andDuane Nickull, Dion Hinchcliffe

“Art is the imposing of a pattern on experience, and our aesthetic enjoyment is recognition of the pattern.”

--Alfred North Whitehead
Web 2.0 Architectures book cover

This excerpt is from Web 2.0 Architectures. This fascinating book puts substance behind Web 2.0. Using several high-profile Web 2.0 companies as examples, authors Duane Nickull, Dion Hinchcliffe, and James Governor have distilled the core patterns of Web 2.0 coupled with an abstract model and reference architecture. The result is a base of knowledge that developers, business people, futurists, and entrepreneurs can understand and use as a source of ideas and inspiration.

buy button

It’s been a long climb, but it’s finally time to explore some Web 2.0 patterns. This chapter could not feasibly contain an exhaustive list of all patterns associated with Web 2.0; new patterns are likely evolving as you read this sentence. Nonetheless, the patterns presented here should continue to provide a foundation for applications well into the future, even as the bigger picture continues to change.

Unlike the rest of this book, the pattern descriptions in this chapter are meant as reference material. We do not expect that you will read the chapter from start to finish, but rather that you’ll refer to sections on certain patterns as the need arises. You should feel welcome to read, re-read, and circle back over the individual pattern discussions.

Finally, a note about ordering. This chapter presents the more foundational patterns first, so that the main patterns on which other patterns depend will appear before their dependencies. For example, the Mashup pattern relies upon the Service-Oriented Architecture (SOA) pattern, so we discuss the SOA pattern first.

The Service-Oriented Architecture Pattern

Also Known As

Other terms you may see in conjunction with the Service-Oriented Architecture pattern include:

Services Oriented Architecture

The extra s, often used by nontechnical users, doesn’t change the meaning.

Services

Developers who work outside of the enterprise development world often just think of SOA as “services,” building them into individual applications without worrying about the connecting architecture between applications. The lessons of SOA, especially those of opacity, still apply even in simpler contexts.

Event-Driven Architecture (EDA)

All SOAs are, by design, event-driven; however, a series of new architectures are emerging that actually take note of the events, and even add the concept of event causality (the relationship between two or more events). The pattern of Event-Driven Architecture, defined by the analyst firm Gartner,[70] is the most prominent of these.

Business Problem (Story)

The SOA story is most easily told within the context of a single organization, though once the problems have been broken down into more manageable pieces, those pieces can be more widely distributed and shared with ease, in particular over the Web.

Consider an enterprise that has a set of systems or applications that its employees use to accomplish various tasks. Each system is large and encapsulates a lot of functionality. Some of the functionality is common to more than one system, such as authenticating users before granting them access privileges to the system. Figure 7.1, “A view of the business problem” shows a set of systems, each containing its own module for logging in and authenticating users, a persistent record of the users’ names and addresses, functionality for changing their names and addresses, and some human resources (HR) information. Each vertical grouping represents an application that fulfills a process. The only thing that differs between them is their central focus: payroll, insurance, or employee relations (ER).

Figure 7.1. A view of the business problem

A view of the business problem

The systems also sometimes communicate with each other. They are arranged in a stovepipe: communication between specific systems takes place according to rules specified for those systems. The natures of the relationships for connected systems vary greatly, but they are often bound to the specific application environment used to create the systems. In some cases, the integrations are tightly coupled, which denotes that the systems have certain dependencies on each other’s environments.

Maintenance of these systems has become a very expensive task for the enterprise, repetitive and fragile. Each time an employee changes her password, she has to log into every system and reset that password. Replacing a system or upgrading it to a newer version requires an extensive series of studies to understand what impact this upgrade might have on other connected systems.

The enterprise is tired of the repetition and fragility, and is starting to realize another cost: it implements a variety of functions that other organizations provide at lower cost. By creating IT “hooks” into which external companies can tie their systems, the enterprise can take advantage of those companies’ services.

Corporate acquisitions and mergers raise similar issues. When two companies come together, it makes little sense for the combined enterprise to maintain two customer relationship management (CRM) systems, two authentication directories, two payroll systems, two enterprise resource planning (ERP) systems, and so on.

Context

This pattern occurs in any context in which IT systems or functionality must be made usable for a large group of potential consumers (those with needs), both inside and outside a specific domain of ownership.

This pattern also occurs anywhere enterprises need their IT to be agile in support of their business strategies, such as outsourcing, acquisitions and mergers, and repurposing functionality to save IT budgets.

Derived Requirements

Each system that the enterprise owns and maintains must have a clearly defined interface so that it can be repurposed and used by multiple other applications, including external systems. Each interface should be linked to a specific policy whose terms a consumer must agree to comply with in order to use the service, thereby entering into a contract of sorts.

Because the exposed services may exist in different domains of ownership, several key artifacts must be present to facilitate interoperation with the enterprise’s guidelines. These include being able to declare policies on services, being able to describe what services do in both technical and business terms, security, governance, and several other related functions.

To facilitate coupling to the services, a common set of protocols and standards must be used alongside a standard model for the services. Ideally, use of common protocols and standards will create a service bus within the organization, enabling most applications to talk to each other without being hardwired together. Agility also arises from being able to orchestrate multiple services to facilitate an automated workflow or business process.

It’s also important that all consumers use the same data formats as the service providers. This requirement logically encompasses the artifacts that describe various aspects of a service (e.g., the use of WSDL as the format for describing a web service).

To alleviate problems with managing the functionality behind the services, the service interfaces should be as opaque as possible, preventing service consumers from tying into the application delivering the services any more deeply than required. This makes it easier to manage the functionality and capabilities behind the services, including replacing systems or changing external providers, without endangering the functionality of the systems using the services. Ideally, service consumers should know only about the service interface, and nothing about how the functionality is being fulfilled.

Generalized Solution

SOA is an architectural paradigm used to organize capabilities and functionality under different domains of ownership and to allow interactions between the consumers of capabilities and the capabilities themselves via a service. It can also be defined as an abstract action boundary that facilitates interactions between those with needs and those with capabilities. Organizations adopting this strategy can make all of their core functionality that is shared by two or more consumers available as opaque services. Opacity isolates the service consumer from the internal details of how a capability is fulfilled and helps keep the components of a system from becoming too tightly bound to each other. (Most public services will want to be opaque in any case, for security reasons.)

Static Structure

The Organization for Advancement of Structured Information Systems (OASIS) Reference Model for Service-Oriented Architecture depicts SOA as an abstract pattern—not tied to any specific technologies, standards, or protocols—built from the components shown in Figure 7.2, “The core reference model for SOA”.

Figure 7.2. The core reference model for SOA

The core reference model for SOA

The service is the core action boundary that enables consumption or use of the capabilities lying behind it by consumers or users in need of the functionality it provides. Those who deploy services manage opacity and transparency, meaning that the services fulfill invocation requests without necessarily allowing the consumer to know the details of how the request was carried out behind the service interface.

Because services fall under various domains of ownership, each service has a set of policies in place that govern its use. Failure to comply with the terms of those policies may result in service invocation requests being denied, whereas agreement and compliance with them implies that a contract has been built between the consumer and the service provider, possibly involving one or more proxies in the middle. Examples involving proxies include services that are consumed and then offered to another consumer. We describe such a scenario in our discussion of the peer-to-peer (P2P) models in Chapter 3, Dissecting Web 2.0 Examples (see Figure 3.15, “Ad hoc P2P network” in that chapter).

For a service to be consumed, it must be reachable by and visible to the consumer. This is typically accomplished via some common network fabric or bus, such as the Internet, although it can be implemented in other manners, such as via Bluetooth, radio broadcast/multicast/unicast/anycast, and many other means. The most common implementations of SOA use web services over HTTP, Asynchronous JavaScript and XML (AJAX), and several variations of Representational State Transfer (REST)-style (sometimes referred to as “RESTful SOA”) services using different technologies, including plain old HTTP.

A service provider must describe a service, including all its relevant details (properties, location, policies, data models, behavior models, and so on), in such a manner that potential consumers can inspect and understand it. Web services are commonly described using a W3C recommendation called the Web Services Description Language, an XML language for declaring relevant aspects of a service’s operations. Simple HTTP services might, in contrast, be described via a URL and/or simply through documentation.

The real-world effect of consuming a service is a tangible change of state based on interactions with that service. For example, the real-world effect of invoking an Amazon.com web service to purchase a book might be that you create a real-world contract to complete the purchase by paying for a book that will arrive at the delivery destination. Real-world effects are, of course, dependent upon the type of service with which a consumer interacts. In some cases, a real-world effect may happen even when the service invocation is not completed successfully. For example, you might not complete the book ordering process, but due to some internal error the service might deduct the book from its internal inventory, or your credit card might be billed even though the book was never scheduled for delivery. (Or perhaps in a nicer situation, you might receive a notification that the process wasn’t completed.)

The service interaction model is much more complex than we’ve shown so far and is worth decomposing further. We can subdivide it into two smaller components: the information model and the behavior model (see Figure 7.3, “SOA interaction model decomposition”).

Figure 7.3. SOA interaction model decomposition

SOA interaction model decomposition

The information model governs all the data passed into and out of a service. Typically, the data is constrained using some form of declaration. An example might be an XML schema that constrains data going into a service to only two signed integers, serialized as XML.

The behavior model governs all the patterns of interaction or service invocation, on both sides of the service interface boundary. The model might be a simple idempotent request/response pair or a longer-running subscribe/push pattern where the service keeps sending responses back to the consumer until the consumer sends another message to unsubscribe.

The service’s execution context is the overall set of circumstances in which the service invocation life cycle plays out. A service’s execution may be affected by business rules, legislative rules, the role of the service consumer, and the state of the service provider. We used Starbucks as an example in Chapter 4, Modeling Web 2.0 to illustrate this concept (see the section called “Services”).

Dynamic Behavior

The dynamic pattern of interaction with a service describes the behavior of all actors working with the service at runtime. Potential consumers must first be aware of the service’s existence and purpose, including its real-world effects, before they can consume the service. In the electronic world, it’s best to use a specific, standard set of protocols and technologies across all service and consumer interfaces, creating an ad hoc service bus. There are many interaction models with services, but those described here are the most common.

Request/Response

Request/Response is a pattern in which the service consumer uses configured client software to issue an invocation request to a service provided by the service provider. The request results in an optional response (see Figure 7.4, “SOA Request/Response pattern”). Request/Response is by far the most common interaction pattern on the public Web.

Figure 7.4. SOA Request/Response pattern

SOA Request/Response pattern

Request/Response via service registry

An optional service registry can help the service consumer automatically configure certain aspects of its service client. The service provider pushes changes regarding the service’s details to the registry to which the consumer has subscribed, which sends the consumer notifications about the changes. The service consumer can then reconfigure the service client to talk to the service. Figure 7.5, “SOA Request/Response pattern with a service registry in the equation” represents this process conceptually.

Figure 7.5. SOA Request/Response pattern with a service registry in the equation

SOA Request/Response pattern with a service registry in the equation

Subscribe/Push

A third pattern for interaction is called Subscribe/Push. In this pattern, one or more clients register subscriptions with a service to receive messages based on some criteria. Regardless of the criteria, the externally visible pattern remains the same. The Subscribe/Push pattern can be as simple as clients subscribing to a mailing list, but it supports a wide range of much more complicated functionality.

Note

Readers who are familiar with Message-Oriented Middleware (MOM) may wish to contrast this pattern with those systems. MOM usually is configured to push asynchronous messages rather than always adhering to the Request/Response pattern. These types of patterns are becoming more common within SOA infrastructure, and you can read about them in David A. Chappell’s Enterprise Service Bus (O’Reilly).[71]

Subscriptions may remain in effect over long periods before being canceled or revoked. A subscription may, in some cases, also register additional service endpoints to receive notifications. For example, an emergency management system may notify all fire stations in the event of a major earthquake using a common language such as the OASIS Common Alerting Protocol (CAP).[72] An example of the Subscribe/Push pattern appears in Figure 7.6, “SOA Subscribe/Push pattern”.

Figure 7.6. SOA Subscribe/Push pattern

SOA Subscribe/Push pattern

Probe and Match

The Probe and Match pattern is used for discovery of services. In this variation, shown in Figure 7.7, “SOA Probe and Match pattern”, a single client may multicast or broadcast a message to several endpoints on a single network fabric, prompting them to respond based on certain criteria. For example, this pattern may be used to determine whether large numbers of servers on a server farm are capable of handling more traffic based on the fact that they are all scaled at less than 50% capacity. This variation of the SOA message exchange pattern may also be used to locate specific services. There are definitely caveats with using such a pattern, as it may become bandwidth-intensive if used often. Using a registry or another centralized metadata facility may be a better option, because the registry interaction does not require sending probe() messages to all the endpoints to find one; by convention, registries allow a query to locate the desired endpoint using a filter query or other search algorithm.

Figure 7.7. SOA Probe and Match pattern

SOA Probe and Match pattern

In the Probe and Match scenario in Figure 7.7, “SOA Probe and Match pattern”, the service client probes three services, yet only the middle one returns a successful match message. A hybrid approach could use the best of both the registry and probe and match models for locating service endpoints. For example, registry software could implement a probe interface to allow service location without requiring wire transactions going to all endpoints, and the searching mechanism could probe multiple registries at the same time.

There are several other variations of SOA message exchange patterns; however, an exhaustive list would be beyond the scope of this book.

Implementation

When implementing SOA, architects have to deal with several key issues:

Protocols and standards

Implementers need to make sure that a common set of protocols and standards is employed to ensure that users can communicate with all services without requiring custom client software for each service. Such noninteroperability would constitute an anti-pattern of SOA. For example, if all of a provider’s services use Simple Object Access Protocol (SOAP) over HTTP to communicate, it’s much easier to wire together multiple services for a common purpose. If each service uses a different protocol and the service responses came back in differing data formats (e.g., one in XML, another in EDI, another in CSV, and another in plain ASCII), the work required to make the service bus operate might be disproportionately greater than the work involved in building a self-contained application from the ground up. On the public Web, the cost of variations tends to fall on consumers. Applying common practices and protocols makes sense and will help adoption rates.

Security

Some services are open to anyone, but most have a limited number of acceptable users and roles, and service providers try to limit use to that list.

Denial-of-service (DoS) and other attacks

Implementers must work to minimize the impact of potential DoS attacks on mission-critical services. Keeping the impact of a failed connection as small as possible is a good foundation. In more intricate systems, one safeguard may be to require every service consumer to use a registry to get the endpoint coordinates for the service and to configure its service client appropriately before talking to the service. In the event of a DoS attack, this will allow the service provider to dynamically redirect legitimate traffic to a new service endpoint that is unknown to the attacker in order to avoid interruptions in service provision. It will deny DoS attackers the ability to even send a message to the new endpoint, and a ring or proxies on the perimeter will drop them. Of course, a smart attacker might then target the registry, so a hybrid approach would be more secure.

Governance

Service providers monitor service invocation requests during their life cycles to make sure they can scale the number of active service endpoints to meet demand in peak times. This is particularly important if the services perform some mission-critical function (like routing a 911 telephone call during a major emergency such as Hurricane Katrina). Additionally, service providers need to monitor the real-world effects of what their services are allowing consumers to do. For example, if you typically build a product at a rate of 1,000 units per month and you receive a purchase request for 30,000 units via one of your services, you’ll need to carefully consider the impact of the request because you will not be able to deliver the product in a timely manner.

Business Problem (Story) Resolved

The architects can apply the SOA pattern and refactor their IT system to be service-oriented, as illustrated in Figure 7.8, “The enterprise’s systems, previously shown in Figure 7.1, “A view of the business problem”, refactored as an SOA”. The core service platform contains several components, each with specialized tasks to fulfill. The core service container governs service invocation requests during their life cycles. This container keeps track of the state of each service invocation request and monitors any conditions that may affect service fulfillment. If a service request must use some capabilities in another system, such as the database (depicted at the bottom of Figure 7.8, “The enterprise’s systems, previously shown in Figure 7.1, “A view of the business problem”, refactored as an SOA”), the service container may route the service request to the relevant component and track timeouts or other errors that might result in unsuccessful service invocation. The invocation layer is where all service requests are started. It can contain multiple types of mechanisms for kicking off service requests. The human actor may simply use a form to log into a system or systems. Note that, unlike in Figure 7.1, “A view of the business problem”, now a single authentication service is shared across all applications within the enterprise’s platform. This is known as single sign-on (SSO). Likewise, there is only one data persistence component. This saves the IT staff money as well as time because now it has to look after only one system, rather than several.

Figure 7.8. The enterprise’s systems, previously shown in Figure 7.1, “A view of the business problem”, refactored as an SOA

The enterprise’s systems, previously shown in , refactored as an SOA

The runtime service registry/repository can also aid in directing requests to the appropriate core enterprise systems to handle the functional aspects of service fulfillment. Any number of enterprise applications can be added to the core service platform via the service provider interface, which uses a common set of protocols and standards to enable interoperability with the service platform. This, in effect, forms a special type of bus often called an enterprise service bus (ESB).

In our example, the enterprise has also outsourced one function, payroll, by allowing a third party to hook into its system via a messaging endpoint. Similar hooks could be added should the company decide to outsource other functions or act as a resource for other enterprises in the future.

The SOA infrastructure is very nimble, and new services can be added with relative ease. If an application has to be replaced, as long as the new application supports the existing service interface, there will be no interruptions to the rest of the system.

Specializations

There are too many specializations of the SOA architecture to list in this book. Some add service aggregation (also known as service composition) to combine services or build composite applications. Others use workflow engines to orchestrate multiple services into a business workflow. Business process automation is another consideration that is built on SOA, but it is not part of SOA itself.

Known Uses

There are several known uses of the SOA pattern. For instance, Microsoft’s .NET architecture exemplifies many of the concepts we’ve discussed. IBM, Oracle, JBoss,[73] Sun Microsystems, Red Hat, TIBCO, and BEA also have similar ESB SOA functionality built into their application servers.[74] JBossESB is based on a loosely coupled message pattern approach in which contracts between endpoints are defined within the messages and not at the service interface boundary. It uses SOA principles within the ESB.

J2EE application servers offered by various vendors can be used alongside secondary web applications (such as Adobe’s LiveCycle Enterprise Suite) as a core SOA platform to provide business-related services and aggregate them into processes. Although many software vendors offer SOA infrastructures or components, it is also important to note that a thriving open source community is rapidly building many components for SOA, and many standards bodies are making protocols, standards, and technologies available to the masses.

The technology family most often used for implementing SOA is referred to as “web services.” However, simply using web services standards does not necessarily mean that a service-oriented architecture has been built.

Options for the payloads are too numerous to list; most are based on XML. The more commonly used payload formats are binary formats and specialized XML constraint languages from groups such as OASIS, including the Universal Business Language (UBL) and Common Alerting Protocol (CAP).

On the public Web, the SOA pattern is also used to create public services. While public services typically act to connect users outside of a process boundary, the same best practices apply. Consistency, ease of use, and reliability are just as important (if not more so) for public services as they are inside of an enterprise.

Consequences

SOA architectures allow the elimination of many dependencies between systems, because the service acts as an action boundary to cleanly separate the consumer from the provider. Duplication of systems can be reduced or possibly eliminated if services can be repurposed among multiple consumers. IT management should become easier if SOA is used to architect systems that can be cleanly separated so that functionality can be easily isolated for testing when things go wrong.

Many other Web 2.0 patterns in this book depend on SOA. The Mashup and Software as a Service (SaaS) patterns, for example, rely on a layer of services that can be mixed and matched to create new and powerful applications and experiences for users.

Unauthorized access to key systems is a real risk that service providers take when exposing their services via the Internet. Overuse of a single service might also become a factor that results in a service no longer being functional for its intended users.

Architects have to consider which interfaces or systems (capabilities) are potential candidates for becoming services. If a capability is used by only one other system, it may not be a suitable candidate. On the other hand, if it is used by more than one process (for example, login/authentication), it might be ideal for implementing as a service that can be repurposed for several processes or clients.

The Software as a Service (SaaS) Pattern

Also Known As

Terms often associated with the Software as a Service pattern include:

Utility Computing and Cloud Computing

Cloud Computing is not the same as SaaS; rather, it is a specialized pattern of virtualization. Utility and Cloud Computing refer to treating computing resources as virtualized, metered services, similar from a consumer’s perspective to how we consume other utilities (such as water, gas, electricity, and pay-per-view cable).

On-demand applications

On-demand applications provide access to computing resources on an ad hoc basis when the functionality is required. Using the http://createpdf.adobe.com service to create a single PDF document online rather than having to download and install a version of Acrobat is a good example.

Software Above the Level of a Single Device

This pattern relates to software that spans Internet-connected devices and builds on the growing pervasiveness of the online experience. It touches on various aspects of the SaaS pattern; in particular, the concepts of distributed computing of tasks via network connections.

Model-View-Controller (MVC)

Some people consider SaaS a specialization of the MVC pattern that distributes the Model, View, and Controller (or parts thereof) over multiple resources located on the Internet. It is strongly related to SaaS, and most SaaS providers follow the MVC pattern when implementing SaaS.

Business Problem (Story)

Consider a software vendor that wishes to develop spam removal software to keep spam from reaching its clients’ inboxes. This can be achieved by writing algorithms that analyze incoming local email messages, detect possible spam messages, and flag them in such a way that the mailbox owners can filter them out automatically without having to manually sort through all the messages.

The business problem arises from the fact that spam is constantly changing, which makes it difficult to detect and flag. Spam is typically sent from a variety of spoofed email addresses, and the specific text patterns are changed frequently. For example, if you wanted to detect any spam that had the word “Viagra” in it, you could simply use Perl’s regular expression matching syntax:

if ($emailString =~ m/viagra/;)
{
 $SpamScore =+ 1;
}

However, all the spammer would have to do is alter the case of some of the letters to thwart this detection, as in the following:

"ViAGRA"

You could counter this in Perl by adding an i flag to ignore case, as follows:

if ($emailString =~ m/viagra/i;)
{
 $SpamScore =+ 1;
}

However, the spammer could then substitute an exclamation point, the number 1, or the letter l for the letter I, capitalizing on the face that the human mind will perceive “Viagra” if it sees “V!AGRA,” “V1AGRA,” or “VlAGRA.” To a human these terms might semantically be the same, but changing the one byte from an I to another character will render useless the efforts of a computer trying to filter spam based on a string of characters. Each possible mutation would require the software vendor to write and distribute new patches to detect the latest variations to each client, possibly on a daily or even hourly basis. In this case, the Perl syntax could be changed to:

if ($emailString =~ m/v*gra/i;) {
{
 $SpamScore =+ 1;
}

The ballet between those who create and send spam and those who try to detect and delete it is a constantly morphing work in progress, with new steps being introduced every day. Each individual the company serves could attempt to create these rules by himself for his own mailbox, but this would be both ineffective and inefficient. Users would each sample only a small subset of all spam, would not be able to easily create heuristic filters to detect spam, and would likely spend an inordinately large amount of time on this activity.

Context

The SaaS pattern is useful any time a customer base has needs that could be addressed more efficiently or reliably by creating a service all of them can share across organizational boundaries.

This pattern occurs whenever a person or organization is building an application whose model, control, or view aspects must be refreshed based on dynamic circumstances or instances in which specialized functionality of the application must be delivered. The pattern could apply anywhere a static application does not easily lend itself to frequent specialization of the model, view, or control aspects required to make it function properly.

The pattern is useful in situations in which users need more computer resources than they can easily support on their local systems and in those situations where users need particular computing resources only occasionally.

Derived Requirements

Computing resources should be architected to be reachable (as discussed in the section on SOA) over whatever network or fabric the architect designs the application to work with. For example, most web-enabled SaaS applications use a common transport protocol, and most ham radio operators use a common frequency to broadcast information or pass it along in a chain.

Functional components of the core computing resources must be usable via a well-defined interface. Such an interface should not be bound to a single client or single model for delivery (such as installation of an application) and should support multiple options for building the user interface (e.g., web-based or client application interface).

Generalized Solution

SaaS is a model of software delivery in which the manufacturer is responsible for the daily technical operation of the software provided to the clients (including maintenance and support), while the clients enjoy the full benefits of the software from remote locations. SaaS is a model of “functionality delivery” rather than “software distribution.” Most of the functionality can be delivered over the Internet or made available in such a way that the end user can interact with the application to get that functionality without having to install the software on her machine. This approach can deliver functionality to any market segment, from home consumers to corporations, and hybrid solutions can deliver small pieces of client-side software that make certain tasks easier.

Static Structure

The basic deployment pattern for SaaS involves deploying different aspects of the model, view, and control components of an application to multiple physical locations. The deployment approach may vary greatly depending on the software and its complexity and dependence on other aspects. Figure 7.9, “Deployment patterns contrasted (SaaS versus conventional approach)” shows how the basic deployment pattern for SaaS differs from traditional software distribution.

Figure 7.9. Deployment patterns contrasted (SaaS versus conventional approach)

Deployment patterns contrasted (SaaS versus conventional approach)

The service should also be able to learn from its users when appropriate. This concept of “software that gets better as more people use it,” a hallmark of Web 2.0, has many advantages. For example, in the business story shown in Figure 7.10, “Spam filter software as a service”, if enough email flows through a pattern detector, spam recognition becomes much more accurate based on the collective interactions of thousands of users. As more and more users flag the same messages as spam, the server will begin to recognize those messages as spam, and the global filter will then prevent them from being delivered to other end users. Many readers probably use this type of functionality already without really knowing it.

Google’s Gmail is a prime example of this pattern in action. Google Search is another dynamic example of Software as a Service that gets better the more that people use it. Google actually tracks the links users click on to determine how many people seek the same resource for the same search term. This system is much more sophisticated than a simple adaptive algorithm, yet the principle benefit of large-scale use is that the system learns and adapts based on users’ behaviors.

This functionality is a side benefit of SaaS rather than a core aspect of the pattern.

Figure 7.10. Spam filter software as a service

Spam filter software as a service

Dynamic Behavior

The dynamic behavior of the SaaS pattern can vary greatly depending on which protocols, standards, and architectural approaches are chosen. Figure 7.11, “A dynamic view of one way to visualize SaaS” shows a common depiction of the pattern.

Figure 7.11. A dynamic view of one way to visualize SaaS

A dynamic view of one way to visualize SaaS

First, a user identifies his requirements for consuming computing resources. This pattern can be implemented at many levels of complexity, from a secure web service to a simple user interface such as an HTML web page in a browser. The user will interact with the service (the service in this case is a proxy), which will then invoke the core functionality. The responses are appropriately directed back to the user, as required.

Note that this pattern becomes very interesting when multiple users employ the resources and implementers have capabilities that do not exist for their non-SaaS counterparts. First, the functionality provider can detect and act on patterns in runtime interactions. An example of this might be the detection of some error state that is occurring for multiple users of the system (e.g., the email clients are crashing because of a nefarious script contained within some emails). Rather than waiting for enough users to contact the software provider with enough information to enable it to fix the error, the provider can detect the condition itself at an early stage and has access to sufficient information to enable it to trace the source of the error. Ultimately, all users of the software will have a much better user experience if problems are mitigated sooner rather than later and before they feel compelled to complain about them.

Second, a provider may want to consider scaling the system in the backend to handle large numbers of requests. Sudden traffic spikes can adversely impact the user experience, making the system seem unresponsive. Service providers may want to investigate other services, notably those of Cloud Computing providers, if they need to support widely varying usage levels. Most Cloud Computing providers offer automatic scaling to support the amount of processing power and storage needed by an application.

Implementation

As shown in Figure 7.12, “Distinctions of the SaaS pattern”,[75] designers of software provided as a service rather than as a static, installed entity must consider several new nuances of the industry, as these changes in how users interact with software vendors are affecting the way we should design and architect SaaS.

Figure 7.12. Distinctions of the SaaS pattern

Distinctions of the SaaS pattern

When implementing SaaS, you may need to ensure that no one can take advantage of your license model. For example, if you license only one user, what keeps that user from simply sharing her username and password and reselling your service? There are various types of license models for SaaS. Some are single-enterprise licenses, with the cost based on the size of the enterprise. Spam software and Microsoft Exchange Server are reported to use this model. By contrast, Adobe Systems uses a “per use” model for http://createpdf.acrobat.com, where users can either create PDFs one at a time or protect them with a persistent policy. Other software business models (e.g., for Google Search and other widgets) are built around advertising revenue models.

A software vendor implementing SaaS also can react swiftly to bugs in the system. The vendor can monitor all users concurrently to detect software glitches that may require immediate attention and may be able to fix them before most users even notice.

Business Problem (Story) Resolved

The spam detection software is housed in a federated server environment, and users’ incoming email can be automatically pre-passed through the filters to detect spam. Any spam that sneaks through can be recognized and trapped by secondary mechanisms (including human users) and reported back to the spam detection infrastructure, enabling the system to adapt to the latest spammer tactics. There is no discernable delay or lag in incoming email, and most spam email gets eliminated.

Sampling a very large cross section of all mail makes it easier to detect patterns indicating spam emails. This results in the entire system performing better, to the benefit of all users.

Specializations

SaaS may be specialized by using advanced computer algorithms to perform reasoning tasks such as inference. These ultra-advanced systems may one day be able to use cognitive skills to recognize and act on certain patterns of events. For example, these hybrid systems could use a combination of the Bayesian Theorem (conditional probability) and lexical heuristic filtering (finding evidence to support a hypothesis) to dynamically change their functionality. Such specializations will also benefit from adoption of the Collaborative Tagging (a.k.a. folksonomy) pattern, discussed later in this chapter, as it will help foster computational intelligence applications that can reason and infer hypotheses.

Known Uses

Postini, Inc. (which Google acquired in 2007[76]) figured out quite a while ago that centralization and offering its security, compliance, and productivity solutions as a service would result in better spam detection for end users. Recently, many other email companies have begun to use similar SaaS models. Apple’s iTunes music application is perhaps one of the most prominent examples of a hybrid approach to the Software as a Service pattern. The iTunes application has some predetermined functionality and user interfaces (the “V” and “C” components of “Model-View-Controller”); however, much of the information presented to the user is based on information (the “M” in “MVC”) the application receives from the iTunes servers during runtime. This hybrid approach can link user profiles to service calls to offer users better experiences.

Adobe Systems recently launched a service that allows people to manually create small numbers of PDF documents online and optionally link to them other functionality for things such as enabling rights management. Google continues to expand its SaaS offerings. Initially, its search service used algorithms to vary search result rankings based on user interaction patterns. Recently, Google has added multiple other SaaS offerings, including creation and manipulation of online documents, spreadsheets, and more. Note that most of Gmail has always been provided as a service rather than as an application.

Consequences

The negative consequences of using the SaaS pattern are minimal, but they need to be addressed from the outset. Offering software as a service may create additional complexity in supporting computing resources for large numbers of users. The ability to dynamically scale an offering based on surges in the number of users requesting the functionality is also an issue.

In addition, authentication—especially when used to force compliance with software licensing—can be difficult to implement and to police.

The most noteworthy consequence of implementing this pattern is that the software may have a dependency on an Internet connection. In many cases, if the connection does not work, neither will the software. When such a mechanism is possible and makes sense, the best way to avoid such issues is to employ client-side caching. The appropriateness of this strategy varies by application. Caching Google Search is difficult, and would be mostly useless in an environment where readers couldn’t go to the linked articles anyway. On the other hand, technologies like the Adobe Integrated Runtime (AIR) and Google Gears are steps in the right direction.

Denial-of-service attacks can also be a threat. A malicious user may be able to overpower the bandwidth or computational capabilities of software implemented as a service and effectively deny or greatly slow down other users’ experiences.

End users often prefer to keep their software in a controlled and secure environment. Because SaaS applications are not hosted or controlled by the user, the user might be subjected to occasional outages when service upgrades or other events take place on the provider side.

Also, personal security risks are inherent in passing information back and forth on the open Internet. Users should carefully assess what risks are acceptable for their purposes.

The Participation-Collaboration Pattern

Also Known As

The Participation-Collaboration pattern is related to two other patterns:

Harnessing Collective Intelligence

This pattern, from the O’Reilly Radar report Web 2.0 Principles and Best Practices, discusses aspects of the Participation-Collaboration pattern where inputs from many users over time are combined into composite works.

Innovation in Assembly

This pattern, also from the O’Reilly Radar report Web 2.0 Principles and Best Practices, is focused on the ability to build platforms where remixing of data and services creates new opportunities and markets. The Participation-Collaboration pattern and the data aggregation patterns in this book cover some aspects of this and of the Harnessing Collective Intelligence pattern.

Business Problem (Story)

Until recently, the easiest way to compose and distribute textual content was to have a small group of authors write some material, print it on paper, bind that paper, and sell it or give it away. Printing is a much cheaper process for large runs than copying by hand: its setup costs are considerable, but the additional costs of printing an extra copy are relatively small. The publishing business structured itself around those costs, trying to find ways of ensuring that projects that made it to the printing press could reach the largest market possible. Traditionally, publishing processes have emphasized uniformity, quality, and stability at the expense of openness, preferring controlled conversations between authors, leading predictably through a complex process that culminated at the printing press and resulted in sales.

Once something is published, if the material becomes obsolete or errors are found, there is no quick or inexpensive way to change it. The high costs of new print runs also mean that once material is published, it isn’t generally possible to append contributions to the material until enough demand accumulates for a new revision to supersede the earlier edition. If a publisher prints a large number of copies of a book and then some new facts that challenge its content are uncovered, this presents a huge problem for the publisher. Pulping books (or destroying software or any other material produced this way) is an expensive waste of resources.

Consider a small company that wants to create a manual covering the use of one of its products. The traditional approach is to gather a small set of experts to write it, hopefully minimizing the potential for costly errors. Manuals face a market of readers with different skill levels, though, and the company’s writers may not always get everything right. For example, they may assume that everyone reading the manual has a deep technical understanding about how alternating current works in relation to direct current, and fail to give the average reader enough information to make informed decisions. Customers often know what they need better than the company does, but the flow of information has traditionally gone from the publisher to the customer, rather than the other way around.

Context

The ease of distributing information on the Web makes it possible to discard most of the constraints of earlier publishing processes, opening them up to include contributions from many more participants. The Participation-Collaboration pattern can appear wherever a group of people has a common interest in sharing and appending to information about a specific subject. This pattern recognizes that an open process may provide better results than having only a few people present their knowledge. This pattern lets a wider group of people collaborate on and contribute to a work, so that it reflects a wider set of experiences and opinions. This many-to-many participation model has worked very well for the development and maintenance of open source software for years. It has also been applied to website and portal development, as well as publishing.

Derived Requirements

To operate, this pattern requires a system or platform where collaborative input can be collected and shared in a way that enables people to interact with it. Participating users generally need to be able to write, not just read, the material. These systems must also have some form of overrule mechanism to guard against common threats such as spam and vandalism, as well as mechanisms allowing users to validate the points of view contributed by other participants.

A community must self-organize to police the collective works and ensure that common interests are protected. This is a nontechnical issue, but it has far-reaching technical implications. For example, Wikipedia.org has been widely criticized by many people for its lack of a common set of standards implemented across all articles. People often contribute to articles on Wikipedia only to later find that the editors have removed their contributions. It can be extremely frustrating for people who consider themselves experts on a given subject to have an editor who is not familiar with the subject unilaterally decide that the information they’ve contributed is not worthwhile. Sometimes errant information also gets published on Wikipedia, and when knowledgeable people try to correct it, the editors keep the accurate information from being published.

Version control over content is also a common requirement for those implementing this pattern. Keeping track of multiple versions of shared edits helps participants to see and discuss the process, and adds accountability.

Generalized Solution

The generalized solution for the Participation-Collaboration pattern is to implement a collaborative platform through which humans or application actors can contribute knowledge, code, facts, and other material relevant to a certain topic. A seed discussion or starter text can be made available, possibly in more than one format or language, and participants can then modify or append to that core content.

Static Structure

In the static structure depicted in Figure 7.13, “The Participation-Collaboration pattern”, a contributor can come in and review published content, possibly aggregated from a content management repository. The contributor can add to or even change some of the published material. The magnitude of the changes can vary greatly, from as minor as adding a small “tag” (see the discussion of the Collaborative Tagging pattern, later in this chapter) or fixing a typo to as major as erasing or rewriting a complete section of content. The actions are fully auditable, and in most such solutions, the users’ identities can be verified. This helps to prevent contributors from deleting material and replacing it with content that is abusive or misleading in nature (e.g., a Wikipedia user erasing a page that discusses online poker and replacing it with an advertisement for his online poker website).

Figure 7.13. The Participation-Collaboration pattern

The Participation-Collaboration pattern

Each time content is changed, it generates an auditable event. In systems such as Wikipedia, volunteer reviewers are notified of these events and may subsequently inspect the content and, if they see fit, change it back to its original form. Most such solutions provide mechanisms to lock content so that more than one collaborator cannot edit it concurrently.

Figure 7.13, “The Participation-Collaboration pattern” uses the term “Topic 1,” but you can replace this with almost any other type of artifact, whether it be text, code, audio, video, or something else entirely. The number of Contributor actors can range from one to infinity.

Dynamic Behavior

The simplified UML sequence diagram shown in Figure 7.14, “The sequence of the Participation-Collaboration pattern” indicates the order of activity during the Participation-Collaboration process. There are many nuances to each step of the sequence. For example, when a logged-in user first requests the content, by retrieving a copy of the content she may place a “lock” upon it that prevents other participants from concurrently accessing it with write privileges. In the case of wikis, this lock is acquired as soon as a participant clicks the Edit button and begins to edit a section of content.

Figure 7.14. The sequence of the Participation-Collaboration pattern

The sequence of the Participation-Collaboration pattern

The reviewer may not necessarily accept or reject the user’s changes wholesale; in fact, he may take on a role similar to that of the collaborator and further change the content. During this process, most implementations normally keep a copy of the original content in case the changed copy becomes corrupted or otherwise unusable.

Implementation

The Participation-Collaboration pattern can operate on any type of content. The pattern is certainly not limited to text, although wiki and publishing examples might give that impression. Several websites, for example, allow video content to be remastered, or allow other changes whereby a user can reproduce the original content, add new video and audio, and create an entirely new piece of work. Several collaborative music sites use the same pattern to allow amateur musicians to mix their tracks with those of professionals, creating unique audio content.[79]

As an implementer of this pattern, you will probably want to have some form of editorial control over content within your domain, but you should realize that some interactions may result in content being moved outside your domain. Having clearly stated and scoped licenses for all contributions is a prime concern that many legal experts have expressed. Legal jurisdiction is another primary consideration. Some countries have not signed or agreed to, or do not enforce the terms of, various international legal instruments (such as treaties from the World Intellectual Property Organization, or WIPO[80]) to mutually respect intellectual property rights. This could create legal problems around ownership and derivative works.

Implementers will likely want some way to identify their contributors and participants. Those who demonstrate consistent wise judgment (as judged by the collective community) in content participation may be good candidates for reviewers. Those who repeatedly abuse such systems should be barred from further participation until they can demonstrate a better understanding of the activity. Of course, much of this is subjective, and implementers should also place a great level of trust on the user community as a whole. Communities can generally police themselves when their common interest is challenged.

One criticism of Wikipedia is that a select few editors end up in control of the collective knowledge that was originally contributed by many diverse contributors. Some Wikipedia pages have disappeared completely, much to the dismay of subjects of those pages, and some content has been ruled unworthy or improper. Trusting one or two people to control the content related to a specific topic may lead to contributor frustration. Proponents of this argument claim that mechanisms like Wikipedia represent a clear danger to society: if considered authoritative and beyond reproach, they could effectively be used to rewrite portions of history, at least for people who don’t look beyond these sources.

Archiving older copies of content is also a consideration for implementers. The ability to examine audit trails and older versions of content can be critical for custodians of mutable digital content. Some watch for a pattern of change and reversion to the original, followed by the same change and reversion over and over again. Think about the Wikipedia problems we just discussed. Multiple users logging in and repeatedly attempting to add the same reference to a specific page could indicate that content really needs to be added. Alternatively, if the editor repeatedly rejects the content, it might be because the same user is logging in under different names, attempting to make the same change. An archived content set might also be of interest to those who study public perceptions as they change regarding certain topics. For example, the American cyclist Lance Armstrong became a hero and won millions of fans by coming back from a deadly form of cancer to win the world’s toughest cycling race a record seven times. The public’s attitude regarding this feat and their admiration for Lance went back and forth on several occasions, and a historical perspective might be a crucial part of any analysis of his career. Edits, though, likely reflect perspectives current at the time they were made.

This pattern has also assisted the rise of microformats, or small snippets of syntax that can be used to mark up existing web pages to declare that certain pieces of content are formally structured using the syntax and semantics defined by those microformats. An example might be a vCard (a formal syntax for describing a business card), or a declaration to help attribute an author to a specific piece of content. We describe the microformats phenomenon in more detail later in this chapter, in the section on the Structured Information pattern.

When working with non-text content, implementers will have to employ extra metadata to ensure that the content can be mixed and matched. Such metadata might include the beats per minute (BPM) on audio files, the codec on video content, and the timestamps on aggregate audio-visual content.

Business Problem (Story) Resolved

The online manual for a product becomes far more useful if customers and users can contribute material (e.g., filling in details that the manual’s original authors may have left out because they made faulty assumptions about the readers’ level of technical expertise). Instead of publishing a static manual for its customers, the company could allow existing users to participate in creating a more comprehensive and useful information set, to help new users who have purchased the product as well as people considering purchasing it. Free and open information carries with it much more weight when it comes from users who are outside the company, and reading positive feedback from existing customers often encourages others to buy the company’s products. Another benefit is that users can include useful workarounds to potential problems or even describe alternative uses of the product that its designers may not have foreseen (say, how to use an ordinary steak knife to perform an emergency tracheotomy on someone who is choking).[81]

To achieve this, the company might set up a special online manual using a wiki that allows contributions by many people. In this scenario, all users must register and log in each time they visit. There are also forums where users can help others with questions pertaining to products and services. Note that such an infrastructure might be implemented by the user community itself rather than by the company that manufactures or distributes the devices, although, generally speaking, the company would be wise to participate. Users may gain status points each time they help other users and append materials that are accepted into the wiki. The company can track these status points and use them to provide special promotions to top users for future sales campaigns or even beta tests of new products (several technology companies use this bonus scheme now and offer users who make large numbers of helpful bulletin board posts early access to alpha versions of their software).

The company itself can employ participants who try to let the community run as autonomously as possible, but who also make sure that all questions are answered in a timely manner and that no errant or abusive information is posted to the company’s website. A consumer can download a printable manual or section thereof at any time, thus eliminating the wasteful process of the company printing static and rapidly outdated materials for every product.

A dynamic publishing platform could also allow content owners to incorporate content provided by their user communities or based on customers’ wishes (notifications of product news, new reviews, new tech bulletins, and updates to user manuals) that could ultimately make for a better user experience for everyone.

Specializations

There are many specializations of the Participation-Collaboration pattern, including blogs (web logs), blikis (blogs with wiki support), blooks (books that are collaboratively written), moblogs (mobile blogs), and vblogs (video blogs), among others. Each uses the same basic pattern, where participation and collaboration in a shared work is a core theme.

The same pattern is also used for open source software development, where many programmers contribute code to evolving projects.

Known Uses

Wikis—including many specialized wikis such as Wikipedia and RIApedia,[82] a site devoted to Rich Internet Applications—are probably the best-known use cases of this pattern. SourceForge, Apache, and other open source software processes use the same pattern for code rather than text.

In addition, the specialized content management system Drupal,[83] built in PHP, supports many aspects of Participation-Collaboration right out of the box. (And its manual is written in precisely the way that we outlined earlier.)

Meanwhile, some cool new websites are applying this pattern in a different manner. As noted earlier, at least three companies—Mix2r.com, Gluejam, and MixMatchMusic—have collaborative sites where music is open sourced and new tracks can be mixed with existing ones to create new audio projects. Artists and musicians can contribute loops and original audio tracks to works that others can download.

In the video world, similar companies are pursuing this pattern. MixerCast.com, MovieSet.com, Brightcove.com, and others have used the same approach with video, letting people remix video clips and even provide audio clips to create new works. MovieSet.com also allows users to view the behind-the-scenes aspects of content creation, and in the future it may allow the audience to provide input regarding the plot.

Consequences

As mentioned earlier, this pattern has many benefits. It does, however, require the domain owner to take care not to give the appearance of dominating the community, yet at the same time to try to collectively guide the community toward a common goal.

The Asynchronous Particle Update Pattern

Also Known As

The Asynchronous Particle Update pattern is related to two well-known architectural styles:

Asynchronous JavaScript and XML (AJAX)

AJAX, which has made web pages much more interactive, is a well-known implementation of this pattern and has been credited with being a major force behind the Web 2.0 movement.

Representational State Transfer (REST)

REST is a strongly related architectural style, one that often supports the update mechanism. Roy Fielding, one of the principal authors of the Hypertext Transfer Protocol (HTTP) specification, introduced the term REST in his 2000 doctoral dissertation on the Web, and it has since come into widespread use in the networking community. Like the Asynchronous Particle Update pattern, REST is a pattern or architectural style and principle of networks that outlines how resources are defined and addressed. The term is often used in a looser sense to describe any simple interface that transmits domain-specific data over HTTP without an additional messaging layer such as SOAP or session tracking via HTTP cookies. In true pattern style, it is possible to design any interchange in accordance with REST without using HTTP or the Internet. It is likewise possible to design simple XML and HTTP interfaces that do not conform to REST principles.

Business Problem (Story)

During the first iteration of the Internet, most clients retrieved content using HTTP GET requests, which returned complete HTML web pages. In true REST style, the web pages were built with information that represented snapshots of the state of information at a certain point in time. For some exchanges this did not represent a problem, as the content was updated rather infrequently. But in many instances some of the information changed within moments of the web page being delivered, and some web page users required more dynamic content delivery. These users had to resort to clicking the browser’s Reload button, causing the entire page to be resent from the server to the client. In many such cases, the ratio of bytes that needed to be updated compared to the number of bytes in the page that remained static made this a very inefficient operation to force on users. Requests for popular web pages that were normally updated several times per minute could completely overload the providers’ servers. Examples included web pages that provided the latest information regarding sporting events such as the Tour de France cycling race, news reports, stock market quotes, and news headlines, among others.

Beyond performance concerns, the page-by-page approach also drastically limited interface possibilities, making it more difficult to replace desktop applications with web applications.

Context

The Asynchronous Particle Update pattern is likely to be useful in any situation in which exchanging a small number of bytes rather than an entire document will save both server and client (and owner) resources.

The pattern is also applicable where the Synchronized Web pattern (discussed later in this chapter) is implemented and multiple users must have regular data updates sent to or pulled into them.

Anywhere there is a stream or supply of rapidly changing data that subscribers want to receive, this pattern will help. There are specialized expressions of it, such as real-time protocols (RTPs) for online collaboration and communication.

Derived Requirements

This pattern requires building or working with a small methodology that is capable of loading a small portion of the document object model (DOM) by making a small, asynchronous request, waiting for a return to the request, and then using the return to update a portion of the page without having to load the entire page. The browser itself must have some form of processing power and access to some way of updating a portion of the page content.

The requests for transfer of data from one point to another should be loosely coupled to the trigger mechanism to allow maximum flexibility in implementation. Being able to trigger updates on events such as user interactions, server events, state changes, or timeout events will provide maximum flexibility for this pattern to be applied over a gamut of contexts.

For this to work, the architecture must have a consistent model for objects (including document objects), events, event listeners, and event dispatchers across multiple tiers of any solution pattern. This means that architects must analyze the event and object models (including the dispatch mechanism) across all technologies, standards, and protocols used within their infrastructures.

Building an application that adheres to the Model-View-Controller pattern is also likely to be an advantage for developers and architects who can manipulate both the server and the client frameworks and code.

Generalized Solution

The generalized solution to this pattern consists of four parts. The first part is a small method built-in to browsers that allows a small structured message to be sent to a remote address and, optionally, waits for a reply. The second part of the solution is a server component that waits for the browser’s request and fulfills it by replying based on the incoming parameters. The third part is a small runtime environment within the browser that can manipulate the returned value and make post-retrieval changes to the data. The fourth component is a plug-in (or the browser engine itself) that uses the result of the operation to update the view presented to the user.

The solution should support message exchange patterns other than straight Request/Response (see the discussion of the SOA pattern for more on message exchange patterns). For example, several clients should be able to subscribe to event or state changes, perhaps with one change federating out to several clients.

Static Structure and Dynamic Behavior

Several views of the static structure of this pattern are interspersed with sequence information; therefore, we have combined the static and dynamic depictions of this pattern in this section.

The scenario in Figure 7.15, “Variation one of the Asynchronous Particle Update pattern” depicts an asynchronous request/response message exchange based on a user event trigger. In this case, the user clicks her mouse over a button labeled “Update stock quote” and a small request() message is dispatched to the server side. The server retrieves the stock quote from the stock quote data provider and returns it to the client side via the response. The service client uses the information to update the browser view.

Figure 7.15. Variation one of the Asynchronous Particle Update pattern

Variation one of the Asynchronous Particle Update pattern

In Figure 7.16, “Variation two of the Asynchronous Particle Update pattern, based on an elapsed time event on the client”, the same sequence of events is kicked off by a timeElapsed() event, which is set to automatically update the stock quote at certain intervals. In this case, the timeout(90) event is triggered from the client to update the node of the DOM every 90 seconds.

Figure 7.16. Variation two of the Asynchronous Particle Update pattern, based on an elapsed time event on the client

Variation two of the Asynchronous Particle Update pattern, based on an elapsed time event on the client

A third variation exists in which the timeout is placed on the stock quote service. Note that in this variation, which is shown in Figure 7.17, “Variation three of the Asynchronous Particle Update pattern, based on an elapsed time event on the server”, communication between the client and the stock quote server goes from server to client, something not presently supported by AJAX. It is assumed that the client has somehow registered that it wants to receive events from the stock quote service every 90 seconds. The advantage of this pattern variation is that the client does not have to send a request() message to each stock quote service it might be using (remember, this pattern shows only a 1:1 ratio; in reality, the number may be 1:n, or one to many). In this case, the timeElapsed event triggers a server-side request. The server-side request then triggers the messages to be pushed to each client that has registered to receive messages of that type or for that event.

Figure 7.17. Variation three of the Asynchronous Particle Update pattern, based on an elapsed time event on the server

Variation three of the Asynchronous Particle Update pattern, based on an elapsed time event on the server

Yet another variation of this pattern, shown in Figure 7.18, “Variation four of the Asynchronous Particle Update pattern, based on a state change event on the server side”, is based on an actual state change (in this case, the change of the value in the stock quote). When the state of a stock price changes, it triggers an event message to be fired to the stock quote service, which in turn pushes a message to all clients registered to receive messages based on that event.

Figure 7.18. Variation four of the Asynchronous Particle Update pattern, based on a state change event on the server side

Variation four of the Asynchronous Particle Update pattern, based on a state change event on the server side

Implementation

When implementing this pattern you must carefully consider many nuances, including the number of requests, the size of the particle, and how to present changes to users. In general, the overall goal will be to provide the best possible quality of service while using the least possible amount of network bandwidth and generating minimal processing overhead.

The sheer number of clients using a service or wanting to be notified of event changes should be grounds for deciding which of the four variations to employ. Each pattern has a slightly different profile in terms of how much bandwidth, system memory, and client-side processing it uses/requires, and finding an optimal balance will require analysis. An architect could further develop a hybrid approach that could dynamically change the pattern based on the number of active clients and other details, such as which common stocks they’re interested in.

Business Problem (Story) Resolved

Wholesale page refreshes can be replaced with updates to minute fragments of data. Within the page, small snippets of code make asynchronous coordinate particle updates with various services and ensure that the user has access to those updates. Regardless of the actual interaction model, the pattern of updating particles rather than the entire resource can be implemented in a manner that saves bandwidth and minimizes the user’s responsibility for updating the view. Users also benefit because the interfaces are more flexible, letting them make changes to information without interrupting a page refresh.

Specializations

The four different interaction models illustrated earlier each specialize (i.e., extend the basic concepts of) the main pattern. Further specializations via interaction patterns are possible as well, but it is beyond the scope of this book to account for all possible variances.

Known Uses

The AJAX manner of implementing this pattern is widely known and used in the software industry today. Although the specifics regarding how it can be implemented are largely left to each developer, the technologies used are very particular.

Adobe’s Flex and AIR, for example, implement the pattern via different remote messaging methodologies, standards, and protocols. The Action Message Format (AMF) transmission protocol may be used within a Flex application. In addition, Adobe LiveCycle data services can be used to push data to multiple subscribers (for more information, see the section on the Synchronized Web pattern later in this chapter).

REST is also a style of software architecture that is frequently used to support implementations of this pattern (REST does not mean XML over HTTP). Although not specifically dependent upon HTTP and XML, it is a frequently quoted implementation of this pattern.

Consequences

It is highly possible to implement this pattern in a way that uses more bandwidth than a simple page refresh would. Software architects and developers should assess the number and nature of the AJAX widgets on their web pages and consider the cost of the various particle updates in relation to a page refresh to ensure that there is an actual gain in efficiency. If a single web page is too heavily dependent on a large number of AJAX components, users will find its performance unacceptable.

Also, as soon as a service becomes available for its intended purpose, some users may discover it and start using it for other purposes. Architects and developers would be wise to consider the policy and security models for their services as well as thinking about what sorts of mechanisms they might use to ensure only authorized use of the services.

The Mashup Pattern

Also Known As

Other terms you may encounter in discussions of the Mashup pattern include:

Whole-Part pattern

The Whole-Part pattern, a pattern of composition and aggregation, is well established in software architecture. In many instances, there are specific nuances. For example, sometimes a whole does not know or cannot see that it is composed of parts, while in other cases it may hide this fact from those that interact with it. An example of the latter might be a software system that uses all open source software internally, yet hides this fact and all the interfaces that the software provides so that other software entities cannot use the functionality.

Aggregate relationship (as described in UML 2.0)

Aggregation is a specialized type of Whole-Part pattern in which individual parts can exist independently without the whole, yet can also be aggregated into the whole. If the whole ceases to exist, each of the parts can still exist.

Composite Relationship pattern (as described in UML 2.0)

In a composite relationship, a whole is composed of individual parts. In the Composite Relationship pattern (often summarized as a “has a” relationship), the whole cannot exist without the parts, nor can the parts exist without the whole. The life cycle of the parts is tied directly to the life cycle of the whole.

Composite applications

The concept of creating applications from multiple particles of computing resources is similar to the concept of aggregating data from many resources. This is a further specialization of the Whole-Part pattern, although possibly an anti-pattern of SOA.[84]

Business Problem (Story)

An insurance company wants to supply maps to its branch offices, overlaid with demographic data to present agents with the information they require to write specialized policies. It has access to detailed data about median income, automobile accident statistics, and snowfall accumulation, as well as statistics regarding the threat of natural disasters such as earthquakes, tornados, forest fires, and hurricanes, and many more data sets. Some of these data sets are available via the insurance company’s infrastructure, while external parties provide others. Individual agents will need slightly different content for their work depending on what types of policies they are writing and where they are located. The corporate insurance office wishes to allow agents to download and display data from various sources at their convenience with little or no overhead so that the agents can write insurance policies that both protect the homeowner and act in the best interests of the insurer by cutting risks to known events.

The insurance company wishes to use online statistical data as well as delivering the maps and other information via the Internet. The statistical data comes from many sources that can be accessed over the Web. The data sources are in multiple formats, and each source may or may not be available at any given moment. This makes hardcoding the data into an application undesirable, and a “best effort” quality of service is the highest possible outcome of any interaction.

To complicate matters further, each agency has its own technology, and the insurer cannot count on any standard technical infrastructure or specific platform across its agents’ offices. Insurance agents may have different browsers, different operating systems, and even different levels of bandwidth, thus making content delivery challenging.

The data required cannot be hardcoded into the agents’ applications because it is constantly changing. Pushing newer information is not desirable given the fact that not all agents require the full set (an agent in Mexico city, for example, does not require heavy snowfall warning data, whereas agents in Alaska do not require heat wave or hurricane warning data). Nor can pushes of data be reliably scheduled because the data sources update infrequently and on completely different schedules.

Context

This problem occurs anywhere a common, user-controlled view to disparate remote data is required and the data is delivered via services over a network connection or other communication formats and protocols. The Mashup pattern may be applicable well beyond the business model outlined in the preceding section.

Derived Requirements

The Mashup pattern depends on the SOA pattern; services that can be consumed must exist. Additional recommendations also apply:

  • Data encoding should generally be in a format that does not depend on any specific operating system, browser, or other hardware or platform (other than a common class of business software, such as browsers). Languages such as XML are commonly used to encode data for mashups, as these languages are easy to interpret and manipulate.

  • On the client side, there should usually be some way for the user to manipulate the mashup content. However, this is not always necessary.

  • Where major computational resources are required to manipulate data, a third tier may be introduced to help reduce client-side processing.

Generalized Solution

An application is built to load a map and template onto a user’s system upon request. The application then detects remote data sources that are online and presents the user with a control button object to allow the user to render the data sets over the top of the map application. The data is retrieved using web-based protocols and standards to connect to a set of services that supply the data.

The data sets may alternatively be cached locally on the client to allow more efficient launching and operation of the client-side application.

Static Structure

A mashup uses one or more services and mashes together aspects of those services. An optional view may be generated for human users. Figure 7.19, “The basic Mashup pattern” depicts the most basic pattern for mashups.

Figure 7.19. The basic Mashup pattern

The basic Mashup pattern

A specialization of the Mashup pattern involves letting users configure the mashup’s view component. This is done in many implementations and has been built-in to content management systems such as Drupal and many PHP sites. There are also examples in which users can mash up content from their own sources, such as photos from their Flickr sets or their own blog content. Note that the UML convention, shown in Figure 7.20, “A UML class view diagram of the Mashup pattern (simplified)”, is written in a manner that recognizes this variant.

Figure 7.20. A UML class view diagram of the Mashup pattern (simplified)

A UML class view diagram of the Mashup pattern (simplified)

For those who do not read the OMG’s UML notation, Figure 7.20, “A UML class view diagram of the Mashup pattern (simplified)” shows that a mashup is an aggregation of services, other mashups, or both. A mashup may provide a view (commonly a graphical user interface, or GUI), although this is not mandatory; mashups may simply aggregate data for another application to ingest.

Several patterns exist for how and where content is mashed up. One variation is that all services are mashed up on the client side. This approach, depicted in Figure 7.21, “A simple two-tier mashup pattern”, is commonly used for Internet-based mashups.

Figure 7.21. A simple two-tier mashup pattern

A simple two-tier mashup pattern

Sometimes content is mashed up in a proxy middle tier or even on a server before being delivered to the client. In this case, the fact that the application is a mashup might not be readily visible to its users. Figure 7.22, “A hybrid multitier mashup” depicts this variation.

Figure 7.22. A hybrid multitier mashup

A hybrid multitier mashup

Implementation

When implementing the Mashup pattern, developers must carefully consider the many standards and protocols that may be used to build the final application. Developers and architects also have to understand the availability of and policies regarding content they wish to mash up. Policies are important considerations if certain user classes might not ordinarily be free to access particular resources, or might have to submit credentials for authorization to prove their identities before being granted access to a system’s resources.

Another top concern of mashup developers is the post-acquisition manipulation of data. This is much easier if the data is provided in a manner that is consistent with the Model-View-Controller pattern (i.e., is pure and not encumbered by a markup language pertaining to information presentation or control). One consideration that has received much attention in recent years has to do with the granularity of a service’s content. A coarse-grained data chunk from any service might require a lot of processing before most clients can use it, generating excessive overhead. An example might be a newspaper’s website where you can request content, but you can only get the full newspaper for any given day returned as a large XML file with binary attachments. This system would not be ideal for service consumers who need only a small portion of the content, such as stock market reports. A better option would be to implement a tiered service model in which content may be accessed at a much more granular (fine-grained) level. Alternatively, if a single runtime environment (such as Adobe’s Flash) is used on the client, the data can be made available in a format such as AMF that is already serialized into an efficient binary format to reduce rendering time on the client. Architects will have to weigh both options before making decisions.

Developers of services that want to provide mashable data should strongly consider using common data formats such as XML, as well as the client-side overhead of processing various standards and protocols. For example, if an implementer chooses to provide a single photograph for mashing into various client-side applications, he should consider using the PNG, JPEG, or GIF format because most browsers support these standards.

Browsers don’t, however, provide much support for SOAP with extensions such as the OASIS WS-SX and WS-RX standards. The processing overhead needed to comply with the SOAP model might eliminate the possibility of services using this model being consumed on mobile platforms or other devices with constrained resources. In these cases, the Representational State Transfer architectural style, implemented by using XML over HTTP, may be a natural fit for many mashups. RESTafarians (the name given to those who enthusiastically evangelize the benefits of REST) promote the view that dividing application state and functionality into resources and giving them unique identifiers makes content easier to mash up.

On the client side, architects and developers will have to choose their target technologies very carefully. In many implementations, a consistent rendering experience or manner in which data can be shown to end users will be desirable. Achieving this in multiple versions of disparate browsers is somewhat difficult; a lot of testing and design iterations are required to get it right, especially if a large number of browsers is targeted. Some may choose to implement the pattern outside of the browser in a custom, standalone application to avoid any variable-display issues. In particular, many AJAX developers have been struggling with trying to maintain their applications and keep them working well in multiple versions of various popular browsers, such as Internet Explorer, Opera, Google Chrome, Firefox, and Safari. To add to the complexity, all of these browsers may have to be tested on multiple platforms or operating systems.

Business Problem (Story) Resolved

By offering several services via web-based interfaces, the insurance company lets branch offices and other resellers consume data and mash it together to build custom views of the data required to write insurance policies. Each branch office can pick and choose the right mix of data for its purposes. While the company could provide maps as one of its services, it could also use public services (such as Google or Yahoo! Maps) as a foundation.

Making this data accessible for mashups could also benefit the insurance company beyond this particular application—for example, by helping the company’s customers choose where they might wish to live based on criteria from data services providing information on median temperatures, crime rates, financial census data, and so on.

Specializations

Mashups themselves are inherently customizable, and an infinite number of specializations are possible. Some allow users full control over the user interface, so they can see only those aspects that interest them. Allan Padgett’s Tour Tracker (discussed next) is a primary example of this. Google and Yahoo! have also done wonderful things to allow developers to write code to use their maps for a variety of possible applications.

Known Uses

When we think about mashups, Allan Padgett’s Tour Tracker application (written to allow users to view information about the Tour of California bicycle race in real time) often comes to mind. Figure 7.23, “An excellent mashup example: Allan Padgett’s Tour Tracker, circa 2007” shows a screenshot of Allan’s application from the 2007 race.

Figure 7.23. An excellent mashup example: Allan Padgett’s Tour Tracker, circa 2007

An excellent mashup example: Allan Padgett’s Tour Tracker, circa 2007

The mashup in Figure 7.23, “An excellent mashup example: Allan Padgett’s Tour Tracker, circa 2007” uses the Yahoo! Maps Flash API to build an overhead aerial view of a section of the bicycle race. The map is overlaid with sprites depicting where specific riders are in real time, and the entire map moves as the tracked riders progress through the race. The bottom-right box shows a live video feed of the bicycle race (basically the same as TV coverage), and the other layers contain data about the course profile and about the cyclists, such as their average speed, distance remaining, and overall standings. Go to http://tracker.amgentourofcalifornia.com for a demo of this mashup.

Prod: please confirm the above link closer to production time. Use a search engine to find a more current link if necessary.

For a vast list of mashups, as well as much more information on how to create and apply them, see http://www.programmableweb.com/mashups/.

Consequences

Mashups depend on the services that feed them. Anyone implementing a mashup must take this into consideration, especially if those services are not under his control.

Content used in mashups may carry with it certain copyright privileges or restrictions. Material licensed through Creative Commons[85] under a license allowing free distribution is a perfect fit for mashups. In the Tour Tracker application shown in Figure 7.23, “An excellent mashup example: Allan Padgett’s Tour Tracker, circa 2007”, thumbnail photos are tiled onto the course based on geocoded photographs that spectators have contributed to Flickr.com. The Tour Tracker uses the Flickr API to retrieve the license data; however, the licenses may not be immediately apparent to end users.

The Rich User Experience Pattern

Also Known As

Often discussed in conjunction with the Rich User Experience (RUE) is the Rich Internet Application (RIA). The term itself suggests a type of application that is connected to the Internet and facilitates a RUE. Some people (most notably, industry analyst Bola Rotibi)[86] expand the RIA acronym as Rich Interactive Application, which is a far more descriptive term in our opinion.

Business Problem (Story)

Websites evolved as static pages (documents) served to those who requested them. The document sets were largely modeled on real-world artifacts such as sales brochures, help manuals, and order forms. However, the real-world documents themselves were only part of the full interaction (process) between the company and the customer/user; typically, a human working for the company provided additional input, contributing to the richness of the interaction concerning the document or artifact.

For example, if you walked into a travel agency in Northern Canada in January and started looking at brochures for Mexico and Hawaii, an observant employee might infer that you were sick of the Canadian winter and want to get some sun. She could then enhance your experience by asking questions to clarify this inference and offering you contextually valuable information (such as details on a sale on a one-week all-inclusive trip to the Mayan Riviera).

Many interactions with customers, suppliers, and partners are good candidates for migrating to electronic interactions and patterns of engagement. The problem is that human-to-human interaction is most often an ad hoc interaction, with boundaries wide enough to allow the entities involved to fork or change the conversation at will. Although the entire range of conversation (engagement) can be scoped to include only those topics that are relevant to the actual business task at hand, the two entities can still discuss multiple topics and engage in a variety of activities.

For an electronic interaction to measure up to human-to-human interaction, the application has to be able to let a human actor dynamically engage with it in multiple manners, at her discretion. The application’s task is to present the available interaction choices to the human and to react to the human’s stimuli in a manner that mimics the interactions between two humans, while simultaneously hooking those interactions into various business systems. Amazon.com, where other users’ interactions are studied and used as enriching data to present likely choices to those who are in the process of buying books, is an excellent example (this manifests as “Customers who bought this item also bought...”).

Visual and design aspects of this engagement are very important, as some tasks are more complicated than simple data interchange. Simple exchanges of data can be accomplished via electronic forms in HTML or PDF. For example, an interactive website that shows a customer how to build a custom Harley-Davidson motorcycle, style it with custom accessories and colors, and view the end result cannot be created effectively using forms. Visual components for Web 2.0 are being developed so cleverly now that some of the interactions possible between applications and humans would be hard to mimic in the real world with two human actors. The Amazon.com functionality mentioned earlier is one example of this—a human being would have to conduct very quick research to gather the data about other customers’ purchases, formulate a hypothesis about which other item the current customer would be most likely to buy, and then retrieve and present a visual rendition of that item. An application can do this much more quickly by running a quick database query to arrive at an answer.

Context

The Rich User Experience pattern can be applied anywhere a user might need to interact with an application in a way that optimizes the experience in terms of visual presentation of data and information relevance. This pattern is likely to occur where multiple remote resources are used to interact with and present the state of the applications as seen by the user.

Derived Requirements

Modeling and understanding the real-world process that accompanies the exchange of documents as part of a human-to-human interaction can help make rich user experiences genuinely rich. For example, if a human passes a sales brochure to a customer and at the same time collects information from the customer to guide her to a specific set of options in the brochure, the electronic interaction should support that extra interchange. To perfect this sort of interaction, alpha architects might want to make the information exchange invisible and seamless from the user’s perspective, perhaps by detecting the user’s IP address, using it to map the user’s locale, and then using that information to contextually specialize the user experience. A contextually specialized experience is one in which the user interacts with an application that has functionality, data components, and visual components uniquely configured for that user.

Note

Many companies do this by directing customers to a national website from their main web page. The corresponding national page is then contextually specified for the user. This is a common “entry-level” application of the RUE pattern.

In architecting a rich user experience, developers and others have to consider that different users may require different presentations to derive the most benefit from the interaction. This is a strong reason to look at developing the application using the Model-View-Controller pattern, which will enable different application views to share the control and model components.

An application supporting rich visual experiences should be capable of rendering content in a variety of media-rich formats, each tied to a common shared view of the state of the core model (see the next section, on the Synchronized Web pattern, for an example of this concept).

An application delivering a rich user experience should also be capable of tying into a variety of backend system services to provide the richest data exchange possible.

Some other considerations for development of RUEs include the following:

  • RUEs should use an efficient, high-performance runtime for executing code and media content and for managing communications. Efficiency is a key attribute of a “good” user experience. If an application is viewed as taking too long to produce results or to change view states, it will not be very enticing to the user. Developers should note that sometimes the illusion of good performance and good user feedback (such as busy cursors) can mitigate perceptions of bad performance.

  • RUEs should be capable of mashing up and integrating content, communications, and application interfaces into a common runtime environment.

  • RUE technologies should enable rapid application development through a component library that is available during the development phase as well as at runtime. Although this trait is not specific to RUEs, it potentially frees developers’ time to make more prototypes, resulting in more options to choose from.

  • RUEs should enable the use of web and data services provided by application servers and other services.

  • RUE technologies should support runtime scenarios for both Internet-connected and disconnected clients, including features for synchronization.

  • RUEs should be easy to deploy on multiple platforms and devices.

Generalized Solution

To deliver rich user experiences, developers should try to model the processes used in the real world (human-to-human interactions) and create workflows to facilitate similar, successful interactions within websites or applications. These processes may have various start and end states as well as decision and elevation points. A process might initiate with the user requesting the website, in which case the developer has to be able to understand as much about that user as possible before the user advances to the next step. Alternatively, a rich user experience may involve a complex backend system that pushes relevant and timely information to the user based on some event in which the user is interested. Some of the first questions the process modeler and developer should ask in this case are:

  • Where is the user located?

  • Has this user been to the site before (cookie check)?

  • Is the user a regular customer (or other partner) of mine?

  • What are the user’s technical capabilities (browser, bandwidth)?

  • Do I already have information stored about this user so I don’t have to pester him to re-enter data on every form he fills in?

  • What is this user likely to do as a next step? Developers can monitor interactions and apply Bayesian Theory to predict likely interactions based on the user’s initial knowledge.

An example

Competent architects and developers must also account for possible forks or deviations from the process’s main path. For a very simplistic example, let’s consider registering as a new user or logging in as an existing user.

In our example, there are several paths and states to consider. The design constraint is that our user will require only two tokens to log in: a username and a password. For the username, using a working email address with validation is a good practice because it can accomplish several things at once. Before we get into that, though, here is a list of possible sequences or states:

Normal

User remembers the username and the password and logs in with no issue.

Lost password

User remembers the username but not the password and needs to have the password reset.

Lost username

User remembers the password but forgets the username. Using email addresses as usernames can mitigate this problem, but a lost or forgotten username event can still happen if the user has multiple email addresses.

Everything gone

User forgets both the username and the password. While the user can get around this by signing up for a new account, that approach may lead to the presence of orphaned accounts that should be cleaned up.

Lost password, email changed

User lost the password and control of or access to the email address where the password might ordinarily be sent after a reset.

Unregistered user

User has neither a username nor a password and needs to log in.

Phishing

User is directed to a fake site and asked to enter his username and password. You can help guard against phishing-style attacks by allowing users to design custom skins for the login page, so they can instantly tell the real site from fakes. This customizing also lets the user choose a custom color scheme that appeals to him.

Double registering

User is already registered and tries to register again.

For a simple first stage of the process, designing mechanisms to mitigate most of the potential problems should be relatively easy. However, in more complex applications, your options may be less obvious. For example, if you are designing an Internet application where a user’s history of previous interactions is available as a series of tabbed panes, figuring out how to present them in a logical manner that is intuitive to all users might be difficult.

Static Structure and Dynamic Behavior

Figure 7.24, “Technology component view reference architecture for RUE developers and architects” shows the Web 2.0 Reference Architecture from Chapter 5, A Reference Architecture for Developers. Architects and developers can use this reference architecture as a guide when considering the various aspects of creating a rich user experience.

Figure 7.24. Technology component view reference architecture for RUE developers and architects

Technology component view reference architecture for RUE developers and architects

Architects and developers should first map out the interactions and experiences they wish their users to have. This may involve creating a series of storyboards using various GUI elements. During this activity, some artifacts should be developed to provide a map of the various paths moving forward.

The challenge is that a system must be built to handle all the potential conditions, without forcing the user to strain her memory or have to try too many times to log in. The data model of the user profile is a good place to start to think about how to architect the application. The following items should be associated with each user account:

  • A unique identifier (username)

  • An email address at which the user can receive any information (such as a reset password token) that is sent back

  • A password

  • Other details (such as the state of the interface the user sees when she logs in)

Upon reading the preceding list, a pragmatic architect might quickly realize that the first and second items can be combined by using the email address as the unique identifier, given that email addresses are guaranteed to be unique (except in situations where multiple people share an email account, but for the purposes of this example we will leave that condition out of the scope). This illustrates many of the benefits of a model-driven approach to software architecture: architects can ask questions such as “How many email addresses does a user have?” (zero to infinity) and “How many people might use the same email address?” (one to many), rather than assuming that everyone has exactly one email address.

A problem with using the email address is that some users have multiple email addresses. To simplify the user experience, it might be helpful to specify the type of email address. For example, depending on the type of application, you might want to prompt users to enter their “primary personal email” or “primary work email” addresses.

You could build the interface for the application to look something like that shown in Figure 7.25, “A simple login user interface”.

Figure 7.25. A simple login user interface

A simple login user interface

Without such guidance, users with several email addresses may be uncertain about which one to use. Understanding this user point of view does not itself constitute a rich user experience, but it is a small aspect of creating one.

PROD: This note was left in the Word files: "Production, please start sidebar here and include the figures herein." SIdebar has been marked up accordingly, and I’ve set the figures not to float. --Tools

Figure 7.26. A specialized date chooser

A specialized date chooser

Figure 7.27. A graphically ugly way to lay out a form

A graphically ugly way to lay out a form

Figure 7.28. The same form designed in a graphically appealing manner

The same form designed in a graphically appealing manner

We’re not done with our example yet; we still haven’t accounted for several use cases. First, let’s consider that a user has not yet registered. Making the registration page very simple and giving it the same look as the login page has several advantages. First, users will not be scared away by having to answer numerous questions before they can register.

Second, a subconscious association between the account creation and login pages might help some users to remember their usernames and passwords later. You could easily modify the user interface in Figure 7.25, “A simple login user interface” to allow users to also use it to create an account, as shown in Figure 7.29, “Registration added to GUI” (note the addition of the text “Need to Register?”).

Figure 7.29. Registration added to GUI

Registration added to GUI

When the user clicks the “Need to Register?” link, the GUI changes to include a password confirmation box, as shown in Figure 7.30, “A second view of the registration screen”.

Figure 7.30. A second view of the registration screen

A second view of the registration screen

This screen now allows a user to create an account. However, it could be optimized. Look at the “Return to Login” link. Why would you want to force your user, who has just registered for the first time, to log in? After all, he has just entered the requisite information. Having to repeat the process will not add any element of trust. If you are not going to verify the account creation through email, you should remove the link and simply allow the user to proceed to the site he wishes to access.

Most websites will add some form of validation. This may be done for a variety of reasons, but the best is to ensure that your user has not made a mistake when entering the email address. There are good and bad ways to do validation. Figure 7.31, “A good way to annoy your user while possibly not validating that his email address is correct” shows a bad way—having to enter an email address twice is annoying to most users, and it’s also a bad architectural choice. Your user could be having a bad day and enter her email address incorrectly both times; even if both email values match, they could still be wrong. Our advice is to never use the interface depicted in Figure 7.31, “A good way to annoy your user while possibly not validating that his email address is correct”.

Figure 7.31. A good way to annoy your user while possibly not validating that his email address is correct

A good way to annoy your user while possibly not validating that his email address is correct

The best way to validate that an email address is correct is to collect it once, and then send an email to that address with a link that the user has to click to complete the registration process. The sequence is simple, as shown in Figure 7.32, “The sequence of registration with email validation”, and if the user makes a mistake, it will be caught right away.

Figure 7.32. The sequence of registration with email validation

The sequence of registration with email validation

This is also a small part of a well-designed system that delivers a Rich User Experience. The user does not have to enter information any more times than is necessary, and a lot of the work is done in the backend. (This example isn’t unique to Web 2.0 applications, but hopefully the process of getting here has shown some of what’s involved in creating a Rich User Experience.)

Now, suppose that some time has passed and the user has decided to log in. Use case 1 (normal login) is simple and can be completed via the screens shown in Figures 7.29, 7.30, and 7.31. We still have to consider use cases 2, 3, 4, and 5, though:

Use case 2: Lost password

In this case, the user remembers her username but not her password and needs to have the password replaced. Because we have the user’s email address (and we know it is valid), all we have to do is add a small link that says “Forgot your password?” to the login screen. The user can simply enter her username (email address) and click the link, and the system will send her a new password. A general security pattern here is to automatically reset the password to a random one and email it to the user rather than emailing her the old password. The reasoning is that she may also use the same password for other accounts, and exposing it via email could allow a clever hacker to use that information to gain access to other accounts (such as PayPal). A good architectural pattern to employ here is to also force the user to change her password upon logging in again for the first time after the lost password incident. Because most email travels in plain text over the Internet, letting the user keep the generated password is a bad idea.

Use case 3: Lost username

In this case, the user remembers his password but forgets his username. Although it’s rare for this to happen if email addresses are used as usernames, such an event can still occur if the user has several email addresses. By prompting the user for a specific type of email address at registration time, you can mitigate the likelihood of such incidents. In the user interface we designed, we used the phrase “Primary personal email address,” which should provide a sufficiently strong hint for most users. In this use case, it is a bad idea to allow users to enter their passwords and look up their email addresses because passwords are not guaranteed to be unique. Using OpenID[87] is a good solution here.

Use case 4: Everything gone

In this case, the user forgets both her username and her password. This is a special case that is most likely to arise with services in which users log in only infrequently (e.g., a domain name registration service, where users may log in only once and then not visit the service again for several years). In this case, often the best you can do is to ask the user to create a new account. If she enters an email address that matches an existing account username, you may be able to catch the mistake.

Use case 5: Lost password, email changed

In this case, the user has lost his password and has lost control of or access to the email address where the password might ordinarily be sent after a reset. An alternative to re-registration in this case may be to ask the user to provide a secondary email account, email the temporary password to that account, and then give the user a fixed amount of time to log in and change the primary email address.

As you can see, Rich User Experiences don’t follow one specific pattern. In the rest of this section, we discuss designer and architect considerations.

Implementation

One of the primary activities of any architect or developer should be to model the entire system, including subartifacts for things such as use cases, technology component views, sequence diagrams, and data models used within the application. If more than five people are working on a project, consider adopting a consistent modeling syntax and methodology for designing the interface and interactions, not just for the code.

Using a modeling methodology when implementing a RUE application will likely help everyone involved to understand where issues may arise. Adopting UMM, MDA, or RUP, as discussed earlier in the sidebar Model-Driven Architecture, will likely benefit your end users.

Capturing the knowledge of the application independent of technology is another consideration worth pondering. Technologies come and go; however, most business patterns remain fairly stable. Adopting a consistent modeling syntax and library of artifacts, including business patterns, processes, and other artifacts, likely will help you build rich applications.

Business Problem (Story) Resolved

In the section called “Generalized Solution” , we caught a small glimpse of how a technical solution can be used to enhance the overall user experience by encouraging careful consideration of the entire lexicon and scope of the interaction: every possible combination of initial situations was considered, and a solution was developed to account for each one of them. This approach was applied to only a small aspect of the overall experience (login), yet the methodology is applicable to the wider scope of application development.

Specializations

The Rich User Experience pattern has so many potential specializations that it would be hard to provide an exhaustive list. Here are some variations with strong architectural departures from the websites of yesteryear:

Getting away from the browser

Several new technologies are arising from various companies that let Rich User Experiences be delivered beyond the browser. This is not an attempt to kill the browser, but rather to cross a chasm between compiled desktop applications and web-based applications. Two of the leading technologies in this arena are the Adobe Integrated Runtime and Sun Microsystems’s JavaFX. Microsoft’s Silverlight is presently constrained within a browser, though that may change.

Virtualization

Virtualization is about presenting the user with an interface that appears to be his native system without actually having him directly connect to that system. An example would be to turn an operating system into a web-based application so that a user can log in to any public terminal in the world and get the appearance of having logged into his own system.

Skinnable applications

This evolution allows users to custom-skin their applications. In the past, only software developers were able to do this; however, more recently some applications have been delivered that allow users to add their own skins or “chrome” to have a custom view of the application. As well as being visually enticing to the user, this specialization also helps to prevent certain types of fraud, such as phishing—the phishers will not know what skin the user applied, so it will be easy for the user to detect phishing attempts.

Known Uses

This pattern has thousands of known uses. Here are some of the more notable ones:

Consequences

The biggest consequence of using this pattern to architect your system is a positive realization of a good user experience. Those who implement cutting-edge applications raise the standard, setting a higher bar for other developers to follow. It may also have a positive impact on your business. For example, online sales traditionally have a very high abandon rate. However, a recent piece of research into the return on investment (ROI) of RUE done by Allurent on Borders.com[88] showed the following results after migration:

  • 62% higher conversion from website visitors to paying customers

  • 41% more products viewed

  • 11% more likely to recommend

Technical considerations range from increased system complexity to limiting users to only those who have a specific set of capabilities required to interact with the applications. Increased complexity can be addressed by using the Software as a Service, Mashup, and Service-Oriented Architecture patterns (discussed earlier in this chapter) to help keep systems from becoming too interdependent. Limiting users to those who have certain required capabilities is an issue that must be addressed much more carefully. When building a Rich User Experience, you do not want to lose a sizeable portion of your users due to limited technical capabilities. Each developer and architect must assess this consequence individually based on the specific requirements.

The Synchronized Web Pattern

Also Known As

Names and terms you may see in conjunction with the Synchronized Web pattern include:

Office 2.0

The term is given collectively to a group of new synchronized applications that provide the same sort of functionality as people collaborating on one instance of a document. The documents (or other digital files), as well as the applications that create and show them, are often online, and all users have a consistent view of their state. The term is so popular that an Office 2.0 conference has been formed and continues to grow and prosper as of this writing.[89]

The Online Desktop pattern

Many companies (Google, Yahoo!, Facebook, Firefox, Salesforce.com, and others) are changing the software we use by shifting applications from the desktop operating system to a model of being on a server and accessed via the Web (the Online Desktop pattern). It also merges in the pattern of SaaS.

Rich Internet Applications and Rich User Experiences

Rich Internet Applications and Rich User Experiences, both of which are defined and discussed elsewhere this book, are relevant to but are not themselves manifestations of the Synchronized Web pattern. Many applications that embrace the Synchronized Web pattern also embrace the RUE pattern (discussed in the preceding section). Arguably, synchronization is in itself a facet of a good user experience.

The Observer pattern

The Observer pattern is a well-known computer science design pattern whereby one or more objects (called observers or listeners) is registered (or registers themselves) to observe an event that may be raised by the observed object (the subject). These are commonly employed in a distributed event handling infrastructure or system. See http://en.wikipedia.org/wiki/Observer_pattern for more information.

REST

The Representational State Transfer model (REST),[90] originally put forward by Roy Fielding for his doctoral thesis,[91] treats application state and functionality as resources to be manipulated with a constrained set of commands, boiled down to GET and PUT. REST describes a method for building large-scale hyperlinked systems, and it’s an ideal architectural style to use for getting and putting information required to synchronize the state of disparate resources.

Business Problem (Story)

Many web-based services are designed to be consumed in pure online mode, and have no value to their consumers when the consumers are decoupled from the Internet. Although some would argue that we’re on the verge of pervasive, reliable broadband access powered by Wi-Fi, 3G networks, WiMAX, and wired connections, 100% online connectivity is not yet pervasive. One of the authors of this book, who recently experienced his broadband being switched off and had to work using a dial-up connection, can attest to the fact that ubiquitous high-speed Internet connectivity is not yet a reality. The business problem involves how to synchronize the states of multiple applications connected to the Internet when you sometimes can’t reach the required services online.

Some people use “small things, loosely joined,” a mantra originally created by David Weinberger, to describe the aspects of Web 2.0 dealing with application and service usage patterns. Those who live and work on the Web use many different online services, each with its own usefulness. Such systems often contain a great deal of redundancy, which is not necessarily a bad thing. An individual user might have redundancy in his Flickr, Facebook, WordPress, and other accounts. As our society increasingly comes to rely on these web services, however, it becomes useful to establish some synchronization between them. This is where an API for service-to-service synchronization might prove useful. Such multiservice synchronization will underpin much of the next generation of web applications, intermingled with desktop experiences in the context of Rich User Experiences (see the preceding discussion of the Rich User Experience pattern).

Context

The Synchronized Web pattern generally occurs wherever an application has both a local (desktop) component and a network-bound server or service component, and both are required to fulfill the functionality of the whole application. In some ways, this is a variant of the Software as a Service pattern; however, this pattern relates specifically to state management in terms of synchronization.

The pattern can also occur anywhere information is persisted in two or more locations in a mutable form. If the information changes state in only one physical form, the other form(s) must be synchronized when possible.

Note

The term synchronization is not used here in the same way it is in traditional database terminology. The term is used a bit more loosely for Web 2.0 purposes, and it touches on the related database concept called replication.

Anyone who has ever signed up for a new online service and asked, “Why can’t I just move data from my existing service to here?” is, in essence, asking the question, “Why don’t these applications support the synchronized Web?”

Derived Requirements

At a minimum, you need replication and synchronization services, a local web application server, and a data persistence tier with state-change reporting functionality. Such a data store must be more than a “bucket of bits,” and you need some form of data-state manager to use this pattern. Such audit trails are necessary to enable a rollback to the last known synchronized state should something go wrong during the process.

Other possible requirements include modules to ensure the performance of the core components of the system.

You must have a high-level set of instructions or other methodology, and declare the rules for how unsynchronized components should become synchronized. For example, you can check the timestamps to determine the most recent information.

Although an embedded relational database such as JavaDB or SQLite is the obvious replication service provider for this pattern, it makes no sense for synchronized web developers to limit themselves to relational databases. Web 2.0 is currently experiencing a data management revolution, with a far greater tolerance of heterogeneous data layers.[92] Database doesn’t necessarily mean relational.

Generalized Solution

A lightweight embeddable data persistence mechanism provides for service-to-service synchronization, as well as letting data be stored and indexed in and retrieved from either remote or local locations. A database is not the only required data-handling mechanism, however. AJAX-like functionality to enable updates of specific data fields rather than entire web pages is another foundational aspect of the Synchronized Web pattern. AJAX can be made even more efficient using the binary JavaScript Object Notation (JSON) format supported by all modern web browsers, although other approaches are acceptable for most architects and developers. Any technology that makes data transfer between services more efficient is useful in a synchronized web application, and local AJAX is now finding its way into more and more applications. Google Gears, for example, provides WorkerPool, a module designed to allow JavaScript processes to run in the background without having to wait for script updates on the web app UI.

The clients may be offline during some phases of the Synchronized Web pattern. In such times, they can be prepared for use in service-to-service synchronization once a connection is re-established. During this preparation, special data-handling routines may be used for any translations that are required.

In some cases there is no offline phase, and the synchronization happens in real time between connected clients. The pattern is still the same, as the communication interaction is one of asynchronous messages as updates become necessary.

With this pattern, it is important to minimize the complexity of the API. Synchronized web solutions should use standard web methods (such as HTTP GET and PUT) wherever possible, generalizing the web development and deployment experience for client-side use. Several basic message exchange patterns exist; however, the more common ones are get, put, subscribe, and push. Supplemental methods exist for event notifications for changes in state in most applications embracing this pattern.

Although not always in machine language form, a set of rules regarding how unsynchronized resources can reattain a synchronized state is imperative to the overall success of this pattern. Rules should carefully outline how to avoid issues when there are conflicts and misalignments.

Static Structure

With the Synchronized Web pattern, multiple clients may be synchronized to a single object’s state or multiple states. Although this pattern may appear to be enterprise-oriented, it is equally applicable to individual servers or even peer-to-peer interactions. The underlying concept is the same.

Figure 7.33, “Synchronized web clients registering to an object’s state (courtesy of James Ward)” depicts one manner in which a pattern of synchronization can occur. In this variation, the state misalignment originates in data on the client at the top left. This is pushed to a centralized service and federated down to multiple tiers in the enterprise’s architecture. Other clients that have subscribed to the change in state can be pushed notifications when the state changes fall outside permissible tolerances.

Figure 7.33. Synchronized web clients registering to an object’s state (courtesy of James Ward)

Synchronized web clients registering to an object’s state (courtesy of James Ward)

During the phase depicted in Figure 7.34, “Multiple interaction services being synchronized via one application”, the change in state is pushed up to every subscribed client that needs to know.

Figure 7.34. Multiple interaction services being synchronized via one application

Multiple interaction services being synchronized via one application

This pattern can facilitate many different types of software, ranging from online video games to financial services. The electronic stock markets of this decade use this pattern on a daily basis to ensure that stock prices are accurate.

Implementation

The industry is currently at a very early point in the synchronized web evolution, and standards have not yet shaken out, which makes decisions somewhat more difficult for architects, developers, and businesspeople. A thousand flowers are currently blooming, and standards are emerging; however, some flowers grow faster than others.

The proposed HTML5, for example, includes web sockets that allow for a much wider range of communication possibilities between browsers and servers, simplifying the process of keeping information synchronized.

In actual deployment, service-to-service synchronization has become a popular feature of Facebook and even of the new Pulse service from Web 1.0 player Plaxo. (To be fair, Plaxo was always about synchronization, so in some respects it’s well positioned for what comes next; it has synchronization in its DNA.)

Synchronization also certainly matters in the online world. One example of a compelling synchronized web application is Dopplr.[93] With this travel management application it is simple to import Twitter information (an example of the Mashup pattern), which is useful because information from both sites can be combined to provide a more meaningful set of information to those who read it.

Perhaps the most pressing concern in building what comes next is enabling offline use of online services. Both Adobe and Sun Microsystems have recently made new runtime environments available to enable synchronized web-style offline access for client-side apps (or vice versa), in the shape of AIR and JavaFX, respectively. Several smaller companies now offer online word processing, spreadsheet, or other data-grid applications that can be used in both online and offline modes and can themselves detect and account for changes in connectivity, effectively synchronizing the desktop with a service.

AJAX has long been used to build synchronization functionality into Rich Internet Applications. Zimbra was one of the first vendors to demonstrate an AJAX web application synchronized for offline access using standard database technology.[94] It used the Derby (otherwise known as JavaDB) database. The announcement of Google Gears in May 2007 was the tipping point, when the synchronized web “arrived” and developers were given some great tools to work with. Google Reader was one of the first applications to benefit. Google even used the offline use cases (the set of requirements for being able to use software when not connected to the Internet, increasingly common in most document-reading software) as the basis for a humorous blog announcement for Google Reader offline titled “Feeds on a plane!”[95] The following quote is from a related article in the Google Developer Blog titled “Going offline with Google Gears”:

One of the most frequently requested features for Google web applications is the ability to use them offline. Unfortunately, today’s web browsers lack some fundamental building blocks necessary to make offline web applications a reality.[96]

Another player is Joyent, with its Ruby on Rails-based Slingshot platform. Slingshot is designed to let developers build web applications using Rails that offer a somewhat desktop-like experience: when a user is offline, she can use the app, accessing a local copy of the data, and the next time she goes online any changes will be synchronized with the server.

The Dojo AJAX framework, an open source DHTML toolkit written in JavaScript, is also working toward synchronized web functionality in the shape of the Dojo Offline toolkit, which includes API calls for synchronization and network detection, and even a widget to indicate successful synchronization (see Figure 7.35, “The Dojo Offline toolkit”).

Figure 7.35. The Dojo Offline toolkit

The Dojo Offline toolkit

One particularly useful Dojo function is slurp(), which works out the resources, JavaScript, Cascading Style Sheets, image tags, and so on that the developer needs to consider in offline-enabling the app.[97]

On the client side of a synchronized web application, you will see a more bounded experience than in the web-dependent version. Introducing and managing constraints will be one of the keys to making the synchronized web work.

From an architectural perspective, a common model for objects, events, and event management is required. If clients are to subscribe to an object’s state, they need a consistent model for what represents an object. Likewise, a framework for listening to and handling events and other exceptions is a requirement. Many individual programming languages embrace a common model today, and architects need to be very clear when building applications that may touch components in disparate environments and domains of ownership.

For Ruby on Rails shops, Slingshot is an obvious choice for offline access to synchronized web applications. Developers targeting the Adobe AIR or JavaFX runtimes, meanwhile, have effectively already chosen a programming model that supports the synchronized web.

One important implementation detail to consider is the rise of the small database SQLite, which Adobe, Google, and Joyent Slingshot are adopting for storage and synchronization services. Adobe and Google are now explicitly partnering in this arena, and Mozilla Firefox 3.0 offers SQLite services out of the box.

When implementing this pattern, you should first build your synchronization services, and then work to optimize performance. The patterns of usage may not be obvious or even fully known until later, as they will be shaped by how users decide to interact with the applications.

The more standards you support in building an application or service, the more easily “synchronizable” it will be.

“Freedom to leave,” a concept championed by Simon Phipps at Sun Microsystems,[98] is automatically built-in to any true synchronized web application. In his blog, Phipps lamented the fact that a lot of user-owned data (such as bookmarks in a browser and email address lists) was not easy to port from one application or platform to another. This placed undue pressure on people to stay with the applications they used rather than allowing them to migrate to newer applications or platforms. It’s not enough to merely be able to transfer or import this data to the service; the user must be able to transfer it away again. This pattern for software is referred to as “lock-in,” and it is the enemy of the synchronized web, to be avoided at all costs. Open data and open standards underpin the synchronized web.

Business Problem (Story) Resolved

Establishing services that allow the free transfer of information between applications creates a much more functional environment for users to share and store data and for businesses to emerge around it. The synchronized web will allow online software offerings such as Google as well as other Software as a Service companies (such as Salesforce.com) to compete on more than equal terms with self-contained software applications. Whereas once Microsoft had a seemingly unassailable advantage when it came to rich clients with offline storage for continuous productivity, with local processing for number-crunching and so on, Google and other synchronized web players are demonstrating that rich applications need not be 100% client-side, and that there are benefits for having some types of documents online. Adobe Systems’s September 2007 acquisition of Buzzword, an online word processing utility, shows the validity of such business models in the industry.

Specializations

Offline access to online web applications is one specialization of this pattern. For example, the Adobe Integrated Runtime made it easy for most developers to migrate over Flash, HTML, AJAX, and PDF-based web applications, effectively allowing any online web application to locally cache data so it can work both online and offline. Early examples included a Salesforce.com application[99] built by Nitobi that allowed sales personnel to enter data while in offline mode; the application on the salesperson’s computer would then automatically synchronize with the main Salesforce.com database when the client application reconnected. Although they are not synchronized with each other while one is offline, the fact that they do resynchronize is evolutionary for web-aware software applications.

Video gamers who play online also embrace a specialization of this pattern. Microsoft’s Halo 2 (one of the first applications that gave users the ability to compete online in real time against opponents anywhere in the world) really is one of the major technical marvels of modern times. Halo 2 is no longer unique in this capability; numerous other games, such as World of Warcraft, have embraced this pattern.

Distributed development repositories also replicate this pattern and specialize it for the field of software development. Most commonly, it is used to enable people in disparate locations to check in and out code using the same Subversion repository. The code repository keeps the state of the overall project synchronized and everyone’s views aligned.

Known Uses

This pattern is very common, and this section lists only a few additional examples. Google Reader and Google Gears both use this pattern, as do YouTube and other Flash-based video upload sites.

Teknision’s Finetune, a music discovery service with an AIR-based rich client to enable client-to-profile synchronization, also embraces this pattern.

Walled Gardens might be deemed an anti-pattern of sorts, as data and objects within the Walled Garden are often not visible to applications outside that environment.

Consequences

Synchronization and replication across loosely coupled web services have considerable benefits. For example, users can:

  • Take web applications offline

  • Take client applications online

  • More easily move from one service to another by making changes in one place and having the data automatically be synchronized in another place

Although not providing a single view of the combined state of a process or application per se, the synchronized web brings the user closer to it. We’re emphatically not talking here about what enterprise software companies would call “master data management.” The synchronized web doesn’t involve a massive data cleansing and normalizing exercise, to create a single canonical view of “the truth.” One way to think of the distinction is in the context of the differences between formal taxonomy and the Web’s community-defined equivalents: tagsonomy and folksonomy. Formal ontologies such as the Suggested Upper Merged Ontology (SUMO)[100] are rife with a pure and untainted first order of logic in which each concept is defined as a distinct and undeniable truth. This logic deals with some of the problems that architects face, such as the requirement for consistent models for objects and events mentioned earlier. The beauty lies within the incontrovertible idealism and universally true axioms and tenets. Folksonomies, on the other hand, are rife with inconsistencies and a system of sometimes illogical instances, yet they flourish and have been adopted (arguably) faster than most formalized ontology work.

The synchronized web is supposed to make users’ lives easier, not harder. So, when users act in one place, this pattern allows applications to update in another. When you update a Dopplr trip, for example, it also updates your WordPress blog. Or you can update your profile in Twitter, and the changes get syndicated to Facebook. No one entity needs to own and manage all the data to create huge benefits for users.

Matt Biddulph, the CTO of Dopplr, explains how this kind of synchronization should change the way that developers create web applications:

The fundamental web concept of Small Pieces, Loosely Joined is finally reaching maturity. It’s no longer necessary for every new webapp to re-implement every feature under the sun. Instead, they can concentrate on their core product, and share data with other sites using web APIs as the connectors.

The flipside of the “small things, loosely joined” mantra is that it can be a challenge to manage all the intermingling and data redundancy. To emphasize this point, consider how many profiles the average Internet user might have, including ones on MySpace, Facebook, Twitter, Flickr, Gmail, Yahoo!, the Microsoft Developer Network, various conference websites, and more. All contain the same basic semantic building blocks, yet they will inevitably become somewhat unsynchronized if some details change. The Synchronized Web pattern can be employed in software design to manage these states for you.

References

For more on the Synchronized Web pattern, please see the following sources:

The Collaborative Tagging Pattern

Also Known As

Terms and patterns associated with the Collaborative Tagging pattern include:

Folksonomy

Folksonomy is a popular term used to capture part of this pattern (as documented in the O’Reilly Radar report Web 2.0 Principles and Best Practices. There are multiple definitions of folksonomy; however, the one given in the O’Reilly Radar publication is closest to this pattern.

The Semantic Web Grounding pattern

We discuss the related pattern of Semantic Web Grounding later in this chapter. Such a pattern may use a collaborative tagging system as the basis for tagging resources; however, it may be implemented without this pattern.

The Declarative Living and Tag Gardening pattern

The Declarative Living and Tag Gardening pattern, which we discuss in the next section, is also highly relevant to this pattern.

Business Problem (Story)

Often, we need to use a search system to find resources on the Internet. The resources must match our needs, and to find relevant information, we need to enter search terms. The search system compares those terms with a metadata index that shows which resources might be relevant to our search.

The primary problem with such a system is that the metadata index is often built based on resources being tagged by a small group of people who determine the relevancy of those resources for specific terms. The smaller the group that does this is, the greater the chance is that the group will apply inaccurate tags to some resources or omit certain relationships between the search terms and the resources’ semantic relevancy. Furthermore, as written in an O’Reilly Radar brief:

Hierarchies by definition are top-down and typically defined in a centralized fashion, which is an impractical model poorly suited to the collaborative, decentralized, highly networked world of Web 2.0.

Rich media, such as audio and video, benefit from this explicit metadata because other forms of extracting meaning and searching [are] still a challenge.

The Collaborative Tagging pattern represents a ground-up style of semantic tagging that, by its very nature, will represent a larger cross section of society’s consensus regarding how to describe resources.

Note

The following definitions apply only to this pattern:

  • To tag means to apply labels to an object or resource. When we say “tagging an object,” we really mean adding a label to an object to declare what that object represents.

  • Resource is used as a catchall term to denote any digital asset that can have an identifier (this is the W3C definition from the Web Services Architecture Working Group). Examples of resources include online content, audio files, digital photos, bookmarks, news items, websites, products, blog posts, comments, and other items available online.

  • An entity is any human, application, bot, process, or other thing (including agents acting on behalf of one of these entities) that is capable of interacting with a resource.

In ontology circles, a triangle with the terms referent, term, and concept at its three points is often used to capture relationships between them. We use the word tag in a way that is synonymous with term, and referent is likewise similar to resource. The third term in the ontology triangle, the concept, is the abstract domain in which the lexicon or central meaning exists. This is noted as the conceptual domain in Figure 7.36, “Resource-tag relationship”. The entity is the agent or actor that makes the declaration linking all three of these concepts together. In this respect, the context in which that entity makes the declaration is an immense factor in the overall semantics, and in folksonomy implementations it can result in vastly differing tags.

Consider a person tagging a photograph of broccoli. One person might label it “cruciform,” “vegetable,” or “nutritious,” while another might tag it “gross,” “bitter,” or worse.

Context

Collaborative tagging adds a community-based natural language semantic layer to other searching and indexing mechanisms, and therefore it is useful anywhere and in any context in which people want to search for and communicate about information. Tagging enables better findability without requiring top-down metadata annotation; with tagging systems, users generate their own metadata. Tagging allows for resources to be described and used in communities in a way that mimics real life. If you observe a child learning to speak, you will see tagging (and later, tag gardening) in conjunction with the community—parents, other kids—in action.

Derived Requirements

Resources must be associated with a metadata infrastructure that lets anyone interacting with the resources add tags to them to make declarations about their relevancy. The system must be bi-navigational so that searchers can find all the tags attached to a specific resource and find other resources tagged with the same terms.

The tags themselves must be part of some normative or natural language to represent meaning.

Generalized Solution

To solve the business problem discussed earlier, a mechanism that lets users declare tags for specific resources should be in place, and its use should be encouraged. This pattern builds on the fact that some software mechanisms (such as folksonomies) become increasingly useful as more people use them. Additionally, collaborative tagging system users should be able to see the tags others have used to declare attributes or properties of a specific resource, to promote reuse of terminology and to increase the semantic understanding of the tags or of the resource.

To explain the pattern, Scott Golder and Bernardo A. Huberman authored “The Structure of Collaborative Tagging Systems.”[101] In this great resource, they wrote:

Collaborative tagging describes the process by which many users add metadata in the form of keywords to shared content. Recently, collaborative tagging has grown in popularity on the web, on sites that allow users to tag bookmarks, photographs and other content. In this book we analyze the structure of collaborative tagging systems as well as their dynamical aspects. Specifically, we discovered regularities in user activity, tag frequencies, kinds of tags used, bursts of popularity in bookmarking and a remarkable stability in the relative proportions of tags within a given URL. We also present a dynamical model of collaborative tagging that predicts these stable patterns and relates them to imitation and shared knowledge.

Users tag resources with keywords to classify them in alignment with their perspectives, yet in terms of their own choosing. The tags themselves are indexed, and a search on the index can yield a list of resources that may be relevant for a specific tag. Alternatively, the relationship can be traversed in the other direction to determine what tags have been applied to a specific resource.

Static Structure

The Collaborative Tagging pattern’s static structure is simplistic in design. Resources (e.g., digital photos) are examined by entities (usually people interacting with those resources) that then tag those resources according to their own interpretations of what they represent. What one person might tag as “dog” another might tag as “friend,” and yet another might tag it as “pooch” or perhaps “Pomeranian.” All these tags might have value for someone wanting to find that picture. Flickr is the exemplar service of this pattern. Communities have emerged that play games using Flickr tags, such as finding “orphan” tags that have been used just once. Collaborative tagging doesn’t require artificial intelligence or normalization. It keeps things simple by convention.

The tag, although freely chosen by the entity, is grounded in some form of conceptualization of a domain and represents that entity’s view of the resource based on the conceptual domain. This concept is referred to in Ontology circles as “pragmatics.” The resource is linked to the tag. This relationship is illustrated in Figure 7.36, “Resource-tag relationship”.

Figure 7.36. Resource-tag relationship

Resource-tag relationship

Dynamic Behavior

The primary sequence of activity in the Collaborative Tagging pattern is as follows. A user (an instance of an entity) makes a call to get a resource. The user then inspects the resource. After the user has inspected the resource, she can make a semantic declaration about the resource by adding a tag that references it (see Figure 7.37, “Sequence of events for tagging a resource”). Note that this diagram illustrates only the application of the tag, not the subsequent use of the tag for discovering related resources.

Figure 7.37. Sequence of events for tagging a resource

Sequence of events for tagging a resource

Implementation

Some tags have multiple meanings. In such cases, using an ontology or other higher order of logic to disambiguate meaning will help users find the correct resources. Implementers should also do what they can to encourage users to add tags to content, as the entire search and retrieval process benefits from more minds contributing tags. A synonym mechanism might be useful where tags or terms are interchangeable within a certain context. For example, those seeking “rental cars” might also be interested in finding resources tagged with “automobile for hire” or “vehicle leasing.”

You might also need some form of post-tagging analysis to avoid having entities thwart the Collaborative Tagging pattern by making illegal or unjustified declarations for their own benefit. For example, many website developers use or have used the HTML specification’s meta tag’s content attribute[102] to make inaccurate declarations about their sites’ content in an attempt to attract visitors; some were even taken to court and fined.

Business Problem (Story) Resolved

Increasing the user population usually makes a service more effective, and collaborative tagging is no exception. Semantics requires agreement and convention: the wider the agreement on a folksonomy term, the more useful the term becomes. As user populations grow, it becomes increasingly likely that someone else will also use a particular tag.

Many people don’t know how Google’s spell checker works.[103] The service doesn’t actually know anything at all about the words it checks. It does, however, know that a lot of people use them in a particular way: the right way. That’s the basis for programming collective intelligence in many contexts, and the same “learning by observation” approach can be applied to tags. Most, if not all, of the type-defining Web 2.0 web services use this pattern. Search indexes built using this system will likely appeal to a larger cross section of society. This is because, unlike taxonomies that were developed by very few people with a specific view of the world based on their own social upbringing, folksonomies are defined as they are created, by the majority of society. Logically, they stand a very good chance of being accurate to a large segment of users.

Terms also morph over time. For example, saying something or someone is “sick” in some contexts is slang meaning “really good” or “cool,” rather than an indication of suffering from an illness.

Specializations

There are ways to specialize this pattern. One way is to couple it with an upper-level ontology to classify tag terms that have multiple meanings. An upper-level ontology is a common, shared conceptualization of a domain. Two common examples are the Laboratory for Applied Ontology’s Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE)[104] and Adam Pease’s Suggested Upper Merged Ontology (SUMO).[105] By mapping SUMO to terms in Wordnet, ambiguities can be avoided in cases where words have multiple meanings. Imagine searching for the term “Washington.” You would likely get results for George Washington (a president), Denzel Washington (an actor), Washington, DC (a city), the Washington Monument (a large monument), Washington State University (a school), and more. If folksonomies can be mapped in a manner that disambiguates pluralities of meanings, it might be a valuable mechanism to advance Semantic Web interests.

Other ways to specialize this pattern could be to include human actors to monitor and build a massive thesaurus. This project could be of interest to entrepreneurs who want to do something in the field of semantics.

Known Uses

Flickr, Google Earth, Technorati, and Slashdot are all notable uses of this pattern.

In particular, Flickr has implemented the capability to have many humans provide tags for digital photographs. Flickr has often been heralded as a pioneer in Web 2.0 in terms of folksonomy development. Technorati is likewise a pioneer in terms of letting individuals use tags for blog articles, and Slashdot’s beta tagging system lets any member place specific tags on any post. Technorati has even published a microformat for tags in blog entries and indexes blogs using those tags (and category data). Even when an explicit tagging mechanism isn’t available, users often create them, as in the #tag syntax common on Twitter.

Consequences

Collaborative tagging has created controversy. For example, some ontologists have argued that folksonomy approaches are worthless due to their inherent flaws and the lack of a formalized ontology to provide context for the declarations (i.e., the lack of a higher level of formal logic to help classify all things in the folksonomy). Tag clouds, for instance, can help people find resources relevant to what they’re seeking.[106] Although this itself may seem to prove that tag clouds work, the logic for making such an assumption is flawed. Most people who find items that seem “relevant” to their search terms via tag clouds have no way of knowing whether those items were in fact the most relevant, because in most cases they don’t see all the choices. A further problem is that some actual tag cloud categories cannot be audited from an external perspective (an example of this might be “most popular”).

In spite of these flaws, it appears that several uses of collaborative tagging actually are semantically useful to those seeking articles or resources, even though not everyone may agree on the validity of the tags used. One example we encountered involved an article on Slashdot, a famous discussion board that bills itself as providing “news for nerds.” The article, which discussed cell phones and some hybrid technology approach, was tagged with the word stupid.[107] Although this tag represents a personal point of view and is in itself rather nondescriptive of the article’s content, it turned out that several people (including some of the authors of this book) who searched Slashdot for stupid felt that the tag accurately described most of the posts to which it was attributed. Despite the lack of a formal ontology and first order of logic, it appears that there are patterns of use in folksonomy approaches that have fairly consistent results among large groups of users.

References

Several other patterns are related to the Collaborative Tagging pattern. The following two sections (describing the Declarative Living and Tag Gardening pattern and the Semantic Web Grounding pattern) are particularly relevant. There are also several sections in Chapter 3, Dissecting Web 2.0 Examples that deal with folksonomies and collaboration that may be of interest.

The Declarative Living and Tag Gardening Pattern

Also Known As

Terms associated with the Declarative Living and Tag Gardening pattern include:

Ambient findability

The pattern of Declarative Living and Tag Gardening is similar to the concepts discussed in Peter Morville’s book Ambient Findability (O’Reilly). In this book, Morville describes several of the problems related to semantic declarations.

Social networks

While many people think of social networks as the rage of Web 2.0, social networks are in fact as old as society itself. What is truly new are the ways we have found to “declare” our social networks, or fractions thereof, on several Web 2.0 platforms. Most notable of these have been MySpace, Facebook, Twitter, Plaxo, LinkedIn, and YouTube.

Business Problem (Story)

In conversations around the world, people are constantly expressing their preferences and opinions. In other words, we live declaratively. We declare who our friends are and who our acquaintances or business colleagues are. We talk about the videos, music, books, art, food, and so on that we’ve encountered, the people that inspire us, and the people we’d rather avoid. It’s no different on the Web, except that on the Web we can make these declarations explicitly through common formats and technologies (such as tags, as discussed in the preceding section on the Collaborative Tagging pattern), or we can leave a trail of digital breadcrumbs and let other people draw their own conclusions. In both cases, others can capture these declarations and make use of them. That is, the breadcrumbs can be aggregated and given to actors (either human or machine) who can make inferences from these declarations.

Before the Internet, a lot of our declarations were never put to any use; many of them dissipated (quite literally) as hot air. But when people began to have conversations online, a new possibility arose. Because people’s preferences could be explicitly imprinted on the Internet’s memory as auditable bits, they could be aggregated and mined. Rather than trying to harvest the entire Internet, which would be like trying to catch a waterfall in a thimble, specific dedicated services have been created to optimize both the declaration and the aggregation of these preferences as “tags,” which can then provide rich semantic information (in a process known as tag gardening).

The next stage in the evolution of declarative living is the continued creation and adoption of standards—for example, microformats—giving the user more control and freedom to make declarations and the tag garden a richer, more formalized vocabulary and structure with which to work.

Context

This pattern occurs in any context where declarations are made or where mechanisms are used to harvest those declarations for some computer functionality. The declarative living aspects of the pattern embrace the concept of metadata (data about data), and the tag gardening aspects embrace inference, a component of artificial intelligence. Pragmatics (the implications of information in the context in which it is encountered) are also inherent in any such design.

Additionally, we see declarative living at work in any system in which the user can make a public statement, either explicitly or by leaving an unintentional audit trail of his actions. Declarative living and its corollary, tag gardening, have varying degrees of structure. A person blogging that she’s currently in London (or making such an announcement via Twitter or Dopplr) would be an example of declarative living and would enable tag gardening. That person might also use Google Maps to pinpoint her location, or even broadcast her exact current geographical coordinates using Twitter.

Derived Requirements

To facilitate this pattern, you need a mechanism that lets users make conscious declarations and, optionally, associate those declarations with resources. User interfaces supporting the Declarative Living and Tag Gardening pattern must be very intuitive and easy to use, and should facilitate data entry in the user’s natural language(s). The Blogger, Twitter, Dopplr, and Delicious interfaces, which allow users to enter declarations in plain text, are good examples.

When implementing the Declarative Living and Tag Gardening pattern, starting as simply as possible with the declarations aspect is preferable (you cannot garden/harvest tags until they exist). The tag gardening aspects can be built later, against either a centralized aggregation of declarations or a decentralized yet indexed collection. In the latter scenario, understanding the intricacies of mapping relational database systems to such an index is paramount to the success of the system. Employing a strategy of partial indexing early on rather than trying to build a full-blown, real-time system from scratch might help you avoid an exponential increase in bandwidth consumption.

Specific to tag gardening itself, you must also consider early on how to split and aggregate data to and from multiple machines. It’s not as difficult to have eight web servers talk to one database as it would be to have one web server talk to eight different databases. This is a problem for architects to handle.

Perception of good performance is also not optional. For Web 2.0 constituents, performance is an important feature. Debugging devices, test harnesses, and other monitoring systems (such as Nagios) are likely to be important tools for developers to use.

Developers and architects should prepare for scale but should not try to do everything up-front. Using proxies between services and the actors consuming them is likely to be beneficial, as proxies can cache and deliver the bytes more efficiently (think of Akamai, discussed in Chapter 3, Dissecting Web 2.0 Examples). Declarations can be harvested by talking to proxies who aggregate them and offer services at some logical level of granularity.

Generalized Solution

Any successful declarative living service must, by definition, be extremely easy to use, as well as self-contained in terms of providing value to the end user; otherwise, it will not be adopted. Delicious.com exemplifies this pattern. Josh Schachter, the founder of the service, constantly refuses feature requests that might make people less likely to actually use the service for declarative living and web annotation. Formal Semantic Web technologies such as the Ontology Web Language (OWL) and Resource Description Framework (RDF) force too much work onto the user. People are unlikely to spend time annotating content or relating their terms to nodes in a formal ontology. Even if some did this, the chances of this work being consistently performed are very minimal. Most new Internet users seem to prefer to get their declarations online quickly and tag resources in their own way. For those building a service, a key design principle should be to remove as many barriers to participation as possible. The service should require as few clicks and field fill-ins as necessary. (It may, of course, make sense for applications to keep track of these declarations using RDF internally, allowing them to apply RDF tools to user-provided information.)

Unlike Ma.gnolia.com (see Figure 7.38, “The ability to add degrees of interest indicated by stars, in Ma.gnolia.com”), a Delicious competitor with a five-star rating system similar to that of iTunes or Windows Media Player, Delicious provides a more binary choice: is this web resource interesting or not? Delicious doesn’t allow for degrees of interest (at least, not at the single-user level). The key point here is that if more granularity is available for declarative living, more care must be taken when making decisions to keep choices consistent.

Figure 7.38. The ability to add degrees of interest indicated by stars, in Ma.gnolia.com

The ability to add degrees of interest indicated by stars, in Ma.gnolia.com

Last.FM, which CBS acquired earlier this year for $280 million, is a social service that brings music lovers together. The masterstroke for Last.FM, from a business model perspective, was acquiring Audioscrobbler, a plug-in that listens to a user’s music player and uploads a logfile of that user’s choices. Thus, rather than you sitting there entering a bunch of data about what music you like, Audioscrobbler runs in the background and does it for you, creating a constant stream of musical breadcrumbs. Last.FM users can also use the plug-in to add additional metadata tags to the track listings, but just as with Delicious, the fact that you’ve downloaded a track is pretty good evidence that you actually like it. If you allow someone to harvest the fact you listened to the same song between one and five times per day, she may reasonably infer that you thought it was a good song.

Taking the automated agent-based tagging approach to its natural conclusion, it’s now time to consider the notion of tag gardening as applied to objects, rather than people, that “live” declaratively. Radio Frequency Identification (RFID) tagging and monitoring is an enterprise-oriented implementation of this pattern. When every instance of a class of things starts to make harvestable declarations, an exponential explosion of tag gardening may occur. When we can label many of these devices with our own tags as well, human and machine tags will be intermingled and harvesting will be required.

The ability to aggregate all the tags is not an enterprise-scale requirement or a web-scale requirement; it goes far beyond both scopes. Arguably, developers and architects haven’t even begun to tackle the really thorny scaling problems in our industry (the ability to implement patterns such as this one across the whole Internet, or a large subset of it). Sun Microsystems CTO Greg Papadopoulos has recently begun to talk about what he describes as a Red Shift, arguing that it will take a hardcore computer science foundation to build scalable next-generation services (which is, of course, Sun’s core competence). One of the defining characteristics of Web 2.0 developers and architects has been a dismissal of many traditional enterprise architecture patterns. But to move forward, they’re likely going to need to scale up, and scale out, server-side managed session state in some cases, along with coming up with new approaches to data integrity and referentiality. Web 2.0 patterns don’t change the fundamental challenges of computer science.

Smart objects aren’t the only things that will need to be marshaled in declarative living. All kinds of digital assets, including documents, images, videos, audio files, users, real-world objects, and other entities, will likely benefit from this pattern. All of these resources have lives and functions, can be marked up, and can contain embedded declarations along the lines of “X is worth Y” and “A has a business relationship with B.” The notion of declarative resources can lead us into some mental gymnastics. For instance, Peter Morville talks about the notion of an antelope as a document. Using the work of Paul Otlet and Suzanne Briet as background, he explains that in certain conditions, even an antelope could be considered a document—that is, if a photo is taken of the animal and is used in an academic setting, the animal has become, in effect, a document because it has been “marked up.” Continuing in this vein, Morville asks:

What if we leave the antelope in the wild, but embed an RFID tag or attach a GPS transponder or assign it a URL? What if our antelope is indexed by Google?[108]

These are really interesting questions, and once we internalize the notion that anything we tag becomes a document (or other resource) to be added to a global body of knowledge, we can begin to understand the deep implications of declarative living and tag gardening.

Recently, we have seen an explosion of Apple iPhone applications that use location services to make declarations about where you did something. An example of this is the application iMapMyRide.com, which uses the GPS data from your phone to track or tag your movements and can overlay this data on a map. More novel uses are certain to emerge, and the pattern of declarative living is here to stay.

Static Structure

This pattern has two static views. The first, shown in Figure 7.39, “The Declarative Living pattern”, is a simple view in which a user (actor) interacts with a resource and makes declarations about the resource. Via these declarations, people express themselves and aspects about how they perceive their existence. This, in itself, is similar to the Collaborative Tagging pattern. However, it is specific to instances in which a human actor is involved and making the declarations, whereas the Collaborative Tagging pattern is agnostic in terms of which actors may make such declarations.

Figure 7.39. The Declarative Living pattern

The Declarative Living pattern

A second static view of this pattern corresponds to the aspect of tag gardening. This is where declarations may be subsequently harvested and/or aggregated (see Figure 7.40, “The Tag Gardening pattern”).

Figure 7.40. The Tag Gardening pattern

The Tag Gardening pattern

Note that the tag gardening actor can be a human, a machine, or another type of actor. Once harvested, declarations can be represented in many forms, as noted earlier.

The conceptual domain shown in Figure 7.39, “The Declarative Living pattern” refers to a higher-level abstract concept of the instance coupled with an alignment in the first order(s) of logic (FOL). Several upper ontologies embrace this in the best way they can, and you can use the work of the upper ontologies as a platform to build mid-level ontologies and more domain-specific taxonomies. Figure 7.41, “The relationship of SUMO’s first order of logic to mid-level ontologies”[109] illustrates the relationships between the FOL and domain-specific ontologies.

Figure 7.41. The relationship of SUMO’s first order of logic to mid-level ontologies

The relationship of SUMO’s first order of logic to mid-level ontologies

The ability to link semantic declarations to higher-level concepts is still in its infancy; however, it has become an area of interest in some ontology circles, including the Ontolog Forum.[110] You can reach SUMO creator Dr. Adam Pease, of Articulate Software, via the Ontolog Forum, alongside ontology gurus and notables such as Professor Bill McCarthy, Dr. Matthew West (Shell), John Sowa, and others.

Dynamic Behavior

The dynamic aspects of the Declarative Living and Tag Gardening pattern are roughly equivalent.

Implementation

There is a fair amount of overlap between the Collaborative Tagging and Declarative Living and Tag Gardening patterns.

The two patterns do, however, have some differences in terms of analysis. Tag gardening offers some specific approaches: tag metadata can, for example, be expressed with user interface patterns such as tag clouds, as shown in Figure 7.42, “A tag cloud”.

Figure 7.42. A tag cloud

A tag cloud

The tag cloud in Figure 7.42, “A tag cloud” comes from Flickr, and it represents many people’s first exposure to the term. In a tag cloud, the more popular a term is, the bigger and bolder it becomes. Essentially, this represents a garden of tags used to declare aspects of digital images referenced within Flickr.

Although people don’t generally consider tagsonomies amenable to hierarchy, hierarchies do emerge. Paul Lamere, a researcher at Sun Microsystems, has done some very interesting work using a search engine to analyze tagsonomies and extract hierarchies from them (see Figure 7.43, “Representation of a hierarchy automatically extracted from a number of tags about different styles of heavy metal music”).[111] As in taxonomies, these hierarchies usually convey the knowledge aspects of inheritance as well as polymorphisms and dependencies. They also can represent the relationships where one concept splits into other concepts due to the child descendents being disjointed. Disjointed binary relationships are sets in which each sibling has an exclusive characteristic that is in contrast to the others. For example, you could note the class of all humans, and then make further distinctions based on humans who are female and others who are male. With the exception of a small transgender community, these subclasses are considered disjointed.

Figure 7.43. Representation of a hierarchy automatically extracted from a number of tags about different styles of heavy metal music

Representation of a hierarchy automatically extracted from a number of tags about different styles of heavy metal music

Given that most systems employ some form of automation of access, avoiding complexity in any API would be a good strategy for those offering services to collect declarations. Most developers find SOAP to be overly complex, let alone other flavors of XML web services, Corba, or RPC. Delicious only offers the ability to GET the value of a resource or PUT a new resource, using a RESTful approach (for more on REST, see the section called “Also Known As”).

Business Problem (Story) Resolved

When it comes to declarative living, the proof is very much in adoption behaviors. Although Twitter is not a tag-based system per se, it may morph into one: microformat support is one possibility currently being discussed, and the service already supports map coordinates for user locations. But Twitter is certainly a declarative living engine, and its adoption has been explosive.

At the time of this writing, business problems are still evolving at a rapid pace, and nothing less than the need for community and clarification of digital assets is required in our age of disintegration.

Specializations

Humans think in a manner that is conducive to implementing this pattern. Our subconscious thoughts often try to make meaning out of the situations we encounter, and by analyzing the tags any specific human applies to a variety of resources, you can determine a lot can about that human.

Declarative living(i.e., the act of making explicit declarations about resources we encounter) can be specialized in many ways. Music tagging, for example, represents a rich field for tag-based systems. An entire science is now being established to deliver musical recommendations to users based on gardening their tags. The main fight is between Rhapsody, which makes music recommendations by algorithmically breaking down music into constituent parts and recommending that which is similar, and Last.FM, which offers recommendations based on tag gardening the collective intelligence of its users. Last.FM and Rhapsody both offer surprisingly accurate recommendations, with some key differences in the “look and feel” of the app. Rhapsody is like a radio station that only plays the music you like, whereas Last.FM is like going back to college and hanging out with lots of cool people with great music collections, some of which you find utterly amazing and some of which you abhor. Both services can be tuned to give you more of what you want, but Last.FM is more likely to throw something unexpected into the mix. CBS has great ambitions for Last.FM; one obvious possibility is to turn it into a TV, rather than audio, recommendation service.

A number of additional specializations can be applied to this pattern:

Book tagging

A great deal of Amazon.com’s success is predicated on getting customers to write reviews for others to read, and its recommendation service uses tag gardening very effectively.

URL tagging

A number of URL tagging services exist. How many times have you wondered, “Oh no, what was that link again?” That’s the problem these services are designed to handle.

Search

The most powerful example of tag gardening in action is Google PageRank. Though not usually thought of in relation to URL tagging, Google’s spiders effectively crawl the Web looking for links to other sites. Google treats these links much like tags of interest, which can then be parsed for authority.

Photo tagging

Flickr really did write the book here. Steward Butterfield and Caterina Fake beat their competitors by making user-generated content the core element of Flickr’s photo storage and sharing services. The mechanism for letting users do amazing things with Flickr, such as creating recipe books, making games, illustrating poems, and so on, was the humble tag.

RFID mashups

It’s really too early to define winners in this space, but there are some interesting implementations out there. London’s Oyster card, for example, is a card that offers cheap passage on public transportation in London. As you move around the city, fares are automatically deducted from your prepaid RFID-enabled card; the optional auto top-up feature ensures that you never get stranded. Users can also query the system to see where they have been traveling. This is a very different example from the others listed earlier, and that difference is intentional. Machine tagging (in this case, geographical) is an important element in the future of the Declarative Living and Tag Gardening pattern.

Distributed development repositories

Distributed development repositories also replicate this pattern and specialize it for the field of software development—most commonly, to enable people on different continents to check in and check out code using the same interface to the versioning system. Developers working on large-scale open source projects typically use such content versioning systems to avoid overwriting each other’s work, and to track the state of the overall project.

Location-based declarations

Several mobile and laptop hardware platforms already include GPS sensors that can be used to augment any declaration and make it geo-spatial specific. This trend is manifesting as massive growth in commercial applications for things such as finding nearby restaurants or gas stations, remembering where you took photographs, and more.

Offline access to online web applications

This was discussed in the earlier section on the Synchronized Web pattern (under the section called “Specializations”).

Known Uses

Delicious is minimalist and is primarily used by a geeky, developer-centric crowd. Ma.gnolia is a bit prettier and is more widely used by web designers, though data corruption brought it to a sudden halt in early 2009. Functionally, it is similar to Delicious in that members save websites as bookmarks, like they would in their browsers. The users also tag the websites, assigning semantically meaningful labels to make them easy to find again. Other users may search via Ma.gnolia and find sites that others have deemed worthwhile, based on the tags they have applied to those websites.

Digg is a community site where users can flag URLs as interesting. By combining all declarations of usefulness, compiling a list of sites that are really of interest to a large segment of society is easy. Since 2007, many people have used Digg as a verb (“Can you digg this blog post?”).

Twitter is both the ultimate declarative living site, allowing people to make declarations about any aspects of their lives that they wish to make public, and one of the best places to harvest those declarations (other users can subscribe to your updates to see what you’ve been doing). Some users have fanatically embraced this site, whereas others have labeled it one of the most annoying uses of technology of all time (a poll was set up at http://wis.dm/questions/37343-do-you-find-twitter-annoying). Using the poll to declare your answer is in itself a manifestation of this pattern.

Dopplr is a great site that can be used for declaring and harvesting travel information for people in your social circles.

Consequences

If you minimize the complexity of the tagging infrastructure and the API, the declarative living service and its associated tag garden can quickly scale and attract new users. The size of the community alone is no guarantee of income, but attention can always be monetized. If you have enough eyeballs, a business model will emerge; ad sales are one obvious opportunity.

If tags make the user experience better, they encourage community, which in turn encourages more attention, and so increases the possibility to make money, or at least deepen community ties. In the case of Delicious, for example, its owner (Yahoo!) has chosen not to try to make money directly from the service.

References

Josh Schachter is an evangelist and a great storyteller. He has done as much as anyone to define the Declarative Living and Tag Gardening pattern. See, for example, his Future of Web Apps 2006 presentation,[112] from which this section borrows heavily.

Other references include the following:

  • Search Inside the Music (the blog of Paul Lamere, who works in research at Sun Microsystems)—http://research.sun.com/spotlight/2006/2006-06-28_search_inside_music.html

  • Ambient Findability, by Peter Morville (O’Reilly)

  • Clay Shirkey, various essays

  • David Weinberger, various essays

  • The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture, by John Battelle (Portfolio)

The Semantic Web Grounding Pattern

Also Known As

The Semantic Web Grounding pattern is related to several patterns and concepts discussed in this book:

The Collaborative Tagging and Declarative Living and Tag Gardening patterns, among others

This pattern draws upon several related patterns, including the Collaborative Tagging pattern, the Declarative Living and Tag Gardening pattern, and various other search and retrieval patterns.

Adaptive software

Adaptive software is a term used to describe a pattern that employs the grounding aspect to “understand” semantic declarations.

The Semantic Web

Some consider the overlay of a good semantics mechanism to be the harbinger of the next generation of the Internet, known as either “the Semantic Web” or “Web 3.0.” For the record, we do not like to use the term Web 3.0 and would encourage everyone to stop this convention. Semantic mechanisms are complex to understand, but they are already part of the Web today.

Business Problem (Story)

A government agency archives hundreds of thousands of data sets submitted from various public and private sector organizations. To make the archives useful, a search facility should exist to allow users to search for and retrieve the data sets they require.

Unfortunately, many data sets are likely to match any specific search request, making it hard for the searchers to find the exact data sets they want. A sorting and ranking algorithm can make preliminary search results somewhat more relevant; however, the thousands of records returned for each search make the retrieval of data sets slow and inefficient.

Some sort of system is needed to aid those searching for specific data sets. To implement such a mechanism, you need a set of search and query algorithms that can rapidly locate the specific data set required.

Some form of semantic declaration about web resources can aid entities wishing to find resources that are relevant to their requirements. These declarations can come from many sources (a folksonomy approach) or a single source. The core problem is that regardless of a resource tag’s origin, most semantic work is not grounded in real-world post-interaction analysis. To rectify this, the Semantic Web Grounding pattern includes a secondary step whereby an application can inspect the claims made about a resource’s semantics and compare those claims to real-world patterns of interaction with the resource or its attributes. No matter where the tags come from, subsequent to the tagging activity, monitoring the patterns of interaction with resources and correlating those actions to the tags used can help refine the relevancy of semantic search and retrieval, enabling applications to adaptively infer desirable results and best choices for future searches.

Context

This pattern can be used wherever large numbers of resources exist and some form of declaration about what they represent is required to aid entities wishing to locate specific resources. The pattern implies that the metadata declarations are not made using some pre-agreed form of semantics; hence, an adaptive mechanism is required for semantic reconciliation. A metadata declaration is nothing more than the application of a label, or tag. The meaning of that label, however, can vary greatly depending on the context (a concept known as pragmatics).

Derived Requirements

To facilitate the Semantic Web Grounding pattern, you need a mechanism for structured exchange of metadata that allows claims to be made about resources. The claims must be in a syntax that is universally parsable by all entities of a fabric. In this context, the term fabric is used to denote all accessible points on any given network, regardless of protocols used (this is akin to the use of the word in the phrase “fabric of space” to denote the universe). It could refer to a Bluetooth short-reach network or even to the Internet, and it is used to ensure that the pattern can be applied to the widest possible range of situations. The claims must be linked to specific resources, and each resource must be uniquely identified.

The entities using the claims should employ some mechanism to reconcile them against the observable real-world effects of interactions with those resources to facilitate cause-and-effect auditing. This mechanism should allow adaptive inferences about future resource claims based on a history of claims versus real-world effects. For example, you could find a way to track search engine users to see how they interact with resources returned for specific search terms. If a user interacts with a resource and then seems to be finished with his search (as evidenced by not searching for the same topic again), an observer could conclude that resource is more relevant than another resource that the user visited only briefly before returning to the search engine and clicking on another search result. Conversely, if the user clicks on more than one search result, you could infer that the first resource visited was not sufficient and potentially lower that resource’s “score” for the search term. Such observations by application monitoring searches could lead to the conclusion that these resources did not fulfill the user’s needs.

Generalized Solution

The generalized solution for this pattern is a mechanism whereby resources are tagged with metadata that relays claims about them to entities seeking a specific resource or set of resources. Entities that interact with those resources are also able to add tags.

Static Structure

Figure 7.44, “The core of the Semantic Web Grounding pattern” illustrates the simple pattern for resource claims and reconciliation with real-world effects. An entity that has an interest in a specific class of resources can inspect the claims made by other entities about a given set of resources. These claims generally describe what the resource is or some aspects about unique instances of that class of resources. The entity’s interaction with any given resource can form part of a real-world effect that may be of interest to other entities seeking the same type or class of resources. These interactions are the basis of this pattern. Without being grounded in a real-world effect, the claims are essentially unquantifiable in terms of accuracy.

Figure 7.44. The core of the Semantic Web Grounding pattern

The core of the Semantic Web Grounding pattern

Implementation

You can implement this pattern using a simple word-tagging mechanism in which any entity that interacts with the resource can declare what tags it believes are relevant to describe the resource. Subsequent entities first parse those tags (claims), and then interact with the resource itself. A post-mortem analysis of the relationship between the declared metadata and the real-world effect of the interaction is then carried out. This is done to determine whether the resources are relevant for the terms with which they are tagged.

In the real world, search engines that index large numbers of HTML files perform in this manner. First, entities (in this case, the authors of HTML pages) use metadata declarations to indicate what their web pages are about. These declarations are often nested inside an HTML meta element. When the search engine first builds an index of resources, it may use these tags to create relevancy rankings for the web pages.

As subsequent entities (for example, other human actors) come to perform searches, they are presented with results based largely on a combination of the actual metadata claims and an index of relevancy to specific terms. The index of relevancy is a mechanism that keeps a score of the relevance of a resource, much like Google PageRank. The search engine returns a list of options to the entity performing the search and monitors the entity’s interactions with the resources. In most cases, search engines return results in a format that allows tracking of which resources the entity visits. The search engine also provides the entity with a cookie to determine whether the entity returns to the search utility after visiting an external resource.

An inference engine is fed data about the real-world effects, and the search engine learns from observing the behavior of multiple entities. Over time, a pattern starts to appear regarding what resources are most relevant for entities searching for specific terms.

Business Problem (Story) Resolved

This pattern creates a dynamically adaptive approach to the semantic web and learned behavior by grounding semantic claims in an audit trail of real-world interactions between entities, claims, and resources. When this pattern is adopted, implementations of it will continue to adjust their inferences as usage patterns emerge.

Specializations

We’ll consider two specializations of this pattern. First is the use of tagging to provide metadata for resources. On the Slashdot website, for example, members can add descriptive tags to each story that’s published. For example, a recent story on using cell phones as servers[113] was tagged with the terms stupid, apache, and nokia.

Another specialization is to add information about the entity’s context to create another layer of predictive semantic behavior. For example, if the visiting entities’ IP addresses are tracked as part of each interaction with a resource, patterns may emerge indicating that IP addresses in certain geographical regions prefer one resource or set of resources, whereas addresses in a different geographical region prefer a second set.

Known Uses

Adobe’s XML Metadata Platform (XMP) is a prime example of a toolkit that allows resources to be tagged with metadata claims. Several online systems, including Technorati and Google Search, also adopt this pattern to some degree. We describe Google Search’s use of the pattern in the following paragraphs.

Google uses an adaptive learning mechanism to track search terms entered by entities seeking resources, observing the entities’ behavior to determine whether the claims made about the resources are meaningful and useful. When Google returns a search page, even though the actual resource URLs are shown on the page, Google tracks the entity’s choices amongst the search results by following each choice back through a Google-controlled URL and matching it against the searcher’s unique IP address. A URL on the search result page can be inspected to reveal this behavior. For example, a search for “Mackenzie-Nickull” yields this return choice:

http://www.google.com/url?sa=t&ct=res&cd=1&url=http%3A%2F%2Fwww.nickull.net%2Fwork%2FMacKenzie-Nickull_ArchitecturalPatternsReferenceModel-v0.91.pdf&ei=17aBRKOuCaz6YJ7Nje8L&sig2=pdt4i7x6oSVZA1xdObv6ig

Google uses the entity’s interactions with this URL to determine which results are most relevant, and an algorithm adjusts the results for the next searcher. If enough people select a certain resource’s URL, eventually that resource is inferred to be most relevant for others searching on the same search term and it is filtered to the top of the list of results.

Clearly, this is far more advanced as a pattern than the simple string matching employed by search engines in the late 1990s.

Consequences

In this pattern, applications still rely on individuals who may have widely differing worldviews. Differences in culture, upbringing, and natural language make many people think differently about the labels they attach to objects. Folksonomies are about diversifying tags and catering to a wide range of points of view, while simultaneously enabling the building of applications that can filter out the list of tags that are of most interest to any given segment of society.

You may also note that Google’s adaptive learning algorithm caters only to the majority, and there will always be a minority of the population that it does not serve well. Although that could be an issue, an additional mechanism Google uses ties the search terms into an upper ontology and attempts to provide contextualized result sets based on what the search engine knows about its users. This system seems to ensure that a balanced list of search results is always returned.

References

For more on the Semantic Web Grounding pattern, visit the following websites:

The Persistent Rights Management (PRM) Pattern

Also Known As

Digital Rights Management (DRM) is often thought of in the same manner as Persistent Rights Management (PRM). However, DRM is only a small subset of PRM aimed at preventing unauthorized use or placing other constraints on the capabilities of an entity, such as locking a file format to a specific hardware/software platform. DRM has been widely criticized by many people. Much of the ado is because of a perceived intrusion upon the rights of those who have paid for content but have to pay again if they want to port it to another technology platform. For example, if you bought a song on LP, then bought it on tape, then again on CD, and finally on iTunes, you might still not legally be allowed to play the song on a non-Apple MP3 player. Many people consider this a violation of their rights.

Business Problem (Story)

Many digital assets and resources are connected to or can be made widely available on the Web via a variety of methods. Inherit policies and owners’ rights on most of these resources can easily be observed and enforced in the tangible world of paper documents, yet they are extremely difficult to enforce in the digital world. Additionally, owners of digital assets are technically required to control certain aspects of their functionality, but doing so can be difficult when there are multiple copies of a resource or asset.

One problem that may occur is when people participating in a review cycle examine a copy of a digital asset that is either the wrong asset or an older version of it. Continuing to work on an older version when a newer one exists might happen for a multitude of reasons, and it is a persistent and expensive problem. Another scenario might involve a person accidentally releasing a document and then wishing to recall it, erasing all traces of its existence.

Often, owners of sensitive or copyrighted resources place them in a secure location and allow access only via some form of authentication. This solution does not address the post-retrieval aspects of the digital asset owner’s rights, however. As shown in Figure 7.45, “Shortcomings of the old approach to rights management”, once an asset is out of the secure location, there is almost no way the owner can control it. This problem is amplified by the sheer ease with which a digital asset can be copied and sent over the Internet.

Figure 7.45. Shortcomings of the old approach to rights management

Shortcomings of the old approach to rights management

Context

The Persistent Rights Management pattern may be of interest in any environment in which digital assets or files are distributed electronically and where the owners or stewards of those artifacts wish to exercise post-distribution management rights.

Derived Requirements

The technical requirements for this pattern are:

  • Digital assets must be able to be linked to a centrally controlled policy.

  • The policy must be explicitly and inalienably linked to the digital asset in such a way that no application can ignore the policy.

  • The link to the policy should be traversable so that applications dealing with the digital assets can gain access to the policy logic and information regarding how to set up and enforce a policy decision point (PDP) to evaluate claims made in efforts to comply with the policy. For example, if the policy involves authenticating someone against a corporate directory, it is essential that that information be available to the application so that the application can collect the tokens required for authentication and forward them back to the rights management server.

  • The policy itself should be mutable so that asset owners or stewards can modify it based on their requirements and changing circumstances.

Generalized Solution

The generalized solution for the Persistent Rights Management pattern is to wrap the digital asset in an outer envelope that either is linked to the actual policy or can be used to reference the policy. The outer wrapper represents a policy decision point that cannot be traversed without the policy being adhered to, thereby forming a contract. The outer wrapper must persist in all copies of the digital asset so that it protects each and every copy.

The wrapper must be written in a manner that lets software agents attempting to interact with the digital asset understand that the asset has a policy on it and how to satisfy the policy requirements.

Static Structure

Each artifact that will be enhanced with rights management requires an outer wrapper that can relay information about the rights enabled on the digital asset. This includes information about the algorithm used to encrypt the asset, a link to any policies or other conditions that must be met to access the resource, where to obtain the key, and possibly what tokens or other information needs to be collected from actors wanting to present claims to satisfy the policy or policies. A policy repository is a component of a rights management infrastructure where policies are stored. It will likely be linked to other components of the infrastructure that enable the functionality shown in Figure 7.46, “A static view of the PRM pattern”.

Figure 7.46. A static view of the PRM pattern

A static view of the PRM pattern

An architecture that reflects this pattern should include, at a minimum, the following components:

  • Client application software

  • File formats for persistent rights management

  • A policy repository and server

  • Services for authentication of users

  • Interfaces for asset owners to manage their rights and set up and manage policies

  • A communications protocol for secure communications between the client application software and the policy server

  • A format for declaring policies that is independent of the actual file formats being protected

  • A strong encryption mechanism to protect the digital asset if the policy is not satisfied

Dynamic Behavior

In the pattern shown in Figure 7.47, “Deployment infrastructure for the Policy Rights Management pattern”, asset owners access policies via a special server. The asset owner then uses an application to wrap the digital asset in such a way that the wrapper protects it by encrypting the contents and links the opening of the contents to a specific policy in the server. The asset owner can then distribute the digital asset. When another application attempts to open the asset, it encounters the outer wrapper that links to the policy. The policy has a set of conditions that must be satisfied in order for access to the asset to be granted. This pattern does not get into the specifics of the types of conditions the policy may impose; however, there are many possibilities. For example, the policy could be linked to the ability to authenticate a specific individual, or it might specify the exclusive date and time range during which the asset owner will allow the asset to render for interactions. Other facets of rights management might include the ability to constrain certain types of interactions for specific classes of viewers. For example, asset owners might want to disable printing for most viewers, yet enable low-resolution printing for a small class of viewers. Concurrently, the owner might choose to disable the ability of all viewers to copy and paste text from the asset.

Figure 7.47. Deployment infrastructure for the Policy Rights Management pattern

Deployment infrastructure for the Policy Rights Management pattern

Once the application understands what the policy represents and what must be done to satisfy it, it may take steps to try to satisfy those requirements until it ultimately reaches a decision regarding whether or not to open the asset. The decision control point may be outside the application’s domain, in which case the application will likely be unable to access the asset without the server providing a key. In this manner, you can keep absolute control.

Implementation

When implementing this pattern, you must keep in mind several important design considerations. The communication channel between the clients and the policy server component might be a weak link, subject to back-channel attacks, replay attacks, or even denial-of-service attacks that might prevent legitimate access to assets. Hence, that communication channel must be protected against a multitude of attacks.

The outer wrapper must be in a format that can be parsed and inspected even if the asset itself remains out of reach. This ensures that applications attempting to open the asset will be able to understand what is happening when they encounter a cyphertext file instead of the asset. Cyphertext is the array of bytes that results when an asset is encrypted. It generally appears as a nonsensical, random array of characters bearing no resemblance to the original asset; hence, it is difficult to rearrange as the original asset if you do not possess the key to unlock it.

The encryption used must be strong enough to guard against brute-force attacks and sufficient for the purposes of the assets it will be protecting. Typically, the more important it is that a resource remain protected, the higher the level of encryption should be. Implementers should attempt to understand the consequences of the system failing, in terms of illegitimate access to assets and denial of access to legitimate users of the assets. The general rule most cryptography workers follow is to find an algorithm that would take all the computing resources in the world one year or longer to crack via a brute-force attack (iterating through every possible key) for basic assets, and longer depending on the useful life of the asset.

A method of authenticating users must be linked into the policy server.

Asset owners should be able to change policies after distributing assets to reflect new conditions such as a newer version, the fact that an asset has expired or is no longer meant to be in the public domain, and so on.

Business Problem (Story) Resolved

By encrypting the digital asset, wrapping it in the policy envelope, and linking it to a policy that ensures potential users cannot modify it, you have protected the asset. If asset owners can dynamically change the policies or aspects of them, the system has bestowed dynamic rights management capabilities upon these owners.

Specializations

You can specialize this pattern to handle certain file formats for assets or to use only a subset of its capabilities. For example, iTunes uses a DRM-type system to protect against copying, but this does not reflect the PRM pattern’s full capabilities. Other systems, such as Adobe’s LiveCycle ES Rights Management solution, work well with only certain file types.

In the future, we anticipate that other file formats may be rights-management-enabled by this pattern, including video and audio files.

Known Uses

Many companies have implemented this pattern. Here are a few of the more notable ones:

  • Adobe Systems has implemented this pattern in its LiveCycle ES Rights Management server,[114] a J2EE server that protects assets in the PDF, Microsoft Word, Microsoft Excel, and CATIA file formats as of this writing.

  • Microsoft has implemented a pattern similar to this in its Rights Management Server,[115] which works with RMS-enabled applications to help safeguard digital information from unauthorized use.

Consequences

The use of this pattern may result in confusion for those whose applications do not understand the wrappers used to protect assets. Additionally, users may experience issues opening documents when they are not in a network-connected environment and/or if the implementation does not allow for an “offline” lease period for digital assets or other means of nonnetwork use.

It’s hard to support searching if bots that index the assets cannot access the actual assets to extract metadata, such as word counts for relevancy scores. A good implementation might allow for encryption of the document, yet still expose some metadata for spiders and other search bots to use.

Rights management in general continues to be controversial, often seen as getting in users’ way rather than helping them accomplish what they hope to do. Apple has removed DRM from its iTunes store, but there are still battles around DRM on Amazon.com’s Kindle and related eBooks platforms, as well as in video streaming. At about the time this book goes to press, Adobe Systems’s Flash platform will introduce DRM-style functionality into the Flash video (*.flv) format.

References

For more on the Persistent Rights Management pattern, visit the following websites:

The Structured Information Pattern

Also Known As

Terms associated with the Structured Information pattern include:

Fine-Grained Content Accessibility and Granular Data Addressing

The concept of structured information has been described by people such as Tim O’Reilly as fine-grained content accessibility. This pattern is a specialization of a well-known generalized pattern called the Whole-Part pattern, which they extend with some specific nuances.

The Declarative Living and Tag Gardening pattern

The Declarative Living and Tag Gardening pattern, discussed earlier in this chapter, is highly relevant; the Structured Information pattern is one way in which finer aspects of declarative living can be implemented.

Microformats

The term “microformat” generally refers to a small snippet of syntax that can add self-declaring metadata within existing content. Examples include marking up contact information, calendar events, and other fragments of an existing HTML web page. For more on microformats, see http://microformats.org. The larger notion of a “markup language” is a pivotal concept that many implementers of this pattern use.

Business Problem (Story)

A lot of largely unstructured content exists on the Internet. Most of this content is captured and presented as HTML documents, with author-supplied data forming part of a web page and the markup forming the other part. These documents are considered to be largely unstructured in terms of finding specific chunks of data within them. If someone wanted to scrape a collection of web pages to find only certain types of data, he would need a consistent syntax and mechanism to reach specific places in the documents and grab the content. Such scraping might be done for a variety of business reasons, including to repurpose content for some other use. (As discussed earlier in this chapter, content repurposing, or “mashing,” is a popular Web 2.0 pattern.) Sometimes it may be desirable to retrieve the content without any extraneous elements that pertain to it. For instance, you could use just the plain text from a PDF document without needing any data regarding how the text should be displayed (font, color, size, etc.).

It is fairly impossible, though, to create a system for deterministically grabbing content from large collections of pages. Scraping and grabbing tend to be fragile processes, breaking frequently when the underlying markup changes.

Context

HTML is ubiquitous, but content stored in HTML is difficult to reuse. What’s more, people creating with HTML are often less than enthusiastic about changing their work processes in order to support reuse. Other factors, such as cross-browser compatibility and ease of creation, are generally much higher priorities.

Derived Requirements

Authors of content pages must have a language to use to mark up (tag) various parts of their content in a manner that will allow applications to consistently retrieve individual pieces of that content. In other words, you have to develop a syntax that lets authors publish information that contains declarations pertaining to its inner structure. Furthermore, any such markup declarations must be in a format that has meaning to others who may attempt to leverage those declarations to address content at the fragment level.

Supplementary mechanisms and methodologies must also exist to guide authors in how to mark up the content in a way that those who later wish to retrieve fragments of that content will understand. This process therefore crosses into the realm of taxonomy creation, or the development of specific structures or hierarchies to guide the authors of the documents.

Generalized Solution

The generalized solution for this pattern is to employ syntax to mark up the data in accordance with the rules of a specific vocabulary, syntax, and/or taxonomy. The markup language used can mark up content in such a way that it is logically divided into smaller pieces, each of which is tagged with a label that provides a way to identify the content fragment. Content fragments must be much smaller than HTML web pages, and may be as small as one or a few bytes or characters.

Ideally, the syntax for the markup language will be similar to existing markup languages yet simplistic enough to enable large-scale adoption by authors and developers.

Taxonomy developers will use the markup language syntax to create vocabularies for specific purposes. The taxonomy will include various tags that have corresponding annotations to help others understand their semantics or intent.

Another potential solution is to divide web pages into smaller logical chunks and assign URIs to each of them. This solution allows users to find and use the smaller fragments with great ease using HTTP get() requests.

Static Structure

The static structure view of the solution incorporates the new syntax to declare fragments smaller than the parent container. It is depicted in Figure 7.48, “Component view of microformats adding metadata at the subparent container level”.

Figure 7.48. Component view of microformats adding metadata at the subparent container level

Component view of microformats adding metadata at the subparent container level

Each piece of content within the parent container might logically be a candidate for markup by one or more microformats. The marking up of smaller chunks of data allows other agents or system actors to access or address content at a more atomic level than is possible by deterministically accessing the content at only the parent container level.

Dynamic Behavior

Several workflows for authoring and marking up data are possible. In general, once the author has prepared the content, it can be marked up with microformats to allow addressing at a much finer-grained level. When the marked-up content is published and made available over the Web, it can be retrieved by the normal means of content acquisition and parsed with the normal methodology. For most of the Internet, this means applying the rules of HTML. When the content is parsed for general purposes (including common tasks such as building a graphical representation for a user or consumption via an application), special handlers can be registered with the parser to do something special with the marked-up fragments within the parent content.

In Figure 7.49, “One possible sequence of events”, such special handlers are assumed to be registered with the parser. During the parsing process, event notifications are sent to the handlers registered for specific types of events. After processing an event notification, the handler may take the optional step of talking to the parser (or possibly other software components) and giving it special instructions based on the content detected.

Figure 7.49. One possible sequence of events

One possible sequence of events

A simplistic example might be a browser with a plug-in to detect a microformat. When the browser begins streaming a new HTML page, the HTML is fed to the browser’s parser. The HTML parser in the browser has an internal handler for the HTML syntax, as well as other handlers (some internal, some via browser plug-ins) to deal with extensions to HTML or other technologies that commonly work with HTML, such as Cascading Style Sheets, Flash, and JavaScript.

Plug-ins may exist for microformats that can register handlers, so that they are notified when a microformat is detected within the HTML syntax. When the microformat is sent to the handler, the handler might take additional steps, such as providing the parser or the parent browser with instructions based on user configuration and preferences. In the case where vCard syntax is used within an HTML web page, for instance, the browser may be given special instructions to highlight the contact information in yellow and show a tooltip (e.g., “This is a contact card. Right-click to input this card to your address book.”), and some dynamically created JavaScript may be registered to fulfill that functionality when the user interacts with the content in the manner specified.

Alternatively, the parser may simply pass the microformat or small syntax snippet to the handler, and the handler may never talk to the browser’s parser again. In fact, a browser does not even need to be involved. Spammers often write “email harvesters” that use HTTP GET requests to grab content, parse the HTML looking for syntax that indicates that an email address has been found, and then pass links back to the queue for the HTTP GET implementation to retrieve. The balance of the HTML may be completely dropped from memory.

Both these examples illustrate possible sequences for processing content marked up at a more fine-grained level than the parent container as a whole in what is called a “deterministic” manner. The processing result is consistent, and all examples of the inner content will be found regardless of the structures of the parent container and the inner marked-up child.

Implementation

When implementing such a solution, if syntax is used that a parser interpreting the parent container understands to mean something different from what you intended, the smaller marked-up chunks of data may disrupt the display. It’s important to avoid possible confusion between declarative syntaxes. One way to do this is to namespace-qualify the markup so that parsers will not confuse microformat markup with the parent container’s markup language. For example, suppose both markup languages used an element that looked like this:

<bold>some content</bold>

but one had the semantics of making the graphical text appear in bold type and the other had the semantics of making it appear with “brave” or “strong” formatting, which might be rendered differently. Namespaces are often used in markup languages to avoid conflicts such as this, and the microformats community has embraced this usage.

In the process of marking up fragments of content, content owners will want to carefully consider the needs of the stakeholder communities. Before embarking on creating a syntax, it might be necessary to create a formal standards group and to reach a consensus on the higher-level models of use. This will allow stakeholders to raise issues that could affect the syntax (such as including non-North American address formats).

Implementers who work with microformats commonly embedded in HTML web pages might want to implement their applications as plug-ins to existing browsers so that they do not have to duplicate their efforts.

Business Problem (Story) Resolved

The use of microformats within content available on the Internet enables deterministic accessing or referencing of small fragments of content. This mechanism enables several other patterns—including the Declarative Living and Tag Gardening pattern and the Mashup pattern, among others—to be implemented in a common manner.

Specializations

You can specialize this pattern in several different ways. For example:

Automation

Content can be automatically marked up via an application that intercepts the content as it is being published, or when the original content is aggregated. Several software vendors have experimented with combining composite documents and other aggregations of information into existing content collections.

Specialization by type

Various methods and syntaxes exist to allow markup of very specialized types of data. For example, Adobe’s XML Metadata Platform lets users mark up metadata of various types of files using a multitude of schemas and includes the option to extend the infrastructure to incorporate custom schemas.

Known Uses

Visit http://microformats.org for more information regarding known uses of the Structured Information pattern.

Consequences

Implementers of the Structured Information pattern must carefully assess the risks posed to forward compatibility by embedding markup languages in their documents. They need to take special care to avoid potential future conflicts or collisions of syntax and semantics. If someone designs a new microformat and uses a tag such as foo, it may clash with later versions of HTML if those versions also contain a foo element. You can avoid this danger by using namespaces to qualify elements; however, there is no way to guarantee that your namespace qualifiers will be unique other than using your unique domain name or universally unique identifiers.

Designers and implementers must also consider, when designing their microformats, at what level of granularity things become semantically meaningful. For example, an address may have subelements for street, city, region, country, and postal/zip code. Some of these may be further decomposed. For example, the street address 3347 West 6th Ave. contains four pieces of information that may be semantically meaningful to different people. Microformats should be designed to account for further optional decomposition that may require careful data modeling. The street address element, for instance, could contain a choice of character data or four subelements that each contains character data.

References

For more on the Structured Information pattern, visit http://microformats.org.

Summary

The patterns discussed in this chapter have captured information based on consistencies in things generally deemed to be Web 2.0. This set of patterns is by no means exhaustive or complete, and new patterns are bound to pop up in the near future. Still, this represents a start toward defining a concrete set of Web 2.0 patterns.

To complete this book, the next chapter takes a look at some of the phenomena that we anticipate will be features of the years ahead.



[72] CAP is a standard from the OASIS Emergency TC (http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=emergency).

[74] ESB is a specialized set of patterns within an SOA infrastructure or context. For more on ESB, see Enterprise Service Bus (O’Reilly).

[75] This figure was provided courtesy of Fred Chong. See http://blogs.msdn.com/fred_chong/.

[81] Inspired by Katey on United Airlines Flight 446 to Charlotte, North Carolina.

[89] An impressive list of companies sponsoring the conference is available at http://office20.com.

[94] Although BackWeb, an enterprise software vendor that offers offline access for web-based enterprise apps such as JEE-based portals, definitely deserves a mention here.

[97] This application’s functionality is described in greater detail at http://docs.google.com/View?docid=dhkhksk4_8gdp9gr#widget.

[101] “The Structure of Collaborative Tagging Systems,” by Scott Golder and Bernardo A. Huberman, Journal of Information Science, Vol. 32, 198–208 (2006); http://arxiv.org/abs/cs.DL/0508082.

[106] See Figure 7.42, “A tag cloud” for an example of a tag cloud.

[108] Ambient Findability, by Peter Morville (O’Reilly), page 149.

[109] (courtesy of SUMO, http://www.ontologyportal.org)

[111] Courtesy of Paul Lamere’s website, http://blogs.sun.com/plamere/entry/metal).

If you enjoyed this excerpt, buy a copy of Web 2.0 Architectures.