2014-09-26

More than just Mobile Devices: Where touch detection breaks down

When you think of “touch,” mobile phones and tablets may immediately come to mind. Unfortunately, it’s far too easy to overlook the newest crop of touch-driven devices, such as Chromebook laptops that employ both a touchscreen and a trackpad, and Windows 8 machines paired with touchscreen monitors. In this article, you’ll learn how to conquer the interesting challenges presented by these “hybrid” devices that can employ both mouse and touch input. In the browser, the Document Object Model (DOM) started with one main interface to facilitate user pointer input: MouseEvent. Over the years, the methods of input have grown to include the pen/stylus, touch, and a plethora of others. Modern web browsers must continually stay on top of these new input devices by either converting to mouse events or adding an additional event interface. In recent years, however, it has become apparent that dividing these forms of input – as opposed to unifying and normalizing – is becoming problematic when hardware supports more than one method of input. Programmers are then forced to write entire libraries just to unify all the event interfaces (mouse, touch, pen, etc). So how did mouse and touch events come to be separate interfaces? Going forward, are all new forms of input going to need their own event interface? How do I unify mouse and touch now?

Is this article for me?

The solutions in this article are specific in nature – only applications that require heavy user interaction (games, HTML canvas application, drag & drop widgets, etc) fall within the target application of the solutions discussed. Click driven interactions (ie. regular websites) do not necessarily need to worry about user-input methods, as click events will be fired regardless of the user’s input method.

The History of the Event Model

The first web browsers available in the early 1990′s had no client-side scripting language. Web pages were completely static, so there was no platform to broadcast user-input events to. It wasn’t until the introduction of JavaScript in 1995 that the capability to capture and consume user input via client-side scripting was possible.

Birth of the Event Model

In December 1995, Netscape Communications Corporation released the second beta of their flagship product, Netscape Navigator 2. With this release came the first client-side scripting language to ship with a browser – JavaScript (at the time, it was called LiveScript). With the introduction of JavaScript, the concept of a Document Object Model (DOM) was born – and with it the inline event-handling model.

Figure 1: Inline Event Model

As seen in Figure 1, code could be written directly in the HTML event attribute. When that event was dispatched, the code in the attribute would be run. Although commonplace today, at the time the inline event model revolutionized the concept of an interactive website. Webpages were no longer strictly static, and could react to the user’s input. Changing an image on mouseover and mouseout was a particular hit.

Figure 2: Image mouseover and mouseout

Nine months later when Netscape Navigator 3 was released, it introduced the traditional event model that allowed callbacks to be assigned purely with JavaScript (see Figure 2).

Figure 3: Traditional Event Model

At this point, there was no Event interface, so event callbacks were called with no data about the action that caused the event to trigger. It wasn’t until Netscape Navigator 4 a year later that the Event Interface was created. The developer could now leverage an Event object which was automatically provided by the browser as an argument to all event callbacks. These two models are part of the informal DOM Level 0 (aka: pre DOM Level 1 de factos) and are still supported in modern browsers.

Events were originally Captured, not Bubbled. When DOM Events were first implement in Netscape Navigator, events did not follow the bubble model (child to parent) that all modern browsers currently do. When you clicked on an element where it and its parent element had an onclick handler, Netscape followed the event capture model (parent to child) where the parent handler was handled first, then the child. Figure 4 illustrates the order in which element events are handled in both the capture and bubble models. The numbers represent the call order when the user clicks the inner element.

Figure 4: Capture vs Bubble Model Event Handle Order

It wasn’t until the DOM Level 2 specification that the event handling model moved away from the inline and traditional models to the event handling model we are familiar with today: addEventListener, removeEventListener and dispatchEvent. This spec introduced a more flexible model that allowed developers to attach multiple events of the same type to a single element.

Figure 5: Flexible Event Model

Internet Explorer’s Event Model. Before the DOM Level 2 specification was finalized, Microsoft introduced its own flexible event model in IE5: attachEvent, and detachEvent. Microsoft stubbornly stood by their proprietary implementation for 10 years until IE9 was released, where support was added for the W3C standard. Microsoft completely dropped their event model in favor of the official W3C standard in IE11.

After the introduction of the flexible event model, other than a few modifications in the DOM Level 3 spec the MouseEvent interface has remained relatively the same for the past 15 years.

Birth of the TouchEvent

In 2007, the first commercially successful touch device was released – the iPhone – and with it came a mobile version of their Safari browser. The first version of Mobile Safari only used mouse events with a user’s touch input. While the results were adequate, the MouseEvent interface could not make full use of the iPhone’s touchscreen capabilities. One year later, Apple created the TouchEvent interface for iOS 2, intended to live along side-by-side with the MouseEvent interface. Creating a separate event interface allowed Apple to change the input workflow to better match the requirements of touch input – such as the ability to recognize multi-touch gestures – without the need to modify the existing MouseEvent.

Figure 6: Touch Event model

Apple’s TouchEvent implementation is – to this day – considered proprietary and lives in its own private WebKit fork. It wasn’t until 2009 that the TouchEvent interface appeared in the official WebKit base due to being implemented by Google for their Android version of Chrome. In early 2011, the W3C started drafting a TouchEvent spec based on Apple’s implementation. But before the W3C released the first working draft, Mozilla released Firefox 4 in April of that year which featured their own version of touch events. Mozilla’s model was much different than Apple’s in that each touch would independently fire off its own event, rather than a single event containing a list of all active touches. If you wanted to track multiple touches, you would do so via a unique event id property. Mozilla deprecated this implementation two years later in Firefox 18 in favor of the emerging W3C spec based on Apple’s model. The W3C released the first working draft of the TouchEvent spec in May 2011. After a brush with Apple’s patents caused work on the spec to temporarily stop, the W3C finally released their TouchEvent Recommendation in October 2013.

The Problem

That brings us to the state we are at now, where we have two separate event models for two separate input methods.

Background

Before the TouchEvent, there was only a single input event model – the MouseEvent. It was straightforward to listen for user input:

Figure 7. Adding event listeners for user input pre touch events

Then came touch devices, and with it (eventually), the TouchEvent. Now, to handle both mouse and touch, we need to add touch event listeners in addition to mouse event listeners.

Figure 8. Adding event listeners for both mouse and touch events

But there was a slight catch to the TouchEvent model: every time a user initiates a touch, it will not only fire a touch event, but will also fire a mouse event! Why is this? In order to support older websites that did not listen for any touch events, Apple decided that “simulated” mouse events should be issued for every touch, which could be caught via the traditional mousedown/mouseup/mousemove event handlers. This has become the de facto standard for almost all touch event model implementations. The W3C even added recommendations to their spec that mouse events still be dispatched. Following a set of rules (figure 23), mouse events are dispatched after touch events so any legacy code still only listening for mouse events will still work.

Figure 9: Click event stack

Desktop

Touch Device

So, logically, the solution is to do a quick check – detect if the browser supports touch events and then only listen for touch events. Otherwise, listen to mouse events. A common approach is to use the Modernizr library to detect whether the TouchEvent model exists in the user’s browser, as illustrated in figure 10.

Figure 10. Conditionally adding events listeners for touch and mouse events

Oh, but wait. What about the rising trend of desktop and laptops that support both touchscreen and mouse input? Performing a conditional check as above would add touch event listeners, but would completely prevent the user from using their mouse! The problem with unifying touch and mouse events is how do I listen to both touch and mouse events but ignore any simulated events?

Input fragmentation is not so good

For cross-device support, the TouchEvent Interface adds new events that developers must listen for. It takes a single starting point – a user’s input – and fragments the event model by creating separate events based on the type of input. And because the TouchEvent interface simulates mouse events, JavaScript developers face an awkward challenge when they intend to listen for actual mouse events (as opposed to a simulated ones).

Figure 11. Current User Input and Event Flow

As seen in Figure 11, it becomes the responsibility of the JavaScript application to determine if an event is simulated or not. Ideally, we would be working with an event flow illustrated in Figure 12.

Figure 12. Ideal User Input and Event Flow

The problem with having two input event models is just that: there are two separate event models. These separate implementations mean web developers need to interpolate and normalize two different event APIs in order to consistently handle input across a wide variety of devices.

Understanding the Simulated Mouse Event

The biggest thorn in our side when unifying mouse and touch events remains the aforementioned “simulated” mouse events. Unfortunately, the event object provides no information that indicates whether the mouse event was simulated or not. In order to detect a simulated mouse event, one needs to understand the ins and outs of the simulated mouse events.

Simulated Mouse Event Delay

After a touch event is dispatched, simulated mouse events are not triggered for ~300ms in Mobile Safari, and up to ~700ms in Chrome for Android. This is to allow the browser enough time to ensure the user isn’t trying to perform a gesture (scroll, pinch/zoom, double tap zoom, etc). It is worth noting that some browser vendors have been working to remove this delay when it is not necessary (eg, if the developer has removed the ability for the user to scale the page in the meta viewport tag).

Figure 13. Non-scalable Viewport

The work for this is still ongoing, and only Chrome for Android and Firefox for Android support this.

Figure 14. Simulated mouse event delays with a non-scalable viewport

Chrome for Android

Android Browser

Firefox for Android

Safari iOS

Non-scalable viewport

No delay

300ms

No delay

300ms

Mouseout

Browsers will not immediately fire a mouseout event when a user releases their finger from the touchscreen. Rather, when the user touches the device again and the target element has changed from the previous touch event, then a simulated mouseout event will be despatched (before touchstart).

Mouseup

When a user touches the device and keeps their finger in place (touch + hold), no simulated mouseup event is dispatched when the user finally lifts their finger to end the touch. The only time a mouseup is simulated from touch events is when a tap occurs and the browser dispatches the click event stack (figure 9).

Figure 15: Missing simulated mouseup event on touch + hold

Desktop (click + hold)

Mobile Touch Device (touch + hold)

Chrome TouchEvent Emulation

Chrome has a feature in its Developer Tools that will emulate touch events in the desktop browser for developers to debug touch events. Unfortunately, touch events and their simulated mouse events are not dispatched in the correct order (see figure 16).

Figure 16: Missing simulated mouseup event on touch + hold

Mobile Touch Device

Desktop Chrome TouchEvent Emulation

As of this writing, this is a known bug that has been resolved but has yet to make it into a public release.

The Future Solution: PointerEvent

If we were to continue with the current model illustrated in figure 11, browser vendors would need to create a new event interface every time a new form of input became the next hot thing. What if we could use our eyes as input? Enter the Optical Event. Or what if in a few years, our devices operate based on our thoughts? Enter the Thought Event. In 2009, the W3C started discussions on a unified pointer model for the DOM Level 3 Events spec. The idea was that all current user input (touch, pen/stylus, and mouse) would be unified into a single model instead of separate models for each form of input. Work on this spec was abandoned, however, to keep the DOM Level 3 Event spec as lightweight as possible in order to expedite the spec’s review process. The W3C was on the right track when they started talks about a unified PointerEvent model. But when W3C abandoned this spec, in stepped Microsoft. IE10 was designed as the first major browser to support both touch and mouse in a desktop environment. To accommodate this, Microsoft developed a new PointerEvent spec to unify all input methods: mouse, touch, and pen/stylus. This approach promised to normalize all event data into a single event model, regardless of the type of input the user uses.

Figure 17. Add event listeners for Pointer Events

After IE10 was released, Microsoft submitted their Pointer Event spec to the W3C for standardization. In May 2013, the W3C released a candidate recommendation that is expected to become a recommendation soon as of this writing. Right now, only IE10 (with vendor prefix) and IE11 support Microsoft’s proprietary Pointer Event spec. So until it becomes widely supported, we must rely on our own methods to unify user input.

Differences between PointerEvent and MouseEvent/TouchEvent

While the PointerEvent and MouseEvent interfaces are quite similar, the PointerEvent interface contrasts quite a bit with today’s TouchEvent interface. In order to unify the event models, the PointerEvent more closely follow the MouseEvent lifecycle and dispatch patterns. The first (and biggest) change comes in the way the PointerEvent spec handles multi-touch. If the user is performing a multi-touch input, a separate PointerEvent will be dispatched for each touch point. Contrast this to the current TouchEvent spec, where touch events are dispatched in a single event grouped together by the same event type. The other change comes from the elements the events are dispatched from. With the TouchEvent spec, touchmove and touchend are always dispatched from the element that the touchstart event was triggered from. This happens even if the user’s finger is no longer over that same element. The PointerEvent acts in the same way as the MouseEvent: pointermove and pointerup will always be dispatched upon the element that is current under the pointer position.

Figure 18: Event target for Event Interfaces when clicking on a div, dragging outside that div to the body, and then releasing.

Down/Start

Move (over div)

Move (over body)

Up/End

Mouse Event

div

div

body

body

Touch Event

div

div

div

div

Pointer Event

div

div

body

body

The PointerEvent, like the TouchEvent, will simulate mouse events for legacy support. However, because the PointerEvent is unifying both touch and mouse, we don’t run into the same problem as stated above since input from the mouse dispatches a PointerEvent. Developers will only be listening to pointer events (figure 17).

Potential Solutions

So what is the solution to this input fragmentation today? Well, that’s where things can get a bit complicated. As noted previously, simply listening to both touch and mouse events will also capture simulated mouse events dispatched from touch events. Perhaps the solution is to listen for both touch and mouse events but ignore the simulated mouse events? But how does one differentiate a real MouseEvent from a simulated one?

event.preventDefault

There is a way to completely prevent simulated mouse events from firing from within either a touchstart or touchmove callback. Calling event.preventDefault() on the touch event will cancel any mapped simulated mouse events from firing. However, this also prevents any default browser behavior from firing (clicks, scrolling, etc), so this is not always a viable solution.

Ignore Mouse Events

Another method is to ignore all mouse events for a small period of time after every touch event. Mobile Safari will wait about 300ms before triggering any simulated mouse events in order to determine whether the user intended to perform a gesture. Chrome for Android will wait even longer – up to ~700ms. The best time to use is 750ms. That will cover both Mobile Safari’s and Chrome for Android’s delay.

Figure 19: Ignore Mouse Events Flag Example

If any mouse events are dispatched during the 750ms, they will be ignored. Obviously, there are a few drawbacks to completely ignoring mouse events for period of time.

If the user tried to perform a mouse event within 750ms of any touch event, it will not be registered. This is a fairly obscure edge case, and not many users will be touching and clicking their mouse simultaneously.

If the thread from a touch callback runs too long, there is a chance the simulated events will be dispatched after 750 milliseconds. This is a rare case, but a valid one nonetheless.

While these are edge cases, the solution may not be robust enough to reliably handle all situations.

Adaptive Input Recognition

Instead of attaching all your event listeners at the same time, this approach starts by only listening for mousedown and touchstart. Then, when a touchstart or mousedown event is dispatched, you attach the rest of your event listeners for that respective input method (mousemove and mouseup or touchmove and touchend). How do you avoid the simulated events? When a touchstart event is triggered, you ignore all mouse events until the touchend event is triggered. This can be achieved by setting a static flag (figure 20).

Figure 20: Ignore mouse events if there is an active touch event

Most of the time, this is a valid solution as simulated mouse events are ignored correctly. However, there are two limits to be aware of.

When a user performs a tap, all mouse events are fired after the touchend event (figure 9), so they will not be ignored by this method.

While the user is performing a touch event, any real mouse events performed by the user will be ignored.

Depending on the use case, these may be acceptable shortcomings.

The Solution

The best solution to unifying mouse and touch is to retain a copy of each touch event by type. When the corresponding mouse event is dispatched, you then compare the target, pageX, and pageY properties. If the mouse event’s position and target are the same as the touch event it would have been simulated from, then it’s a simulated event.

Figure 21: Detect if a Mouse Event is Simulated

Why compare event.target? One might think that comparing the target of the last touch event with the mouse event should not be a part of the simulation detection since there is a discrepancy between which targets mouse events and touch events dispatch from (see figure 18). In the case of simulated events however, they are always dispatched from the same target as the touch event. It is recommended to compare event.target when detecting whether a mouse event is simulated. The one exception to this case is the mouseout event. When comparing the mouseout event, instead of using the target property, you would use the relatedTarget property (figure 22). This is necessary because the mouseout event will be dispatched from the previously focused element, not the element touchstart was dispatched from.

Figure 22: Comparing event targets for a mouseout event.

Knowing how mouse events are mapped from touch events is key to making this method work, as not all mappings are obvious. For example, mouseout is not simulated when a user lifts their finger, it is simulated when they touch the screen again. Therefore, you need use the touchstart event when comparing properties for a mouseout event.

Figure 23: Mouse Events and the Touch Events they are simulated from

Browser’s Simulated Mouse Event

User-Initiated TouchEvent

mouseover

touchstart

mousedown

touchstart

mousemove

touchend

mouseup

touchend

mouseout

touchstart

Pointer Event Polyfill

Until the PointerEvent is widely supported, the best solution to unifying touch and mouse is to emulate the PointerEvent API. This is achieved using the methods discussed in the article, and creating and dispatching a custom pointer event. I have created a lightweight library that emulates the Pointer Event API with a minimal footprint: https://github.com/aarongloege/pointer. Legacy browsers (Notably IE8 and below) do not support dispatching custom events, so there is a legacy-browser-friendly version that uses jQuery to create and dispatch the pointer events. Other libraries:

http://www.polymer-project.org/platform/pointer-events.html

http://handjs.codeplex.com/

Conclusion

We’ve reviewed a history of user interaction in the browser, taken a look at what the future holds, and how we can create a solid stopgap approach to polyfilling the PointerEvent API until all browsers universally accept the standard. The web is headed in the right direction with Pointer Events. Despite the tumulous history of events in the browserverse, exciting things are on the horizon. When the next big input type comes along, the web will be ready (and so will your code)!

The post Developing for Next Generation Touchscreen Computers appeared first on Inside the Nerdery.

Related posts:

Next Generation Mobile Applications and iOS 8 Extensions Creating the next generation of iOS mobile applications using iOS...

Developing Accessible Websites (League of Front-End Developers) Aaron Cannon, a software engineer and accessibility consultant at The...

Friday Links: Vintage computers, female nerds, and the case for ignoring people Rise of Female Nerds. Why iAds will revolutionize mobile advertising....

Nerdery.com

Developing for Next Generation Touchscreen Computers