Security Products: A Needle in a Haystack
Analytics helps sort through video surveillance's information overload
By Stephen Russell
London's city-wide transit surveillance system, the Ring of Steel, includes more than 10,000 cameras. Some 3,000 cameras also have been deployed in Chicago, with 3,000 more soon coming to New York City.
Meanwhile, New Orleans, Baltimore, Philadelphia and other metropolitan areas have all moved forward with new city-wide and transit authority surveillance projects. Internationally, Taipei recently announced a 13,000-camera city-wide initiative, and largest of them all, Beijing deployed a staggering 300,000 cameras prior to the 2008 Olympics.
Collectively, more cities have deployed more cameras for more purposes in the past few years than in previous decades combined. And with fast networks and high-resolution cameras, surveillance video spigots are on and running at full blast.
Too Much of a Good Thing
Cities and transportation authorities are fast discovering that in addressing security and safety problems through video surveillance, they have created another challenge: information overload. It's the same problem folks at the National Security Agency, at the CIA and in signal intelligence realized years ago after proliferating their countless satellites and listening posts. What were they going to do with all that information? How could they sift through countless hours of nothing while looking for that rare something? Today this problem has led many to question the ultimate effi- cacy of video as a security tool in human and cargo transit systems.
In London, opponents of the Ring of Steel argue that despite the 10,000-plus cameras, more than 80 percent of crimes remain unsolved. After the London bombings in 2005, it took thousands of investigators more than six weeks to comb through the city's vast surveillance archives looking for clues. Clearly this isn't the kind of effort that can be deployed in more routine circumstances. It's no wonder that despite massive camera proliferation in London so many crimes go unsolved.
In some enterprising small cities, such as Lancaster, Pa., citizens are given the ability to access and manipulate cameras in an attempt to address these kinds of problems, giving new meaning to "neighborhood watch." Still, many studies question whether enough has been done to really make video surveillance effective. For instance, recent studies in San Francisco and Los Angeles claim zero impact on crime.
This will remain a problem so long as these deployments are hampered by an unmanaged and unmanageable deluge of video. You simply can't find what you can't see.
The critical question that no one is asking is how to make the captured video relevant to safety and security. While many articles and studies focus on how to network cameras effectively, how to cover hardto- reach spots with specialized cameras, whether to locate intelligence at the edge or on a server and how video analytics will magically find a terrorist in the act of contemplating a catastrophic attack, no one has put forward a workable model for making video useful in a transportation safety scenario.
How will video technology stop the assault of a citizen? How can we foil the plots of those bent on mass destruction, and how can we do it in such a way that salvages the millions of dollars spent to implement these extensive camera systems?
The Role of Search
The problem with big video really comes down to its inherent lack of structure. Analog or IP, standard resolution or megapixel, surveillance video is all essentially unstructured stuff. It contains no notes, no tags or descriptions and no keywords to help you separate what's important from what is not. That is a problem. It means that for the most part, video is useless without a person to make sense of it, and we just don't have enough people to keep up with all the video we are generating. We've deployed about 30 million cameras in the world; that works out to be more than 250 billion new hours of recorded video every year.
But if we could teach computers to make sense out of video, even a little, we could make it searchable. And a good search engine, we have learned, can change the world. Imagine the Internet without a search engine. There are more than 10 billion Web pages. Users would sift aimlessly through content, rarely striking on something useful and relevant to their interests. If they did, they would not be able to indicate their interest in topics like this and find similar content. In short, search makes the Internet usable.
Yet, somehow this information revolution that improved productivity in other knowledge management industries has yet to transform our approaches to surveillance. Law enforcement and security staff are often left to sift through countless hours of video footage to pinpoint those vital few moments of a crime. Once found, the process of finding more video related to the same person or event is often just as arduous as the first inquiry.
Video search technology exists to make video relevant. Searching for an event by time and place, license plate, serial number, face, color, toll transaction or other relevant data point can rapidly narrow the video data down to a volume that can quickly be sifted through and analyzed by the human eye.
Take, for example, a purse that is stolen on the subway. The victim reports the time and location of the assault, but the actual assault did not happen in a camera's field of view. In most instances, the victim would have little recourse other than to complete a crime report.
With search-based video surveillance, however, the investigating party could rapidly pull up video from motion events in the surrounding area and times, broadening the search to achieve more results presented in an easy-to-scan form. In a few seconds, they spot the perpetrator exiting the subway with the purse. Without search, the operator would only have been able to find this video if he had the time and willingness to manually review video from each possible camera feed.
Faces, License Plates, Colors and More
Search engines like Google rely on things like keywords and page ranks and the fact that Web page text can be analyzed and cross indexed to make meaningful searches possible. Video is completely different, but it too can be analyzed for things like faces, license plates, colors and object tracks. When this data is processed and cross indexed, an incredible understanding of activity and identity can emerge from video.
Once relevant video analytics is implemented and tuned, the game changes entirely in favor of law enforcement. License plate recognition, commonly used to stop toll evaders, is often dismissed as a crime-stopping tool because the technology requires highly tuned and expensive cameras to be implemented. Technology now exists that allows common cameras to track license plates anywhere from a car rental agency to a city intersection.
While facial recognition is a dirty word in some video surveillance circles, the technology's promise to deliver more criminals to justice is being realized in transit scenarios. Much of this can be attributed to the dramatic increase in accuracy of face-finding technologies in recent years. Where tests of facial recognition in German subways in 2006 to 2007 yielded accuracy rates of around 60 percent, recent studies conducted in South Korea show that new technologies can achieve accuracy of closer to 85 percent with very low instances of false positives.
From Reactive to Proactive
We will all know that efforts at developing transportation-based video surveillance systems have been successful when we read the headline "Terrorist Plot Foiled, Suspects Captured using Video Surveillance, No Citizen Harmed."
For this to happen, search is simply not sufficient because it is by its nature a forensic tool. But, if you turn search on its head, you have alerts. Just as Google can send you information pertaining to topics of interests, advanced video surveillance systems can provide alerts related to events, people, cars and other items of interest.
For transit security professionals, this answers the question "What should I be looking at?" Rather than asking security staff to stare into monitors hoping to see something that may or may not be happening, why not provide that staff with a steady stream of events (e.g., faces, motion and license plates) that indicate an activity of interest?
Visiting the example of our ill-intentioned thief again, let's assume he actually got away with stealing the woman's purse. We now have isolated three instances of him on video that provide evidence of three different thefts. Flagging this video, we ask the system to alert us the next time he enters the transportation system. True to his pattern, the thief returns to the subway and attempts to position himself in the same off-camera location where he committed his other crimes. This time, monitoring staff are alerted to his presence before he has a chance to act, and they are able to apprehend him.
While video analytics can surface suspicious activities, objects and people, the trick to making alerts work is managing false positives. If every alert turns out to be a false positive and actual criminal behavior is missed, then the system is no more useful than a sleeping security guard. In the Korea subway case mentioned above, each time a watchlist suspect entered the subway, the system accurately alerted staff to the presence of the individual. While alerts did come through that turned out to be false, the volume was at a tolerable level that did not reduce the efficiency of the operation.
"The Minority Report" is just a movie, and what you see on "CSI," "NCIS" and any other crime television drama is fake. Though these images often form the basis of what users expect from a system, they are not realistic use cases for video technology. However, never before have so many valuable use cases been within our grasp, allowing safety and security professionals to gain the upper hand and narrow the chances of a criminal's escape.
To successfully leverage intelligent video in a transportation setting, it is important to simplify and integrate. Find a system that provides a platform for the integration of a wide variety of cameras, data systems and analytics. By starting with a platform that is easy to integrate to, the system will be able to fl ex to meet several use cases.
Use cases are the crux of an effective implementation. Transportation surveillance focuses on the rapid motion of a large number of people and objects over a wide area. Too often, video systems are installed in such a way that they see too much of the big picture and not nearly enough of the detail. While tracking a wide area is a catch-all use case, it does nothing to solve the problem of too much video. To create an effective use case, institutions should not implement a wide-reaching dragnet, but rather should address individual cases, develop, implement and tune them until they work.
As New York found in the 1990s, the key to overall crime reduction is to crack down on specific small crimes— for example, graffiti, littering and public drunkeness. Define the use case, how you would solve it and then determine the proper equipment, search tools and analytics that will allow your organization to address the behavior effectively from a reactive and proactive stance. Toll jumpers, freight from a suspicious location or unattended bags are good examples of use cases that can be addressed and managed with video analytics. Once you conquer one, move on to another. Soon enough, you will see that you have solved several individual problems, but what you have really done is created fewer options for criminals to destabilize transit systems and endanger the people and cargo that move through them.
Surfing the Tidal Wave
We stand at a turning point in the development of video as a tool to stop crime in our nation's transportation systems. A proliferation of cameras has created a tidal wave of near indecipherable visual information. To surf it, we are going to have to combine the knowledge from our intelligence communities, the most innovative system design, and the best and brightest from the IT industry.
It will require us to embrace and hone new tools and technologies, teaching the right video to surface itself. But in the end, making sense out of video and making it searchable is the only way to begin to fulfill on the promise of video as a tool to keep travelers, trade and our nation safe.