As 2014 drew to a close, one of the SPD officers involved in our body-worn video camera pilot project was on-hand at the Seattle Center to capture the dazzling lights and cacophonous boom, boom, boom of the New Year’s Eve’s fireworks show.
The Space Needle looks even brighter than the moon, moon, moon, doesn’t it?
SPD is still working through some of the ideas generated at our amazing December Hackathon to make more of these videos—and other police records—accessible to the public. If you weren’t able to attend the Hackathon but want a look at some of the ideas discussed during the event, a few of our attendees got together and compiled their notes into a single document. You can find their raw notes below:
#SPDhackathon
https://twitter.com/search?q=%23SPDHackathon&src=typd
http://spdblotter.seattle.gov/2014/12/05/sign-up-now-for-the-first-ever-seattle-police-hackathon/
We’re looking for a few good hackers who can help automate the process of:
Blurring or redacting faces, license plates and audio in recordings, while leaving officers un-obscured
Transcribing, subtitling and time-stamping audio from videos
We’ve uploaded a sample video (broken into three .zip files) and we’ll post additional clips next week. Before you get coding, please note that any redaction software you dream up must leave recordings in their original format:
Our front facing car cameras currently record in H.264 (MPEG4) at 720×480, 3.5mbps. 352×340 rear facing
Videos recorded prior to 2013 are in MPEG2.
The frame types include typical MPEG GOP structure, including I, P and B frames.
\
First video file in a three part zip archive
http://tinyurl.com/pyw8u9t
http://tinyurl.com/nq5uvvk
http://tinyurl.com/py999e3
File: 6092@20141130161704.mpg
Direct link: https://drive.google.com/file/d/0Bz0KotYzRFdjbGVLaHhQdEdFUFE/view?usp=sharing
File: 6968@20141204095101.MPG
Direct link: https://drive.google.com/file/d/0Bz0KotYzRFdjVXVLdDl5WEd6U28/view?usp=sharing
File: 5678@20141204144151.MPG
Direct link: https://drive.google.com/file/d/0Bz0KotYzRFdjbndRb2pPTTJZUzg/view?usp=sharing
File: 6968@20141204095102.MPG
Direct link: https://drive.google.com/file/d/0Bz0KotYzRFdjMlF2UHN6V1hJXzQ/view?usp=sharing
File: 6092@20141130161705.mpg
Direct link: https://drive.google.com/file/d/0Bz0KotYzRFdjdktKQXEydlg1YTg/view?usp=sharing
File: 5678@20141204144152.MPG
Direct link: https://drive.google.com/file/d/0Bz0KotYzRFdjQk80SUpCNllDb0U/view?usp=sharing
Microsoft Azure Media Services
Tools from the Demo
Demo AMS Explorer Tool – https://github.com/Azure/Azure-Media-Services-Explorer/releases/tag/v3.0.3.0
Demo Caption Viewer – http://ie.microsoft.com/testdrive/Graphics/CaptionMaker/Default.html
Getting started Resources
Getting Started with Azure Media Services – http://azure.microsoft.com/en-us/develop/media-services/resources/#header-0
Detailed References – http://msdn.microsoft.com/library/azure/hh973629 (indexer specific post can be found on the left hand menu)
Indexer Intro and Sample Code – http://azure.microsoft.com/blog/2014/09/10/introducing-azure-media-indexer/
Download Developer Tools – http://azure.microsoft.com/en-us/develop/media-services/developer-tools/
Free Azure Credits
http://www.microsoft.com/bizspark/Register.aspx?SecurityCode=SAt1qdQcEz
Code SAt1qdQcEz must be redeemed by 1/31/2015
Tutorial on activating http://bretstateham.com/bizspark-enrollment-tutorial/
Contact – Richard Li. rli@microsoft.com. @richyli.
Legal redaction requirements – Mary Perry
License plates are not generally exempt from disclosure and thereby subject to disclosure
http://apps.leg.wa.gov/RCW/default.aspx?cite=13.50.050 Records relating to commission of juvenile offenses
Media and requestors would not accept completely blurred or over-redacted video.
The automatic over-redacted video could serve as a teaser or preview of available video for follow-up requests that require more manual processing.
Actual cost of copying records covered.
Video edaction needs to include more than just the face: body, hair/skin? color, gait.
Redaction needs to include audio to prevent release of sensitive information (e.g., medical information, mental health information, SSN, DL#, birth date, etc)
Open Source
Face Detection in Python Using a Webcam (OpenCV): https://realpython.com/blog/python/face-detection-in-python-using-a-webcam/
ffmpeg
ffmpeg has a feature that allows scrubbing to first sound
avidemux http://avidemux.sourceforge.net/ (scriptable manual editor)
blender – can track individual objects and blur (manual process, can be automated using Python)
Pitivi – Facebl0R (makes it crash) and face tracking (SegFaults)
kdenlive – Has “Auto Mask” to follow, obscure objects (cars, whole bodies, etc.) (too many options)
gaupol, free, open-source automatic captions (useful for search, then edit with kdenlive or blender)
Idea from henry@thenerdshow.com: embed every redaction action in an SRT (captions) file which would provide metadata about why the video was blurred (was it a minor, etc). This would allow a specific redaction to be challenged, reviewed and possibly reversed.
Team UW
demonstrated using openCV to detect and blur faces
Team Evidence.com
Said Washington State is most challenging because of records act but will set international standard for disclosure of video
henry@thenerdshow.com
Demonstrated own algorithm, realtime face detect/obscure, fun effects for live feeds.
Search will be important, unless over-redacting.
1. Recommends gaupol, free, automatic captions for search. Windows version http://home.gna.org/gaupol/download.html
2. Plans to extend gaupol to learn new words, e.g. police terminology, names. (Edit SRT, press “Learn” button.)
This will be as easy as taking code from his other project, FreeSpeech, also hosted on http://thenerdshow.com
3. Have gaupol edit multiple captions to make it easier to log changes.
Redaction
todo: REDaction Editor (reddit). Starting with something like the above, further extend it into a stripped-down search-and-destroy video editor with text, logging capabilities, audio search, b-frame motion search, and easier object search, track/blur. Planning stages. Figure naming it after reddit will generate free controversy/publicity…
In the meantime, tentatively recommending http://blender.org as probably the most capable cross-platform, open-source solution to redact, block out various moving objects and bleep out audio, but certainly not the fastest, nor easiest to use… demo video here: https://www.youtube.com/watch?v=IFHtIGjxzUQ but would like to make a better one.
kdenlive is easier, though less capable, if virtualbox image is not too much hassle…
https://kdenlive.org/user-manual/downloading-and-installing-kdenlive/virtualbox-images
Over-redaction
The “quarktv” effect may be considered to automatically blur people from videos, under the assumption that (usually) people are in motion. At worst, this would reduce the editing required to only blur out portions when they are not moving. Example video here: https://www.youtube.com/watch?v=UDjHE71xSNc
Team Anon
Over-redacting video and providing a transcript provides quick fast solution for enhanced transparency and privacy
http://policevideorequests.org/overredacted_proposal
http://policevideorequests.org/long_interview_demo/
above 5000 content hours, cost of audio indexing is $3 per content hour
Team Open Source
Seattle spends one third of its municipal budget on policing. Money spent by the city on software encoding of procedures for legally mandated public records requests should be Open Source and provided under Free Software licenses to the greatest extent possible. Seattle Police Foundation and Code for America are examples of nonprofits that are most likely to fund open source software development.
General Comments / Discussion
Mary Perry [City Law Department] asked those who were there as community members how they felt about redaction policies. Two community memebers spoke and both said that over-redaction of videos released to the public was preferred to redacting to the legal standard of privacy / minimal redacting.
Rather than attempting to find the perfect redaction solution that fits all scenarios, SPD should tailor their redaction approach to the specific problem. If the problem is releasing all its existing videos to the public, then over-redaction with audio stripped and redacted transcript may be the best solution even if it won’t solve their future problems regarding redaction and release of body camera video. SPD should implement a monitoring program (e.g., tracking feedback, follow-up requests, complaints, etc.) that will allow them to determine whether their redaction policy has actually solved the problem.
The identification of people in redacted video can be effected with other sources of information (mosaic theory):
other overlapping video / photographs
gait recognition
electronic signals gathered: (RFID, cellular, WiFi, Bluetooth)
see http://digitalcommons.law.umaryland.edu/fac_pubs/1375/ for details (tl;dr anonymization is hard, de-anonymization is easy and irrevocable)
Notes and Comments from Bill Schrier – bill@schrier.org
See also my article and photos posted on Geekwire here: http://www.geekwire.com/2014/seattle-police-hackathon-substantial-first-step/
These notes are “raw” and may not be correct in every respect.
1. Microsoft Azure Services – indexing – platform but not a solution
a. Uploads the video, converts the audio to text pieces by the pauses. When you click on the audio in the right side window it is supposed to take you right to the video part.
b. Can you delete the audio and associated
c. English audio only – future is different languages – treats non-English as background noise
d. Result can also be displayed as a transcipt
e. Science Cinema is built off this platform – www.osti.gov/sciencecinema
f. Can search on any word in the collected sets of transcripts and find the appropriate video
g. Build on the standard set of Microsoft Azure services
h. Gives a recognizability score based on how good the video is
i. H264 and a number of other video standards …
j. Facial recognition is not an Azure service and is apparently not planned …
k. Ton of APIs which could be connected – could run indexing on the huge backlog of files –
l. Could be connected to the Microsoft government cloud which is CJIS compliant since it is an Azure service
m. MAVIS is the audio recognition service from Microsoft Research
n. Have a more advanced service based on SQL which displays word-by-word, and a percentage of potential accuracy, and possible interpretations.
o. Open source software?
p. Azure media services explorer
q. SDKs for Java is open source and some tools on Github
r. Liking to Skype real-time translator? Basis of the tech is fundamentally the same. Not integrated in the same service, out of the box.
s. Can index files which are on a server other than in the Azure cloud.
2. Mary Perry – what has to be deleted?
a. Abiity to create video has outstripped ability to redact
b. RCW 42.56
c. Exemptions are specific – law enforcement records exemption – information that would violate any person’s right to privacy – tort standard of privacy – highly offensive to a reasonable individual AND of no legitimate interest to the public. Courts want redaction rather than withholding an entire record.
d. Not an exhaustive live – complainant victim or witness who asks for non-disclosure or whose person or property is at risk. Courts have not definitively ruled on video – is it blurring or pixelation or complete blacking out? More than face – audio of names or addresses.
e. Other types: identified juvenile – whole chapter of RCW 13.50 which deals with juveniles – withhold anything which identifies them.
f. Mental health information or medical information. Someone in a crisis and lots of sensitive information.
g. Can give 3rd party notice – in other words if someone requests public disclosure SPD can say they are going to release it and the individual can seek a reduction.
3. Simon Winder – 20 years at Microsoft Research but now independent
a. Interested in detecting things in video.
b. Face detection and showed video of egyption protests with redaction and also redaction on a still frame from Spokane police
c. Relatively easy to track the same face across a series of frames – it is “possible” – needs a few months of actual engineering work
d. Same techniques for facial recognition usable for license plates
e. YouTube has a redaction technology which is relatively slow
f. Possible to do this with a whole body as well – need to protect whole identification of an individual, e.g. gait, hair color, sex etc.
g. Distinction between facial detection and recognition.
h. Viella-Jones face detection in the first set – developed by Paul viella of Microsoft.
i. Interesting idea – build a facial recognition of all officers so they are not automatically redacted from video as required by law
j. Can the officer flag something in the video which would require further redaction? This requires a lot of officer time. Evidence.com: might have chipsets in the camera which actually recognize what’s a face or what needs to be redacted. Also an issue of what needs to be flagged in audio. Maybe modules can identify SSNs or similar audio. But it also means someone will have to do a final review of an audio before released.
k. Mary Perry – trying to implement procedures to tag or flag video – or add metadata that a juvenile witness is in the video or someone has asked for non-disclosure. Note: license plates almost never are allowed for redaction.
l. Could everything be blurred? Probably would not meet the requirements of the public records act as it redacts information of legitimate interest to the public and therefore subject to disclosure. Maybe over-redact and push it all out and then have people ask for specific video to be less redacted.
m. What are all the use cases and patterns? Have people view the video rather than give them a copy? Mary: two issues – might be violating right to privacy having someone see it but also requires public records act saying “I want a copy”.
4. OPA Review Board – show both the redacted and the unredacted video.
5. Body-worn video – training is today and launch is tomorrow for SPD.
6. San Francisco – could the redaction be a separate stream from the video itself – embed SRT files which are caption files including timestamps to index all the changes to video files. Public could look at that separate stream and ask for unredaction if necessary.
7. Seattle pd video – I frame followed by 14 p frames. MPG4 video and some older 720×480 and 352×240 rear facing in MPG2. Use FMMPG tool to probe the file. Court-ordered redaction. So:
a. Full body of video, some which cannot be publicly disclosed because it is an active investgation
b. For that video which is ordered redacted by court, has to be blurred, not blacked to disguise things like gait but not entirely hide.
8. UW Team …
a. Mostly a proof of concept
b. Using open CV – open source library for video – doing research at the UW on this
c. Time it takes to process the video is the time it takes to watch it – really fast. But is not very thorough. Can distinguish individual features and shapes on faces – there are hard filters for bodies or eyes or profile face etc.
d. Lots of recent work on putting boxes or polygons around things and allowing redaction of multiple frames
e. Also if two cameras are present it may be possible to get a 3D image of the thing to be redacted in both video
9. The nerdshow,com Henry Kroll
a. Open source – the Blender – demonstrated a cool blurring and tracking of objects … can be fully automated with python …
b. Pitivi – gives timeline and thumbnailes, face blurrers etc. but has lots of bugs which the developers are fixing …
c. Ffmpeg appears to be the basic set of libraries for handling multi-media files
d. Henry Knoll has own voice recognition program to compete with Dragon
e. Court TV plug-in open source
10. Evidence.com – Marcus
a. It is not about the camera, but about
b. 80-100 gigabytes per year per officer standard definition 720p. Then HD 1080 eventually and 4k for surveillance.
c. Encrypted in transit and at rest.
d. 30 second pre-event buffer – it is always recording so it grabs previous 30 seconds when turned on. At end of shift officer plugs into docking station and it is uploaded to evidence.com. Integrity and chain of custody maintained, including access control.
e. Today: manual redaction is what is offered today.
f. Washington is further along its open records – expect to see the same challenges nationwide eventually
g. Want to provide an open platform for all evidence, e.g. surveillance, interview room, body-worn etc. Want to have open-apis so outsiders have access to the video. So if they can get a great redaction tool it could be applied and the public can view on the evidence.com site.
h. Things to keep in mind:
i. The end to end experience of the officer, e.g. offloading each video at the end of the day. Think about the process wholistically. Also think about servicing the public with both the redaction and serving them up.
j. Think about doing this at scale. Process to handle 80-100 gigabytes per year per officer. Make themanual part more efficient, e.g. cloud.
k. Take baby steps. Even if you can eliminate 10-20% of the work you have a win.
l. Open partner platform – Tony in the back of the room.
m. Multi-agency solution – kinda like dropbox for sharing with other agencies and district attorneys etc. Have a district attorney workflow.
n. It is the agencies data
o. Data is hashed on the camera and follows through the chain
p. Washington state the most vocal – most other states much of this can be withheld.
q. Mary: agencies cannot deny requests which are openly broad; also can only recover cost of copying the record; 3 – very few exemptions
11. Anon – Tim Clemens
a. The reason we are here today.
b. Get out as much as possible as quickly as possible
c. Overredaction
d. Demonstrated a 5 hour video of interview with SPU murder suspect – audio file created with Microsoft indexer – allows quick search of all instances where parents were discussed.
e. Tim’s whole idea is that the video which does not need redaction can be immediately uploaded and exposed.
f. SPD thinks 90% of the video is in this category but doesn’t know which video fall into that cateogry – police and procedures still needimplementation.
g. Idea is to overredact and then if more people want the detailed data, e.g. news media, then allow a tailored solution
h. Transcirpts, overredacted police reports and video, CAD reports allow people/news media to find the stuff which is interesting
i. He likes to catch cops doing stuff right – and publicizing it –
j. Idea: using all these police microphones and audio to create a seattle polcie gunshot detection system
12. Team open source
a. SeaGL – Adam Monson, Phil Mocek, Lee Colleton of Seattle Privacy Coalition –
b. SeaGL – Seattle community colleges –
c. Idea is to make software used by SPD open and available – Phil: Center for Open Policing
d. They are just pushing the idea of use of open source software …
e. Open source software – many different foundations or non-profits who can provide to support
13. Community input:
a. Over-redact rather than under redact
14. Summary:
a. Actually an ideal solution would allow specification of objects, e.g. faces or “juvenile bodies” or “house numbers” and allow fast processing of huge numbers of video files to find and redact the objects all at once …
b. Flagging video which does not need redaction
c. Concern about the amount of time for cops on the street –