Detecting Fake Videos

This story nicely illustrates the arms race between technologies to create fake videos and technologies to detect fake videos:

These fakes, while convincing if you watch a few seconds on a phone screen, aren’t perfect (yet). They contain tells, like creepily ever-open eyes, from flaws in their creation process. In looking into DeepFake’s guts, Lyu realized that the images that the program learned from didn’t include many with closed eyes (after all, you wouldn’t keep a selfie where you were blinking, would you?). “This becomes a bias,” he says. The neural network doesn’t get blinking. Programs also might miss other “physiological signals intrinsic to human beings,” says Lyu’s paper on the phenomenon, such as breathing at a normal rate, or having a pulse. (Autonomic signs of constant existential distress are not listed.) While this research focused specifically on videos created with this particular software, it is a truth universally acknowledged that even a large set of snapshots might not adequately capture the physical human experience, and so any software trained on those images may be found lacking.

Lyu’s blinking revelation revealed a lot of fakes. But a few weeks after his team put a draft of their paper online, they got anonymous emails with links to deeply faked YouTube videos whose stars opened and closed their eyes more normally. The fake content creators had evolved.

I don’t know who will win this arms race, if there ever will be a winner. But the problem with fake videos goes deeper: they affect people even if they are later told that they are fake, and there always will be people that will believe they are real, despite any evidence to the contrary.

Posted on October 26, 2018 at 9:01 AM30 Comments

Comments

Peter S. Shenkin October 26, 2018 9:29 AM

The kicker is that some of our own very explicit memories turn out to be false; or fake, from the outside. We tend to believe even our own fakes.

Sancho_P October 26, 2018 9:36 AM

@Bruce

The problem with any believe is it will change with time and knowledge.
So with time+ there are more options on the opposite side:
Some people (30% ?) will not believe even if they are told the video is true, because they distrust the teller.
Others (10% ?) will not believe anyway, even if the video is true, because they know it could be faked.
Last not least there are the people who don’t want it to be true.
At the end of 2018 I guess there will be more opposed as believers.

However, the disturbing part is your “… they affect people even if they are later told that they are fake”.

Impossibly Stupid October 26, 2018 10:08 AM

I don’t know who will win this arms race, if there ever will be a winner.

The fakers have already won. Despite whatever imperfections there are in what they create, history has shown that the first-mover advantage will always favor them. FUD works, regardless of the levels of technology employed. The only way to “win” is to fight a different battle, which is to not be overwhelmed by your lizard brain when it comes to processing information.

Winter October 26, 2018 10:21 AM

A digital image or recording is nothing better than a painting or drawing. In principle, a recording can be build bit for bit. Currently that is not a practical method to create an image or recording. But if the reward is worth enough, it will become profitable to do it anyhow.

Like with photographic evidence, any recording will be as much worth as the word of the creator. If the creator is anonymous, the recording will be worthless as evidence of anything.

Bob Paddock October 26, 2018 10:38 AM

Bringing the 1981 movie LOOKER (Light Ocular Kinetic Emotive Response) to ‘Reality’ it seems. Actress replaced with their indistinguishable digital avatars. We’ve already seen a few TV commercials making crude versions of this concept.

The LOOKER device caused people to lose segments of time by flashing patterns of lights in their eyes. Security implications in ‘Non-Lethal Weapons’.

Phaete October 26, 2018 10:51 AM

It’s a double edged sword.
You don’t want fake videos to ruin people lives.
You do want fake video to immerse in the latest hollywood blockbuster.

Kinda like automatic lockpicking tools, you only want them in certain hands.

Charles Todd October 26, 2018 11:07 AM

What if every camera had a digital watermark? Preferably, one baked into the CCD as a bias. If the signature was signed by the manufacturer, we would gain nonrepudiation and integrity. News organizations and video/photo sites could prevent images from going viral if they can’t be validated and shown to be void of tampering. This would save us from disproving something as real and rather would allow us to ask if it is authentic. Put the doubt in people’s mind at the first.

Genie October 26, 2018 12:54 PM

It’s a cheap murder conviction in a small-town county court.

A gun was involved, or possibly a knife if there was blood everywhere and the D.A. can get a DNA lab to cooperate with the coroner.

Murder by poisoning, drug overdose, arranged “accident” etc. is far beyond the means of the small-town county court to prosecute.

Alex October 26, 2018 1:06 PM

It will take time, but I suspect we will start treating these videos as you usual random Internet image. We are no longer surprised by photoshopped images, even if we’re not expert at detecting them. We treat them with a healthy level of skepticism, and the same will go with these videos at some point.

David Rudling October 26, 2018 1:40 PM

The technology involved is really beside the point.

You can fool some of the people all of the time.
You can fool all of the people some of the time.

Nothing has changed irrespective of the technology. Gullibility will always exist.

tfb October 26, 2018 2:29 PM

@Charles Todd: If every camera has a digital watermark people will take movies of videos they have faked, which will now have appropriate watermarks.

Fred P October 26, 2018 2:48 PM

@Charles Todd – There already are manufacturing defects in CCDs that are pretty close to unique. Sample source: http://www.darkerview.com/CCDProblems/pointdefect.php Typically, however, the user of a camera is not looking at raw data – typically, a camera’s output is some sort of processed, and often compressed image. One of the things that is typically processed out is any small defects in the camera.

Disclosure: I worked on low-level software for high-quality cameras about 2 years ago.

Clive Robinson October 26, 2018 3:03 PM

@ Bruce,

I don’t know who will win this arms race, if there ever will be a winner.

There is a way a potential victim can avoid being a victim, but it carries a price, quite a high one.

If you decide for some reason you might become a victim, you can fairly simply arange these days for your entire life to be recorded 25×365.25 and archived for future use as counter evidence.

The price is little or no social life, no private moments and never ever doing wrong even by accident.

The question is how much of your life do you need to record and release?

We already know that excercise recorders have already been used by prosecution teams to obtain convictions, the question is could the recordings of excercise or cardiac rate alone be enough?

With the secondary question being if an attacker obtained a copy of your excercise recording could they make a fake video to match?

With the equivalent of crypto chaining the ability of an attacker to just deleate portions of the excercise recording as might once have happened become moot.

On analysis the advantage lies not with the attacker but the well prepared target because the attacker does not know when they play their card if the target has one or more trumps to play.

This has actually happened in a way. In the UK a woman accused a conservative former politician Neil Hamilton and his wife Christine of kidnapping and assulting her. The accuser went to a well known media to the stars publicist Max Clifford, who pushed the story out in great detail into the UK MSM mainly for the salacious asspect but including the alleged time and place. After Mr & Mrs Hamilton obtained their cell phone records to show they were at a meal somewhere else entirely it fairly quickly killed the story, and police investigation. However Max Clifford kept flapping his gums which ended up costing him £100,000 in damages in court.

https://www.theguardian.com/media/2004/aug/16/pressandpublishing.conservativeparty

Impossibly Stupid October 26, 2018 3:07 PM

@Charles Todd

What if every camera had a digital watermark?

They already effectively do. They have their raw data format, which then usually gets exported as some common compressed format like JPEG. You can hash the data at any point in the editing process, establishing a blockchain-like history of provenance.

If the signature was signed by the manufacturer, we would gain nonrepudiation and integrity.

I’m not sure how much value that is. In addition to potential hacking of the signature system, you could simply use a “trusted” device to record the edited content, thereby setting up an easy counter-claim on which version is the original. Given the ability to make perfect digital copies of things, the better way to establish ownership is to demonstrate you have additional data that wasn’t made available in the faked version.

Guest October 26, 2018 3:41 PM

@Clive,
Timestamp servers. (Or the blockchain, if you can publish the hashes anonymously.) To all appearances, you have no audit trail for your life. Except when someone tries to accuse you of something, whereupon you produce far more compelling evidence. (Expect evidentiary requests for sealing the records from trial; if they can see it exists, and if experts can examine it before testifying that it was legitimate, but noone else can obtain original copies to play with, forging challenges go up.)

Anon Y. Mouse October 26, 2018 3:48 PM

If memory serves, this was predicted thirty years ago in Stewart Brand’s
book, “The Media Lab,” and presaged with the National Geographic magazine
cover in February 1982 which was digitally altered to move the Egyptian
pyramids closer together, that they might fit in its vertical format.

I believe that Brand predicted that photographic and even video evidence
would eventually be far less believeable because of the possibility of
it being digitally modified.

Clive Robinson October 26, 2018 5:39 PM

@ Guest,

To all appearances, you have no audit trail for your life.

Yes and no, it depends on what abilities the attacker has, and what precautions you take.

If you use any service outside of the perimiter you 100% control then you will leak information in various side channels. This can now be fairly easily analyzed with automated AI these days, look on it as being “traffic analysis on steroids”.

Thus any traffic that goes outside your 100% control can be used to paint a picture of your activities thus sending “life data” or even hashes of it to an external service such as a public ledger or other form of electronic notary will get seen and noted.

Not much is talked about when it comes to the use of AI on your externaly available signals, but it is something I’ve given some thought to over the years.

The old issues that prevented this in the past were, lack of computer power, lack of algorithmic knowledge and the cost of resources especially when it comes to gathering source data.

The first two are fairly obviously “solved problems” these days, which leaves the cost of resources.

Technology has steadily brought the price down or if you prefere effectively doubled the value of your buck year on year. Worse technology has become more or less “totally invasive” not just via IoT but by the various “Smart” grids and systems and the desire of some to extract the maximum information possible to gain extra value. That is they view not collecting everything they can as “leaving money on the table” which is not good business practice.

Few realise that the chips and signalling being used to put in utility supply “Smart Meters” are sufficient that AI driven Signal Processing can match your power consumption to activities. Especially as the devices in use become more efficient and the bulk of their power signiture is function related.

As an example if I can see the power signiture of your fridge and oven when in use, it’s quite possible to determin the ambient temprature in your kitchen, which in turn will give an indication of other factors such as the level of sun light through the kitchen window…

A power signiture for any kind of entertainment unit that has speakers will due to high efficiency audio amplifiers put the low frequency envelope of the audio being played onto the mains power which the smart meter can then measure. Like songs and broadcast audio dynamic images in films and similar will have a fairly unique power signiture that can also be matched to a database entry. These can then be checked against external sources to see wher and when you got them. Such systems are a desired “wish list” item for licencing organisations thus will no doubt appear in the future when money can be made from it.

The problem with connection to an external data network is very similar. Traffic signitures are meta-meta-data in that you do not need to see the content (data) or routing information (meta-data) to make valid assumptions about what an individual network packet is about.

For instance a request to a DNS service has a time/envelope signiture, the process of resolving the parts of the domain name give a fairly clear indication as to what level of resolving is required. If a new data stream signiture indicative of audio or video starts the fact a DNS request does or does not preceade it tells you various pieces of information.

If enough points on a network are monitered, which is what “collect it all” is all about, then a database search will fill in just by the various recorded meta-meta-data signals where your traffic has gone to or originated from. It’s the sort of basic search with self modifing rules that AI systems are a very good match for.

The question then becomes who would build such systems and who would get access to them.

Put simply level three adversaries such as SigInt agencies and major Corporates already are already doing so in what ever way they can. Level two adversaries who are technically sophisticated or well funded can get access to level three data or they can these days build sufficient systems. Thus it will only be a couple of years before even level one attackers such as the dreaded “300lb teenager in his bedroom in his parents house” gets the capability, thus amplifies their ability to dox etc.

Simply because the corporates who make these data gathering devices and build the databases don’t spend the money to make them secure to stop any levels of attacker above basic “script kiddy”. But even if they did, the script kiddy will no doubt be able to rent access to tools made by those who are slightly more adept. Because based on what is currently known about how “dark net” activities are currently monetized, it’s a fairly safe assumption that is the model that will appear.

Which means I can fairly safely predict that the cost of privacy within half a decade to a decade will be to much for the majority to afford if they are to remain part of what society will become thus demand of them.

It’s actually a war out there and what exists behind your privacy is the resource they are all seeking to obtain to commoditize you at best price so no money is left on the table. The problem is the freeloaders who get access for their own purposes.

It’s one of the reasons why I talk about what is involved with “energy-gapping” and “compartmentalizing” not just your computers but also other parts of your life. Because if people made simple changes now to make the cost of collecting their private data to high then it would not be cost effective for corporates to do and their share holders would insist they stop…

Coyne Tibbets October 26, 2018 9:22 PM

Like all arms races, it will never be won…such races are always a never-ending spiral of capability both offensive and defensive.

A comment on the pulse idea: it has already been shown that a real person’s pulse can be read from video by detecting physiological changes. Two methods I know of: the first measures hemoglobin movement by detecting changes in skin color (especially on the green channel) and the second detects small movements of the head triggered by pulse. I suspect these faked videos would be easily revealed by either method…and what if the methods were combined?

Already climbing the next loop of the spiral, defensively.

JF October 27, 2018 8:13 AM

I recommend the book “Photo Fakery” by Dino A. Brugioni, a founder of the CIA’s National Photographic Interpretation Center.

The first chapter is Photo Fakery is Everywhere. The book’s take home message is photo manipulation for whatever reason, good or bad, art or deception, has been with us since photography was invented. Which is approximately 180 years now. Little wonder techniques of deception extend to video and digital.

As I was taught in Civics class in junior high school, don’t believe everything you read (or see)!

Jemil Oyoka October 28, 2018 5:41 AM

Seems like it will only become easier to manufacture fake evidence that The Other Guy is making fake videos. You can even have some expert confirm that the videos are faked.

Tatütata October 28, 2018 1:41 PM

Aggressive video recoding can naturally produce false positives, with facial movements around the mouth and eyes getting dropped out of the output stream. So a faker would be well advised to generally use as low a bit rate as acceptable, a bit like Soviet photography which abused airbrushing even on pictures that didn’t need doctoring.

Herman October 30, 2018 12:40 AM

“CGI killed the video star.”

I predict the end of over paid hollywood starlets. That is not necessarily a bad thing.

Gilbert Fernandes October 30, 2018 4:27 AM

The French GEIPAN organization (which does scientific study of UFO and anything unknown seen in the sky by both militaries and civilians) has a special software used to detect tampered images. They are able to detect when a JPEG or various images have been modified to “insert” a fake UFO with a very, very high rate of success. The software is closed source, and GEIPAN and a few organizations from the government are their only users. Since videos are basically a succession of JPEG images (with differential frames, and full frames from time to time) they probably can expand their sofware to do so, if not already being able to.

The way the color data is encoded and other information helps to detect tampering. To insert something in a picture without detection is pretty hard and requires custom code so the changes are “within” the gamut and encoding artifacts from JPEG as a result or it will be detected at the binary level itself.

echo October 30, 2018 6:29 AM

@Herman

I predict the end of over paid hollywood starlets. That is not necessarily a bad thing.

This is a deep area. I am not arrogant to claim a definitive view but suggestions. People view originals and “quality” in a different way to commodity copies. This is not many steps removed from “learning by seeing”. I’m not sure offhand what the mechanisms are but it’s related to perception of what is real and out emotional responses and attachments. This is part of the reason why practical effects produce a depth of experience and immersivity that CGI at least in its present form cannot match. I believe Ian M. Banks explored some aspects of this in The Player of Games.

As far as I’m concerned until Freddie Mercury performs on stage at Live Aid III the technology isn’t done.

vas pup November 2, 2018 9:53 AM

Q: Could we train AI in charge of robot to be empathic by watching many good classical movies?

Empathetic machines:

https://www.sciencedaily.com/releases/2018/11/181101085240.htm

“The researchers recruited 88 volunteers from a university and Amazon Mechanical Turk, an online task platform. The volunteers were asked to interact with one of four different online health service chatbots programmed to deliver responses specific to one of four conditions set up by the researchers: sympathy, two different types of empathy — cognitive empathy and affective empathy — or, an advice-only control condition.

In the sympathetic version, the chatbot responded with a statement, such as, “I am sorry to hear that.” The chatbot programmed for cognitive empathy, which acknowledged the user’s feelings, might say, “That issue can be quite disturbing.” A chatbot that expressed affective empathy might respond with a sentence that showed the machine understood how and why a user felt the way they did, such as, “I understand your anxiety about the situation.”

The researchers said that affective empathy and sympathy worked the best.

“We found that the cognitive empathy — where the response is somewhat detached and it’s approaching the problem from a thoughtful, but almost antiseptic way — did not quite work,” said Sundar.” Of course, chatbots and robots do that quite well, but that is also the stereotype of machines. And it doesn’t seem to be as effective. What seems to work best is affective empathy, or an expression of sympathy.”

Kevin November 15, 2018 8:56 AM

The Running Man – movie of novel by Stephen King showed tampered-but-convincing video of man committing crime. It was used to convict him. He was actually the only innocent person. It’s another example of sci-fi correctly predicting the future. Video tampering will eventually be cheap, easy and convincing, and, hopefully, people will learn that videos are worthless as evidence of truth.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.