Using Metadata to Spot Misinformation
Misinformation in media has been a heated topic online. Incidents of doctored images, deepfake videos and questionable reporting have made the news itself part of the news cycle.
However, the Content Authenticity Initiative (CAI) aims to address at least a part of that.
Describing itself as “a community of media and tech companies, NGOs, academics,” the CAI is hoping to use technology to not only prove the authenticity of the images, but let users know what, if anything, was modified.
That is done through a series of open-source tools that attach metadata to images along every step of the process. This includes from when they were taken all the way to when they were edited and published.
To that end, they’ve amassed a tremendous group of companies and organizations. It includes tech firms such as Adobe and Microsoft, news organizations such as the Associated Press and Reuters, as well as camera companies Canon and Nikon.
The goal is to make it easy to both verify images as authentic and spot images that were manipulated in unethical ways.
However, even with my early testing of their product, there were some issues and limitations that need to be addressed before this approach can really take off as a solution to unethical image manipulation.
How it Works
The CAI outlines the process plainly on their “How it Works” page.
However, to summarize it, they are working to create an end-to-end system that hashes and tracks images through their entire pre-publication process. Verifying not only that the image is authentic, but any changes that were made to it along the way.
This starts at the creation of the image with the embedding of metadata. Then, as the image is taken into an image editor, such as Adobe Photoshop, the history of edits made to the image are also embedded into it.
That embedded data is then preserved when it is published, and the CAI provides a separate site to verify the content. This lets readers see the entire workflow that went into creating the image, including images that were added to it, any edits that were made and how the image was originally captured.
It’s worth noting that, while this metadata can and does carry attribution, it is intended as a tool for checking authenticity, not ownership. Users have the ability to remain anonymous.
The entire process is achieved by a series of open-source tools that the CAI provides. These tools are capable of writing, inspecting and displaying the embedded metadata, making it easy to integrate the technology into devices, applications and websites.
The technology is impressive, and the idea is very interesting. However, it’s important to note that this technology is still very much in beta and that my attempt to use it fell at the first hurdle.
Some Serious Limitations
To try out the tools (other than using the provided images), I noted that the New York Times was a member of the CAI and opted to see if they were already using the metadata on their site.
So, I found a current image taken by a New York Times photographer and tried to upload it to the CAI’s Verify site. However, the site only works on JPG and PNG images and the image from the New York Times site was WebP.
WebP is an image format that was developer specifically for use on the web and is designed to greatly reduce file sizes while maintaining detail. It is rapidly become the de facto standard format for the internet, thanks in large part to a preference by Google (who also developed the standard).
While the process is still under development and WebP support could be added at a later date, the WebP standard was first released in 2010 and is capable of holding metadata. Including it should have been a priority.
Update 1/24/2023: I heard back from the CAI. According to a spokesperson, the issue with WebP is that it has not been adopted as an international standard yet. Instead, it was a proprietary standard to Google. Google has submitted it to the IETF for international standardization and, once it is adopted, they plan on including it.
However, even if that issue is resolved, the CAI approach, as of right now, only deals with images. Of particular importance is that it doesn’t deal with video, which is of increasing importance online.
Though it is easy to see how these concepts could be applied to video, they aren’t at this time and would likely require partners that aren’t in the coalition at this time.
Finally, any such system can only work as well as it is adopted. Users who don’t use the tools provided will simply not have that metadata. While that may make those images less trustworthy, that depends on readers/viewers regularly checking such images to find out.
And that may be the biggest challenge. This tool only works if people know that it’s available and feel that it’s worth using. The CAI not only has to create a norm for applying the metadata, but a norm for checking it.
The first they can do through their consortium members. They have all the right companies and organizations in place to make this a standard practice for journalists and creatives. However, creating that norm among readers and viewers will be a very different process.
To that end, their tools for displaying the information in-site may be their best weapon. While their Verify site is a good showcase for the technology, it’s unlikely users will routinely leave a site to check the image’s history.
Still, it’s going to be an uphill battle, but it is one that they seem to be well positioned to fight.
What I find most impressive about the CAI is the group they’ve amassed. Scrolling through the list you will find schools, such as the University of Helsinki, non-repudiation services like Safe Creative, and trade groups such as the European Journalism Centre.
All of these entities are mixed with a large list of tech companies and news organizations all over the world. While there are names that I would like to see added to that list, it’s clear that they’ve done a lot of work on outreach and building this coalition.
And that is ultimately their greatest asset. The technology itself is challenging, but not nearly as hard as creating the standard of practice. That takes implementation and policy changes, something that can be very difficult to bring about, especially with large organizations.
The CAI, quite wisely, seems to be addressing that issue either before or at the same time they are addressing the technical ones.
That gives this approach at least some chance of seeing broad use, if they can solve the other challenges that still remain.