Are automatic captions accurate enough? What is the standard?

The potential that we see in automatic captions is really exciting — and just in the past month we have seen new releases of captioning features from major companies that promise a lot of great functionality in the future.  But how will we know when it’s good enough to rely on?

The first part of the answer to that question is to really understand what’s required by law.  The federal accessibility standards all now refer to the WCAG standards as the measure of what accessibility looks like, and the WCAG requires that all time-based media (video and audio) provide captions, or failing that, a transcript.

The WCAG doesn’t provide a measure of how accurate is enough, and not much guidance on what good captions look like. The FCC, however, does have those standards, and those standards now apply to online video.  The FCC’s standards call for accurate captions, but don’t establish a percentage of accuracy that is good enough. So, any inaccuracy is too much.  The standards also call for punctuation and the identification of speakers, both of which are not really possible for automatic captioning solutions at this time.

So,  we live in the real world, and mistakes are going to happen in captions — and especially in live captions where there’s no time to edit and correct mistakes.  That reality does not make inaccurate captions acceptable.

Since we need to be able to demonstrate that we ‘re doing everything we can to provide accurate captions, making a choice that costs substantially less and is less accurate (like an computer-only captioning solution) instead of a more expensive and more accurate system (human captioning system) puts your institution at legal risk.

So, until the computer-only captioning solutions are as accurate as human-generated captions, the answer to this question has to be that we must use human-generated or human-corrected solutions.

That may mean one of two things — either paying for a service that produces human-corrected captions, or train your content developers to leverage computer generated captions and edit them to ensure they meet accuracy standards.  This does mean extra work for those content producers, but in many cases that’s the only fiscally reasonable way forward.

Back to the Accessibility FAQ