Mitigating deep fakes and phishing in meetings

Here comes a quick project idea. It’s meant to provide deep-fake + phishing resistance while at the same time not introduce “big bang changes” to how you operate. This allows you to retrofit a system like this into an existing legacy system.

The setting is this:

you run security sensitive meetings with a bunch of participants. say you run those using a laptop over (say) google meet / zoom / whatever.
you are remote/distributed
you are concerned about a participant getting pwned and getting replaced by a deep fake
you can’t change 100% how you operate. you are stuck with google meet / zoom / whatever

The problem:

you are concerned some participant is getting MITM / phished and replaced by a deep fake

Solution in a nutshell:

each participant has an iPhone, separate from your corporate laptop
iPhones are live transcribing what you say + what others say thru the google meet into text
they encrypt + authenticate this text and send it to a “group chat”
at the same time, the iphone is monitoring this group chat and detecting whether other iphones are saying in the group chat actually matches its own transcription.
if they are consistent, the screen of the iPhone is green. If they are not, a deep fake is detected and it should be red and complain loudly

Product notes:

This is invisible in case everything goes right
This is loud in case security issues are detected

Implementation details:

preferably, this is a personal iPhone separate from corporate infrastructure
all transcription + diarization can happen locally, say with openAI’s whisper models
iPhones can use secure enclaves for storing keys
iPhones can communicate out of band
the transcription match won’t be perfect, we need some sort of “approximate match”
of course this assumes remote-only attacker that can’t get a hold of these iPhones