Saw that - a real cool project! Thanks for sharing! Now the question is how to motivate people to train models?
I am thinking of somehow replacing crypto currency hashing with model training and inference. Somehow a training topic and data get fed into the network - naturally it somehow must be filtered to avoid weird crap - and queries are somehow run. Imagine all the crypto mining farms and gpus switching over to training massive models.
I'd ask myself another set of questions:
- can they extract information from conversations and exfiltrate it in a stealthy or obfuscated enough way so that they won't be noticed or have plausible deniability
- do they have incentives to do so (assuming the absence of liability described above)
- do they have a track record on related topics that makes you confident in the fact that they wouldn't act that way
My answers being yes - yes - no, the question of 'do they listen to target the ads they try to make me display' is pretty irrelevant to me. I can't trust them not to nor check reliably if they do.
If you try to address a different question such as 'do they really encrypt reliably to protect your conversations from being snooped on without their authorization', the threat analysis may differ. In that case they have incentives aligned with yours and are probably faithfully trying to effectively protect your/their data.
At the end, I'd estimate the probability of the scenario and how I value the consequent loss of privacy. Then accept/mitigate/refuse the risk accordingly.
Currently in a large private bank: Keepass, hardware tokens in physical safes accessed supervised by another team, Hashicorp Vault, a few HSMs and managed key vaults for the cloud workloads.
Paranoid levels of security are relevant in some cases but unnecessary in others. Physical security and organisational processes are also an important complement to technological solutions.