Generative AI for User Safety

Generative AI for user safety refers to the field of research that focuses on using Gen-AI to keep users safe in online as well as real world. While machine learning techniques have been in use for this for a long time generative AI presents new opporunities.

Safety Across Domains

User safety matters in both digital and physical world. We describe how user safety can be achieved in both these domains using generative AI.

Physical domain

Gen-AI powers chatbots that provide rapid targeted information, emotional support, and preparedness during emergencies [citation]. It also enhances accessibility for visually impaired people by providing concise audio descriptions of real-time video feeds, enabling safer navigation.

Gen-AI can also be used for real time processing of videos, live-feed and provide actionable intelligence to law enforcement agencies.

Although research on using Gen-AI for counterfeit detection is limited, Generative Adversarial Networks (GANs) are being used to identify counterfeit goods. The generative component mimics the counterfeiter, while the discriminator acts as a detective, identifying fraudulent outputs.

Gen-AI detects and flags cyberbullying instances, analyzes patterns to identify predatory behaviors, and augments training data to improve existing classifiers for hate speech detection.

Digital Realm

Large language models excel in various NLP tasks, including text classification, entity recognition, and sentiment analysis, making them effective in detecting hate speech, spam, fake news, and fake reviews.

Gen-AI techniques are used to detect harmful images, deepfakes, and sensitive media. They leverage vision-language models to capture complex relationships between visual elements and connect visual and textual information for accurate identification.

Gen-AI helps analyze video content to detect harmful behavior, deepfakes, and copyright violations.2829 It processes individual frames as images, converts audio to text, and employs machine learning models like Bi-LSTM and CNNs for analysis.

Advanced models like Conformer improve speech recognition, enabling the detection of harmful audio content.

Gen-AI effectively detects malicious code by de-obfuscating scripts, identifying threats, and understanding minified code. It also assists in determining the maturity rating of apps to protect children from age-inappropriate content.

Adversarial Gen-AI

Bad actors are also going to use Gen-AI run their scams. In this section we analyze how this might be happening.

Gen-AI enables attacks at scale

A lot of phishing attacks are about writing emails or text messages. Today this is about sending same message to many people. However Gen-AI enables the attackers to write wide variety of messages in different languages and with different text. This can also beat spam filters.

A lot of these attacks involve exchanging emails with the victims. Gen-ai excels at such interactive communication and reduces cost of running scamming operation.

Gen-AI enables greater personalization

Bad actors chose their targets based on some information about those victims. This information can be used to personalize the phishing messages. This might improve the effectiveness of the phishing attacks.

Further the bad actors can use reinforcement learning to train gen-ai based methods to write communication to communicate with the victims. This data is available only to bad actors.

Conclusion

Gen-ai technologies have potential to enhance user safety through various mechanism that involve chatbots, video processing, multi modal analysis. On other hand gen-ai might also open up new possibilities of attacks. Gen-AI will make such attacks easier and cheaper to carry out. Research community needs to stay one step ahead.