At the recent .debug 2024 event, I gave a presentation on privacy and data security in the era of AI tools. This followed my earlier talk on creating your own GPT at the Bug Future Show a few months prior. The excitement surrounding AI has only grown since then.

While Artificial Intelligence is becoming more popular and offers significant advantages, especially for daily operational tasks, it also raises concerns about the security and privacy of our data.

What should we be aware of when dealing with AI models?

  1. Privacy Breaches – Who are you sharing your personal information with? Training AI requires vast amounts of data, often including private information. The more data used, the more accurate and effective the model. However, large models frequently conceal the data sources used for training. For example, when ChatGPT was first launched, Samsung encountered significant issues because employees used it at work, leading to their data being incorporated into the model and becoming publicly accessible.

    Many of us use Slack, but did you know that by default, Slack uses your messages for model training? The only way to opt-out is to email Slack, requesting that your data not be used. This highlights crucial concerns about data collection practices and the need for responsible handling of personal and sensitive information.

  2. Discrimination and Bias in Models – Did you check your facts? AI models can reflect biases present in their training data, resulting in discriminatory outcomes. For instance, Google Gemini faced backlash when users received images of a black or female Pope after requesting a picture of the Pope. Similarly, asking for a picture of a German soldier from 1945 yielded images of African Americans and Asians. Although these examples might seem amusing, the issue is serious. Biased outputs can mislead users who rely on chatbots for information, believing it to be factual without verification.

    Developers possess the technical skills to create these models, but addressing bias is an ethical issue. Who bears the responsibility to tackle this?

  3. Data Manipulation – Have you heard about the Nazi chatbot? The quality of an AI model is directly tied to the quality of its training data, which can be manipulated. A notable case involved Microsoft’s "Tay" chatbot, which, within 24 hours, was manipulated by users to make offensive statements. More recently, Reddit users deliberately posted offensive and inaccurate content when it was revealed that Reddit data would be used for AI training.

    The concept of "Garbage in—garbage out" (GIGO) is crucial here. In healthcare, poorly labeled or unlabeled diagnostic data can lead to incorrect conclusions or diagnoses by AI models, underscoring the critical need for accurate data practices.

  4. Data Leaks and Data Breaches – Is your company prepared for AI technologies? Despite increased security budgets for AI systems, with 94% of IT leaders allocating funds for 2024, 77% of companies have still experienced AI-related security breaches. Even with significant investments, only 61% of IT leaders believe their budgets are sufficient to prevent potential cyberattacks.

  5. Deepfakes – Is the Nigerian prince trying to contact you? While not directly impacting AI development, deepfakes pose a growing problem for technology and internet users. The proliferation of fake news and false content will make it increasingly difficult to find accurate information. Moreover, scammers, such as the infamous "Nigerian prince," will likely become more convincing. The potential for voice and video manipulation is particularly concerning, as it could lead to significant misinformation and fraud.

    One striking example is the podcast between Joe Rogan and Steve Jobs that never occurred. AI models learned from available materials about both individuals and created a realistic conversation using advanced text-to-voice models, making it sound like a genuine dialogue. 

    Although there seem to be numerous dangers regarding AI, there are also plenty of self-defense possibilities. Such as using relevant data for model training, strong encryption and access control. 

    It's difficult to say what awaits us in the future. Every so often, a newer and better model with incredible capabilities emerges, and we joke internally that ChatGPT 5 will probably have "out of the box" solutions for our clients.

    However, that doesn't mean we will be out of work. On the contrary, if we keep up with the trends and understand how these systems work, we will create additional value for our clients.

Comments (0)
No login
color_lens
gif
Login or register to post your comment