Retrieving Concealed Information by Assaulting LLMs.

Mirage Insights
3 min readJul 29, 2024

--

Knowing the importance of keeping privacy, hidden data, and important sensitive information safe and secure in today’s cyber world is just like keeping your money safe at home! Isn’t it so? Always sharing your personal data anywhere, everywhere is really harmful and dangerous. It takes up a lot of courage to do so, if you know how the cyber world is acting up in the current era. Without knowing the facts that how it may effect your personal life, you should not be sharing any personal information so quickly to anyone just anywhere.

LLMs are chaotic!

The way LLMs are helpful, the same way they are chaotic as well. Well known that LLMs are skilled in a variety of natural language processing (NLP) activities and have been trained on enormous volumes of text data. Mentioning about LLMs as chaotic is because they are easily prone to threats, attacks, and prompt injections. The huge disaster can be caused by online LLM assaults as these attacks can:

  1. obtain the information that is accessible to the LLMs.
  2. utilize API to cause detrimental activities.
  3. incite assaults against other individuals.

Types of Attacks On LLMs:

  1. Adversarial Prompts.
  2. Prompt Injections.
  3. Data Poisoning Attacks.
  4. Injection Attack Methods.

There are several ways in which the LLMs are attacked in order to steal sensitive information and can cause harm to other important data as well leading to a lot of loss to any organizations and individuals as well.

What way Attacks on LLMs are affecting?

As mentioned above, there are several kind of attacks which can be intended on LLMs and they can have a huge impact as well. Impact can be based on type of attack performed. So, discussing each attack and its impact as well.

  1. Adversarial Prompts: In large language models (LLMs), adversarial prompts are inputs designed to take advantage of biases or vulnerabilities in the model, which frequently results in the model producing hurtful, illogical, or inaccurate responses. Adversarial Prompt attacks impact the LLMs in various ways affecting the overall security, reliability, and trustworthiness. Impact of this attack on the LLMs are in such a way:
  • Model performance deterioration
  • Encroaching on Safety and Ethics
  • Risks to Security
  • Operational Difficulties

and these ways Adversarial Prompt attack impacts and affects the LLMs. There are some examples of the Adversarial Prompts which are mentioned below:

  • Non-Sensical Queries.
  • Bias Inducing Prompts.
  • Factually Incorrect Statement.
  • Malicious Instructions.

Some of the ways in which adversarial prompts attacks LLMs:

  • Evasion Attack.
  • Poisoning Attack.
  • Backdoor Attack.
  • Prompt Injection Attack.
  • Bias Exploitation Attack.
  • Context Manipulation Attack.

2. Prompt Injection Attacks: This manipulation is meant to take advantage of the model’s ability to grasp language and force it to produce outputs or operate in ways that deviate from its original purpose. Prompt injection attacks are basically known to be the “Calculated Manipulation”. This attack is of two types: Direct and Indirect Injection.

  • Direct Injection: This injection method basically works in order to directing the LLMs like MITM (Man-in-the-Middle Attack). This attack usually amends the data which is being transferred from one point to another as similar to the MITM attack does.
  • Indirect Injection: This injection method actually converts the data which the model has an access to. This means the data which is saved in models and the model can easily access because of the connection of the model with it is the Indirect Injection attack.

3. Data Poisoning Attack: This is kind of adversarial attack purposefully modifies the training set in order to undermine the performance of the final model. This attack can be harmful to the foundation of the machine learning as these are already the trained models. This type of attack can also create the backdoors for the attackers easily in order to harm the model.

As it is known with the vast usage of LLMs, there are higher chance of models being triggered with attacks and are very prone to them. This can be lot more trouble to the Security Experts so as to prevent and mitigate those attacks.

These attacks should be mitigated as they can led to data loss, Personal Data breach, Financial Breach, and other different breaches as well. This is one of the major concern nowadays which should be taken care of.

--

--

Mirage Insights

A passionate cybersecurity enthusiast dedicated to safeguarding digital landscapes. With a keen interest in ethical hacking, network security, and data safety.