Posts

Balancing functionality and privacy concerns in AI-based Endpoint Security solutions

The integration of Artificial Intelligence (AI) in endpoint security has revolutionized the way organizations protect their devices and data.

Ok, let’s take a break here: have you read the article about Artificial Intelligence vs. Machine Learning ?

 

By leveraging AI and machine learning models that analyze user behavior on devices, organizations can detect anomalies and potential security threats more effectively.

However, this advanced approach to endpoint security raises significant privacy concerns, as it necessitates the collection of user activity data, sometimes in real time.

One thing needs to be clear: if you want to do anomaly detection, you need to train your ML model with what “normal” is first – this is called “baseline”. And this means that data needs to be collected from the user.

Now the question remains, how can we reduce the privacy concerns?

This short article explores the privacy challenges I think are associated with using AI models that require user data(behavior), discusses potential solutions, and suggests ways to deploy AI on devices while minimizing privacy concerns.

What are the privacy concerns when data is collected for training an ML model?

Data Collection and Usage


Collecting user data for AI-driven endpoint security involves monitoring and logging user activities on devices.

This process includes:

  • capturing information about the applications used (URLs accessed, CPU usage, memory usage),
  • websites visited and items clicked
  • files accessed
  • applications installed
  • applications started
  • time of login, logout, inactivity
  • webcam usage
  • microphone usage
  • biometrics

This data is essential for creating baselines of normal behavior and identifying deviations that might indicate security threats.

This extensive data collection raises concerns about user privacy, as it creates a comprehensive profile of a user’s digital activities.

AI-based endpoint security solutions can infer or predict sensitive information from non-sensitive forms of data, such as user preferences, interests, or behaviors.

This can enable the systems to provide personalized or customized services or recommendations, but it can also violate the privacy or autonomy of the users or the owners of the devices or networks.

For example, someone’s keyboard typing patterns can be analyzed to deduce their emotional state, which includes emotions such as nervousness, confidence, sadness or anxiety

 

Data Security

Safeguarding the collected user data is critical, as it contains sensitive information about an individual’s online behavior.

The risk of data breaches or unauthorized access to this information poses a significant privacy threat.

Where is this data stored, how long, how is it stored, who has access to it, how is it going to be used/processed and by who, are just a few questions that need to be asked.

GDPR has made clear which are the responsibilities of the controller and processor(s) of the data.

 

Transparency and Consent

A good user experience of a security product means that users will be as unaware as possible that their activity data is being collected for security purposes.

Ensuring transparency and obtaining explicit user consent for data collection is critical. Without clear communication, users may feel their privacy is being violated.

 

Data Retention

Storing user data indefinitely can compound privacy concerns. Organizations should establish clear data retention policies, specifying how long the data will be retained and under what circumstances it will be deleted.

 

User Profiling and Discrimination

The detailed user activity data collected for AI analysis can lead to user profiling, which may be used for purposes beyond cybersecurity, such as targeted advertising.

AI-based endpoint security solutions can make automated decisions or recommendations based on the data they analyze, such as blocking access, flagging anomalies, or prioritizing alerts.

Discriminatory decisions and practices can arise from the insights drawn from user behavior data. However, these decisions or recommendations can be discriminatory, unfair, inaccurate, or biased, if the data or the algorithms are flawed, incomplete, or skewed.

For example, people can be misclassified, misidentified, or judged negatively, and such errors or biases may disproportionately affect certain demographics.

 

Solutions to address privacy concerns

The solutions to address these concerns are actually not new, they are covered pretty good by the GDPR and other privacy laws world-wide.

They are :

Data Minimization

Organizations should adopt a data minimization approach, collecting only the data necessary for security purposes.  This is definitely not as easy as it sounds.

In Security, you usually collect as much as possible, because the more you know about your target, the better it is for the ML model (better detection, less false positives).

However, the Compliance dept. should be involved from the early stages of developing the product in order to control what is being collected.

 

Anonymization

Anonymizing user data can be a privacy-enhancing technique. By removing personally identifiable information from collected data, the risk of individual users being identified is reduced.

This works good when data is collected from many computers, but when the solution works on a single computer, it usually needs time to “learn” the user’s behavior.

There is nothing anonymous there and this is usually OK, as long as this data is not sent to the backend for further processing and analysis.

 

Encryption

Encrypting the data collected for AI analysis ensures that even if a breach occurs, the information remains unreadable and inaccessible to unauthorized parties.

When “cleaned up” data needs to be sent, it is mandatory to send it encrypted and keep it at rest encrypted all the time.

 

Informed consent

Transparently informing users about data collection and obtaining their explicit consent is a fundamental step in addressing privacy concerns.

Users should have the option to opt in or out of data collection at any time. It is mandatory for the ML models to be able to cope without any datasets, because they could disappear at any time.

 

Data deletion

After the data is no longer needed for security analysis, organizations can ideally erase the data, and if this is not possible, then it should remove any direct or indirect associations with individual users.

Balancing Security and Privacy

Balancing AI-based endpoint security and privacy is essential. Organizations can adopt the following strategies to minimize privacy concerns:

  • Implement Strong Privacy Policies

Establish comprehensive privacy policies that clearly define data collection, usage, retention, and disposal procedures. These policies should adhere to legal and regulatory requirements for the region where the users reside (GDPR, CPA, etc.).

This can by itself be a challenging task, because no company is willing to block access to potential customers.

 

  • Regular risk assessment and impact analysis

Conduct periodic risk assessment and impact analysis to ensure that data collection and analysis practices align with privacy policies and legal requirements and correct any deviations promptly.

The audits should be first performed internally, in order to have time to fix any deviations. If an external audit body finds any irregularity, the company can be fined with large sums of money.

 

  • Third-Party Vetting

When using third-party AI solutions, organizations should thoroughly vet the security and privacy practices of these providers.

 

  • Ongoing Monitoring

Continuously monitor the effectiveness of privacy protection measures and adjust them as needed to address emerging privacy concerns.

 

Conclusion

AI-based endpoint security is a powerful tool for protecting devices and data from cyber threats. However, it should not come at the cost of user privacy or well-being.

Organizations must strike a delicate balance by implementing privacy-enhancing measures, obtaining informed consent, and adhering to transparent data collection and usage practices.

 

 

PS: The image of the post was generated using DALL-E.

 

The post Balancing functionality and privacy concerns in AI-based Endpoint Security solutions first appeared on Sorin Mustaca on Cybersecurity.

Thoughts on AI and Cybersecurity

Being an CSSLP gives me access to various emails from (ISC)2. One of these announced me that there is a recording of a webinar about AI and Cybersecurity held by Steve Piper from CyberEdge.

Very nice presentation of 1h, and I found out that there is a sequel to that on November 1st.

So, following Steve’s article, I did some research, read a lot and used ChatGPT to summarize some of my findings.

This article explores the multifaceted ways AI is transforming cybersecurity, from threat detection to incident response and beyond. It also looks into What it means actually to use AI in some of these fields. What is the impact on privacy and confidentiality?

Important to keep in mind that any AI must first learn (trained) in order to be able to understand the system and then potentially predict what is happening.

 

  1. Threat Detection

One of the primary applications of AI in cybersecurity is threat detection. Traditional rule-based systems are no longer sufficient to identify and combat sophisticated attacks.

AI-driven technologies, such as machine learning and deep learning, can analyze massive datasets to detect anomalies and potential threats.

Here’s how:

a. Anomaly Detection: AI algorithms can establish a baseline of normal behavior in a network or system. Any deviation from this baseline can trigger an alert, indicating a potential security breach.

b. Behavioral Analysis: AI can analyze user and entity behavior to detect patterns that may indicate malicious activity. This is particularly useful for identifying insider threats.

c. Malware Detection: AI can scan files and code for patterns consistent with known malware or recognize behavioral patterns of malicious software.

We’ll talk more in the future on this topic.

 

  1. Predictive Analysis

AI-driven predictive analysis enhances cybersecurity by identifying potential threats before they become full-blown attacks.

By crunching vast amounts of historical data, AI systems can predict emerging threats, trends, and vulnerabilities. This early warning system allows organizations to preemptively shore up their defenses.

It would have to gather huge amounts of data, crunch them (preprocess, normalize, structure), creating an ML model and then based on the chosen technology train the system.

Here we can think of supervised (pre-categorized data, requiring feature to be defined) and unsupervised learning (non categorized data, basically being restricted to Anomaly detection).

There is a huge warning here, because :

a) such huge amounts of data has to come from somewhere and

b) predictions can be influenced by specially crafted training data, for unsupervised training models.

 

  1. Automation and Orchestration

AI can automate routine cybersecurity tasks and workflows, reducing the workload on human analysts and minimizing response times. AI-driven systems can:

a. Automatically quarantine infected devices or isolate compromised areas of a network to prevent lateral movement by attackers.

b. Investigate and analyze security incidents, rapidly categorizing and prioritizing alerts.

c. Initiate predefined incident response procedures, such as patching vulnerable systems or resetting compromised user accounts.

 

Automation:

Automation involves the use of technology, such as scripts, workflows, or AI-driven systems, to perform routine and repetitive tasks without human intervention. In the context of cybersecurity, automation can significantly improve efficiency and response times by handling various operational and security-related processes automatically. Here’s how it works:

a. Incident Response: When a security incident is detected, automation can trigger predefined actions to contain, investigate, and mitigate the threat. For example, if a system detects a malware infection, an automated response might involve isolating the affected device from the network, blocking the malicious IP address, and initiating a forensic investigation.

b. Vulnerability Patching: Automation can be used to deploy security patches and updates to systems and software as soon as they are released. This reduces the window of vulnerability and helps prevent attacks that target known vulnerabilities.

c. Log Analysis and Alerts: Automation can continuously monitor logs and events from various systems. It can detect and respond to predefined security events, generating alerts or triggering specific actions when unusual or malicious activity is detected.

 

Orchestration:

Orchestration is a broader concept that focuses on integrating and coordinating various security tools, processes, and workflows into a unified and streamlined system. It enables organizations to create end-to-end security workflows by connecting different security solutions and ensuring they work together cohesively. Here’s how it works:

a. Workflow Integration: Orchestration systems allow the creation of predefined security workflows that link multiple tools, such as firewalls, intrusion detection systems, antivirus software, and incident response platforms. For example, when a malware alert is triggered, orchestration can coordinate the response by isolating the affected system, collecting forensic data, and alerting the incident response team.

b. Information Sharing: Orchestration enables the sharing of critical information among security tools. This ensures that all relevant security solutions have access to the latest threat intelligence, allowing for more effective threat detection and mitigation.

 

  1. Phishing Detection

Phishing attacks remain a prevalent threat. AI can help identify phishing attempts by:

a. Analyzing email content and sender behavior to identify suspicious emails.

b. Scanning URLs for malicious domains or suspicious patterns.

c. Inspecting attachments for known malware signatures.

d. Recognizing social engineering techniques and language used in phishing emails.

 

  1. Network Security

AI-driven intrusion detection systems (IDS) and intrusion prevention systems (IPS) monitor network traffic for anomalies and threats.

They can identify and block malicious traffic in real-time, protecting the network from various attacks, including DDoS attacks and data exfiltration.

 

  1. Threat Intelligence

AI can be used to aggregate and analyze threat intelligence from various sources, including open-source feeds, dark web monitoring, and industry-specific data.

This aggregated intelligence can help security teams stay informed about emerging threats and vulnerabilities.

 

  1. Endpoint Security

AI-driven endpoint security solutions provide real-time protection for individual devices.

They can identify and mitigate threats at the device level, even when the device is not connected to the corporate network. This is especially crucial for remote workers and mobile devices.

This raises another red flag for me: complete monitoring of user’s actions on the device. What happens to the data gathered, is the model trained locally on in the cloud? And many other such concerns.

I will write a dedicated post about AI and Privacy very soon.

The post Thoughts on AI and Cybersecurity first appeared on Sorin Mustaca on Cybersecurity.

ChatGPT and automotive cybersecurity #2/2: TISAX certification

This is the 2nd post about Automotive Cybersecurity.

Since I am working these days on CSMS (based on ISO ECE 21434 and TISAX), part of my companies consulting offer for automotive I thought maybe I check what ChatGPT things about them.

 

First post was about CSMS and ISO 21434 and this one is about TISAX.

 

Query:

Tell me about challenges in TISAX certification 

 

ChatGPT:

“TISAX (Trusted Information Security Assessment Exchange) is a security assessment and certification process for automotive suppliers. The certification is used to evaluate and ensure the information security capabilities of companies in the automotive sector.

There are several challenges associated with TISAX certification:

  • Complex requirements: TISAX has extensive and complex security requirements that can be difficult to understand and implement. Companies may struggle to meet these requirements and may need help from experts to achieve certification.
  • Resource constraints: Implementing the security measures required for TISAX certification can be resource-intensive and time-consuming. This can be especially challenging for small or medium-sized businesses.
  • Technical difficulties: Companies may face technical challenges in implementing the necessary security controls, particularly if their existing systems are not well-suited to the TISAX requirements.
  • Cost: The cost of TISAX certification can be significant, and this may be a challenge for companies with limited budgets.
  • Maintenance: TISAX certification must be maintained, and companies must continually evaluate and improve their security controls to stay certified. This can be a challenge, as the threat landscape is constantly evolving and new risks emerge.

Overall, TISAX certification requires a substantial investment of time, resources, and expertise, but it is considered an important aspect of information security for companies operating in the automotive sector.”

 

So, I have to say that this impressed me a bit …

I have always difficulties explaining to my customers why is it so hard to pass the TISAX audit.

It is all true, and actually it is quite a challenge to get companies to commit to a long term cooperation with a consulting company, which can help them understand the requirements and teach them how to apply them.

 

Conclusion: 

I will get back to ChatGPT from time to time and try to get such short descriptions for various things I need.

The post ChatGPT and automotive cybersecurity #2/2: TISAX certification first appeared on Sorin Mustaca on Cybersecurity.