The integration of Artificial Intelligence (AI) in endpoint security has revolutionized the way organizations protect their devices and data.

Ok, let’s take a break here: have you read the article about Artificial Intelligence vs. Machine Learning ?

By leveraging AI and machine learning models that analyze user behavior on devices, organizations can detect anomalies and potential security threats more effectively.

However, this advanced approach to endpoint security raises significant privacy concerns, as it necessitates the collection of user activity data, sometimes in real time.

One thing needs to be clear: if you want to do anomaly detection, you need to train your ML model with what “normal” is first – this is called “baseline”. And this means that data needs to be collected from the user.

Now the question remains, how can we reduce the privacy concerns?

This short article explores the privacy challenges I think are associated with using AI models that require user data(behavior), discusses potential solutions, and suggests ways to deploy AI on devices while minimizing privacy concerns.

What are the privacy concerns when data is collected for training an ML model?

Data Collection and Usage

Collecting user data for AI-driven endpoint security involves monitoring and logging user activities on devices.

This process includes:

capturing information about the applications used (URLs accessed, CPU usage, memory usage),
websites visited and items clicked
files accessed
applications installed
applications started
time of login, logout, inactivity
webcam usage
microphone usage
biometrics

This data is essential for creating baselines of normal behavior and identifying deviations that might indicate security threats.

This extensive data collection raises concerns about user privacy, as it creates a comprehensive profile of a user’s digital activities.

AI-based endpoint security solutions can infer or predict sensitive information from non-sensitive forms of data, such as user preferences, interests, or behaviors.

This can enable the systems to provide personalized or customized services or recommendations, but it can also violate the privacy or autonomy of the users or the owners of the devices or networks.

For example, someone’s keyboard typing patterns can be analyzed to deduce their emotional state, which includes emotions such as nervousness, confidence, sadness or anxiety

Data Security

Safeguarding the collected user data is critical, as it contains sensitive information about an individual’s online behavior.

The risk of data breaches or unauthorized access to this information poses a significant privacy threat.

Where is this data stored, how long, how is it stored, who has access to it, how is it going to be used/processed and by who, are just a few questions that need to be asked.

GDPR has made clear which are the responsibilities of the controller and processor(s) of the data.

Transparency and Consent

A good user experience of a security product means that users will be as unaware as possible that their activity data is being collected for security purposes.

Ensuring transparency and obtaining explicit user consent for data collection is critical. Without clear communication, users may feel their privacy is being violated.

Data Retention

Storing user data indefinitely can compound privacy concerns. Organizations should establish clear data retention policies, specifying how long the data will be retained and under what circumstances it will be deleted.

User Profiling and Discrimination

The detailed user activity data collected for AI analysis can lead to user profiling, which may be used for purposes beyond cybersecurity, such as targeted advertising.

AI-based endpoint security solutions can make automated decisions or recommendations based on the data they analyze, such as blocking access, flagging anomalies, or prioritizing alerts.

Discriminatory decisions and practices can arise from the insights drawn from user behavior data. However, these decisions or recommendations can be discriminatory, unfair, inaccurate, or biased, if the data or the algorithms are flawed, incomplete, or skewed.

For example, people can be misclassified, misidentified, or judged negatively, and such errors or biases may disproportionately affect certain demographics.

Solutions to address privacy concerns

The solutions to address these concerns are actually not new, they are covered pretty good by the GDPR and other privacy laws world-wide.

They are :

Data Minimization

Organizations should adopt a data minimization approach, collecting only the data necessary for security purposes. This is definitely not as easy as it sounds.

In Security, you usually collect as much as possible, because the more you know about your target, the better it is for the ML model (better detection, less false positives).

However, the Compliance dept. should be involved from the early stages of developing the product in order to control what is being collected.

Anonymization

Anonymizing user data can be a privacy-enhancing technique. By removing personally identifiable information from collected data, the risk of individual users being identified is reduced.

This works good when data is collected from many computers, but when the solution works on a single computer, it usually needs time to “learn” the user’s behavior.

There is nothing anonymous there and this is usually OK, as long as this data is not sent to the backend for further processing and analysis.

Encryption

Encrypting the data collected for AI analysis ensures that even if a breach occurs, the information remains unreadable and inaccessible to unauthorized parties.

When “cleaned up” data needs to be sent, it is mandatory to send it encrypted and keep it at rest encrypted all the time.

Informed consent

Transparently informing users about data collection and obtaining their explicit consent is a fundamental step in addressing privacy concerns.

Users should have the option to opt in or out of data collection at any time. It is mandatory for the ML models to be able to cope without any datasets, because they could disappear at any time.

Data deletion

After the data is no longer needed for security analysis, organizations can ideally erase the data, and if this is not possible, then it should remove any direct or indirect associations with individual users.

Balancing Security and Privacy

Balancing AI-based endpoint security and privacy is essential. Organizations can adopt the following strategies to minimize privacy concerns:

Implement Strong Privacy Policies

Establish comprehensive privacy policies that clearly define data collection, usage, retention, and disposal procedures. These policies should adhere to legal and regulatory requirements for the region where the users reside (GDPR, CPA, etc.).

This can by itself be a challenging task, because no company is willing to block access to potential customers.

Regular risk assessment and impact analysis

Conduct periodic risk assessment and impact analysis to ensure that data collection and analysis practices align with privacy policies and legal requirements and correct any deviations promptly.

The audits should be first performed internally, in order to have time to fix any deviations. If an external audit body finds any irregularity, the company can be fined with large sums of money.

Third-Party Vetting

When using third-party AI solutions, organizations should thoroughly vet the security and privacy practices of these providers.

Ongoing Monitoring

Continuously monitor the effectiveness of privacy protection measures and adjust them as needed to address emerging privacy concerns.

Conclusion

AI-based endpoint security is a powerful tool for protecting devices and data from cyber threats. However, it should not come at the cost of user privacy or well-being.

Organizations must strike a delicate balance by implementing privacy-enhancing measures, obtaining informed consent, and adhering to transparent data collection and usage practices.

PS: The image of the post was generated using DALL-E.

The post Balancing functionality and privacy concerns in AI-based Endpoint Security solutions first appeared on Sorin Mustaca on Cybersecurity.

Downloads

Download slides (PDF)

A year ago, at VB2019 we presented for the first time an overview of how the anti-malware world looks from the perspective of a young company trying to enter the market: how they try to build products, how they try to enter the market, how they try to convert users, and what challenges they face in these activities.

In this new paper we will present an overview of the situation for such a company after one year of experience. We will look at the situation from several angles:

- that of the consulting company helping them to build the product and enter the market

- that of working with certification companies regularly, checking the products for detection and performance

that of working with Microsoft to make the company compliant and keep them compliant

One year later, many still have a hard time understanding that the security market is no longer the Wild Wild West, but we also see that a lot of visible efforts are being made to improve. This means that compliance with ‘clean software’ regulations is becoming an issue. We will present some interesting statistics and compare data from the past with current data. The young companies still have a lot of challenges in understanding that implementing AV software is not the same as implementing any other type of software. Despite the fact that they still get flagged by the established products for various reasons, there are still more and more companies trying to enter the market.

A lot of people in the audience will ask themselves ‘why would anyone want to enter the market, since the market is overcrowded, there are plenty of free products out there, and on Windows there is also Microsoft Defender?’. We will try to provide an answer to this question, but the answer is not what many think it is. Or, maybe it is …

Posts

Balancing functionality and privacy concerns in AI-based Endpoint Security solutions

What are the privacy concerns when data is collected for training an ML model?

Solutions to address privacy concerns

Balancing Security and Privacy

Conclusion

One year later: challenges for young anti-malware products today (presentation Virus Bulletin 2020)

Downloads

Video

Endpoint Cybersecurity GmbH

Tag Archive for: Endpoint

Posts

What are the privacy concerns when data is collected for training an ML model?

Solutions to address privacy concerns

Balancing Security and Privacy

Conclusion

Downloads

Video

Endpoint Cybersecurity GmbH