Publications by authors named "Anderson C A Nascimento"

Domain Generation Algorithms (DGAs) are used by malware to generate pseudorandom domain names to establish communication between infected bots and command and control servers. While DGAs can be detected by machine learning (ML) models with great accuracy, offering DGA detection as a service raises privacy concerns when requiring network administrators to disclose their DNS traffic to the service provider. The main scientific contribution of this paper is to propose the first end-to-end framework for privacy-preserving classification as a service of domain names into DGA (malicious) or non-DGA (benign) domains.

View Article and Find Full Text PDF

Background: In biomedical applications, valuable data is often split between owners who cannot openly share the data because of privacy regulations and concerns. Training machine learning models on the joint data without violating privacy is a major technology challenge that can be addressed by combining techniques from machine learning and cryptography. When collaboratively training machine learning models with the cryptographic technique named secure multi-party computation, the price paid for keeping the data of the owners private is an increase in computational cost and runtime.

View Article and Find Full Text PDF

We show that random oblivious transfer protocols that are statistically secure according to a definition based on a list of information-theoretical properties are also statistically universally composable. That is, they are simulatable secure with an unlimited adversary, an unlimited simulator, and an unlimited environment machine. Our result implies that several previous oblivious transfer protocols in the literature that were proven secure under weaker, non-composable definitions of security can actually be used in arbitrary statistically secure applications without lowering the security.

View Article and Find Full Text PDF

Machine learning (ML) is revolutionizing research and industry. Many ML applications rely on the use of large amounts of personal data for training and inference. Among the most intimate exploited data sources is electroencephalogram (EEG) data, a kind of data that is so rich with information that application developers can easily gain knowledge beyond the professed scope from unprotected EEG signals, including passwords, ATM PINs, and other intimate data.

View Article and Find Full Text PDF