A Framework for Privacy Preserving Cloud-based ML
Image Classification
We propose One Way Encryption by Deconvolution (OWED), a deconvolution-based encryption framework that offers the advantages of Homomorphic encryption at a fraction of the computational overhead.
Cloud-based machine learning services (CMLS) enable organizations to take advantage of advanced models that are pre-trained on large quantities of data. The main shortcoming of using these services, however, is the difficulty of keeping the transmitted data private and secure. Asymmetric encryption requires the data to be decrypted in the cloud, while Homomorphic encryption is often too slow and difficult to implement. Extensive evaluation of multiple image datasets demonstrates OWED’s ability to achieve near-perfect classification performance when the output vector of the CMLS is sufficiently large. Additionally, we provide a comprehensive analysis of the robustness of our approach.

Hypothesis Insight
-
very difficult to translate back
-
consistently classified
ML algorithms respond to latent patterns in data
Consistent transformations can create representations that are:
Motivation
Use cloud-based ML services while preserving confidentiality
-
Utilize large and complex algorithms
-
Improve performance
-
Significantly reduced costs
Existing Approaches
While many solutions exist, they all have their own limitations



Our Method
Steps:
-
The organization’s confidential data is encoded.
-
A generative model generates transformed images using the encoding and a secret key.
-
The cloud-based ML model generates predictions on the transformed images.
Confidential Data Encoding
Transformed images generation
-
Generative model
Cloud-based ML model predictions

An organizational Neural Network (NN) Model is then trained to predict the original image classifications.
Inputs:
-
Original data embeddings
-
Transformed data classification
Output:
-
Prediction of the original data real labels

Inferring Labels for New Images

Confidentiality Analysis
Comparing Original and Scrambled Images

Analyzing the Cloud-based Model Output

Cryptographic Proof & Privacy Verification
Robustness Strength Empirical Proof
Reconstruction of Scrambled Images Analysis


Experiments
֍Use-Case 1: Same Labels in Confidential Data and Cloud
֍Use-Case 4: Ensemble of Encoders
֍Use-Case 3: Different Labels for Confidential Data and Cloud
֍Use-Case 2: Subset of Confidential Data Labels




֍Use-Case 5: IIN’s Training Size
֍Ablation Study
Conclusions
-
Significantly different from source
-
Difficult to reconstruct
-
Loose Lower bound for images per key
-
Quick and inexpensive key change
-
Increasing security and performance using multiple keys
Future Work
-
Collaborative training of ML models
-
Adapting our approach:
֍Textual data
֍Tabular data