June 20, 2021


Causally Constrained Data Synthesis for Private Data Release. (arXiv:2105.13144v1 [cs.LG])

Making evidence based decisions requires data. However for real-world
applications, the privacy of data is critical. Using synthetic data which
reflects certain statistical properties of the original data preserves the
privacy of the original data. To this end, prior works utilize differentially
private data release mechanisms to provide formal privacy guarantees. However,
such mechanisms have unacceptable privacy vs. utility trade-offs. We propose
incorporating causal information into the training process to favorably modify
the aforementioned trade-off. We theoretically prove that generative models
trained with additional causal knowledge provide stronger differential privacy
guarantees. Empirically, we evaluate our solution comparing different models
based on variational auto-encoders (VAEs), and show that causal information
improves resilience to membership inference, with improvements in downstream