Generate Synthetic Time-series Data with Open-source Tools
This presentation introduces the DoppelGANger generative adversarial network model, and describes how to create synthetic time series data using a PyTorch implementation of it.
In the modern data world, time series data is ubiquitous because it is a series of measurements taken over a period of time. When real data is scarce or sensitive information has to be protected, we may generate synthetic time series data. Timestamped log messages, financial markets, and medical records are all examples of synthetic time series data. Trends and correlations across time are just as important as correlations between variables in synthetic data due to the additional dimension of time.
We have previously posted blogs on synthesizing financial series data (financial data, time series basics) at Gretel, but we are continually looking for new models that can help improve our synthetic data generation. Our APIs and console are being built around the DoppelGANger model, which is described in the paper (Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions by Lin et. al.). For this project, we created a PyTorch implementation of the DoppelGANger model and are excited to publish it as part of the Gretel-Synthetics open-source library.
DoppelGANger Model
DoppelGANger is based on generative adversarial networks (GANs) with some modifications to make it suitable for time series generation. Using an adversarial training scheme, the model optimizes the discriminator (or critic) and generator networks simultaneously by comparing synthetic and real inputs. By passing input noise to the generator network, arbitrary amounts of synthetic time-series data can be created.
To identify limitations of existing synthetic time series approaches, Lin et al. review existing approaches and their own observations to propose several specific improvements that are part of DoppelGANger. GAN-specific tricks range from generic improvements to time-series-specific ones. Some of these modifications are as follows:
If you want to download the source code, then check the link below. You can find the link to was s3 bucket where you can download the entire code.