Skip to content

Add TabGAN - synthetic tabular data generation with GANs, Diffusion, and LLMs#3003

Open
Diyago wants to merge 1 commit intovinta:masterfrom
Diyago:add-tabgan
Open

Add TabGAN - synthetic tabular data generation with GANs, Diffusion, and LLMs#3003
Diyago wants to merge 1 commit intovinta:masterfrom
Diyago:add-tabgan

Conversation

@Diyago
Copy link
Copy Markdown

@Diyago Diyago commented Mar 28, 2026

Add TabGAN — Synthetic Tabular Data Generation

TabGAN is a Python library for generating high-quality synthetic tabular data using multiple generative approaches through a unified API:

  • CTGAN (Conditional Tabular GAN) for mixed data types
  • ForestDiffusion (tree-based diffusion) for structured data
  • GReaT (Large Language Models) for semantic dependencies

Key Features

  • Unified API across GANs, Diffusion Models, and LLMs
  • Adversarial filtering ensures distribution consistency
  • Privacy metrics (DCR, NNDR, membership inference)
  • Constraint enforcement (range, uniqueness, formula, regex)
  • HTML quality reports with distribution comparisons
  • sklearn TabGANTransformer for Pipeline integration
  • 100K+ PyPI downloads, 115 tests, Apache 2.0

Paper: Tabular GANs for uneven distribution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant