Python environment
Scenarios run on DataMaker-hosted Python workers. You don’t manage the environment — just the script.
Runtime
- Python 3.12 (current stable).
- One process per scenario run. Cold start ~600 ms; warm starts effectively instant if you re-run within ~5 minutes.
- Each run gets its own working directory, mounted at
/workspace. Files survive the duration of the run only — for persistence across runs, see Workspace files.
Pre-installed packages
Every worker comes with these:
| Package | Why |
|---|---|
datamaker | The official DataMaker SDK (entrypoint of every script). |
requests | Generic HTTP. Use this for non-DataMaker REST calls. |
httpx | Async HTTP. Available if you’d rather use async/await. |
psycopg[binary] | Postgres driver (if you want raw SQL alongside dm.connection). |
pymysql | MySQL driver. |
pymongo | MongoDB driver. |
pandas | DataFrame manipulation. Useful for transforming generated rows. |
pyyaml | YAML parsing. |
python-dateutil | Date parsing/arithmetic. |
faker | Python’s Faker. Most cases use dm.generate() instead, but Faker is there if you need a quick one-off. |
pyodata | OData client. Mostly redundant — dm.connection.fetch() is the supported path. |
Adding packages
Each scenario can declare its own requirements:
# top of your scenario:# requirements: arrow~=1.3, polars~=0.20import arrow, polars as pl
# normal scenario code followsDataMaker reads the # requirements: comment, resolves the dependency tree against
PyPI, and installs into the worker’s .venv before running. Subsequent runs reuse the
cached install.
Pinning rules: we accept any PEP 440 version specifier (==1.2.3, ~=1.3,
>=2,<3). For reproducibility, prefer ~= (compatible release).
Environment variables
Scenarios have access to:
- DataMaker context:
DM_PROJECT_ID,DM_TEAM_ID,DM_SCENARIO_ID,DM_RUN_ID— set automatically; you usually don’t read them directly. - Workspace secrets: anything you set under Settings → Workspace secrets is
available as
os.environ["YOUR_KEY"]. Use this for third-party API tokens. - Run parameters: passed via the scenario’s API call (
{"params": {...}}) and available asdm.params(a dict).
import osslack_token = os.environ["SLACK_BOT_TOKEN"] # workspace secretenv = dm.params.get("environment", "dev") # run-time paramWhat’s not there
To keep workers fast and isolated:
- No shell. No
subprocess.run()of arbitrary binaries (we block it at runtime). - No persistent filesystem outside
/workspace. - No outbound network to private IPs unless you’ve configured a VPN connector (Enterprise plans).
Running locally
You can develop scenarios against your local Python:
pip install datamakerexport DM_API_KEY=your_api_keypython my_scenario.pyThe same SDK works locally and in the worker. The only differences:
dm.paramsis empty unless you read CLI args yourself.- Workspace files are not mounted; use
dm.workspace_file().download()if you need them locally.
For more, see API & SDKs → Python SDK.