sqlmesh API documentation

Edit on GitHub

SQLMesh logo

SQLMesh is a next-generation data transformation framework designed to ship data quickly, efficiently, and without error. Data teams can run and deploy data transformations written in SQL or Python with visibility and control at any size.

It is more than just a dbt alternative.

Architecture Diagram

Core Features

SQLMesh Plan Mode

Get instant SQL impact and context of your changes, both in the CLI and in the SQLMesh VSCode Extension

Virtual Data Environments

See a full diagram of how Virtual Data Environments work
Watch this video to learn more

Create isolated development environments without data warehouse costs

Plan / Apply workflow like Terraform to understand potential impact of changes

Easy to use CI/CD bot for true blue-green deployments

Efficiency and Testing

Running this command will generate a unit test file in the tests/ folder: test_stg_payments.yaml

Runs a live query to generate the expected output of the model

sqlmesh create_test tcloud_demo.stg_payments --query tcloud_demo.seed_raw_payments "select * from tcloud_demo.seed_raw_payments limit 5"

# run the unit test
sqlmesh test

MODEL (
  name tcloud_demo.stg_payments,
  cron '@daily',
  grain payment_id,
  audits (UNIQUE_VALUES(columns = (
      payment_id
  )), NOT_NULL(columns = (
      payment_id
  )))
);

SELECT
    id AS payment_id,
    order_id,
    payment_method,
    amount / 100 AS amount, /* `amount` is currently stored in cents, so we convert it to dollars */
    'new_column' AS new_column, /* non-breaking change example  */
FROM tcloud_demo.seed_raw_payments

test_stg_payments:
model: tcloud_demo.stg_payments
inputs:
    tcloud_demo.seed_raw_payments:
      - id: 66
        order_id: 58
        payment_method: coupon
        amount: 1800
      - id: 27
        order_id: 24
        payment_method: coupon
        amount: 2600
      - id: 30
        order_id: 25
        payment_method: coupon
        amount: 1600
      - id: 109
        order_id: 95
        payment_method: coupon
        amount: 2400
      - id: 3
        order_id: 3
        payment_method: coupon
        amount: 100
outputs:
    query:
      - payment_id: 66
        order_id: 58
        payment_method: coupon
        amount: 18.0
        new_column: new_column
      - payment_id: 27
        order_id: 24
        payment_method: coupon
        amount: 26.0
        new_column: new_column
      - payment_id: 30
        order_id: 25
        payment_method: coupon
        amount: 16.0
        new_column: new_column
      - payment_id: 109
        order_id: 95
        payment_method: coupon
        amount: 24.0
        new_column: new_column
      - payment_id: 3
        order_id: 3
        payment_method: coupon
        amount: 1.0
        new_column: new_column

Never build a table more than once
Track what data’s been modified and run only the necessary transformations for incremental models
Run unit tests for free and configure automated audits
Run table diffs between prod and dev based on tables/views impacted by a change

Level Up Your SQL

Write SQL in any dialect and SQLMesh will transpile it to your target SQL dialect on the fly before sending it to the warehouse. Transpile Example

Debug transformation errors before you run them in your warehouse in 10+ different SQL dialects
Definitions using simply SQL (no need for redundant and confusing Jinja + YAML)
See impact of changes before you run them in your warehouse with column-level lineage

For more information, check out the website and documentation.

Getting Started

Install SQLMesh through pypi by running:

mkdir sqlmesh-example
cd sqlmesh-example
python -m venv .venv
source .venv/bin/activate
pip install 'sqlmesh[lsp]' # install the sqlmesh package with extensions to work with VSCode
source .venv/bin/activate # reactivate the venv to ensure you're using the right installation
sqlmesh init # follow the prompts to get started (choose DuckDB)

Note: You may need to run python3 or pip3 instead of python or pip, depending on your python installation.

Windows Installation

mkdir sqlmesh-example
cd sqlmesh-example
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install 'sqlmesh[lsp]' # install the sqlmesh package with extensions to work with VSCode
.\.venv\Scripts\Activate.ps1 # reactivate the venv to ensure you're using the right installation
sqlmesh init # follow the prompts to get started (choose DuckDB)

Follow the quickstart guide to learn how to use SQLMesh. You already have a head start!

Follow the crash course to learn the core movesets and use the easy to reference cheat sheet.

Follow this example to learn how to use SQLMesh in a full walkthrough.

Join Our Community

Together, we want to build data transformation without the waste. Connect with us in the following ways:

Join the Tobiko Slack Community to ask questions, or just to say hi!
File an issue on our GitHub
Send us an email at hello@tobikodata.com with your questions or feedback
Read our blog

Contribution

Contributions in the form of issues or pull requests (from fork) are greatly appreciated.

Read more on how to contribute to SQLMesh open source.

Watch this video walkthrough to see how our team contributes a feature to SQLMesh.

View Source

  1# ruff: noqa: E402
  2"""
  3.. include:: ../README.md
  4"""
  5
  6from __future__ import annotations
  7
  8import glob
  9import logging
 10import os
 11import sys
 12import typing as t
 13from datetime import datetime
 14from enum import Enum
 15from pathlib import Path
 16
 17from sqlmesh.core.dialect import extend_sqlglot
 18
 19extend_sqlglot()
 20
 21from sqlmesh.core import constants as c
 22from sqlmesh.core.config import Config as Config
 23from sqlmesh.core.context import Context as Context, ExecutionContext as ExecutionContext
 24from sqlmesh.core.engine_adapter import EngineAdapter as EngineAdapter
 25from sqlmesh.core.macros import SQL as SQL, macro as macro
 26from sqlmesh.core.model import Model as Model, model as model
 27from sqlmesh.core.signal import signal as signal
 28from sqlmesh.core.snapshot import Snapshot as Snapshot
 29from sqlmesh.core.snapshot.evaluator import (
 30    CustomMaterialization as CustomMaterialization,
 31)
 32from sqlmesh.core.model.kind import CustomKind as CustomKind
 33from sqlmesh.utils import (
 34    debug_mode_enabled as debug_mode_enabled,
 35    enable_debug_mode as enable_debug_mode,
 36    str_to_bool,
 37)
 38from sqlmesh.utils.date import DatetimeRanges as DatetimeRanges
 39
 40try:
 41    from sqlmesh._version import __version__ as __version__, __version_tuple__ as __version_tuple__
 42except ImportError:
 43    pass
 44
 45
 46if t.TYPE_CHECKING:
 47    from sqlmesh.core.engine_adapter._typing import QueryOrDF  # noqa: F401
 48
 49
 50class RuntimeEnv(str, Enum):
 51    """Enum defining what environment SQLMesh is running in."""
 52
 53    TERMINAL = "terminal"
 54    DATABRICKS = "databricks"
 55    GOOGLE_COLAB = "google_colab"  # Not currently officially supported
 56    JUPYTER = "jupyter"
 57    DEBUGGER = "debugger"
 58    CI = "ci"  # CI or other envs that shouldn't use emojis
 59
 60    @classmethod
 61    def get(cls) -> RuntimeEnv:
 62        """Get the console class to use based on the environment that the code is running in
 63        Reference implementation: https://github.com/noklam/rich/blob/d3a1ae61a77d934844563514370084971bc3e143/rich/console.py#L511-L528
 64
 65        Unlike the rich implementation we try to split out by notebook type instead of treating it all as Jupyter.
 66        """
 67        runtime_env_var = os.getenv("SQLMESH_RUNTIME_ENVIRONMENT")
 68        if runtime_env_var:
 69            try:
 70                return RuntimeEnv(runtime_env_var)
 71            except ValueError:
 72                valid_values = [f'"{member.value}"' for member in RuntimeEnv]
 73                raise ValueError(
 74                    f"Invalid SQLMESH_RUNTIME_ENVIRONMENT value: {runtime_env_var}. Must be one of {', '.join(valid_values)}."
 75                )
 76
 77        try:
 78            shell = get_ipython()  # type: ignore
 79            if os.getenv("DATABRICKS_RUNTIME_VERSION"):
 80                return RuntimeEnv.DATABRICKS
 81            if "google.colab" in str(shell.__class__):  # type: ignore
 82                return RuntimeEnv.GOOGLE_COLAB
 83            if shell.__class__.__name__ == "ZMQInteractiveShell":  # type: ignore
 84                return RuntimeEnv.JUPYTER
 85        except NameError:
 86            pass
 87
 88        if debug_mode_enabled():
 89            return RuntimeEnv.DEBUGGER
 90
 91        if is_cicd_environment() or not is_interactive_environment():
 92            return RuntimeEnv.CI
 93
 94        return RuntimeEnv.TERMINAL
 95
 96    @property
 97    def is_terminal(self) -> bool:
 98        return self == RuntimeEnv.TERMINAL
 99
100    @property
101    def is_databricks(self) -> bool:
102        return self == RuntimeEnv.DATABRICKS
103
104    @property
105    def is_jupyter(self) -> bool:
106        return self == RuntimeEnv.JUPYTER
107
108    @property
109    def is_google_colab(self) -> bool:
110        return self == RuntimeEnv.GOOGLE_COLAB
111
112    @property
113    def is_ci(self) -> bool:
114        return self == RuntimeEnv.CI
115
116    @property
117    def is_notebook(self) -> bool:
118        return not self.is_terminal and not self.is_ci
119
120
121def is_cicd_environment() -> bool:
122    for key in ("CI", "GITHUB_ACTIONS", "TRAVIS", "CIRCLECI", "GITLAB_CI", "BUILDKITE"):
123        if str_to_bool(os.environ.get(key, "false")):
124            return True
125    return False
126
127
128def is_interactive_environment() -> bool:
129    if sys.stdin is None or sys.stdout is None:
130        return False
131    return sys.stdin.isatty() and sys.stdout.isatty()
132
133
134if RuntimeEnv.get().is_notebook:
135    try:
136        from sqlmesh.magics import register_magics
137
138        register_magics()
139    except ImportError:
140        pass
141
142
143LOG_FORMAT = "%(asctime)s - %(threadName)s - %(name)s - %(levelname)s - %(message)s (%(filename)s:%(lineno)d)"
144LOG_FILENAME_PREFIX = "sqlmesh_"
145
146
147# SO: https://stackoverflow.com/questions/384076/how-can-i-color-python-logging-output
148class CustomFormatter(logging.Formatter):
149    """Custom logging formatter."""
150
151    grey = "\x1b[38;20m"
152    yellow = "\x1b[33;20m"
153    red = "\x1b[31;20m"
154    bold_red = "\x1b[31;1m"
155    reset = "\x1b[0m"
156
157    FORMATS = {
158        logging.DEBUG: grey + LOG_FORMAT + reset,
159        logging.INFO: grey + LOG_FORMAT + reset,
160        logging.WARNING: yellow + LOG_FORMAT + reset,
161        logging.ERROR: red + LOG_FORMAT + reset,
162        logging.CRITICAL: bold_red + LOG_FORMAT + reset,
163    }
164
165    def format(self, record: logging.LogRecord) -> str:
166        log_fmt = self.FORMATS.get(record.levelno)
167        formatter = logging.Formatter(log_fmt)
168        return formatter.format(record)
169
170
171def remove_excess_logs(
172    log_file_dir: t.Optional[t.Union[str, Path]] = None,
173    log_limit: int = c.DEFAULT_LOG_LIMIT,
174) -> None:
175    if log_limit <= 0:
176        return
177
178    log_file_dir = log_file_dir or c.DEFAULT_LOG_FILE_DIR
179    log_path_prefix = Path(log_file_dir) / LOG_FILENAME_PREFIX
180
181    for path in list(sorted(glob.glob(f"{log_path_prefix}*.log"), reverse=True))[log_limit:]:
182        os.remove(path)
183
184
185def configure_logging(
186    force_debug: bool = False,
187    write_to_stdout: bool = False,
188    write_to_file: bool = True,
189    log_file_dir: t.Optional[t.Union[str, Path]] = None,
190    ignore_warnings: bool = False,
191    log_level: t.Optional[t.Union[str, int]] = None,
192) -> None:
193    # Remove noisy grpc logs that are not useful for users
194    os.environ["GRPC_VERBOSITY"] = os.environ.get("GRPC_VERBOSITY", "NONE")
195
196    logger = logging.getLogger()
197    debug = force_debug or debug_mode_enabled()
198
199    if log_level is not None:
200        if isinstance(log_level, str):
201            level = logging._nameToLevel.get(log_level.upper()) or logging.INFO
202        else:
203            level = log_level
204    else:
205        # base logger needs to be the lowest level that we plan to log
206        level = logging.DEBUG if debug else logging.INFO
207
208    logger.setLevel(level)
209
210    if debug:
211        # Remove noisy snowflake connector logs that are not useful for users
212        logging.getLogger("snowflake.connector").setLevel(logging.INFO)
213
214    if write_to_stdout:
215        stdout_handler = logging.StreamHandler(sys.stdout)
216        stdout_handler.setFormatter(CustomFormatter())
217        stdout_handler.setLevel(logging.ERROR if ignore_warnings else level)
218        logger.addHandler(stdout_handler)
219
220    log_file_dir = log_file_dir or c.DEFAULT_LOG_FILE_DIR
221    log_path_prefix = Path(log_file_dir) / LOG_FILENAME_PREFIX
222
223    if write_to_file:
224        os.makedirs(str(log_file_dir), exist_ok=True)
225        filename = f"{log_path_prefix}{datetime.now().strftime('%Y_%m_%d_%H_%M_%S')}.log"
226        file_handler = logging.FileHandler(filename, mode="w", encoding="utf-8")
227
228        # the log files should always log at least info so that users will always have
229        # minimal info for debugging even if they specify "ignore_warnings"
230        file_handler.setLevel(level)
231        file_handler.setFormatter(logging.Formatter(LOG_FORMAT))
232        logger.addHandler(file_handler)
233
234    if debug:
235        import faulthandler
236
237        enable_debug_mode()
238
239        # Enable threadumps.
240        faulthandler.enable()
241
242        # Windows doesn't support register so we check for it here
243        if hasattr(faulthandler, "register"):
244            from signal import SIGUSR1
245
246            faulthandler.register(SIGUSR1.value)

class RuntimeEnv(builtins.str, enum.Enum): View Source

 51class RuntimeEnv(str, Enum):
 52    """Enum defining what environment SQLMesh is running in."""
 53
 54    TERMINAL = "terminal"
 55    DATABRICKS = "databricks"
 56    GOOGLE_COLAB = "google_colab"  # Not currently officially supported
 57    JUPYTER = "jupyter"
 58    DEBUGGER = "debugger"
 59    CI = "ci"  # CI or other envs that shouldn't use emojis
 60
 61    @classmethod
 62    def get(cls) -> RuntimeEnv:
 63        """Get the console class to use based on the environment that the code is running in
 64        Reference implementation: https://github.com/noklam/rich/blob/d3a1ae61a77d934844563514370084971bc3e143/rich/console.py#L511-L528
 65
 66        Unlike the rich implementation we try to split out by notebook type instead of treating it all as Jupyter.
 67        """
 68        runtime_env_var = os.getenv("SQLMESH_RUNTIME_ENVIRONMENT")
 69        if runtime_env_var:
 70            try:
 71                return RuntimeEnv(runtime_env_var)
 72            except ValueError:
 73                valid_values = [f'"{member.value}"' for member in RuntimeEnv]
 74                raise ValueError(
 75                    f"Invalid SQLMESH_RUNTIME_ENVIRONMENT value: {runtime_env_var}. Must be one of {', '.join(valid_values)}."
 76                )
 77
 78        try:
 79            shell = get_ipython()  # type: ignore
 80            if os.getenv("DATABRICKS_RUNTIME_VERSION"):
 81                return RuntimeEnv.DATABRICKS
 82            if "google.colab" in str(shell.__class__):  # type: ignore
 83                return RuntimeEnv.GOOGLE_COLAB
 84            if shell.__class__.__name__ == "ZMQInteractiveShell":  # type: ignore
 85                return RuntimeEnv.JUPYTER
 86        except NameError:
 87            pass
 88
 89        if debug_mode_enabled():
 90            return RuntimeEnv.DEBUGGER
 91
 92        if is_cicd_environment() or not is_interactive_environment():
 93            return RuntimeEnv.CI
 94
 95        return RuntimeEnv.TERMINAL
 96
 97    @property
 98    def is_terminal(self) -> bool:
 99        return self == RuntimeEnv.TERMINAL
100
101    @property
102    def is_databricks(self) -> bool:
103        return self == RuntimeEnv.DATABRICKS
104
105    @property
106    def is_jupyter(self) -> bool:
107        return self == RuntimeEnv.JUPYTER
108
109    @property
110    def is_google_colab(self) -> bool:
111        return self == RuntimeEnv.GOOGLE_COLAB
112
113    @property
114    def is_ci(self) -> bool:
115        return self == RuntimeEnv.CI
116
117    @property
118    def is_notebook(self) -> bool:
119        return not self.is_terminal and not self.is_ci

Enum defining what environment SQLMesh is running in.

TERMINAL = <RuntimeEnv.TERMINAL: 'terminal'>

DATABRICKS = <RuntimeEnv.DATABRICKS: 'databricks'>

GOOGLE_COLAB = <RuntimeEnv.GOOGLE_COLAB: 'google_colab'>

JUPYTER = <RuntimeEnv.JUPYTER: 'jupyter'>

DEBUGGER = <RuntimeEnv.DEBUGGER: 'debugger'>

CI = <RuntimeEnv.CI: 'ci'>

@classmethod

def get(cls) -> RuntimeEnv: View Source

61    @classmethod
62    def get(cls) -> RuntimeEnv:
63        """Get the console class to use based on the environment that the code is running in
64        Reference implementation: https://github.com/noklam/rich/blob/d3a1ae61a77d934844563514370084971bc3e143/rich/console.py#L511-L528
65
66        Unlike the rich implementation we try to split out by notebook type instead of treating it all as Jupyter.
67        """
68        runtime_env_var = os.getenv("SQLMESH_RUNTIME_ENVIRONMENT")
69        if runtime_env_var:
70            try:
71                return RuntimeEnv(runtime_env_var)
72            except ValueError:
73                valid_values = [f'"{member.value}"' for member in RuntimeEnv]
74                raise ValueError(
75                    f"Invalid SQLMESH_RUNTIME_ENVIRONMENT value: {runtime_env_var}. Must be one of {', '.join(valid_values)}."
76                )
77
78        try:
79            shell = get_ipython()  # type: ignore
80            if os.getenv("DATABRICKS_RUNTIME_VERSION"):
81                return RuntimeEnv.DATABRICKS
82            if "google.colab" in str(shell.__class__):  # type: ignore
83                return RuntimeEnv.GOOGLE_COLAB
84            if shell.__class__.__name__ == "ZMQInteractiveShell":  # type: ignore
85                return RuntimeEnv.JUPYTER
86        except NameError:
87            pass
88
89        if debug_mode_enabled():
90            return RuntimeEnv.DEBUGGER
91
92        if is_cicd_environment() or not is_interactive_environment():
93            return RuntimeEnv.CI
94
95        return RuntimeEnv.TERMINAL

Get the console class to use based on the environment that the code is running in Reference implementation: https://github.com/noklam/rich/blob/d3a1ae61a77d934844563514370084971bc3e143/rich/console.py#L511-L528

Unlike the rich implementation we try to split out by notebook type instead of treating it all as Jupyter.

is_terminal: bool View Source

97    @property
98    def is_terminal(self) -> bool:
99        return self == RuntimeEnv.TERMINAL

is_databricks: bool View Source

101    @property
102    def is_databricks(self) -> bool:
103        return self == RuntimeEnv.DATABRICKS

is_jupyter: bool View Source

105    @property
106    def is_jupyter(self) -> bool:
107        return self == RuntimeEnv.JUPYTER

is_google_colab: bool View Source

109    @property
110    def is_google_colab(self) -> bool:
111        return self == RuntimeEnv.GOOGLE_COLAB

is_ci: bool View Source

113    @property
114    def is_ci(self) -> bool:
115        return self == RuntimeEnv.CI

is_notebook: bool View Source

117    @property
118    def is_notebook(self) -> bool:
119        return not self.is_terminal and not self.is_ci

Inherited Members

enum.Enum: name; value
builtins.str: encode; replace; split; rsplit; join; capitalize; casefold; title; center; count; expandtabs; find; partition; index; ljust; lower; lstrip; rfind; rindex; rjust; rstrip; rpartition; splitlines; strip; swapcase; translate; upper; startswith; endswith; removeprefix; removesuffix; isascii; islower; isupper; istitle; isspace; isdecimal; isdigit; isnumeric; isalpha; isalnum; isidentifier; isprintable; zfill; format; format_map; maketrans

def is_cicd_environment() -> bool: View Source

122def is_cicd_environment() -> bool:
123    for key in ("CI", "GITHUB_ACTIONS", "TRAVIS", "CIRCLECI", "GITLAB_CI", "BUILDKITE"):
124        if str_to_bool(os.environ.get(key, "false")):
125            return True
126    return False

def is_interactive_environment() -> bool: View Source

129def is_interactive_environment() -> bool:
130    if sys.stdin is None or sys.stdout is None:
131        return False
132    return sys.stdin.isatty() and sys.stdout.isatty()

LOG_FORMAT = '%(asctime)s - %(threadName)s - %(name)s - %(levelname)s - %(message)s (%(filename)s:%(lineno)d)'

LOG_FILENAME_PREFIX = 'sqlmesh_'

class CustomFormatter(logging.Formatter): View Source

149class CustomFormatter(logging.Formatter):
150    """Custom logging formatter."""
151
152    grey = "\x1b[38;20m"
153    yellow = "\x1b[33;20m"
154    red = "\x1b[31;20m"
155    bold_red = "\x1b[31;1m"
156    reset = "\x1b[0m"
157
158    FORMATS = {
159        logging.DEBUG: grey + LOG_FORMAT + reset,
160        logging.INFO: grey + LOG_FORMAT + reset,
161        logging.WARNING: yellow + LOG_FORMAT + reset,
162        logging.ERROR: red + LOG_FORMAT + reset,
163        logging.CRITICAL: bold_red + LOG_FORMAT + reset,
164    }
165
166    def format(self, record: logging.LogRecord) -> str:
167        log_fmt = self.FORMATS.get(record.levelno)
168        formatter = logging.Formatter(log_fmt)
169        return formatter.format(record)

Custom logging formatter.

grey = '\x1b[38;20m'

yellow = '\x1b[33;20m'

red = '\x1b[31;20m'

bold_red = '\x1b[31;1m'

reset = '\x1b[0m'

FORMATS = {10: '\x1b[38;20m%(asctime)s - %(threadName)s - %(name)s - %(levelname)s - %(message)s (%(filename)s:%(lineno)d)\x1b[0m', 20: '\x1b[38;20m%(asctime)s - %(threadName)s - %(name)s - %(levelname)s - %(message)s (%(filename)s:%(lineno)d)\x1b[0m', 30: '\x1b[33;20m%(asctime)s - %(threadName)s - %(name)s - %(levelname)s - %(message)s (%(filename)s:%(lineno)d)\x1b[0m', 40: '\x1b[31;20m%(asctime)s - %(threadName)s - %(name)s - %(levelname)s - %(message)s (%(filename)s:%(lineno)d)\x1b[0m', 50: '\x1b[31;1m%(asctime)s - %(threadName)s - %(name)s - %(levelname)s - %(message)s (%(filename)s:%(lineno)d)\x1b[0m'}

def format(self, record: logging.LogRecord) -> str: View Source

166    def format(self, record: logging.LogRecord) -> str:
167        log_fmt = self.FORMATS.get(record.levelno)
168        formatter = logging.Formatter(log_fmt)
169        return formatter.format(record)

Format the specified record as text.

The record's attribute dictionary is used as the operand to a string formatting operation which yields the returned string. Before formatting the dictionary, a couple of preparatory steps are carried out. The message attribute of the record is computed using LogRecord.getMessage(). If the formatting string uses the time (as determined by a call to usesTime(), formatTime() is called to format the event time. If there is exception information, it is formatted using formatException() and appended to the message.

Inherited Members

logging.Formatter: Formatter; converter; datefmt; default_time_format; default_msec_format; formatTime; formatException; usesTime; formatMessage; formatStack

def remove_excess_logs( log_file_dir: Union[str, pathlib.Path, NoneType] = None, log_limit: int = 20) -> None: View Source

172def remove_excess_logs(
173    log_file_dir: t.Optional[t.Union[str, Path]] = None,
174    log_limit: int = c.DEFAULT_LOG_LIMIT,
175) -> None:
176    if log_limit <= 0:
177        return
178
179    log_file_dir = log_file_dir or c.DEFAULT_LOG_FILE_DIR
180    log_path_prefix = Path(log_file_dir) / LOG_FILENAME_PREFIX
181
182    for path in list(sorted(glob.glob(f"{log_path_prefix}*.log"), reverse=True))[log_limit:]:
183        os.remove(path)

def configure_logging( force_debug: bool = False, write_to_stdout: bool = False, write_to_file: bool = True, log_file_dir: Union[str, pathlib.Path, NoneType] = None, ignore_warnings: bool = False, log_level: Union[str, int, NoneType] = None) -> None: View Source

186def configure_logging(
187    force_debug: bool = False,
188    write_to_stdout: bool = False,
189    write_to_file: bool = True,
190    log_file_dir: t.Optional[t.Union[str, Path]] = None,
191    ignore_warnings: bool = False,
192    log_level: t.Optional[t.Union[str, int]] = None,
193) -> None:
194    # Remove noisy grpc logs that are not useful for users
195    os.environ["GRPC_VERBOSITY"] = os.environ.get("GRPC_VERBOSITY", "NONE")
196
197    logger = logging.getLogger()
198    debug = force_debug or debug_mode_enabled()
199
200    if log_level is not None:
201        if isinstance(log_level, str):
202            level = logging._nameToLevel.get(log_level.upper()) or logging.INFO
203        else:
204            level = log_level
205    else:
206        # base logger needs to be the lowest level that we plan to log
207        level = logging.DEBUG if debug else logging.INFO
208
209    logger.setLevel(level)
210
211    if debug:
212        # Remove noisy snowflake connector logs that are not useful for users
213        logging.getLogger("snowflake.connector").setLevel(logging.INFO)
214
215    if write_to_stdout:
216        stdout_handler = logging.StreamHandler(sys.stdout)
217        stdout_handler.setFormatter(CustomFormatter())
218        stdout_handler.setLevel(logging.ERROR if ignore_warnings else level)
219        logger.addHandler(stdout_handler)
220
221    log_file_dir = log_file_dir or c.DEFAULT_LOG_FILE_DIR
222    log_path_prefix = Path(log_file_dir) / LOG_FILENAME_PREFIX
223
224    if write_to_file:
225        os.makedirs(str(log_file_dir), exist_ok=True)
226        filename = f"{log_path_prefix}{datetime.now().strftime('%Y_%m_%d_%H_%M_%S')}.log"
227        file_handler = logging.FileHandler(filename, mode="w", encoding="utf-8")
228
229        # the log files should always log at least info so that users will always have
230        # minimal info for debugging even if they specify "ignore_warnings"
231        file_handler.setLevel(level)
232        file_handler.setFormatter(logging.Formatter(LOG_FORMAT))
233        logger.addHandler(file_handler)
234
235    if debug:
236        import faulthandler
237
238        enable_debug_mode()
239
240        # Enable threadumps.
241        faulthandler.enable()
242
243        # Windows doesn't support register so we check for it here
244        if hasattr(faulthandler, "register"):
245            from signal import SIGUSR1
246
247            faulthandler.register(SIGUSR1.value)