python great expectation compatible with pyspark

Question

I am implementing data quality checks using Great expectation library. does this library compatible with Pyspark does this run on multiple cores?

Abhinay · Accepted Answer · 2022-10-10 11:45:47Z

0

Yes it is compatible with Pyspark. Here is the example.

datasource creation.

datasources:
  spark_ds:
    class_name: Datasource
    execution_engine:
      module_name: great_expectations.execution_engine
      class_name: SparkDFExecutionEngine
      force_reuse_spark_context: true
    module_name: great_expectations.datasource
    data_connectors:
      spark_ds_connector:
        class_name: RuntimeDataConnector
        module_name: great_expectations.datasource.data_connector
        batch_identifiers:
          - batch_id

Create runtime batch request

df=#Create your dataframe
request=RuntimeBatchRequest(
            datasource_name="spark_ds",
            data_connector_name="spark_ds_connector",
            data_asset_name="any_asset_name",  

            runtime_parameters={"batch_data": df},  

            batch_identifiers={"batch_id": "batch_id"},
        )
    
    
ge_context.run_checkpoint(checkpoint_name="checkpoint", validations=[{"batch_request": request, "expectation_suite_name": "suite_name"}])

answered Oct 10, 2022 at 11:45

Abhinay

4745 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

code_bug Over a year ago

I have simply created in this format gedf = ge.dataset.SparkDFDataset(df1) DQI=gedf.expect_column_values_to_be_unique("ID", result_format = "COMPLETE") to remove unnecessary overhead. But failing with Python kernel unresponsive. not sure this error because of OOM.

halfwind22 Over a year ago

Should be OOM, because as per the documentation ( and also code) the results are collected to the driver memory. So only one node does the job.

Collectives™ on Stack Overflow

python great expectation compatible with pyspark

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related