26,304 questions
-3
votes
0
answers
30
views
Marketing ga4 gsc data in Big query to power BI Confusion [closed]
I am working on a pipeline and facing issues.
I used data from gsc and ga4 in big query. I am calculating the avg of avg position and sum of new users, etc.
I am taking the data to power bi now and ...
Best practices
0
votes
4
replies
97
views
How to use concat
I'm analyzing wind speed and visibility data from 2023 in BigQuery, but the dataset wasn’t cleaned and missing values were entered as zeroes.
I'm trying to update those zeroes to NULL before running ...
Advice
0
votes
5
replies
128
views
Best SQL Function for Text Filtering
I am examining a large data set on sports activities amongst a population of people and need to find all entries of the word "soccer."
What is the best SQL function to filter all entries ...
-4
votes
0
answers
20
views
Cannot add a field [closed]
On writing the following in Bigquery the following error message shows up:
Please check my code and advise how to solve the problem.
Invalid schema update. Cannot add fields (field: ...
Advice
0
votes
2
replies
106
views
Explain this SQL Query
I am learning SQL for my data analytics course and I have come across a query in which I have questions about. Can someone explain how the the query works to me.
SELECT
usertype,
CONCAT (...
Best practices
0
votes
1
replies
47
views
Minimum IAM role on target GCP project for GA4 Analytics Admin API CreateBigQueryLink?
We are building a B2B SaaS platform that programmatically creates BigQuery export links from customer GA4 properties to our GCP project using the Analytics Admin API.
API call:
POST https://...
1
vote
1
answer
175
views
Bigquery storage API `to_arrow_iterable` returns only 8 rows at a time
I have this code to retrieve millions of rows from my BigQuery query results:
query_job = client.query(
query,
)
storage_client = bigquery_storage....
Advice
0
votes
3
replies
83
views
How to remember all SQL codes effectively?
I'm currently taking the Google Data Analytics certificate, and the SQL code is a bit overwhelming. How can I remember them so whenever there is a need for help, I can do the job. Any advice or tips ...
0
votes
0
answers
102
views
BigQuery Storage Write API: "context deadline exceeded" only on low-frequency table
Problem
I'm using the BigQuery Storage Write API (Go managedwriter package) to upload data to three tables with very different ingestion rates:
Table
Frequency
Record Size
A
~10 records/sec
Several KB
...
0
votes
0
answers
85
views
GA4 BigQuery: events_intraday_YYYYMMDD returns ~2x more distinct user IDs than finalized events_YYYYMMDD partition
Some context:
I'm connecting mobile game data to BQ using Firebase/GA4 with BigQuery export. Every session has a USER_ID set in user_properties at client load. I have pipelines that run daily at 6 AM ...
-7
votes
1
answer
149
views
Sort data in BigQuery using order [closed]
I am learning BigQuery in Coursera. I am trying to sort a table but I am confused about how to use ORDER BY.
My table has columns like name and age. I want to sort the age from highest to lowest. What ...
Best practices
1
vote
3
replies
100
views
Difference between CONCAT with II and CONCAT with +
What is the difference between CONCAT with II and CONCAT with + in SQL?
Which should be used and when?.
If possible can anyone please explain with example.
Advice
1
vote
1
replies
107
views
Efficiently processing 1 year of daily historical files using dataflow
I have an Apache Beam pipeline (running on Dataflow) that normally performs a daily batch load from Cloud Storage to BigQuery. The source team has provided 1 year of historical data that needs to be ...
Best practices
0
votes
4
replies
116
views
SQL Query as a table
How to save query as a table in SQL on BigQuery?
I am doing data analysis and want to save a query as a table to reuse and save my time.
I was running a query in BigQuery platform for the practice and ...
Tooling
0
votes
3
replies
76
views
I would like to change values that were input as zero since they were missing into null values to avoid getting errors
However, you’re working with the newest data and it hasn’t been cleaned yet. Missing values were incorrectly entered as zeroes, and you need to change them to null values before you look for trends. ...
1
vote
0
answers
70
views
Why is clustering on _id not reducing bytes scanned in BigQuery for a table synced via Datastream?
I have a BigQuery table (approx. 140 GB in size) that is synchronized from MongoDB via a Google Cloud Datastream. I have set the _id column as the clustering column. However, when I run a query ...
0
votes
1
answer
61
views
How to bulk update descriptions for datasets shared via Analytics Hub without modifying each source dataset manually?
I have a GCP project (let's call it "Tier") that acts as a central data-sharing hub across multiple teams. Using Analytics Hub, I've created an exchange with several listings that reference ...
Advice
0
votes
4
replies
41
views
Passing table content as parameter for BigQuery UDF
I would like to have the UDF getting table content as parameter, like
CREATE OR REPLACE PROCEDURE dataset.function1(tab TABLE)
so that I am easier to pass the whole dataset for unit testing, is that ...
Advice
0
votes
4
replies
94
views
BigQuery Standard SQL: UPDATE to convert incorrect 0 values to NULL returns an error
I’m using Google BigQuery (Standard SQL) to clean 2023 weather data. Missing values were mistakenly stored as 0, but 0 is a valid value for some rows, so I only want to convert the incorrect zeroes to ...
0
votes
1
answer
76
views
AWS Glue Connection for BigQuery gives "SparkProperties is missing but it is required" and "secretId is not defined in the schema" when using Go SDK
I'm trying to programmatically create a native Google BigQuery connection in AWS Glue using the AWS SDK for Go v2 (github.com/aws/aws-sdk-go-v2/service/glue).
According to the AWS docs (Glue 4.0+ ...
3
votes
1
answer
96
views
Filtering dataset by geolocation data using SQL in BigQuery
I trying to determine if NCAA basketball games have an affect on liquor sales in the surrounding areas around the stadiums where the teams play their games. I am writing a query that filters and sorts ...
0
votes
1
answer
86
views
Airflow BigQueryInsertJobOperator and job re-attachment
The inciting problem
One of our DAGs suffers from an issue wherein a BigQueryInsertJobOperator task fails due to it becoming a "zombie task" (from Airflow's perspective); in the "Event ...
Best practices
0
votes
8
replies
113
views
How can I best optimize a Bigquery SQL by appending completely redundant filters to the query?
I have two different SQL calls to BigQuery. One completes in 22 seconds, the other is identical except that I add completely redundant filter at the end. This second completes in 8 seconds. The ...
1
vote
0
answers
104
views
GA4 BigQuery Daily Export Missing Data After Specific Date
I have a GA4 property linked to BigQuery for daily export (both iOS and Web data streams selected, “export all events” enabled, no filters).
Everything was working fine until 2025-11-23, when the ...
-1
votes
1
answer
158
views
Facing issue in Big Query while extracting JSON Value
We are leveraging BigQuery to create reports, with some column values represented in JSON. Below is a sample payload. I can successfully retrieve the Template value, but the objectType value remains ...
Tooling
1
vote
2
replies
68
views
Windows Shortcut
What shortcuts can I use to split strings in my database (BigQuery). I have been trying a lot of functions and I make use of the google bigquery platform. I'm getting really confused now. I tried ...
0
votes
1
answer
128
views
Optional nanoseconds when using `timestamp_format` in BigQuery Load Jobs?
I'm loading data into BigQuery and using LoadJobConfig. There's a timestamp_format field where you can specify the format of timestamps.
Checking the data I have, timestamps come in different format:
...
0
votes
1
answer
94
views
"Was expecting: <EOF>" when using BigQuery
I am trying to use BigQuery and want to search from the Google Cloud SDK Shell.
I logged in and set up with gcloud config init successfully.
I used the BigQuery console to put data in my dataset table....
0
votes
0
answers
101
views
How to achieve dynamic partition pruning in BigQuery TVF using a subquery lookup?
I have a Table Valued Function (TVF) that takes an array of IDs as input.
source_table: A very large table (TBs) partitioned by day on capturedTimestamp.
selection_table: A small lookup table that ...
Tooling
0
votes
3
replies
99
views
How to enforce dynamic partition pruning in BigQuery TVF using a subquery lookup?
I have a table-valued function (TVF) that takes an array of IDs as input.
source_table: a very large table (TBs) partitioned by day on capturedTimestamp
selection_table: a small lookup table that ...
Tooling
0
votes
0
replies
45
views
BigQuery Data Transfer: Load CSV Files Using Column Headers
I’m setting up a BigQuery Data Transfer for daily CSV files.
Because the column order in these files can vary from day to day,
I’d like BigQuery to use the CSV headers to map the data to the correct ...
0
votes
0
answers
79
views
BigQuery data is not inserted in tables
I'm having random data missing/not showing up in BigQuery issues. I would seemingly write data without issues, but SELECT * would return nothing.
The control panel shows no errors. None of the quotas ...
0
votes
1
answer
111
views
Calculate percent of total per firm and filter firms above 50%
I'm trying to calculate the percentage contribution of each firm to the total amount across all firms.
Database:
google-bigquery
What I need (desired output):
Sum usde_haircut_amt per firm
Compute ...
0
votes
0
answers
93
views
Can I send a JSON object to BigQuery without stringifying it?
I am trying to figure out if there is a way to send a JSON object to a BigQuery table that has a column of type JSON. I know the current practice is to stringify the JSON and send it over which gets ...
0
votes
1
answer
156
views
How to optimize a BigQuery query that uses multiple JOINs
I’m trying to optimize a BigQuery SQL query that joins several large tables. The query works, but it’s slow and more expensive than expected when running on production-scale datasets.
Below is a ...
1
vote
0
answers
80
views
Databricks always loads built-in BigQuery connector (0.22.2), can’t override with 0.43.x
I am using Databricks Runtime 15.4 (Spark 3.5 / Scala 2.12) on AWS.
My goal is to use the latest Google BigQuery connector because I need the direct write method (BigQuery Storage Write API):
option(&...
0
votes
0
answers
45
views
Unable to fetch Accurate Performance Max (PMAX) YouTube Video Metrics via Google Ads Script / BigQuery Transfer
I’m currently working on a task to fetch and display daily Google ADS Manager (GAM) records—such as Cost, ROAS, and other metrics—within a data analysis application. I’ve successfully retrieved data ...
0
votes
0
answers
46
views
BigQuery TVFs Prevent Optimization, Lead to Higher Billing?
I work on a project in BigQuery using Looker Studio for dashboard visualizations. We recently saw our billing costs and usage skyrocket and are trying to determine the cause. Nothing fundamentally ...
Advice
0
votes
1
replies
75
views
Difference between DIV vs / in BigQuery
When I tried to do division in BigQuery, I saw this.
Initial implementation -> div (x / y) for this, I got an error,
No matching signature for function DIV Argument types: FLOAT64, FLOAT64 ...
0
votes
1
answer
88
views
How to reference a CSV column with parentheses and a decimal point in Spark SQL or COALESCE expression?
I’m working on a data ingestion pipeline using Apache Spark (triggered via a Cloud Function on Dataproc).
The input CSV contains column names that include special characters such as parentheses and a ...
0
votes
1
answer
105
views
Google Pub/Sub push subscription to BigQuery changes value in float data type
I have a BigQuery push subscription in Pub/Sub that for some reason changes the values of the fields with float datatype when pushed to BigQuery.
I tried creating a pull subscription and attached it ...
2
votes
1
answer
78
views
Why does a DELETE with a JOIN on partitioned columns in BigQuery cost more than dropping specific partitions?
I have a large BigQuery table, big_table, around 5 TB in size.
It is partitioned by the column partition_date, which has about 2000 distinct values.
I also have a smaller table, small_table, which ...
1
vote
0
answers
40
views
403 Permission error when creating BigQuery link with GA4 via Admin API (v1alpha)
I’m trying to use the Google Analytics Admin API (v1alpha) to link a GA4 property to BigQuery via the properties.bigQueryLinks.create method.
Here is my code using : https://developers.google.com/...
0
votes
0
answers
112
views
How to export BigQuery query results directly to GCS without creating a temporary table (using Java client)
I'm currently aware that I can export a BigQuery query result to Google Cloud Storage (GCS) by first creating a temporary table and then performing an extract table operation on that temp table.
...
0
votes
0
answers
83
views
Wrong REST response for retrieving Column Level Lineage in GCP BigQuery
I have got two BigQuery tables:
bigquery:ssh-test-project-01.SSh_Dataset_03.SSh_BgQ_Src_01 and
bigquery:ssh-test-project-01.SSh_Dataset_03.SSh_BgQ_Dst_BgQ_01.
Using Data Transfer with SQL:
"...
0
votes
1
answer
152
views
Vertex AI Agent Builder/Dialogflow CX: Cannot Set ID Key Property for BigQuery Data Store (UI Issue?) - Search Fails for IDs
I'm trying to connect a BigQuery table containing property listings (originally from an Excel/CSV file) to a Vertex AI Agent Builder / Dialogflow CX agent using a Structured Data Store. My goal is to ...
-2
votes
3
answers
133
views
“SELECT list expression references column X which is neither grouped nor aggregated” when using COUNT(*)? [closed]
I'm trying to run a simple SQL query in BigQuery like this:
SELECT usertype, COUNT(*)
FROM `project.dataset.table`;
But I get an error:
SELECT list expression references column usertype which is ...
1
vote
1
answer
114
views
BigQuery stopped auto-detecting columns for external tables on top of parquet files with custom partitions
Some of our BQ projects create external tables on top of parquet files like this:
CREATE OR REPLACE EXTERNAL TABLE my_dataset my_table
WITH PARTITION COLUMNS (ingestion_date DATE)
OPTIONS (
format ...
0
votes
1
answer
68
views
How to group by geography in Bigquery
I have the following code:
SELECT
h3s.h3id, h3s.geog,
MIN(ST_DISTANCE(`carto-os`.carto.H3_CENTER(htsp.h3id), `carto-os`.carto.H3_CENTER(h3s.h3id)))
OVER (PARTITION BY h3s.h3id)
FROM
...
0
votes
1
answer
85
views
Defining an external table in Dataform
I need to define an external table from Google sheet in Dataform - only a few columns from a given sheet should be used.
Some AI tools are saying that 'range' can be used in OPTIONS, but I guess it's ...