Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
132 views

I'm working with Amazon Redshift Spectrum and trying to follow the principle of least privilege by granting only the necessary access to users or roles. I attempted the following command: GRANT SELECT ...
Jeff A's user avatar
  • 649
0 votes
2 answers
151 views

I have data stored in s3 with parquet format, and have multiple column with array of struct and also array inside array of struct Here I am query one category column which have array of struct Sample ...
Nishant Dixit's user avatar
0 votes
1 answer
85 views

I am trying to get JSON extract from a column value in redshift. The column value is like: [{'IDIndex': '0001', 'History': 4, 'Name': '08-SA-21-C1', 'ActiveFlag': 1, 'Category': 3, 'TotalCount': 0, '...
nodev_101's user avatar
  • 109
0 votes
1 answer
215 views

I'm trying to query data through Redshift Spectrum using an external schema from the Glue catalog but encountering an issue with a column that has a timestamp data type. When I run the query SELECT * ...
Jeff A's user avatar
  • 649
0 votes
1 answer
344 views

I have very large (1 billion + records) files in S3, that I am querying via Amazon Redshift using Spectrum. I have a datatype in Redshift as follows: map<string,struct<string_value:string,...
Nick Edwards's user avatar
0 votes
1 answer
49 views

I inserted the JSON: {key1:xxx, key2:xxx, ..., key3:1.0000000123456789, ..., keyn:xxx} into a Super type column in Redshift, and it resulted in the following on Redshift: {key1:xxx, key2:xxx, ..., ...
ezail's user avatar
  • 23
-1 votes
1 answer
39 views

I have a table Sales with the following columns - Emp ID Activity Date Sales 1234 2024-01-01 254.22 1234 2024-05-08 227.10 5678 2023-02-01 254.22 5678 2024-05-01 227.10 I need to find the total ...
Vertika Sharma's user avatar
0 votes
1 answer
175 views

I have an S3 bucket which contains 1000s of folders which are basically table_names and those contains parquet files. I'm trying to create tables with that schema in redshift. I'm using redshift-data ...
Poreddy Siva Sukumar Reddy US's user avatar
0 votes
1 answer
187 views

I have a spectrum table with the following schema: TABLE spectrum.table ( realmcode struct < @code: string >, typeid struct < @extension: string, @root: string >, ...
Edoardo De Gaspari's user avatar
2 votes
1 answer
86 views

I have an external table in Redshift spectrum. It works fine when I add it in a view. But when I try to query it directly, it gets stuck on Discover attribute {column name}. The query takes ages but ...
abd's user avatar
  • 81
0 votes
0 answers
280 views

Had a previous post asking about parsing an array , (JSON data) in AWS Athena into rows and columns which was answered (AWS Athena Parse array of JSON objects to rows) but had a new twist added. We ...
Jeff A's user avatar
  • 649
0 votes
0 answers
40 views

Input stream. { "awsRegion": "us-west-2", "eventID": "101", "eventName": "TEST", "userIdentity": null, "...
Anand's user avatar
  • 1
0 votes
1 answer
1k views

I have a table in AWS Redshift Spectrum that contains a column called "data". Each cell in "Data" contains an array of JSON objects. A single data cell may look like this (this is ...
Benjamin Bingham's user avatar
0 votes
1 answer
370 views

I am using below query to create metalized view in redshift , `CREATE MATERIALIZED VIEW test_sch."new_vw" AUTO REFRESH YES AS SELECT approximate_arrival_timestamp, JSON_PARSE(kinesis_data) ...
Anand's user avatar
  • 1
0 votes
2 answers
1k views

I have a super field that holds JSON formatted data - ** { "awsRegion": "us-west-2", "dynamodb": { "ApproximateCreationDateTime": 1712584702997808, "Keys&...
Biswa Patra's user avatar
0 votes
1 answer
269 views

I am getting the path of the s3 files from external table using : create or replace view raw_view as select *,"$path" as sourcefilename from raw_external_table WITH NO SCHEMA BINDING; Now I ...
Rikesh Kayastha's user avatar
0 votes
1 answer
308 views

I am loading around 50 gb of Parquet data into DataFrame using Glue ETL job and then trying to load into Redshift table which is taking more 6-7 hrs and not even completing. datasink=glueContext....
RickyS's user avatar
  • 23
0 votes
1 answer
276 views

When I scan the data from S3 using a Glue crawler I get this schema: {id: integer, value: String} This is because Spark writes data back in String type and not varchar type. Although there is a ...
Neelanjoy B's user avatar
0 votes
0 answers
52 views

My query is like this: unload('select * from table') to 's3://path' credentials '*******' header parallel off delimiter as '\307' The delimiter is Cedilla Ç. and I have to unload it like this '\307'. ...
Rikky Bhai's user avatar
  • 1,018
0 votes
0 answers
141 views

Given a table with this schema: id name values 1 a [1,2,3] 1 b [4,5,6] 1 c [x,x,y] Can I query it to receive this: id a b c 1 1 4 x 1 2 5 x 1 3 6 y And be then able to filter e.g. WHERE c = 'x' or ...
jacksbox's user avatar
  • 929
0 votes
0 answers
151 views

I'm facing a query performance on a quite complex query in Redshift and as a noob in Redshift I don't understand. The query is made of several joins (INNER and LEFT) and doesn't return data in hours. ...
Teuh1975's user avatar
0 votes
1 answer
284 views

I try to create a materialized view via spektrum from an external table in the glue data catalog CREATE MATERIALIZED VIEW "dev"."public"."table_name" AS SELECT DISTINCT * ...
jacksbox's user avatar
  • 929
0 votes
1 answer
482 views

I have a parquet file generated by clickhouse, if use pyarrow to show its schema: import pyarrow.parquet as pq data = pq.read_table('test.pqt') print(data.schema) It shows the schema was like this: ...
Rinze's user avatar
  • 834
0 votes
3 answers
2k views

Alright. I have a table that has SUPER type fields. These fields hold values like below: id mycol --------------------------------- 1 [{"Title":"first"},{"Title":"...
Rick's user avatar
  • 1,619
-1 votes
1 answer
3k views

I have a super field that holds JSON formatted data - [{"Title":"First Last"}] I want to extract the JSON value string First Last and to do so, I tried converting this field to ...
Rick's user avatar
  • 1,619
0 votes
1 answer
133 views

I want to convert a column with mixed formats like "1 day 07:00:00" and "2 days" into hours. Here's a query that should work in Amazon Redshift: SELECT CASE WHEN ...
Keshav Sahu's user avatar
1 vote
0 answers
144 views

I'm querying redshift spectrum and certain fields are showing up null without any explanation. I've checked SVL_S3LOG SVL_SPECTRUM_SCAN_ERROR SYS_EXTERNAL_QUERY_ERROR And they are all empty. In the ...
RSHAP's user avatar
  • 2,446
0 votes
1 answer
944 views

I want to copy some parquet files into AWS Redshift, but the Redshift table schema has fewer columns compared to the parquet files, because those columns contain sensitive information. Therefore, I ...
Rinze's user avatar
  • 834
3 votes
1 answer
782 views

According to the list of available monitoring views at the bottom of Monitoring queries and workloads with Amazon Redshift Serverless, sys_external_query_error is not available in Redshift Serverless. ...
Kris Bixler's user avatar
1 vote
0 answers
150 views

I have 2 tables, let's say MAIN (Redshift) and TEMP (Spectrum), and a simple query that inserts all the data from TEMP into MAIN. But sometimes it may fail and rise an error like this:error: Invalid ...
Vahagn's user avatar
  • 11
0 votes
1 answer
1k views

I am working on some github data and would like to pull the first commit email ([email protected]) in the data below. All of the below data is stored as a struct column. "struct<action:string,...
Bob's user avatar
  • 1
0 votes
2 answers
257 views

I am currently migrating a PostgreSQL procedure to work in Redshift Spectrum Serverless. I was able to have a working procedure that works as intended in Redshift. However, its originally used inside ...
viralshah009's user avatar
0 votes
1 answer
63 views

I have files in s3 I need to read into redshift, but I need to maintain the line number from the file somehow. I tried inserting from a spectrum table into a table with an identity column but it ...
user433342's user avatar
  • 1,118
0 votes
1 answer
574 views

I have an S3 bucket with 5 prefixes / "sub-folders", each containing a set of CSV files that were exported from a legacy database. The CSV files have been crawled and created a Glue database ...
RobD's user avatar
  • 1,704
1 vote
2 answers
185 views

I am trying to create a regex that I can use to extract out the values assigned to variable x in the following string: (req.idf=6ca9a AND (req.ster=201 OR req.ster=st_home) AND (req.ste=hi OR req....
Moh's user avatar
  • 21
0 votes
1 answer
1k views

I was just trying to run one merge statement redshift, But continuously I am getting "ERROR: syntax error at or near "limit" Position: 229". But no where in code I have used this &...
nodev_101's user avatar
  • 109
1 vote
2 answers
1k views

I have a S3 data lake that I can query with Athena. The same data lake is hooked up to Amazon Redshift as well. However when I run queries in Redshift I get insanely longer query times compared to ...
Killerpixler's user avatar
  • 4,080
0 votes
1 answer
608 views

I have different Iceberg tables built and updated using Python scripts on Glue. I need now to access to them via Redshift Spectrum. From documentation (and some personal tests) it seems not possible ...
Randomize's user avatar
  • 9,163
0 votes
1 answer
927 views

I have this table called results, with two nested arrays in columns students and grades: class students grades C1 [S1, S2, S3] [C, A, B] C2 [S3, S4] [A, B] I'd like to unnest it to the following: ...
Youssef's user avatar
3 votes
1 answer
676 views

We have set up AWS Redshift external table accessing S3 using Spectrum. Due to the huge data amount, we decided to change S3 storage class for files older than 30 days to storage class S3 Glacier Deep ...
Edgars T.'s user avatar
  • 1,149
0 votes
1 answer
2k views

try: cur.execute(""" copy users from 's3://mybucketxxx/SampleCSVFile_119kb_Copy.csv' credentials 'aws_iam_role=arn:aws:iam::78942xxx:role/redshift-s3-access' delimiter ',' region 'ap-...
Arun Kumar's user avatar
0 votes
2 answers
619 views

I'm using AWS Redshift Spectrum to query a Hudi table. As we know, filtering data by partition column when querying data in Spectrum could reduce the size of the data scanned by Spectrum and speed up ...
Rinze's user avatar
  • 834
0 votes
1 answer
2k views

Based on: https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_EXTERNAL_SCHEMA.html I have my schema declared in the following way: create external schema spectrum_schema from data catalog database ...
Vzzarr's user avatar
  • 5,882
0 votes
1 answer
339 views

I have a partitioned location on S3 with data I want to read via Redshift External Table, which I create with the SQL statement CREATE EXTERNAL TABLE.... Only thing is that I have some metadata files ...
Vzzarr's user avatar
  • 5,882
0 votes
1 answer
257 views

I have an external table in Redshift. When I use UNLOAD to fill this table, sometimes the S3 folder that contains the data gets deleted randomly (or I couldn’t figure out the reason). Here's the ...
ένας's user avatar
  • 115
0 votes
1 answer
109 views

We have a query which is performing an aggregation, like: SELECT t.date, COUNT(*) AS rec_count FROM our_schema.log_data t WHERE t.date BETWEEN '2011-01-01' AND '2012-01-01' GROUP BY t.date; I know we ...
bpeikes's user avatar
  • 3,739
0 votes
2 answers
947 views

I'm trying to use boto3 redshift-data client to execute transactional SQL for external table (Redshift spectrum) with following statement, ALTER TABLE schema.table ADD IF NOT EXISTS PARTITION(key=...
PolarStorm's user avatar
0 votes
0 answers
459 views

I'm trying to concatenate an entire row of data to get their checksum output in AWS Redshift external tables to insert them in another external table. Here's a sample of my code (I have much more ...
ένας's user avatar
  • 115
1 vote
0 answers
874 views

I am working with Delta Table and Redshift Spectrum and I notice strange behaviour. I follow this article to set up a Redshift Spectrum to Delta Lake integration using manifest files and query Delta ...
Antonio La Macchia's user avatar
0 votes
1 answer
295 views

AWS doc on the pricing of AWS Redshift Spectrum says that we pay for only TB scanned. However, I still need to create a Redshift cluster and specify instance type as well as how many nodes in the ...
user159566's user avatar

1
2 3 4 5
7