0

I have an Excel spreadsheet connected with an ACE.OLEDB.12.0 connection manager.

I'm having an issue with a column that should contain a date, but there is invalid data in the column.

Most of the entries consist of a date or blank.

The other entries are either a SPACE, "na", or "Dispute".

I found that I can get around the error by using the Input and Output Properties from the Excel source\Advanced Editor, by going to the Output Columns, select my column and change the DataType to DT_WSTR.

The SPACE, "na", and "Dispute" values do not get imported.

It's as if the Excel source is acting as a data converter and just dropping the non-date values. I was fully expecting to have to filter out the non-date values in a derived column, but I don't have to.

Does someone know why it does that?

2
  • It wouldn't silently ignore conversion errors unless you inadvertently set them to ignore. You haven't provided sample data etc, so hard to try here. Can I suggest that you prepare a new EXCEL file, where the date column has some invalid date data (space, na, dispute) towards the beginning of the file (first few rows - these are what SSIS checks to decide if the column's data-type is date), and use that to recreate your dataflow. (describing how to change all downstream components to align with your change from Date to wstr can be quite long). Where do these go? into a date column? Commented Apr 16 at 23:26
  • Not silently ignoring the errors. The errors just stop when the "Output Columns" column is changed to DT_WSTR. The column from the Excel spreadsheet is still set to DT_DATE, even though the column is mixed. I am not using Extended property IMEX or MAXROWSTOSCAN. I'm not concerned with changing the DataType of the Excel column, I'm interested in the behavior. It seems as though the Excel data type is ignored and the only real issue is the conversion from the Excel DataType to the Output DataType. The behavior seems strange and I am wondering about lower level details of why this happens. Commented Apr 17 at 14:25

1 Answer 1

0

Found the answer, it took a while to find, but here is the reason the Excel Source does what it does.

From page: https://learn.microsoft.com/en-us/sql/integration-services/load-data-to-from-excel-with-ssis?view=sql-server-ver16

"The Excel driver reads a certain number of rows (by default, eight rows) in the specified source to guess at the data type of each column. When a column appears to contain mixed data types, especially numeric data mixed with text data, the driver decides in favor of the majority data type, and returns null values for cells that contain data of the other type. (In a tie, the numeric type wins.) Most cell formatting options in the Excel worksheet do not seem to affect this data type determination."

This is a very useful feature to have, if you don't want to change the perceived data type of the Excel spreadsheet.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.