SQL Server Merge Duplicate Rows and Rows Conversion (with Example)

Optimized requirement: Please query the daily clock in records of each employee for that month by month (as shown in the figure below), and separate the morning and evening using the separator "| |" (Actually, the original requirement required a carriage return, but I couldn't find a good way to do it in the database. After consultation, it was changed to | |. If anyone who has read this article knows how to split it, please let me know.

Table design: (This requirement was proposed in February due to the pandemic, and the initial clock in table design was a flexible and simple table, which led to the generation of complex row and column conversions).

Analysis: According to the data in the table, if an employee clock in and out twice, two pieces of data will be generated. It is necessary to first merge data from the same day, and then convert the date into a header field through row and column conversion.

1. Merge data from the same day (if you don't want to see it, you can directly look at the conversion of 2 rows and columns):

a) Using the for XML path. More detailed usage of for XML path: click to enter.

b) Use stuff to merge the< in the merged column; Xxx> Remove and separate dates with "| |". More detailed usage: click to enter.

My code is as follows:

select pro_dcp_person,(
 ---perform character segmentation 
select stuff((
 --merge duplicate rows 
select CONVERT(varchar(20), pro_dcp_date,24)+'||' from pro_dailypunchcard 
where 1=1 
and CONVERT(varchar(10), pro_dcp_date, 120) = CONVERT(varchar(10),dcp.pro_dcp_date, 120)
and pro_dcp_person = dcp.pro_dcp_person for xml path('')
 --merge ended 
),1,0,'')
 ---character segmentation ended 
)as alldate,CONVERT(varchar(10),dcp.pro_dcp_date, 120)as pro_dcp_date from pro_dailypunchcard dcp 
where 1=1
and CONVERT(varchar(10), pro_dcp_date, 120) in ('2020-04-01','2020-04-02','2020-04-03')
group by pro_dcp_person,CONVERT(varchar(10),dcp.pro_dcp_date, 120)

The query results are as follows:

2. I did not directly succeed in converting the query results as shown in the above figure.

Case when failed code (lazy to add useless max() during review):


select a.*,
case when CONVERT(varchar(10), a.pro_dcp_date, 120) = '2020-04-01' then a.alldate else '/' end as '2020-04-01',
case when CONVERT(varchar(10), a.pro_dcp_date, 120) = '2020-04-02' then a.alldate else '/' end as '2020-04-02',
case when CONVERT(varchar(10), a.pro_dcp_date, 120) = '2020-04-03' then a.alldate else '/' end as '2020-04-03'
from (
select pro_dcp_person,(
select stuff((
select CONVERT(varchar(20), pro_dcp_date,24)+'||' from pro_dailypunchcard 
where 1=1 
and CONVERT(varchar(10), pro_dcp_date, 120) = CONVERT(varchar(10),dcp.pro_dcp_date, 120)
and pro_dcp_person = dcp.pro_dcp_person for xml path('')
),1,0,'')
)as alldate,CONVERT(varchar(10),dcp.pro_dcp_date, 120)as pro_dcp_date from pro_dailypunchcard dcp 



where 1=1
and CONVERT(varchar(10), pro_dcp_date, 120) in ('2020-04-01','2020-04-02','2020-04-03')
group by pro_dcp_person,CONVERT(varchar(10),dcp.pro_dcp_date, 120)


) as a

Case when failed result:

Between April 1st and April 3rd, there should only be one check-in data for 1079 users, but three entries appeared in the case when and the merge was not successful.

To achieve this, data can actually be sent back to the impl layer for data decomposition and merging. I almost gave up on working in the database, and then a senior engineer from the company recommended pivot to me.

Super detailed pivot tutorial: Clicking to enter pivot has a slight restriction, but it is not a problem for more internet based enterprises.

But for me, the problem is not directly using the stored procedures in this tutorial. This article mainly records the situation where data in the table needs to be processed and then converted to columns.

A) Convert processed data into views. .

B) Convert rows and columns in the view. .

Therefore, I need to convert the data I processed in step 1 into a view.

The conversion code is as follows:

create VIEW viw_dailypunchcard 
AS 
select pro_dcp_person,(
select stuff((
select CONVERT(varchar(20), pro_dcp_date,24)+CHAR(13) from pro_dailypunchcard 
where 1=1 
and CONVERT(varchar(10), pro_dcp_date, 120) = CONVERT(varchar(10),dcp.pro_dcp_date, 120)
and pro_dcp_person = dcp.pro_dcp_person for xml path('')
),1,0,'')
)as alldate,CONVERT(varchar(10),dcp.pro_dcp_date, 120)as pro_dcp_date from pro_dailypunchcard dcp 



where 1=1
--and CONVERT(varchar(10), pro_dcp_date, 120) in ('2020-03-01','2020-03-02','2020-03-03')
group by pro_dcp_person,CONVERT(varchar(10),dcp.pro_dcp_date, 120);

Then perform row to column conversion in the stored procedure, and the conversion code is as follows:

DECLARE @dec_val NVARCHAR(MAX)
DECLARE @sql_str NVARCHAR(MAX)
DECLARE @sql_col NVARCHAR(MAX)
DECLARE @tableName SYSNAME --row to row list 
DECLARE @groupColumn SYSNAME --group fields 
DECLARE @row2column SYSNAME --fields of row to column transformation 
DECLARE @row2columnValue SYSNAME --fields with row variable column values 
SET @tableName = 'viw_dailypunchcard'--view name 
SET @groupColumn = 'pro_dcp_person'
SET @row2column = 'pro_dcp_date'
SET @row2columnValue = 'alldate'
 --declare the date to be passed in 
SET @dec_val='''2020-04-01'',''2020-04-02'',''2020-04-03''';
 --retrieve possible columns from row data 
SET @sql_str = N'
SELECT @sql_col_out = ISNULL(@sql_col_out + '','','''') + QUOTENAME(['+@row2column+']) 
FROM ['+@tableName+'] where CONVERT(varchar(10), ['+@row2column+'], 120) in ('+@dec_val+') GROUP BY ['+@row2column+'] ORDER BY ['+@row2column+'] ASC  '
PRINT @sql_str
EXEC sp_executesql @sql_str,N'@sql_col_out NVARCHAR(MAX) OUTPUT',@sql_col_out=@sql_col OUTPUT
PRINT @sql_col
SET @sql_str = N'
select a.*,sysp_use_name from (
SELECT * FROM (
SELECT [pro_dcp_person],[pro_dcp_date],[alldate] FROM [viw_dailypunchcard]) p 
PIVOT 
(MAX([alldate]) FOR [pro_dcp_date] IN ( '+ @sql_col+') ) AS pvt
) as a
 --the self added association table section 
left join (select * from sysp_user where sysp_use_valid = ''1'') u
on u.sysp_use_id = a.[pro_dcp_person]
ORDER BY a.[pro_dcp_person]
'
PRINT (@sql_str)
EXEC (@sql_str)

Output result:

End of row and column conversion.

... it took me almost six hours. In a fit of anger, everyone was ready to do data decomposition processing in the impl layer.

In short, methods are always more difficult than difficulties.