0

I have a hierarchy table:

CREATE TABLE tmp.myTable  
   (  
    [Id] int IDENTITY(1,1) PRIMARY KEY ,      
    [Desc] nvarchar(50) NOT NULL,
    [Lvl] TINYINT NOT NULL,
    [ParentId] int REFERENCES tmp.myTable(Id)
   )

I have to insert the data one by one, which can be time-consuming on a slow connection or over VPN etc

Let's say we want to insert this row of hierarchy:

   A > AB > ABC > ABCD

These are the statements:

   INSERT INTO tmp.myTable ([Desc],[Lvl],[ParentId]) VALUES('A',1,NULL);

   INSERT INTO tmp.myTable ([Desc],[Lvl],[ParentId])
   SELECT 'AB',2,Id FROM tmp.myTable WHERE [Desc]='A'

   INSERT INTO tmp.myTable ([Desc],[Lvl],[ParentId])
   SELECT 'ABC',3,Id FROM tmp.myTable WHERE [Desc]='AB'

   INSERT INTO tmp.myTable ([Desc],[Lvl],[ParentId])
   SELECT 'ABCD',4,Id FROM tmp.myTable WHERE [Desc]='ABC'

I was wondering if there is a better way to do this.

One idea is to make ID column INT and not IDENTITY and then control it on the application side, but I don't like that idea since if more than a user tries to do inserts, the transaction will fail or we may even make mistakes in inserting ParentIds

7
  • Not clear why you are selecting PT based on Descriptions You are assuming that Descriptions are 100% UNIQUE (risky). Furthermore, if you are inserting an ITEM, you should know the parent id. Commented Oct 12, 2022 at 22:42
  • I simplified my example, Descriptions are unique at each level, I have constraints for them Commented Oct 12, 2022 at 22:44
  • @JohnCappelletti what do mean by PT? Commented Oct 12, 2022 at 22:45
  • 2
    Then you might as well grab the Parent Level+1 for level. Honestly, having managed hundreds of large hierarchies assumptions and "rules" are risky. Sooner or later they will need an exception. Commented Oct 12, 2022 at 22:47
  • 1
    Simply input the parent id when you insert. Don't rely on the description Commented Oct 12, 2022 at 23:11

1 Answer 1

1

This is not the only solution, but it is a reasonable method.

In your comments, you say that 'Descriptions are unique at each level, I have constraints for them'. If that is the case, you can set up your table with the Primary Key being (Desc, Lvl) rather than the arbitrary int you have now. This also means that the ParentID field is being changed to ParentDesc.

Then, whenever you insert data, you do insert the parent value directly (as @JohnCapelletti wisely imo suggests).

For the Foreign Key reference, it needs to reference both fields (Desc and Lvl). It will use ParentDesc, and for current ease of use, a calculated field called ParentLvl (these referring to the parent's Desc and Lvl respectively).

Note that I'm assuming for this task that parents are always one level up (e.g., if a row is level 3, then its parent is level 2). If parents may be multiple levels higher, or the calculated field is too annoying, you could make ParentLvl a normal field that you enter data into.

For example:

CREATE TABLE [myTable](
    [Desc] [nvarchar](50) NOT NULL,
    [Lvl] [tinyint] NOT NULL,
    [ParentDesc] [nvarchar](50) NULL,
    [ParentLvl]  AS (CONVERT([tinyint],[Lvl]-(1))) PERSISTED,
    PRIMARY KEY ([Desc], [Lvl])
    )
GO

ALTER TABLE [myTable]  WITH CHECK ADD  CONSTRAINT [FK_myTable_myTable] FOREIGN KEY([ParentDesc], [ParentLvl])
REFERENCES [myTable] ([Desc], [Lvl])
GO

ALTER TABLE [myTable] CHECK CONSTRAINT [FK_myTable_myTable]
GO

Then to insert data, you can insert with the parent desc directly specified e.g.,

INSERT INTO myTable ([Desc],[Lvl],[ParentDesc]) 
VALUES ('A',1,NULL),
       ('AB',2,'A'),
       ('ABC',3,'AB'),
       ('ABCD',4,'ABC');

This is what the data in the table then looks like.

|Desc   |Lvl    |ParentDesc |ParentLvl  |
|-------|-------|-----------|-----------|
|A      |1      |null       |0          |
|AB     |2      |A          |1          |
|ABC    |3      |AB         |2          |
|ABCD   |4      |ABC        |3          |

See dbfiddle here for a running example.

Sign up to request clarification or add additional context in comments.

5 Comments

Sorry - my initial version of this had issues with the foreign key constraint. I deleted it temporarily then updated it with SQL script that actually works.
Interestingly this is the first format I designed the table, I had a path string as a column too meaning A/AB/ABC/ABCD. The reason I tried to make it simplified and get back to the very original adjacency table was to keep it small and manageable. The more columns there are the more difficult it gets to develop more
if many rows are inserted at the same time will the foreign key allow for it? what if we are inserting a value whose parent is also in the same batch, since the parent is not in the table yet would it be ok?
@Ibo - regarding rows being inserted at the same time, it appears to work OK in testing (for example, in the script above, it goes from no data to 4 rows without issue). Regarding complexity - the insert statement here is basically the same as the table in the original question (3 fields - Desc, Lvl and ParentDesc). Whereas previously you had an automatic ID created, this automatically creates the parentLvl instead. However, it will make your SQL queries a bit more complex as you will need to do joins on 2 fields (desc and level) rather than just the ID.
@Ibo - If there is a main table that these refer to (e.g., has the desc, lvl, plus all the other info about that) - that is what I'd use. Typically, for a table with parent/child relationships, I would make the relationship table use the primary key (e.g., unique identifiers) to reference that source table.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.