Read arbitrary trees of arrays of composite types, with specific type handlers for inner elements

For https://github.com/npgsql/efcore.pg/issues/1691, we need a way to efficiently read a column of type `record[]` which can potentially represent a unique structure for every query. It more or less deals with the same issue as unmapped composite types, but in this case the plain `record` oid is used.

Currently, a .NET `object[][]` is returned for PG `record[]`, which is not good enough in case some element has a specific type mapping for example when using EF.

Note that an item in a record can contain another record, or array of records, of an arbitrary depth.

There are many approaches to deserialize this. Here are some options:

1. Some reflection-based approach that takes a `Type` as input and constructs the given object.

```C#
public class MyType {
    public int Thing { get; set; }
    public InnerType[] MoreThings { get; set; }
}

public class InnerType {
    public NpgsqlTimestamp Timestamp { get; set; }
    public int[] IntArray { get; set; }
}
```

```C#
MyType[] records = reader.GetFieldValue<MyType[]>(index);
```

I believe this was kind of supported in Npgsql 4 (but not inner types), but was dropped.

2. A communication API for moving the deserializer forward, step by step. Something like this:

```C#
// in NpgsqlDataReader:
public void StartDeserializeComplexType(int columnIndex);
public int StartDeserializeArray(); // returns the number of items in the array that follows
public int StartDeserializeRecord(); // returns the number of items in the record that follows
public T DeserializeElement<T>(); // uses the standard type handler for reading the item
```

The last three methods throw if the deserialization state doesn't match what the user expects.

We can then use it like this:

```C#
reader.StartDeserializeComplexType(index);
MyType[] records = new MyType[reader.StartDeserializeArray()];
for (var i = 0; i < records.Length; i++) {
    reader.StartDeserializeRecord();
    MyType record = new MyType();
    record.Thing = reader.DeserializeElement<int>();
    record.MoreThings = new InnerType[reader.StartDeserializeArray()];
    for (var j = 0; j < record.MoreThings.Length; j++) {
        reader.StartDeserializeRecord();
        InnerType inner = new InnerType();
        inner.Timestamp = reader.DeserializeElement<NpgsqlTimestamp>();
        inner.IntArray = reader.DeserializeElement<int[]>(); // can do this the easy way
        record.MoreThings[j] = inner;
    }
    records[i] = record;
}
```

Or let `StartDeserializeComplexType` return a new object having the methods above, to avoid cluttering `NpgsqlDataReader`, at the expense of an extra allocation (which however could be cached...).

3. Some visitor with callback API, so the user gets a callback when we enter and leave records and arrays etc.

While option 1 is probably easiest to use for normal users, it has performance drawbacks (due to reflection), as well as the potential unability to map record items, which are unnamed, to the correct property (since properties are unordered in C#, if not annotated to be laid out in a particular order). Also if the returned object tree is only for intermediate use, we waste memory.

Option 2 is in my opinion a better fit for how EF materialization works, as it dynamically generates .NET Expressions which will be compiled to a real function. The materialization code can just generate the (verbose) needed code that iterates through the record tree and builds whatever entity objects it wants directly, without any boxing or any creating a bunch of new .NET Types.

Also note that 1 can easily be created as a helper function using 2 if we would want that.

Comments?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read arbitrary trees of arrays of composite types, with specific type handlers for inner elements #3558

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Read arbitrary trees of arrays of composite types, with specific type handlers for inner elements #3558

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions