For npgsql/efcore.pg#1691, we need a way to efficiently read a column of type record[] which can potentially represent a unique structure for every query. It more or less deals with the same issue as unmapped composite types, but in this case the plain record oid is used.
Currently, a .NET object[][] is returned for PG record[], which is not good enough in case some element has a specific type mapping for example when using EF.
Note that an item in a record can contain another record, or array of records, of an arbitrary depth.
There are many approaches to deserialize this. Here are some options:
- Some reflection-based approach that takes a
Type as input and constructs the given object.
public class MyType {
public int Thing { get; set; }
public InnerType[] MoreThings { get; set; }
}
public class InnerType {
public NpgsqlTimestamp Timestamp { get; set; }
public int[] IntArray { get; set; }
}
MyType[] records = reader.GetFieldValue<MyType[]>(index);
I believe this was kind of supported in Npgsql 4 (but not inner types), but was dropped.
- A communication API for moving the deserializer forward, step by step. Something like this:
// in NpgsqlDataReader:
public void StartDeserializeComplexType(int columnIndex);
public int StartDeserializeArray(); // returns the number of items in the array that follows
public int StartDeserializeRecord(); // returns the number of items in the record that follows
public T DeserializeElement<T>(); // uses the standard type handler for reading the item
The last three methods throw if the deserialization state doesn't match what the user expects.
We can then use it like this:
reader.StartDeserializeComplexType(index);
MyType[] records = new MyType[reader.StartDeserializeArray()];
for (var i = 0; i < records.Length; i++) {
reader.StartDeserializeRecord();
MyType record = new MyType();
record.Thing = reader.DeserializeElement<int>();
record.MoreThings = new InnerType[reader.StartDeserializeArray()];
for (var j = 0; j < record.MoreThings.Length; j++) {
reader.StartDeserializeRecord();
InnerType inner = new InnerType();
inner.Timestamp = reader.DeserializeElement<NpgsqlTimestamp>();
inner.IntArray = reader.DeserializeElement<int[]>(); // can do this the easy way
record.MoreThings[j] = inner;
}
records[i] = record;
}
Or let StartDeserializeComplexType return a new object having the methods above, to avoid cluttering NpgsqlDataReader, at the expense of an extra allocation (which however could be cached...).
- Some visitor with callback API, so the user gets a callback when we enter and leave records and arrays etc.
While option 1 is probably easiest to use for normal users, it has performance drawbacks (due to reflection), as well as the potential unability to map record items, which are unnamed, to the correct property (since properties are unordered in C#, if not annotated to be laid out in a particular order). Also if the returned object tree is only for intermediate use, we waste memory.
Option 2 is in my opinion a better fit for how EF materialization works, as it dynamically generates .NET Expressions which will be compiled to a real function. The materialization code can just generate the (verbose) needed code that iterates through the record tree and builds whatever entity objects it wants directly, without any boxing or any creating a bunch of new .NET Types.
Also note that 1 can easily be created as a helper function using 2 if we would want that.
Comments?
For npgsql/efcore.pg#1691, we need a way to efficiently read a column of type
record[]which can potentially represent a unique structure for every query. It more or less deals with the same issue as unmapped composite types, but in this case the plainrecordoid is used.Currently, a .NET
object[][]is returned for PGrecord[], which is not good enough in case some element has a specific type mapping for example when using EF.Note that an item in a record can contain another record, or array of records, of an arbitrary depth.
There are many approaches to deserialize this. Here are some options:
Typeas input and constructs the given object.I believe this was kind of supported in Npgsql 4 (but not inner types), but was dropped.
The last three methods throw if the deserialization state doesn't match what the user expects.
We can then use it like this:
Or let
StartDeserializeComplexTypereturn a new object having the methods above, to avoid clutteringNpgsqlDataReader, at the expense of an extra allocation (which however could be cached...).While option 1 is probably easiest to use for normal users, it has performance drawbacks (due to reflection), as well as the potential unability to map record items, which are unnamed, to the correct property (since properties are unordered in C#, if not annotated to be laid out in a particular order). Also if the returned object tree is only for intermediate use, we waste memory.
Option 2 is in my opinion a better fit for how EF materialization works, as it dynamically generates .NET Expressions which will be compiled to a real function. The materialization code can just generate the (verbose) needed code that iterates through the record tree and builds whatever entity objects it wants directly, without any boxing or any creating a bunch of new .NET Types.
Also note that 1 can easily be created as a helper function using 2 if we would want that.
Comments?