Refer to this example attached as follows. It consists 2 methods of streaming, one by defined function, another by class append method. The 2 methods produce exactly same results no matter how I tune the repay or subscribetable settings.
This is a bit confusing. I thought that the batchSize=0 in subscribetable means the engine would process a batch of messages once arrival. Then this means the number of messages in the batch could be either 1 or many. In the function I do volume_[0], picking the first element of the vector. I would expect to see something like 0 1 2 3 3 5 5 5 8 9 etc., but it produces strictly increasing results.
So is it
the input of the function is always vector of length 1?
Can I confirm that class append method process per row, and function process per batch, and the batch size is always 1
Or is it my subscribetable settings are wrong, introduce only one row per batch? If it is, how should I correct it?
try{unsubscribeTable(,tableName="input_stream",actionName="tes")}catch(ex){}
try{dropStreamTable("input_stream")}catch(ex){}
try{dropStreamTable("volume_stream")}catch(ex){}
try{dropStreamEngine(`tes)}catch(ex){}
go;
max_number = 999999
t = table(
2020.01.01 00:00:01 + 0..max_number as datetime,
0..max_number as `volume,
take(`apple,max_number+1) as `id
)
// insert into t values(2020.01.01 00:00:00,, )
// t.sortBy!(`datetime)
def tes(volume_,id_){
return volume_[0], id_[0]
}
class MyCumSum {
def MyCumSum() {}
def append(volume_,id_) {
return volume_, id_
}
}
share(table=streamTable(1000:0, `datetime`volume`id, [DATETIME,INT,STRING]), sharedName=`input_stream)
share(table=streamTable(1000:0, `datetime`volume`id, [DATETIME,INT,STRING]),sharedName=`volume_stream)
go;
createReactiveStateEngine(
name="tes",
metrics=<[datetime, tes(volume,id) as `volume`id]>,
// metrics=<[datetime, MyCumSum().append(volume,id)]>,
dummyTable=input_stream,
outputTable=volume_stream
)
subscribeTable(
tableName="input_stream",
actionName="tes",
handler=getStreamEngine("tes"),
// batchSize=200000,
// throttle=0.001,
reconnect=true,
msgAsTable=true
)
timing = now()
replay(inputTables=t, outputTables = input_stream, timeColumn=`datetime)
do{}while ((select count(*) from volume_stream)["count"][0] < max_number+1)
timing = now() - timing
print(timing)
