Comments on Just a little Python: MongoDB Pub/Sub with Capped Collections

No, the sources are for something different. I'...

2015-10-09T10:49:55.115-04:00

No, the sources are for something different. I've restarted the mongodb server and the ids have changed in the middle.

Looking at the sources (I could really read them, ...

2015-10-09T10:38:50.845-04:00

Looking at the sources (I could really read them, but; https://github.com/mongodb/libbson/blob/master/src/bson/bson-oid.c#L104):

> The bson_oid_t generated by this function is not guaranteed to be globally unique. Only unique within this context. It is however, guaranteed to be sequential.

Ah yes, you are right about the rollover.

2015-10-09T10:25:16.222-04:00

Ah yes, you are right about the rollover.

With the oplog, the filtering happens on the serve...

2015-10-09T10:20:59.019-04:00

With the oplog, the filtering happens on the server (you query the oplog for only the events you're interested in), so it doesn't actually consume more resources than the approach in this blog post. So you'd do something like db.oplog.rs.find({'ns': 'my-db.messages', 'ts': {'$gt': last_ts_field_I_handled}}).sort('$natural)

For rollover, keep in mind that, while the counter may not be *thread* local, it is certainly *machine* local, and is not a shared counter with the clients. Thus we can ignore them entirely for this analysis. So consider two ObjectIds generated in the same second *on the server*. In this case, the first one is [current_time, server_machine_id, server_pid, 0xffffff], and the second one is [current_time, server_machine_id, server_pid, 0x000000]. So you get rollover, and order inversion. Now, you may have a slow event stream, which would reduce the likelihood of this, but it 50% of your events occur in the same second as the previous event, this inversion will happen, on average, every 48 mln events, with higher likelihood if you have a faster flow of events. (And the approach in this blog post can generate 1000s of events per second, so you *can* get inversions almost every every 28 mln events).

> This means that if you tail the oplog you'...

2015-10-09T10:08:34.480-04:00

> This means that if you tail the oplog you'll have to filter out operations that you don't care about.

That's what I meant. Filtering out also takes resources. And IIRC oplog is a capped collection with a limited size. So it could be filled with other records with enough intensivity.

> It only requires 2 documents in the same second that happen to have the sequence numbers 0xffffff and 0x000000

Rollover doesn't affect the order, because it's not within a second. Object id = current second + machine id + process id + counter.
In order that object_id_1 > object_id_2 while object object_id_1 was generated earlier than object_id_2, the overflow must happen during one second, which is possible only when during the same second 16 millions ids were generated (while most of the inserted documents have ids generated on clients).

For the oplog, I think you have things backwards. ...

2015-10-09T09:49:19.013-04:00

For the oplog, I think you have things backwards. Inserts into messages (as well as all other collections in your database) causes new operation documents to be inserted into the local.oplog.rs collection. (This means that if you tail the oplog you'll have to filter out operations that you don't care about.)

And order inversion doesn't require 16 mln documents per second. It only requires 2 documents in the same second that happen to have the sequence numbers 0xffffff and 0x000000, which can happen once ever 16 mln documents (which is not unrealistic over the course of weeks or months.) And of course, the first rollover will occur at a random time after the counter is initialized, since it's initialized with a random value.

Thanks for the comments!

The reply button under your comment doesn't wo...

2015-10-09T09:42:16.851-04:00

The reply button under your comment doesn't work, so I reply here.
I meant that local oplog.rs records insertions not only into messages collection, but also into other collections? Or not?

> which would lead to an order inversion if two events in the same second spanned the rollover

I thought of this, but this means over 16 mln of documents per second, which is unrealistic. Anyway, I think using server-generated ids suits my case, where sequence within a second is not so important (although on my server machine identifier and process id are the same for inserted records).

Oh, and also regarding the local database and repl...

2015-10-09T09:30:52.038-04:00

Oh, and also regarding the local database and replication: all operations on all your non-local databases are automatically appended to the oplog, so there's no dedicated databases needed for that approach. (Operations in the oplog specify both the database and collection to which they pertain.)

As for server-side ObjectIds being monotonic, ther...

2015-10-09T09:29:18.251-04:00

As for server-side ObjectIds being monotonic, there are two possible issues. One is counter rollover. At some point your sequence number will contain 0xfffffe, 0xffffff, 0x000000, 0x000001..., which would lead to an order inversion if two events in the same second spanned the rollover. Secondly, I haven't read the MongoDB server code here, but it's certainly possible that a different "machine ID" or "process ID" is used for different threads within the server to eliminate locking, which could lead to consistent order inversion within a second. The same page you referenced did contain a warning not to assume monotonicity (though it didn't explicitly state whether server-side monotonicity is guaranteed).

As for the local database, that's a database present on all MongoDB nodes that is never replicated to the rest of the set. That's why the oplog is located in the local database (replicating the oplog would cause a circular replication dependency). And while using the oplog for pub/sub does require running a replica set (in most cases you should probably be doing this in production anyway), it does not require any secondaries. For instance, in development and testing I run a single-node replica set (no secondaries) for just this purpose.

Having server side generated object ids, no counte...

2015-10-09T09:12:54.099-04:00

Having server side generated object ids, no counters are needed, so there is no race condition (as I see it). Also not every installation is run in master mode having an oplog.

I am using this to generate the _id on the server ...

2015-10-09T09:09:38.265-04:00

I am using this to generate the _id on the server side:
messages.database.eval('db.messages.insert({})'.format(json.dumps(
{'data': message, 'channel': channel}, ensure_ascii=False)))

I've seen your code using oplog, but I found it too complex. Does it have to use a dedicated database if you are reading `local`?

> I would *not*, however, assume that server-generated _id fields are guaranteed to be monotonically increasing.

I would (http://docs.mongodb.org/manual/reference/object-id/), ObjectId contains "a 3-byte counter, starting with a random value", which is, I guess, incremented upon each new object id.

Thanks for the article and your comments!

eval() could be used to generate new dates on the ...

2015-10-09T09:03:12.020-04:00

eval() could be used to generate new dates on the server, I suppose, but it suffers from having to eval() Javascript, which causes a speed penalty. It's also not guaranteed that dates are unique when they are generated (multi-client again), so you'd need a unique index to take that into account.

Tailing the oplog fixes those issues and more, though. A big issue I discovered when doing using the approach in this blog post is that there is a gap between when the sequence number is generated and when the event is inserted, so you can have a race condition where client A generates a sequence number, client B generates a sequence number, client B inserts an event, server handles client B's event and re-queries the server, updating its internal counter, then client A inserts its event. In this case, client A's event will never be handled because the internal 'tail' of the capped collection has already proceeded beyond client A's sequence number.

Additionally, it's much faster to just insert into a collection and count on the oplog to get updated by the replication machinery rather than perform the 2 separate operations of a) generate sequence number and b) insert into oplog.

Thanks for the comments!

Good point on the _id field. I would not, howeve...

2015-10-09T08:53:54.593-04:00

Good point on the _id field. I would *not*, however, assume that server-generated _id fields are guaranteed to be monotonically increasing.

As an aside, I've actually abandoned this approach in favor of using the oplog directly (just create insertion or update events in some collection and the get a monotonically increasing "ts" field in the oplog automatically)

And this allowed me to evaluate an expression on t...

2015-10-09T08:53:11.465-04:00

And this allowed me to evaluate an expression on the server side:
db.eval('db.messages.insert({ts: new Date()})')

According to the docs http://docs.mongodb.org/manu...

2015-10-09T08:50:11.330-04:00

According to the docs http://docs.mongodb.org/manual/reference/method/db.collection.insert/#id-field
> Most drivers create an ObjectId and insert the _id field, but the mongod will create and populate the _id if the driver or application does not.

As for server side generated dates, you can do:

Neither dates nor ObjectIDs are server-side genera...

2015-10-09T08:25:10.178-04:00

Neither dates nor ObjectIDs are server-side generated in MongoDB, so there's no guarantee that events are consumed in the same order they're produced.

Why not using server side generated Date or Object...

2015-10-09T01:02:41.341-04:00

Why not using server side generated Date or ObjectId?

Hi Mike, Thanks for the comment. In the case of t...

2013-06-14T17:37:12.684-04:00

Hi Mike,

Thanks for the comment. In the case of the slow consumer, it is possible to lose data. Blocking the producer is certainly an approach, but in my case I was able to size the capped collection to handle the occasional slow consumer, and in general I have a large number of consumers for each producer.

Rick, This is a great post, thanks, but it leave...

2013-06-04T10:01:02.162-04:00

Rick,

This is a great post, thanks, but it leaves me wondering one thing: What happens in the case of the classic slow consumer? E.g. what if I can insert records into the capped collection faster than I'm able to get them to the consumers, and capped collection fills up. Won't I simply lose data in that case? Or is there a way I'm not seeing to "block" the producer temporarily?

Thanks again for the post

Mike

yaps, you are right in a multi client scenario mon...

2013-04-16T21:52:24.325-04:00

yaps, you are right in a multi client scenario mongoIds can possibly overlap.

Thanks for replying and your post.

Regards Nick Milon

Thanks for the comment, Nick! The reason I'm ...

2013-04-16T15:45:34.427-04:00

Thanks for the comment, Nick!

The reason I'm using a manually generated sequence number is that ObjectIds aren't actually sequence numbers (ObjectIds generated on 2 different clients can come in "out of order" for instance). What I need for this is a way to ensure that I never handle a message twice; an always-increasing sequence number is a fairly inexpensive way to do that.

Thanks again for the comment!

Nice writing, I have a similar implementation and ...

2013-04-16T15:36:18.821-04:00

Nice writing, I have a similar implementation and find it extremely usufull since it saves me the trouble of maintenance for one more application (REDIS) in the stack, performance is not bad either.
The only thing I missed is hacking of "ts" field although I use 'oplog_replay' when i read from RS oplog.

And one question ..... why not use a mongo Id object for filling 'ts' field instead of creating your own sequence ?

Comments on Just a little Python: MongoDB Pub/Sub with Capped Collections

No, the sources are for something different. I'...

Looking at the sources (I could really read them, ...

Ah yes, you are right about the rollover.

With the oplog, the filtering happens on the serve...

> This means that if you tail the oplog you'...

For the oplog, I think you have things backwards. ...

The reply button under your comment doesn't wo...

Oh, and also regarding the local database and repl...

As for server-side ObjectIds being monotonic, ther...

Having server side generated object ids, no counte...

I am using this to generate the _id on the server ...

eval() could be used to generate new dates on the ...

Good point on the _id field. I would *not*, howeve...

And this allowed me to evaluate an expression on t...

According to the docs http://docs.mongodb.org/manu...

Neither dates nor ObjectIDs are server-side genera...

Why not using server side generated Date or Object...

Hi Mike, Thanks for the comment. In the case of t...

Rick, This is a great post, thanks, but it leave...

yaps, you are right in a multi client scenario mon...

Thanks for the comment, Nick! The reason I'm ...

Nice writing, I have a similar implementation and ...

Good point on the _id field. I would not, howeve...