tag:blogger.com,1999:blog-18508356.post378173131439873438..comments2026-02-13T11:24:21.556-05:00Comments on Just a little Python: MongoDB Schema Design at ScaleRick Copelandhttp://www.blogger.com/profile/11612114223288841087noreply@blogger.comBlogger23125tag:blogger.com,1999:blog-18508356.post-21837315913317934252016-10-06T05:51:58.418-04:002016-10-06T05:51:58.418-04:00niceniceAnonymoushttps://www.blogger.com/profile/14447550912257112248noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-19467527831718576612013-09-11T10:22:41.450-04:002013-09-11T10:22:41.450-04:00If you only want a single metric, the query is str...If you only want a single metric, the query is straightforward. If you need multiple metrics, you can do a regex query on _id for &#39;^20101010/&#39;, which will be reasonably fast. If you&#39;re *always* getting the same set of metrics, however, I&#39;d recommend storing them alongside one another in the document instead.<br /><br />The question of whether to use multiple *collections* or a single collection is pretty much a wash performance-wise. Separating your documents into different collections is only really *necessary* when you have different query patterns (and therefore a need for different indexes, sharding approaches, etc.).<br /><br />Hope this helps!Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-87637193885305098332013-09-11T10:19:16.137-04:002013-09-11T10:19:16.137-04:00Well, you can store the three counters in three co...Well, you can store the three counters in three collections. I don&#39;t think there&#39;s a good way to make it too much more compact, though; you don&#39;t want to &quot;pack&quot; the arrays since that means MongoDB can&#39;t do an in-place update, and BSON&#39;s not particularly efficient as a storage protocol (compared with relational DBs, that is).Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-49369913811702827242013-09-10T17:34:40.382-04:002013-09-10T17:34:40.382-04:00Great post. I have some questions for you. You are...Great post. I have some questions for you. You are using &quot;20101010/metric-1&quot; as an _id. So you are using a single collection for multiple metrics. How do you query that? <br />Also, having in mind that I have multiple devices with multiple metrics what&#39;s better to do. Create one collection per device and store multiple metrics or create one collection for every single metric?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-4357902913852962682013-09-09T05:45:31.114-04:002013-09-09T05:45:31.114-04:00I want to structure my data this way but I need a ...I want to structure my data this way but I need a counter for each value(daily, hourly, minute). Any idea of how to store that efficiently? I firstly tried this solution but I want my data to be more compact.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-91641446041058503302013-01-15T20:51:26.456-05:002013-01-15T20:51:26.456-05:00Hi Anon, Monary looks pretty cool; I hadn&#39;t h...Hi Anon,<br /><br />Monary looks pretty cool; I hadn&#39;t heard of it before. I don&#39;t think it&#39;s really applicable to this case, however, as it&#39;s focused on reading a &quot;stripe&quot; of values from many documents into a single numpy array. What I&#39;m trying to do here is to repeatedly update a single document.<br /><br />Thanks for the comment, though. Monary looks very interesting; I&#39;m going to have to find a place to use it. :-)Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-63821841447986704272013-01-15T17:48:29.149-05:002013-01-15T17:48:29.149-05:00What about using alternative connector like Monary...What about using alternative connector like Monary which instead of encapsulating into dictionaries, it relies on NumPy arrays?<br /><br />https://bitbucket.org/djcbeach/monary/wiki/HomeAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-23271470834758632182012-11-07T15:21:02.218-05:002012-11-07T15:21:02.218-05:00This comment has been removed by the author.Unknownhttps://www.blogger.com/profile/10346436738062930772noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-50165211898808665992012-10-28T19:56:11.855-04:002012-10-28T19:56:11.855-04:00Thanks for the comment, Vivek! Glad you found it u...Thanks for the comment, Vivek! Glad you found it useful :)Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-63399096178405155292012-10-27T12:58:16.058-04:002012-10-27T12:58:16.058-04:00Simple and insightful article, thank you Rick. I&#...Simple and insightful article, thank you Rick.<br />I&#39;m making schema changes right-away :)Vivekhttp://vyadav.innoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-38768293305974058442012-10-13T10:21:34.087-04:002012-10-13T10:21:34.087-04:00Hi Sergey, Thanks for the comment! It turns out t...Hi Sergey,<br /><br />Thanks for the comment! It turns out that arrays in BSON are actually stored as dicts where they keys are &quot;1&quot;, &quot;2&quot;, &quot;3&quot;, etc. (surprising but true!), so you wouldn&#39;t actually get any faster using them. I&quot;ve brought up this interesting design decision with 10gen folks multiple times, so someday we may have better-performing arrays, but for now they&#39;re just documents with a different type code.Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-79235187763710918912012-10-05T03:45:19.622-04:002012-10-05T03:45:19.622-04:00I am just curious if accessing &#39;dict&#39; keys...I am just curious<br />if accessing &#39;dict&#39; keys in mongo document is slow<br />is accessing &#39;list&#39; indexes slow too ?<br /><br />could we replace<br />&#39;minute&#39;: { &#39;0000&#39;: N0, &#39;0001&#39;: N1, ... &#39;1439&#39;: N1439 }<br />with<br />&#39;minute&#39;: [N0...N1439]Anonymoushttps://www.blogger.com/profile/03075141187336066805noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-79730225592900811992012-10-02T11:59:02.248-04:002012-10-02T11:59:02.248-04:00In this case, I simply kept the minute of the day ...In this case, I simply kept the minute of the day (numbered 0-1439) as the key of the embedded document. You could also use<br /><br />&quot;23&quot;: { &quot;00&quot;:..., &quot;59&quot;: ... }<br /><br />... which is probably what I would do in the future.Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-290132727041295252012-10-02T10:31:49.005-04:002012-10-02T10:31:49.005-04:00&quot;23&quot;: { ..., &quot;1439&quot;: 2819 } Is...&quot;23&quot;: { ..., &quot;1439&quot;: 2819 }<br />Is that really correct? Shouldn&#39;t it be something like:<br />&quot;23&quot;: { ..., &quot;2339&quot;: 2819 }<br />Or have I missed out something?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-88746642278017633422012-09-28T10:57:01.566-04:002012-09-28T10:57:01.566-04:00Thanks for the comment! I have to admit, the insig...Thanks for the comment! I have to admit, the insight on O(N) object traversal is not mine; it was discovered by the 10gen MMS team when optimizing the monitoring service. It was a fascinating and weird enough 2nd order effect that I thought others would be interested as well. <br /><br />Thanks again!Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-57873866865253991412012-09-27T18:46:58.824-04:002012-09-27T18:46:58.824-04:00Great insights! especially with regard to the 0(N)...Great insights! especially with regard to the 0(N) object traversal on updates.Jonhttps://www.blogger.com/profile/16950702112945468418noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-22697453352761367622012-09-27T18:44:17.155-04:002012-09-27T18:44:17.155-04:00This comment has been removed by the author.Jonhttps://www.blogger.com/profile/16950702112945468418noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-67512351661000482842012-09-27T11:53:12.609-04:002012-09-27T11:53:12.609-04:00Thanks for the comment! I&#39;m glad you liked it....Thanks for the comment! I&#39;m glad you liked it.Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-53854776152035589412012-09-27T09:16:41.682-04:002012-09-27T09:16:41.682-04:00Wonderful! Thank you, we need more of this kind of...Wonderful! Thank you, we need more of this kind of posts :)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-54770778597161561162012-09-26T10:46:00.074-04:002012-09-26T10:46:00.074-04:00Thanks for the comment!Thanks for the comment!Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-55294115988577903132012-09-26T00:11:12.679-04:002012-09-26T00:11:12.679-04:00Great post! Implement, measure, adjust.Great post! Implement, measure, adjust.Calebhttps://www.blogger.com/profile/16594901630822768761noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-60508751442745900382012-09-25T13:31:29.943-04:002012-09-25T13:31:29.943-04:00Leo, Thanks so much for the comment! I&#39;m glad...Leo,<br /><br />Thanks so much for the comment! I&#39;m glad you found the post useful. It&#39;s always nice to see real numbers I think (especially in scatter plots ;-) )Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-56809018959336131232012-09-25T13:16:30.625-04:002012-09-25T13:16:30.625-04:00Rick, Thank you very much for putting this togeth...Rick,<br /><br />Thank you very much for putting this together. I have to say that this is one of the best blog posts I have ever read on MongoDB design and scalability. And it has just convinced me of getting rid of growing documents in my models.<br /><br />LeoAnonymousnoreply@blogger.com