On Wed, Oct 16, 2013 at 09:30:59AM -0600, CS DBA wrote:
- All;
-
- One of our clients is talking about moving to Mongo for their
- reporting/data mart. I suspect the real issue is the architecture of
- their data mart schema, however I don't want to start pushing back if I
- can't back it up.
-
- Anyone have any thoughts on why we would / would not use Mongo for a
- reporting environment.
-
- what are the use cases where mongo is a good fit?
- what are the drawbacks long term?
- is mongo a persistent db or simply a big memory cache?
- does mongo have advantages over Postgres hstore?
- etc...
-
- Thanks in advance...
-
- /Kevin
I work with both.
Mongo doesn't really seem approprite for a datamart. Mongo supports
Map Reduce and has an Aggregation framework (which will give you a lot
of the functionality of SQL but is much more esoteric)
You need an index for every query you run and every possibly sort order.
Mongo will cancel you're query if the result set hits a certian size
w/o an index.
Doing ad-hoc queries is HARD. and there are no joins. If it's not in
your document you basically have to pull both documents into your app
and join them by hand.
Writes block reads, massive updates (like into a datamart) will need to "yield"
to allow reads to happen and that only happens at a pagefault.
You need to have enough memory to store you're "working set". or performance tanks
In a datamart your working set is frequently the whole thing.
People throw around the "Schemaless" thing, but really there is some schema. you
have to know what you want your document to look like. So this means schema changes
as you grow your product, etc.
In a datamart you're not going to use 10gen's idea schema change methodology
of "Only Apply Data Model Changes when you access a record" That works if you're
ooking up a single document at a time, but not if you're mostly doing range scans
and aggregations.
Mongo is very limited on how it can sort, we have a number of "sort fields" added
to our document that give us a different indexable sort order. like you can't
do ORDER BY CASE statements.
IMO Mongo, like most NoSQL solutons, address write scaling and availablity
by making it easier to do. You can generally shard w/o bothering the application
too much and you get free seamless failover with the replica sets.
Hope this is helpful
Dave