Sunday, April 13, 2014

Book Review: Getting Started with MariaDB


Recently I was given a reviewer's copy of Getting Started with MariaDB from the publisher Packt Pub. The fact that I was given a copy of the book in exchange for a written review has in no way influenced my rating of the book.

Getting Started with MariaDB by Daniel Bartholomew is a quick introduction to MariaDB. At 100 pages in total (including front and back matter), it obviously cannot go deep in depth about things and must refer you out to the official documentation.

I liked the fact that the author took to the time to walk through examples of setting MariaDB up on Windows, Mac and Linux (sections for Debian based, Red Hat based and general). Chapter 5 is the most useful chapter as it is where you get to see working examples of using MariaDB.

While the book is a good introduction I felt that some things were mentioned that someone new to MariaDB wouldn't need to know. Things like the configuration file (my.ini/my.cnf) are mentioned but the author didn't talk about even the basic entries you might want to put in there. He also has a couple pages talking about the need for security at all levels (internet, intranet, building and server) which while is true, I didn't think that it warranted the mention in an introductory book. And sadly enough this is all mentioned before you learn how to perform basic CRUD operations. :(

One other thing that I found interesting is there is only the briefest mentions of indexing which I feel is a more important topic to a beginner than security and maintenance. Granted that in such a short book you can't put everything in, but indexing is a very important topic and is something someone new to databases needs to know about.

Overall this book should come in handy for someone new to MariaDB, and databases in general, that want to learn the very basics of this database server. However if you have any knowledge of databases already (especially MySQL which MariaDB is based off of), you will want to pass on this book.

For those who like to see actual ratings, I give the book a 3 out of 5 stars.

Saturday, April 05, 2014

MongoDB Aggregation for Justin John Mathews - Part 2

Recently on LinkedIn (and as a follow up to a previous post, Justin John Mathews asked a question about being able to return data in a certain format. The input and output records are below. Input
db.users.insert({ courseId: 1, stDt: new Date(2014, 01, 01), endDt: new Date(2014, 01, 20), active: false });
db.users.insert({ courseId: 1, stDt: new Date(2014, 01, 25), endDt: new Date(2014, 02, 10), active: false });
db.users.insert({ courseId: 1, stDt: new Date(2014, 02, 25), endDt: new Date(2014, 03, 10), active: true });
db.users.insert({ courseId: 1, stDt: new Date(2014, 02, 28), endDt: new Date(2014, 06, 10), active: true });
db.users.insert({ courseId: 1, stDt: new Date(2014, 02, 25), endDt: new Date(2014, 02, 30), active: false });
db.users.insert({ courseId: 1, stDt: new Date(2014, 05, 25), endDt: new Date(2014, 10, 30), active: false });
db.users.insert({ courseId: 1, stDt: new Date(2013, 10, 01), endDt: new Date(2014, 08, 10), active: true });
db.users.insert({ courseId: 1, stDt: new Date(2014, 09, 01), endDt: new Date(2014, 11, 30), active: false });
Output (this is my interpretation as the format he gave was not a valid document)
{
    "result" : {
        "past" : [
            { "endDt" : ISODate("2014-02-20T07:00:00Z"), "active" : false },
            { "endDt" : ISODate("2014-03-10T06:00:00Z"), "active" : false },
            { "endDt" : ISODate("2014-03-30T06:00:00Z"), "active" : false }
        ],
        "current" : [
            { "endDt" : ISODate("2014-04-10T06:00:00Z"), "active" : true },
            { "endDt" : ISODate("2014-07-10T06:00:00Z"), "active" : true },
            { "endDt" : ISODate("2014-09-10T06:00:00Z"), "active" : true }
        ],
        "upcoming": [
            { "endDt" : ISODate("2014-11-30T07:00:00Z"), "active" : false },
            { "endDt" : ISODate("2014-12-30T07:00:00Z"), "active" : false }
        ]
    }
}

As you can see the dates are broken up into three different categories of past, current and upcoming. After talking with Justin, he would like to see the stDt value as well and he doesn't need the active field any longer. Each category is sorted by the endDt. A course is past if the endDt is less than the current date. It is active if the stDt is less than the current date and the endDt is greater than the current date. It is upcoming if the stDt is greater than the current date.

This is the code that I came up with. It does most of what he's looking for, but no matter what I did, I couldn't get the final grouping to display in the sorted order he was looking for in all categories. I'm sure I'm missing something simple, but when printing out the sorted list before the final grouping everything looks fine, but for some reason it's not getting added to the array in the order that I would imagine it should have been.

db.users.aggregate([
    {
        "$project": {
            "range": {
                "stDt": "$stDt",
                "endDt": "$endDt"
            },
            "_id": 0
        }
    },
    {
        "$group": {
            "_id": null,
            "past": {
                "$addToSet": {
                    "$cond": [
                        {"$lt": ["$range.endDt", new Date()]},
                        "$range",
                        null
                    ]
                }
            },
            "current": {
                "$addToSet": {
                    "$cond": [
                        {
                            "$and": [
                                {"$gt": ["$range.endDt", new Date()]},
                                {"$lt": ["$range.stDt", new Date()]},
                            ]
                        },
                        "$range",
                        null
                    ]
                }
            },
            "upcoming": {
                "$addToSet": {
                    "$cond": [
                        {"$gt": ["$range.stDt", new Date()]},
                        "$range",
                        null
                    ]
                }
            }
        }
    },
    {"$unwind": "$past"},
    {"$unwind": "$current"},
    {"$unwind": "$upcoming"},
    {
        "$match": {
            "past": {"$ne": null},
            "current": {"$ne": null},
            "upcoming": {"$ne": null}
        }
    },
    {
        "$sort": {
            "past.endDt": -1,
            "current.endDt": -1,
            "upcoming.endDt": -1
        }
    },
    {
        "$group": {
            "_id": null,
            "past": {"$addToSet": "$past"},
            "current": {"$addToSet": "$current"},
            "upcoming": {"$addToSet": "$upcoming"}
        }
    },
    {
        "$project": {
            "past": 1,
            "current": 1,
            "upcoming": 1,
            "_id": 0
        }
    },
])
And here is the output you get in MongoDB 2.2.x or 2.4.x.
{
    "result" : [
        {
            "past" : [
                {
                    "stDt" : ISODate("2014-02-25T07:00:00Z"),
                    "endDt" : ISODate("2014-03-10T06:00:00Z")
                },
                {
                    "stDt" : ISODate("2014-02-01T07:00:00Z"),
                    "endDt" : ISODate("2014-02-20T07:00:00Z")
                },
                {
                    "stDt" : ISODate("2014-03-25T06:00:00Z"),
                    "endDt" : ISODate("2014-03-30T06:00:00Z")
                }
            ],
            "current" : [
                {
                    "stDt" : ISODate("2014-03-25T06:00:00Z"),
                    "endDt" : ISODate("2014-04-10T06:00:00Z")
                },
                {
                    "stDt" : ISODate("2014-03-28T06:00:00Z"),
                    "endDt" : ISODate("2014-07-10T06:00:00Z")
                },
                {
                    "stDt" : ISODate("2013-11-01T06:00:00Z"),
                    "endDt" : ISODate("2014-09-10T06:00:00Z")
                }
            ],
            "upcoming" : [
                {
                    "stDt" : ISODate("2014-06-25T06:00:00Z"),
                    "endDt" : ISODate("2014-11-30T07:00:00Z")
                },
                {
                    "stDt" : ISODate("2014-10-01T06:00:00Z"),
                    "endDt" : ISODate("2014-12-30T07:00:00Z")
                }
            ]
        }
    ],
    "ok" : 1
}

This works well for the small test set you gave, but it might not be that great for larger datasets. You'll need to test that before putting this code into production. If nothing else this gives you an idea of how you can do things with the aggregation framework.

Tuesday, April 01, 2014

MongoDB Aggregation Query for Justin John Mathews

Recently on LinkedIn, Justin John Mathews asked a question about being able to return data in a certain format. The input and output records are below. Input
db.users.insert({ courseId: 1, stDt: new Date(2014, 01, 01), endDt: new Date(2014, 01, 20), active: false });
db.users.insert({ courseId: 1, stDt: new Date(2014, 01, 25), endDt: new Date(2014, 02, 10), active: false });
db.users.insert({ courseId: 1, stDt: new Date(2014, 02, 25), endDt: new Date(2014, 03, 10), active: true });
db.users.insert({ courseId: 1, stDt: new Date(2014, 02, 28), endDt: new Date(2014, 06, 10), active: true });
db.users.insert({ courseId: 1, stDt: new Date(2014, 02, 25), endDt: new Date(2014, 02, 30), active: false });
db.users.insert({ courseId: 1, stDt: new Date(2014, 05, 25), endDt: new Date(2014, 10, 30), active: false });
db.users.insert({ courseId: 1, stDt: new Date(2013, 10, 01), endDt: new Date(2014, 08, 10), active: true });
db.users.insert({ courseId: 1, stDt: new Date(2014, 09, 01), endDt: new Date(2014, 11, 30), active: false });
Output
{
    "result" : {
        true : [
            { endDt: new Date(2014, 03, 10), active: true },
            { endDt: new Date(2014, 06, 10), active: true },
            { endDt: new Date(2014, 08, 10), active: true },
        ],
        false: [
            { endDt: new Date(2014, 01, 20), active: false },
            { endDt: new Date(2014, 02, 10), active: false },
            { endDt: new Date(2014, 02, 30), active: false },
            { endDt: new Date(2014, 10, 30), active: false },
            { endDt: new Date(2014, 11, 30), active: false }
        ]
    }
}
The code below is one possible solution. I'm not necessarily happy with everything yet, but it does produce the same results as requested, with two small exceptions. First, the code below displays the ISODate instead of the JavaScript date method which I'm assuming is what should have been in the request as you can't return what's shown above. The other small difference is that since I'm using the aggregation framework with MongoDB 2.4.9, the result is a key of result which contains an array of subdocuments as it's value. There were quite a bit of hoops I jumped through to get this solution, and it could probably be cleaned up quite a bit, but I thought I would throw this post together to let him see that it can be done with a single command if you're patient enough to keep tweaking your pipeline. ;)
db.users.aggregate([

    // Project only what we need.
    {
        "$project": {
            "endDt": 1,
            "active": 1,
            "_id": 0
        }
    },

    // Group to true and false buckets. This will give us arrays with null.
    // We'll remove them in a bit.
    {
        "$group": {
            "_id": "$active",
            "true": {
                "$addToSet": {
                    "$cond": [
                        {"$eq": ["$active", true]},
                        "$endDt",
                        null
                    ]
                }
            },
            "false": {
                "$addToSet": {
                    "$cond": [
                        {"$eq": ["$active", false]},
                        "$endDt",
                        null
                    ]
                }
            }
        }
    },

    // We need to unwind our arrays so we can build them back up without the
    // "null" array.
    {"$unwind": "$true"},
    {"$unwind": "$false"},

    // Project out the values. This will give both a true and false key for
    // each item. This builds our arrays up with the proper endDt and
    // active values.
    {
        "$project": {
            "true": {"endDt": "$true", "active": "$_id"}, 
            "false": {"endDt": "$false", "active": "$_id"}
        }
    },

    // Project out a single value to clean up the "issue" a couple steps above.
    {
        "$project": {
            "value": {
                "$cond": [
                    {"$eq": ["$_id", true]},
                    "$true",
                    "$false"
                ]
            }
        }
    },

    // Group things up again to rebuild our arrays.
    // This adds a single "null" entry that will need to be cleaned up.
    {
        "$group": {
            "_id": null,
            "true": {
                "$addToSet": {
                     "$cond": [
                         {"$eq": ["$_id", true]},
                         "$value",
                         null
                     ]
                }
            },
            "false": {
                "$addToSet": {
                    "$cond": [
                        {"$eq": ["$_id", false]},
                        "$value",
                        null
                    ]
                }
            }
        }
    },

    // Unwind our arrays again so we can clean up one more time.
    {"$unwind": "$true"},
    {"$unwind": "$false"},

    // Match only documents where true and false are not null.
    {
        "$match": {
            "true": {"$ne": null},
            "false": {"$ne": null}
        }
    },

    // Sort our items so we can add the to the array in the correct order.
    // I'm not sure why it has to be descending order, but it works.
    {
        "$sort": {
            "true.endDt": -1,
            "false.endDt": -1
        }
    },

    // Group again to build our array.
    {
        "$group": {
            "_id": null,
            "true": {"$addToSet": "$true"},
            "false": {"$addToSet": "$false"}
        }
    },

    // Once again project out just the fields we need
    {
        "$project": {
            "true": 1,
            "false": 1,
            "_id": 0
        }
    }
])
Here are the returned results in MongoDB 2.2.x - 2.4.x:
{
    "result" : [
        {
            "true" : [
                {
                    "endDt" : ISODate("2014-04-10T06:00:00Z"),
                    "active" : true
                },
                {
                    "endDt" : ISODate("2014-07-10T06:00:00Z"),
                    "active" : true
                },
                {
                    "endDt" : ISODate("2014-09-10T06:00:00Z"),
                    "active" : true
                }
            ],
            "false" : [
                {
                    "endDt" : ISODate("2014-02-20T07:00:00Z"),
                    "active" : false
                },
                {
                    "endDt" : ISODate("2014-03-10T06:00:00Z"),
                    "active" : false
                },
                {
                    "endDt" : ISODate("2014-03-30T06:00:00Z"),
                    "active" : false
                },
                {
                    "endDt" : ISODate("2014-11-30T07:00:00Z"),
                    "active" : false
                },
                {
                    "endDt" : ISODate("2014-12-30T07:00:00Z"),
                    "active" : false
                }
            ]
        }
    ],
    "ok" : 1
}
And in the soon to be released version 2.6.0, the results look like the following:
{
    "true" : [
        {
            "endDt" : ISODate("2014-04-10T06:00:00Z"),
            "active" : true
        },
        {
            "endDt" : ISODate("2014-07-10T06:00:00Z"),
            "active" : true
        },
        {
            "endDt" : ISODate("2014-09-10T06:00:00Z"),
            "active" : true
        }
    ],
    "false" : [
        {
            "endDt" : ISODate("2014-02-20T07:00:00Z"),
            "active" : false
        },
        {
            "endDt" : ISODate("2014-03-10T06:00:00Z"),
            "active" : false
        },
        {
            "endDt" : ISODate("2014-03-30T06:00:00Z"),
            "active" : false
        },
        {
            "endDt" : ISODate("2014-11-30T07:00:00Z"),
            "active" : false
        },
        {
            "endDt" : ISODate("2014-12-30T07:00:00Z"),
            "active" : false
        }
    ]
}