Rounding datetimes to nearest minute or hour in Power Query

One of the most common culprits behind a large data model in Power BI is the presence of columns with a high cardinality. When a column has a lot of unique values (hundreds of thousands, or more), the Vertipaq engine has a hard time compressing the data and allowing for a compact model.

A common scenario for experiencing high cardinality columns is when you’re dealing with event data, where you have precise datetimes down to the second for a series of events. In a single year alone, you may have as many as 31,622,400 unique datetime values (this is the number of seconds in a leap-year), so your cardinalities can easily come an issue for large datasets.

A useful technique for dealing with this is to store the dates and times in separate columns in your model, which reduces you down to 86,766 unique values for a leap year (number of seconds in a day plus number of days in a leap-year). But this won’t be helpful for every scenario. For example, many of the publicly-available custom visuals oriented around process mining require that you supply a datetime field. In these scenarios, I’ve found it best to aggregate your timestamps by rounding to the nearest minute. For many applications, knowing the precise second that an event occurs isn’t as important as knowing what minute or hour it occurred in. Rounding your datetimes where possible can help Vertipaq to compress your model much better.

Rounding to the nearest minute

If you have a report where you could sacrifice a small amount of precision for big improvements in report file size, then you may find the following Power Query function helpful. This function, which I call RoundDateTime in my reports, takes a datetime column and rounds it to the nearest minute…

(DateTimeToRound as datetime, optional NearestMinutes as number) =>
  let
    // Make function round to nearest minute by default
    Rounding = if NearestMinutes is null then 1 else NearestMinutes,

    Source = DateTimeToRound,
    // Convert to number of days since 30/12/1899. Decimal component stores the time component of the datetime
    #"Convert datetime to decimal" = Number.From(Source),

    // Convert number from representing days to representing minutes
    #"Upscale number" = #"Convert datetime to decimal"*(24*60/Rounding),

    #"Round to nearest minute" = Number.Round(#"Upscale number"),
    // Convert back to days since 30/12/1899
    #"Downscale number" = #"Round to nearest minute" / (24*60/Rounding),

    #"Convert back to datetime" = DateTime.From(#"Downscale number")
in

    #"Convert back to datetime"


To use this function, you just need to add a single step to your query…

#"Round to nearest minute"
  =
 Table.TransformColumnTypes(
        Table.TransformColumns(
            #"Changed Type",
            {{"Timestamp", each if _ is null then null else RoundDateTime(_)}}
        )
,
        {{"Timestamp", type datetime}}
    )

The step has two nested transformations within it. The inner transformation applies the custom function, with some extra logic to avoid rows with nulls returning errors. The result is wrapped in a second function that restores the datetime type, which is stripped away when applying the custom function. The two transformations are applied as a single step to make it easier to add to an M query than two separate steps.

Here’s what the function looks like when applied to a column of datetimes…

Rounded to nearest minute highlighted.PNG

Rounding to other timeframes

If rounding to the nearest minute doesn’t reduce your model size enough, then you can use this function to round datetimes even further. By adjusting the NearestMinute parameter, you can round datetimes to the nearest 5 minutes, nearest 15 minutes or nearest hour. For example, to implement rounding to the nearest hour, you’d simply adjust the step where you apply the rounding function to use the expression RoundDateTime(_,60) instead of RoundDateTime(_).

This function can also be adapted to work on time columns rather than just datetimes. To do so, you’d just need to change any references to the datetime type in the function definition to refer to the time type instead.


Sam Fischer

Sam Fischer is a self-described Power BI Guru, and works as a Lead Consultant in Melbourne, Australia. In his role, he manages a Business Intelligence Centre of Excellence with over a dozen developers. Sam is known for his ability to build out novel and scalable Power BI solutions, which drive business value and transform the way organizations think about their data. In addition to his technical expertise, Sam provides project management and strategy consulting services, helping his clients develop and implement data-driven initiatives that achieve their business goals.

Previous
Previous

Top 5 tips to reduce your Power BI data model size

Next
Next

Calculating business hours between activities using DAX