Using DAX to Split Delimited Text into Columns

30th October 2014

Imagine that you’ve loaded a table into Power Pivot and it has a column called ‘Full Name’ which contains comma delimited surname and first name values such as ‘Smith,John’. You’d like to split ‘Smith’ and ‘John’ into separate columns.

103014_1142_UsingDAXtoS1

Ideally, we would split these values into separate columns before loading the data into Power Pivot. This can normally be done using Power Query, or with some SQL if the data is being loaded from a SQL database. That said, I’ve seen a handful of scenarios where this wasn’t an option. I’ve seen solutions for splitting text into columns using the MID and FIND DAX functions like this but there is another solution that makes it easier to pick which delimited value to return. This solution uses a combination of theSUBSTITUE and PATHITEM DAX functions and is especially useful when there are more than 2 delimited values.

 

We can use the SUBSTITUE function to swap the comma delimiter with a pipe “|” and then the PATHITEM function to get one of the delimited text values. To get the surname, we can use the following DAX formula:

=PATHITEM(SUBSTITUTE(Person[Full Name], “,”, “|”), 1)

 

Broken down into two steps, let’s see how this solution handles the value ‘Smith,John’:

  1. ​The SUBSTITUE function is changing the value ‘Smith,John’ to ‘Smith|John’.
  2. The PATHITEM function is checking ‘Smith|John’ for the first part of the pipe delimited text, which happens to be ‘Smith’.

103014_1142_UsingDAXtoS2

To get the first name, we simply use the same formula but change the parameter of the PATHITEM function to a ‘2’ so that it gets the second part of the pipe delimited text:

=PATHITEM(SUBSTITUTE(Person[Full Name], “,”, “|”), 2)

103014_1142_UsingDAXtoS3

That’s all there is to it. Now you know a quick and simple way to split out values from a delimited text column.

 

When a Date Value Comparison Will Give You Wrong Results in Power Pivot/SSAS Tabular

7th August 2014

I’ve seen a number of new Power Pivot users get confused by unexpected results when comparing date values that appear to be the same. I’ll illustrate this problem with a simple​ example before providing some possible solutions. ​

080714_1502_Whenadateva1

The above Power Pivot table holds the date that each employee started working for a company and the date that they were trained. The ‘Start Date’ and ‘Training Date’ columns have their data type property set to ‘Date’ in Power Pivot.

 

The following DAX formula was used to created the ‘Trained Same Day’ calculated column:

​=

IF(

Employees[Start Date] = Employees[Training Date],

“Yes”,

“No”

)

 

This formula will return a ‘Yes’ for all the rows where the ‘Start Date’ and ‘Training Date’ are the same. When they are not the same, a ‘No’ is returned. We would expect Employee ID’s 1 to 3 to have a ‘Yes’ in the ‘Trained Same Day’ calculated column but this isn’t the case. Take a brief moment to think about why this calculated column is not behaving as expected.

 

The ‘Date’ data type in Power Pivot can be a bit misleading as it actually stores date values with their time parts. When we change the display format of the two date columns to one that shows the date and time, we can see where things are going wrong: The ‘Training Date’ column contains some dates which match the ‘Start Date’ but have a different time value.

 

Note: It’s important to highlight at this point that when dates are loaded into Power Pivot without a time part they will be given a default time value of ’00:00:00′. This is the case with the ‘Start Date’ column.

080714_1502_Whenadateva2

 

The ‘Trained Same Day’ calculated column is actually returning the correct value when we consider the time values that are also held in the columns. We need to re-write the original DAX formula so that it only considers the date part of the ‘Training Date’ column when comparing the values with the ‘Start Date’. There are many ways that this can be done, but one approach is to use a combination of the DATE, YEAR, MONTH, and DAY functions:

​=

IF(

Employees[Start Date]

=

DATE(

YEAR(Employees[Training Date]),

MONTH(Employees[Training Date]),

DAY(Employees[Training Date])

),

“Yes”,

“No”

)

 

Notice how the new calculated column called ‘Trained Same Day New’ gives us the desired behaviour because the date/time values are first stripped of their time part before the comparison is made.

080714_1502_Whenadateva3

 

Alternatively, a calculated column that strips the time part could be added to the table. For example, a calculated column called ‘Training Date New’ could be defined as:

​=

DATE(

YEAR(Employees[Training Date]),

MONTH(Employees[Training Date]),

DAY(Employees[Training Date])

)

 

080714_1502_Whenadateva4

 

After adding this, references to the ‘Training Date’ column in the original DAX formula can be replaced with references to ‘Training Date New’. However, doing this would add to the size of the model.

 

The ideal solution would be to strip the ‘Training Date’ column of a time value before loading it into Power Pivot, but there could be scenarios where this isn’t feasible. The next time you run into a date value mismatch when comparing date values that appear to be the same, remember to check whether the column holds any values for the time part other than the ’00:00:00′ default.

Using Friendly Week Names in Your Date Table

26th April 2014

Something Isn’t Quite Right…

Take a quick moment to think about why these two Pivot Tables might be different and what the problem is with the week names (Note: both of the Pivot Tables are unfiltered)

042614_1122_Usingfriend1

Pivot Table A

042614_1122_Usingfriend2

Pivot Table B

The weeks in Pivot Table A are Sunday-based (i.e. Sunday through Saturday) whereas the weeks in Pivot Table B are Monday-based (i.e. Monday Through Sunday), but you wouldn’t be able to guess this by just looking at the images, would you?

With regards to what’s wrong with the week names, the truth is, nothing is actually wrong from a logical or performance perspective. The weeks appear in the correct order and each week belongs to a single year. In other words, this is a natural hierarchy and follows the best practices mentioned in this article by Marco Russo. So what’s the issue then? You’ll quickly learn the answer to this by trying to answer the following question: Without drilling down, would you be able to quickly tell the starting date of ‘Week 11 2006’ in either of the Pivot Tables? Your answer would most likely be “no” and that’s the problem. Unsurprisingly, most end-users wouldn’t know the answer to this question either and that’s an even bigger problem!

042614_1122_Usingfriend3
Drilling down is great and everything, but this is one extra step that the end-user has to take to gain a better understanding of what they are looking at. Imagine that the end-user isn’t sure which week contains a particular date. They may end up trying to guess this by drilling down on the most likely weeks until they find it, or they may end up expanding all of the weeks at once, but neither of these actions is ideal.

 

Right, so now that we know what the problem is, let’s move on to a possible solution…

 

Add a Friendly Week Name Column to Your Date Table

Depending on your end-users you will need to create either Sunday-based weeks or Monday-based weeks. In some cases, your target audience may require the option of both. Let’s assume that the latter is true for our examples.​

042614_1122_Usingfriend4

 

A calculated column for ‘Sunday Based Week’ can be defined as:

​=”Week ” & WEEKNUM(‘Date'[Date], 1) & ” ” & ‘Date'[Year]

 

A calculated column for ‘Monday Based Week’ can be defined as:

​=”Week ” & WEEKNUM(‘Date'[Date], 2) & ” ” & ‘Date'[Year]

 

Notice that the difference between these two DAX formulas is the argument used for the second parameter of the WEEKNUM function. A ‘1’ means that the starting date is a Sunday, and a ‘2’ means that the starting date is a Monday. The first parameter must reference a column with the Date data type. You can learn about the WEEKNUM function in an official Microsoft Office help topic here.

 

We also need columns that can be used to sort the ‘Sunday Based Week’ and‘Monday Based Week’ columns into the correct order. These same sorting columns will also be used to sort the friendly versions of the week names.

042614_1122_Usingfriend5

 

A calculated column for ‘Sunday Based Week Number’ can be defined as:

​=VALUE(‘Date'[Year] & RIGHT(“00” & WEEKNUM(‘Date'[Date], 1), 2))

 

A calculated column for ‘Monday Based Week Number’ can be defined as:

​=VALUE(‘Date'[Year] & RIGHT(“00” & WEEKNUM(‘Date'[Date], 2), 2))

 

042614_1122_Usingfriend6

 

042614_1122_Usingfriend7

 

Now let’s move on to our friendly week name columns…

042614_1122_Usingfriend8

 

A calculated column for ‘Friendly Sunday Based Week’ can be defined as:

​=

“Week ”

& WEEKNUM(‘Date'[Date], 1)

& ” (”

& FORMAT(

CALCULATE(

MIN(‘Date'[Date]),

ALLEXCEPT(

‘Date’,

‘Date'[Sunday Based Week]

)

),

“ddd d, MMM yy”

)

& “)”

 

A calculated column for ‘Friendly Monday Based Week’ can be defined as:

​=

“Week ”

& WEEKNUM(‘Date'[Date], 2)

& ” (”

& FORMAT(

CALCULATE(

MIN(‘Date'[Date]),

ALLEXCEPT(

‘Date’,

‘Date'[Monday Based Week]

)

),

“ddd d, MMM yy”

)

& “)”

 

I will outline what’s happening for the ‘Friendly Sunday Based Week’ calculated column but near-identical logic applies to the Monday-based week version. The DAX formula does the following things for each row:

 

  1. Clear all the filters from the filter context, that was derived from the row context, except for on the ‘Sunday Based Week’ column.
  2. Find the first/earliest date within this context.
  3. Format the date as ddd d, MMM yy e.g. ‘Sun 1, Jan 06’.
  4. Determine the week number of the current row’s ‘Date’ column using the WEEKNUM function.
  5. Build the text string ‘Week <Week Number> (<Formatted Date>)’ e.g. ‘Week 1 (Sun 1, Jan 06)’.

 

Note: The exact formatting used for the friendly week names in the examples presented are to illustrate the approach, but it’s likely that you may require much shorter friendly names so that they fit comfortably on your charts. You can use the FORMAT function and custom format strings to get the desired outcome such as ‘Wk1 1-Jan-06’ instead of ‘Week 1 (Sun 1, Jan 06)’. Learn about the FORMAT function and custom format strings here and here. You may have noticed that the week starting date associated with the first week will not always be the Sunday or Monday of that week. This is ok in this scenario because the friendly week name should only reflect the dates that would be implicitly filtered when selecting or slicing the data.

 

These two columns need to be sorted using the same respective sort columns as their ‘non-friendly’ week name counterparts and then we can use them to create friendly week name hierarchies.

Diagram View - Using friendly week names - rs

042614_1122_Usingfriend9 ​​042614_1122_Usingfriend10
Pivot Table A Pivot Table​​ B

Now that we have these friendly week names, you can easily tell why the numbers are different. Without drilling down, would you be able to quickly tell the starting date of ‘Week 11 2006’ in either of the Pivot Tables? The answer is most likely a “yes”. If you needed to look at sales for the whole week containing the date 7th April 2006, would you be able to pick the correct week to drill-down on? The answer is probably “yes” to this question too!

 

Conclusion

Whenever creating calculated columns in our Power Pivot or SSAS Tabular models, if we expect that the column will ever be used as a label in Pivot Tables or charts, we should ensure that the values within it are unambiguous. Doing this will save end-users some time whilst also helping to avoid the confusion that often surrounds ambiguous labels in reports and charts.

 

You Can Download the Workbook

You will need the version of Excel 2013 that comes with Office Professional Plus 2013, Office 365 ProPlus, or the stand-alone version in order to successfully open this workbook. You can download an​ evaluation version of Office 365 ProPlus here.

 

Using_friendly_week_names.zip (Please read our Excel workbook disclaimer notice here)