Text analytics in Power BI

Microsoft is continously adding new AI functionality to the Power Platform. Here is a brief look on how to use the text analytics function to detect the language and overall sentiment in a body of text.

This can be very useful when analysing interactions with customers or suppliers, the mood among team members and company perception among financial analysts or in the media.

This function is part of Azure Cognitive Services so you will need to create an Azure account to try this yourself.

In the example below I am using a table with data about three newspaper articles. Two are in Swedish, one is in Danish. One of the Swedish articles is regarding a win in a horse racing event, the other is about the Coronavirus. The Danish article is about recent acts of terrorism in France.

I have opened the Power Query Editor and marked out the ‘AI insights’ group under the Home tab. Today I am going to use the ‘Text Analytics” and the functions ‘Detect language’ and ‘Score sentiment’.

I am then prompted to sign in to my AI functions Azure account.

As the Score sentiment function is helped by pre-defining the language of a text, we will start with detect language. Simply select the column of the text you want do detect the language for.

After a brief moment (larger tables will take longer time) two new columns are returned. The first shows the detected language and the second the corresponding ISO code for that language code for that language (this column will be used next in the ‘Score sentiment’ function.

Now let’s try the ‘Score sentiment’ function. Choose the text column to analyse. As you can see there is an option to assist the model by providing a pre-defined language classification either as single two letter ISO code or as a column from the table. The second option is of course more useful if there are multiple languages in the table.

After pressing OK the function will return a new column with a score of the sentinment in respective texts. The score range is between 0 and 1, with 1 being the highest (most positive) sentiment. Usually most texts score between 0.4 and 0.6.

The results gave the article about horse racing the highest score (0.5366) and the article about the coronavirus the lowest score (0.4877).

Please note that the Danish article on terrorism is almost as high as the Swedish article on horse racing, whcih could seem counter-intuitive given the differing subject of the articles. I have also seen this pattern when using this function on other texts of different languages, where it seems different languages end up on their own language specific range, which differs from other languages. This it leads me to conclude that the function is not yet suited for comparison across different languages. However, it is very useful for comparison within languages and we can expect it will improve quickly over time.

For a detailed tutorial on the third text analytics function ‘Detect Key Phrases, check out this tutorial from Microsoft.

Leave a Reply Cancel reply