Using AI to understand how media content influences us

Newspaper content can change the way people think or their mood because of its polarity. That is why it is vital to understand its importance. Can machine learning come to our assistance and predict the sentiment for us? Our author has investigated this question in his Bachelor thesis.

Sentiment analysis or opinion mining is a way of finding out the polarity or strength of the opinion (positive or negative) that is expressed in written text [1]. Often, such technologies are used in business to understand social sentiment for their brand, a particular product or service, specifically in the following two use cases:

  • Customer classification: sentiment analysis allows customers to be classified according to their emotional mood. This offers the opportunity to find customers who are more willing to buy.
  • Product classification: evaluating how a product is perceived on the market, based on online reviews.
  • Chatbot training: with the results of a sentiment analysis tool, it is possible to train chatbots to recognize and respond to specific customer sentiments.

 With a higher degree of scalability and automation in most current applications, such tools for sentiment analysis can be easily extended or integrated into an automated system. Behind the scenes, sentiment analysis is a specific form of machine learning, often using text or audio traces as training data.

Sentiment Analysis can also be applied to texts such as news articles [2]. My bachelor thesis [3] was based on creating a machine learning model in order to do sentiment analysis on Swiss newspapers. The idea was to create, train, adjust and improve a model so that it is fit to analyze various Swiss newspapers and to determine which papers are written with the most negative and positive attitude.

Training a Machine Learning Model

To create this model several state-of-the-art tools were applied (such as Tensorflow, Keras, and Ktrain) and in particular, a pre-trained language model based on Google’s BERT model [4] was used. Such per-trained language models have a basic understanding of the language and are then trained with additional datasets for a specific purpose, such as sentiment analysis. Different datasets were used to train the machine learning system, based on existing positive and negative texts. Each dataset had a pre-processing phase, to prepare the data for the training. Figure 1 shows the detailed system pipeline of data processing used in this project.

Figura 1 System architecture used in this project, involving the machine learing training and data collection.

 

As training data, an existing dataset containing German movie reviews was used (Filmstarts dataset from [5]). Using the pre-trained BERT model and this dataset, an accuracy of 93% was achieved.

Analyzing Swiss Newspapers

Using this machine learning model, newspaper articles from Switzerland have been analyzed. The model can basically be imagined as a function that takes arbitrary text as input, evaluates it, and classifies it into negative and positive. Using two existing web services (APIs) that provide access to news articles, I regularly collected new articles and classified them according to their sentiment. This allowed then an aggregated visualization of the sentiment in the news landscape of Switzerland, sorted by topic or news provider.

The plot in Figure 2 describes the percentage of positivity and negativity of all articles I obtained. The y-axis describes the percentage of all articles with a positive or negative sentiment, which are located on the x-axis. It is visible that overall, almost 60% of the articles were classified with a negative sentiment.

Figura 2 Overall sentiment classification of the collected articles.

 

The plot in Figure 3 gives an overview of positivity and negativity of different topics.

Figura 3 Positivity and negativity per news topic.

We observed that topics such as health or science have a majority of negative articles. This is not surprising, since the articles have been collected in May 2021, in the middle of the pandemic situation. The topics business and sport are slightly more positive; however, we can conclude to we are exposed to rather negative news articles when reading the common Swiss newspapers.

Conclusion

Today’s newspapers have the power to shape one’s entire perspective on the world. Having a positive or negative sentiment might influence our mood when reading them on a daily basis. Our study found that a majority of the news articles in Switzerland can be considered of negative sentiment. This interesting insight might encourage future research in the field of social psychology to further investigate the impact this has on the society.


References

  1. Cambria, E., Das, D., Bandyopadhyay, S., & Feraco, A. (Eds.). (2017). A practical guide to sentiment analysis (pp. 1-196). Cham, Switzerland: Springer International Publishing.
  2. Balahur, A., Steinberger, R., Kabadjov, M., Zavarella, V., van der Goot, E., Halkia, M., … & Belyaeva, J. (2010, May). Sentiment Analysis in the News. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10).
  3. Girgio Bakhiet Derias (2021). Sentiment Analysis on Swiss Newspapers. Bachelor Thesis. Bern University of Applied Sciences, Switzerlad.
  4. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019, January). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT (1).
  5. Guhr, O., Schumann, A. K., Bahrmann, F., & Böhme, H. J. (2020, May). Training a Broad-Coverage German Sentiment Classification Model for Dialog Systems. In Proceedings of the 12th Language Resources and Evaluation Conference (pp. 1627-1632).
Creative Commons Licence

AUTHOR: Giorgio Bakhiet Derias

Giorgio Bakhiet Derias graduated in 2021 from the Bachelor Program in Computer Science at the Bern University of Applied Sciences, with a specialization in Data Engineering. He is now a project management intern at SBB.

Create PDF

Related Posts

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *