How the industry benefits from machine learning
That machine learning is shaking up the industry is nothing new. But how mature is this technology for widespread use? And at what point are we in its development? At the Data Engineering Day at BFH Technik & Informatik on 7 May, researchers examined these questions. A summary.
The field of machine learning and artificial intelligence (AI) is experiencing tremendous growth and providing immense opportunities for industry to solve problems. At the same time, the flow of information is getting bigger and bigger and more and more confusing. It is precisely this challenge that the company Anacode is addressing, with the aim of providing a clear overview of progress and development in this field. Janna Lipenkova, CEO, shows how her tool is used to identify current trends and topics. Natural Language Processing (NLP) plays a major role here in analysing and evaluating various text sources. A particularly important company when it comes to NLP is Cortical.io. It allows companies to automate particularly text-intensive tasks and thus save a lot of time. An example of this is the customer service of an international transport company, which receives hundreds of thousands of emails in different languages from different countries every day. NLP can be used to filter out which ones need to be answered. These are then automatically categorised to give structure to the information. Pablo Gonzalvez Garcia, Director Product Engineering & Customer Success, emphasises how important it is to consider the security of applications and data when using AI.
In the health sector
As the Covid 19 pandemic has clearly shown, the correct handling of data in the health sector is elementary. For this purpose, the Insel Group has founded the Insel Data Science Center (IDSC) to enable the collection and processing of data for medical purposes. Machine learning comes into play here, for example, to create diagnoses and analyses that support the work of doctors. Benjamin Ellenberger, data scientist at the IDSC, continues by showing us how Inselspital has used data to combat AER outbreaks. The focus here was on the visualisation of data, because as Benjamin Ellenberger himself says: “A good visualization is worth a thousand insights A good visualisation is worth a thousand insights
Another industry that relies heavily on data is finance. Investment and private banks in particular can benefit from machine learning and artificial intelligence. But for this to happen, various obstacles must first be overcome. According to Oana Diaconu, Executive Director of Data Science and Analytics at Wall Street, one important point is that the potential of these technologies is not yet fully understood, which could be due to their lack of explanation. Another challenge is the regulation of the banking sector, which explains the caution towards modern technologies. On the consumer protection side of the financial sector, however, the use of these technologies is somewhat more advanced, shows Prof. Dr. Stavros Zervoudakis, professor at NYU. Here, for example, algorithms are used to detect potentially fraudulent transactions. Here, unbalanced data sets on financial transactions are the biggest challenge; the analysis and preparation of data therefore deserve special attention.
A challenging process
There is a huge amount of data flowing through the network, which has to be stored, cleaned and processed in order to gain advantages from it. In the field of machine learning, 80% of the working time goes into data preparation . Florian Wilhelm, Head of Data Science at inovex, shows us how this process can be designed most effectively. Teams must first be clear about what is to be done with the data: Where will it be stored? Where will the data be sourced from? What is the scope of it? Who has access to it? All these questions must be clarified at the beginning. The cooperation between the different teams within a project plays an equally important role. You have to create a sense of shared responsibility and a common goal, he says. According to Florian Wilhelm, Spotify serves as a good example in this area.
What research will bring
Human search behaviour
The fact that the verb googeln is now even recognised by the Duden already says enough about how important search engines have become. Machine learning is now used in various ways, but the best known is probably the so-called ‘learning to rank’, whereby the machine learns which search results best match the search performed. Voice recognition is also becoming increasingly important, because the percentage of searches that are carried out by voice is rising steadily. In 2016, for example, it was already one fifth of all searches on Google . Hideo Joho, professor at the University of Tsukuba in Japan, is conducting research in precisely this field, i.e. human search behaviour. He sees the possibility of automating searches in the future, for example with augmented reality glasses. An example of this would be reading: Here, the glasses read along, recognise words that might be unfamiliar to users, and already look up an explanation of the word in order to be able to show it on the display as soon as this word is read.
Interface between humans and artificial intelligence
The topic of automation is also relevant in relation to artificial intelligence, because there is often talk of “humans versus computers”, as if we were looking into a future in which machines will have taken over all our tasks. Werner Greyer, Senior Research Manager AI Interaction for IBM, presents us with the current state of research on the human-machine interface, quoting Tom Malone: “We have thought far too much about humans versus computers and not nearly enough about humans and computers”  . Werner Greyer sees a future in which cooperation between humans and artificial intelligence is the norm. For this to happen, however, he believes two important aspects must change: Trust  and transparency. It must be easier to grasp the concept of machine learning and artificial intelligence and to understand how these models generate predictions. It must also be easier to create such models. One approach is so-called AutoAI systems , which – based on a given data set – automatically suggest models and already train them. And, as Tom Malone’s quote also mentions, human competences should not be automated, but extended and supplemented by means of technology. With this hopeful conclusion, the Data Engineering Day comes to an end. We are pleased with the insights gained and would like to thank all speakers and participants once again.
- A. Ruiz, “The 80/20 data science dilemma,” [Online]. Available: https://www.infoworld.com/article/3228245/the-80-20-data-science-dilemma.html. [Accessed 10 05 2021].
- Google, [Online]. Available: https://www.thinkwithgoogle.com/marketing-strategies/app-and-mobile/voice-search-statistics/. [Accessed 10 05 2021].
- J. Guszcza and J. Schwartz, “Superminds: How humans and machines can work together,” Deloitte Review, no. 24, January 2019.
- Z. e. al., “How much Automation do data scientists need?,” 2021.
- D. e. al., “Trust in AutoML: exploring information needs for establishing trust in automated machine learning systems,” IUI, 2020.
- W. e. al., “AutoDS: Towards Human-Centered Automation of Data Science,” 2021.