Applying AI to production of music
In 2016, when computer vision successes were already visible in many applications, researchers still struggled to produce decent sounding music using deep learning algorithms.
One reason for that lies in the complexity of music because of its many elements – rhythm, dynamics, melody, harmony, texture and musical form. Whereas in computer vision one can find an abundance of similar images, for example, of animals, faces and buildings, the same is much more difficult to achieve in music.
Despite this, recent advances by OpenAI and Google are encouraging. Google Brain released Music Transformer, an attention-based neural network that uses event-based representation to generate musical performances.
This year OpenAI surprised us not only with their GPT-2 model but also released Musenet, a deep neural network that can “generate 4-minute musical compositions with 10 different instruments”. The OpenAI team trained their model on hundreds of thousands of MIDI files.
Musenet uses the same general-purpose model as OpenAI’s GPT-2 model, Sparse Transformer, a deep neural network that can predict what comes next in a sequence of text, images, or sound.
Many other tasks, such as language translation can also be modelled as the learning of sequences, in fact, translation was the primary focus of the paper that introduced the Transformer model: Attention is all you need. One of the main advantages of Transformer models over RNN-based models is significantly shorter training times through the use of attention. Whereas RNNs have a sequential nature, Transformer models use attention which allows the decoder and encoder to see the entire input sequence at once.
Text Summarization For Extracting Insights From Textual Big Data
Returning to the discussion of text that can be generated by AI – not all text written by AI needs to be creative. With the modern onslaught of big data, AI can play a helpful role in providing shortened narratives of long documents.
According to the study “The digital universe in 2020” by EMC, the size of “big data” in 2010 was 1.2 zettabytes and will increase to 40 zettabytes in 2020. In 2017, IBM estimated that 90% of all data was created in the last two years.
To better cope with understanding and analysing these huge amounts of data, we often resort to using NLP algorithms called text summarizers. Text summarization is an algorithmic procedure for generating accurate and meaningful summaries of documents such as books, emails and newspaper articles.
The first attempts at text summarization date back to the 1950s work of Hans Peter Luhn and the 1960s work of Harold Edmundson. Today, there are two main approaches used for text summarizers: extractive summarization and abstractive summarization
Extractive summarization first identifies important sentences and phrases in the text and reproduces them unchanged as part of text summary. Sentences are not modified and the primary focus is on identifying the most important ones.
Abstractive summarization is a more advanced method that tries to interpret text and generate new summaries, where parts of or the entire summary may not even be present in the original text.
Abstractive summarization algorithms still struggle to get satisfactory quality however. As a result, most text summarizers that are in use today rely on extractive summarization.
The March of AI Into the Creative Realm
Although the first AI applications in this area focused on producing content of an analytical nature such as automated earnings reports, there are an increasing number of AI tools and models that cater to creatives and are increasingly used by artists, often with support from AI programmers.
A recent example of AI entering the art world is the sale of a portrait by British auction house Christie’s that was generated by artists and an AI researcher team named Obvious, who build on code from AI programmer Robbie Borrat.
An increasing number of companies are developing AI tools specifically for artists. One of them is Nvidia, who recently introduced an AI tool called GauGAN that allows artists, architects, urban planners and landscape designers to turn doodles or rough sketches into highly realistic images. According to Nvidia, the tool acts like a “smart paintbrush” that can fill in rough segmentation maps with details, depending on whether the segment is classified as sand, sky, sea, snow or some other feature.
AI Content Generation in Digital Marketing
Another industry that may gain considerable benefits from AI tools that can produce content of high quality is digital marketing.
As copywriting is one of the highest paid and most creative jobs in digital marketing, one would expect that it would be the longest to hold out from being affected by AI. However, the first AI tools for copywriting have already arrived, even though their main competitive advantage is mostly around scale, i.e. the ability to produce a very large number of copy texts in a short amount of time through automation.
The ability of today’s AI to produce outstanding and complete copy for the most demanding customers, with no supervision or interventions from humans, is however still limited in most cases.
Good progress in this field has been recently achieved with the tool ‘AI Copywriter’ for generating copy for products. It was launched by Alibaba’s digital marketing arm Alimama. According to Alimama, the tool can produce 20,000 lines of copy in a second, showing great scalability. Brands using the tool can modify both the length and tone of their copy, selecting whether they want the tone to be “promotional, functional, fun, poetic or heartwarming.” Alimama’s tool is already used by merchants and marketers in the Alibaba ecosystem, around a million times per day.
It is expected that the evolution of AI tools for copywriting will follow a similar path as in several other industries with AI first relieving humans of repetitive, mundane, analytical and simpler tasks. By freeing time for copywriters to focus on the creative side of the copy they may contribute to increased overall quality.
Another important part of the digital marketing industry that is dependant on high quality content is Search Engine Marketing (SEM).
Rankings on search engines are determined by search engine algorithms and high-quality content is consistently quoted as one of the major ranking factors with content being noted as important in both the ‘Penguin’ and ‘Hummingbird’ Google Updates.
Google and other search engine algorithms have “battled” artificially generated content almost since their launch. The main reason for this was low quality content, which was often generated by so-called content spinners. They often just took an original piece of content and generated new content from that by doing simple transformations, like replacing words with their synonyms.
The improving quality of systems like OpenAI GPT-2 may exacerbate these problems for search engines, although even AI generated texts from systems like OpenAI can be easily detected. MIT-IBM Watson AI lab and HarvardNLP already introduced a tool which detects texts based on GPT-2 model.
A more productive and less problematic path may thus lie in using AI content generation for initial drafts, with humans adding the final touches.
AI Text Writing – Parallels With Neural Machine Translation
We have seen that AI has already enjoyed success as a content generation tool in a wide range of different fields. When applied as a tool for writing text, it excels when the content revolves around interpretation or analysis of structured data, such as the aforementioned quarterly earnings reports.
AI often still struggles when used for generating creative text however. To explore what may happen when improved AI models are more widely adopted in the creative writing industry, we may turn to another field that has already undergone part of a similar transition – translation services.
Translation was long thought to be impervious to automation due to the complex nature of human language. This, however, did not stop the translation industry becoming one of the first fields that turned to machines for automation with the first major applications dating back to the 60s.
Despite gradual improvements in machine translation over the subsequent decades, an important breakthrough was the recent success of so-called neural machine translation (NMT) systems which use neural nets and vector representations of words to replace earlier statistical methods. NMT systems are already used by Facebook, Microsoft and Google, among others, for their translation products.
Even though NMT systems like those in use at Google and DeepL are often achieving human-like performance, even better results can be achieved on tests like the BLEU benchmark by specialized NMT models trained on domain-specific data, for example medical documentation.
The Emergence of AI Content Quality Editors?
These developments have led to increasing use of NMT systems in translation services, whereby the texts are first translated by NMT programs and then checked by a human proof editor.
Although the quality of AI content production systems is not yet high enough to be generally used in production environments, it is possible that eventually these will follow a similar path seen in translation services, with content first produced by AI models and subsequently checked by human “content quality editors”.
The Current State of the Content Production With Artificial Intelligence
Our review of AI content generation shows that the field is vibrant, developing fast and is already enjoying significant successes in production in many industries.
In the context of generating text, AI content production is gaining ground fast where the content is based on structured data and/or where there is a need for generating large amounts of structured content, at low cost and with short timelines.
AI has more difficulty when confronted with the task of writing complex, creative stories. OpenAI GPT-2, while an important step forward, still suffer from problems like repetitive text, world modelling failures and non-natural topic switching.
However, given the rapid advances in artificial intelligence that we have witnessed over the last few years, we should be optimistic about the future abilities of machines to produce content for us in coming years.