Content creation with AI - Part 1

/blog/2019-10-09-ai-content-creation-part-1/images/featured_hu3d03a01dcc18bc5be0e67db3d8d209a6_2172388_1595x1148_resize_q75_box.jpg

“AI can write just like me… I’ve seen how OpenAI’s GPT2 system can produce a column in my style.” - Guardian journalist Hannah Jane Parkinson

AI content generation overview

In February 2019, non-profit research company OpenAI, co-founded by Elon Musk and backed by tech investors Reid Hoffman and Peter Thiel, revealed a language modelling program called GPT-2. They trained the system on around 40GB of text obtained from the most up-voted links on social media site Reddit.

Their AI system works by predicting the next word after a given sequence of words and is so good at this task, that it not only obtained high scores on benchmark tests but could “adapt to the style and content of the conditioning text” allowing “the user to generate realistic and coherent continuations about a topic of their choosing”.

The OpenAI team was so impressed by the ability of AI system to generate coherent and flexible text that they took a rare step in the open source community and decided against releasing the complete program, opting instead to release a limited version.

Impressive samples of AI content

In their blog post introducing the system, OpenAI provided several samples of the program at work. On feeding their AI program the following input:

In a shocking finding, scientists discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English,

the AI system continued the story by producing the following text after 10 attempts:

The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science. Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved. Dr. Jorge Pérez, an evolutionary biologist from the University of La Paz, and several companions, were exploring the Andes Mountains when they found a small valley, with no other animals or humans. Pérez noticed that the valley had what appeared to be a natural fountain, surrounded by two peaks of rock and silver snow. Pérez and the others then ventured further into the valley. “By the time we reached the top of one peak, the water looked blue, with some crystals on top,” said Pérez.

The AI text generator that is too good to be released

After publication, the OpenAI work has resulted in wide coverage with some alarming headlines: “OpenAI built a text generator so good, it’s considered too dangerous to release” (Techcrunch), “AI can write just like me. Brace for the robot apocalypse” (Guardian).

Although the OpenAI system still suffers from problems like excessive repetition and topic changing, the latest breakthroughs show that machine understanding of natural language is rapidly improving.

After Google’s BERT (Bidirectional Encoder Representations from Transformers) made significant improvements on many language tasks in 2018, its performance was bettered this year by several models, such as Facebook’s RoBERTa, Google’s XLNet and Microsoft’s MT-DNN. As these models “hit the ceiling on many existing benchmarks”, a new set of benchmark tasks, named SuperGLUE (successor to GLUE) were recently released. This confirms the rapid advance of AI performance in understanding natural language.

One area where today’s content writing algorithms already excel and are used in production is in generating content from structured data.

Automated content writing from structured data

The Associated Press raised many eyebrows when they announced in 2014 that they would start publishing automated earnings reports. Quarterly financial statements, with their fixed format, were an ideal candidate for trying out algorithms for content production.

The project has been a success, enabling the Associated Press to increase the coverage of earnings releases from around 300 per quarter in 2014 to 4,700 in first quarter of 2018. This extended coverage was not the only benefit of implementing AI. According to Lisa Gibbs, Director of News Partnerships at the Associated Press, their stories are more accurate than when they were still written by humans. Other financial reporting organizations have followed the example of the Associated Press in recent years, including Reuters and Bloomberg.

The latter uses a system called Cyborg to generate thousands of articles about company earnings reports. Roughly a third of the content that Bloomberg News publishes uses some form of automated technology.

Natural language processing (NLP) algorithms for intelligent narratives have also expanded to other parts of the financial sector. Institutions such as T. Rowe Price, Commerzbank and Credit Suisse are also using them to generate financial reports.

Another important example of automated journalism affecting both the production and consumption of news is sports reporting, as structured data in the form of sports scores lies at the heart of every game. Where sports involve large amount of data and statistics, such as baseball, data science algorithms can enhance the reporting value of stories beyond what humans could produce in a comparable amount of time.

It is noteworthy that two of the foremost companies in the field of automated content generation, Narrative Science and Automated Insights, started with the goal of providing summaries of games. In 2016, the Associated Press implemented automated game stories to cover Minor League Baseball, which would otherwise have requires a staff of hundreds of journalists.

So far, algorithmic writing has mostly been confined to areas of analytical and data-based news and stories. One significant benefit of this approach is relieving reporters of mundane and repetitive tasks and allowing them to spend more time on investigative and in-depth stories with a higher editorial impact.

Using AI to turn text into video

AI isn’t only useful for automatic text creation but is also increasingly and successfully used in turning text into videos.

AI-based Text-To-Video tools typically follow these steps in generating a video from text:

  • first, you need to provide content by entering a link to an article or uploading your content
  • an natural language processing (NLP) algorithm will then process the content to determine the main highlights and keywords which leads to the generation of storyboards
  • using computer vision and other algorithms, tools then select relevant video and audio content from a media library to match the topics in your story
  • finally, you can customise the video in terms of brand appearance, voiceover, colours, fonts and other features.

Another interesting application of AI for video generation is the Semantic Scene Generation model, developed by a team from Allen Institute, University of Illinois Urbana-Champaign and University of Washington. This model can generate complex scene videos from rich natural language descriptions as the team demonstrated by generating believable cartoons from unseen captions.

Generating images using artificial intelligence

A lot of the content that we consume everyday uses images to convey emotions, tell stories or show us examples of products. Static images are thus another interesting class of content that are being artificially generated with the AI algorithms.

In the last year alone, considerable progress has been made in this field. Nvidia researchers combined Generative Adversarial Networks (GANs) and style transfer to develop a new algorithm, named StyleGAN, to generate new, realistic looking fake faces. Other researchers have successfully applied the same model for generating other types of images, including anime characters and graffiti.

Examples of realistic faces generated by Nvidia's Generative Adversarial Networks Source: A Style-Based Generator Architecture for Generative Adversarial Networks, Nvidia

Although there are some obvious professions that could benefit from the application of AI in generating artificial images, such as designers, marketers and illustrators, there are also other uses of these technologies, with negative repercussions.

With the ability of generating fake images at scale and/or embedding them in other images (deep fakes), modern societies will increasingly grapple with their concept of trust and how to prevent its erosion in a reality where every picture or even video can be believably artificially generated.

AI has enjoyed considerable success not only in generating fake faces but also in other computer vision tasks, such as image classification, object detection, semantic segmentation, image captioning and instance segmentation.

It is thus interesting that it has struggled to achieve comparable levels of success when applied to music generation.


This was part 1 of a 2 part series. Read about AI generated music, design and more in Part 2.

In the meantime why not spin up a machine learning environment on the Dotscience platform and try out this great article on Generative Adversarial Networks (GANs). You’ll be up and running in minutes!

Written by:

Mark Coleman