Quantitative Characteristics of Human-Written Short Stories as a Metric for Automated Storytelling

  • PDF / 2,310,330 Bytes
  • 37 Pages / 439.37 x 666.142 pts Page_size
  • 83 Downloads / 147 Views

DOWNLOAD

REPORT


Quantitative Characteristics of Human‑Written Short Stories as a Metric for Automated Storytelling Carlos León1   · Pablo Gervás1 · Pablo Delatorre2 · Alan Tapscott1 Received: 21 March 2020 / Accepted: 29 September 2020 © The Author(s) 2020

Abstract Evaluating the extent to which computer-produced stories are structured like humaninvented narratives can be an important component of the quality of a story plot. In this paper, we report on an empirical experiment in which human subjects have invented short plots in a constrained scenario. The stories were annotated according to features commonly found in existing automatic story generators. The annotation was designed to measure the proportion and relations of story components that should be used in automatic computational systems for matching human behaviour. Results suggest that there are relatively common patterns that can be used as input data for identifying similarity to human-invented stories in automatic storytelling systems. The found patterns are in line with narratological models, and the results provide numerical quantification and layout of story components. The proposed method of story analysis is tested over two additional sources, the ROCStories corpus and stories generated by automated storytellers, to illustrate the valuable insights that may be derived from them. Keywords  Empirical quantification · Narrative features · Story components · Story metrics · Story evaluation

Introduction Creating story generation systems is a complex task. The number of features that can play a role in the generation or the evaluation of automatically generated stories is large, as evidenced by the heterogeneity of systems described in the literature. These This work has been supported by the CANTOR project (PID2019-108927RB-I00) funded by the Spanish Ministry of Science and Innovation; by the project FEI INVITAR-IA (FEI-EU-17-23) of the University Complutense of Madrid; and by the ComunicArte project (PR2005-174/01) by BBVA Foundation Grants-Scientific Research Groups 2017. * Carlos León [email protected] Extended author information available on the last page of the article

123

Vol.:(0123456789)



New Generation Computing

features include aspects related with the story world like emotions, characters, locations or intentions, and structural aspects like length or narrative arc. Some of these features need explicit or implicit values for the generation, as setting the appropriate length, the number of characters, or the amount of descriptions that the story needs. Additionally, the parameters for the story features can change depending on the kind of the story, author, and context. Choosing the optimal values for these parameters is not a trivial task, since the range of acceptable values for many of the features is large and the features are rarely independent. However, one potential source for information is stories written by humans. A quantitative analysis of the features present in stories produced by humans can provide a set of values that can be considered a c