Advanced Models for Stylometric Applications

Some well-known models have been explained in the previous chapter, but various advanced approaches have been suggested. Related to the humanities, the Zeta test is focusing on terms used recurrently by one author and mainly ignored by the others. Selecti

  • PDF / 4,643,311 Bytes
  • 294 Pages / 439.43 x 683.15 pts Page_size
  • 119 Downloads / 264 Views

DOWNLOAD

REPORT


Machine Learning Methods for Stylometry Authorship Attribution and Author Profiling

Machine Learning Methods for Stylometry

Jacques Savoy

Machine Learning Methods for Stylometry Authorship Attribution and Author Profiling

Jacques Savoy Department of Computer Science University of Neuchatel Neuchˆatel, Switzerland

ISBN 978-3-030-53359-5 ISBN 978-3-030-53360-1 (eBook) https://doi.org/10.1007/978-3-030-53360-1 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

To Jacinthe, Adelaïde, and Benjamin

Preface

With the recent progress made in network and computing technology, the ubiquity of data, and textual repositories freely available, the scientific practice evolves towards a more data-based methodology. Thus, numerous domains consider machine learning models as pertinent tools to verify hypotheses or to improve their knowledge by discovering significant patterns hidden in datasets. And stylometry, or more generally digital humanities, follows this new research trend. Focusing on the written style, this book presents methods and approaches able to identify the true author of a doubtful document or text excerpt. Assuming that each author has his1 specific style, statistical or computer-based models can be applied to verify whether or not Shakespeare was the real author of a given play or poem. Besides literature works and authorship attribution, stylometric approaches can be useful to determine some demographics about the author. For example, one can wonder whether a novel (e.g., My Brilliant Friend (2012) by Elena Ferrante) is really written by a female writer. As other factors having a significant impact on the written style, one can study the eff

Data Loading...