Variable Selection

This chapter is dedicated to variable selection using random forests: an automatic three-step procedure involving first a fairly coarse elimination of a large number of useless variables, followed by a finer and ascending sequential introduction of variab

  • PDF / 2,289,092 Bytes
  • 107 Pages / 439.37 x 666.142 pts Page_size
  • 78 Downloads / 235 Views

DOWNLOAD

REPORT


Robin Genuer Jean-Michel Poggi

Random Forests with R

Use R! Series Editors Robert Gentleman, 23andMe Inc., South San Francisco, USA Kurt Hornik, Department of Finance, Accounting and Statistics, WU Wirtschaftsuniversität Wien, Vienna, Austria Giovanni Parmigiani, Dana-Farber Cancer Institute, Boston, USA

Use R! This series of inexpensive and focused books on R will publish shorter books aimed at practitioners. Books can discuss the use of R in a particular subject area (e.g., epidemiology, econometrics, psychometrics) or as it relates to statistical topics (e.g., missing data, longitudinal data). In most cases, books will combine LaTeX and R so that the code for figures and tables can be put on a website. Authors should assume a background as supplied by Dalgaard’s Introductory Statistics with R or other introductory books so that each book does not repeat basic material.

More information about this series at http://www.springer.com/series/6991

Robin Genuer Jean-Michel Poggi •

Random Forests with R

123

Robin Genuer ISPED University of Bordeaux Bordeaux, France

Jean-Michel Poggi Lab. Maths Orsay (LMO) Paris-Saclay University Orsay, France

ISSN 2197-5736 ISSN 2197-5744 (electronic) Use R! ISBN 978-3-030-56484-1 ISBN 978-3-030-56485-8 (eBook) https://doi.org/10.1007/978-3-030-56485-8 Translation from the French language edition: Les forêts aléatoires avec R by Robin Genuer and Jean-Michel Poggi © Presses Universitaires de Rennes 2019 All Rights Reserved © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

Random forests are a statistical learning method introduced by Leo Breiman in 2001. They are extensively used in many fields of application,