LaDissertation.com - Dissertations, fiches de lectures, exemples du BAC
Recherche

Cours 1 - programmation en R

Cours : Cours 1 - programmation en R. Recherche parmi 300 000+ dissertations

Par   •  26 Janvier 2021  •  Cours  •  21 189 Mots (85 Pages)  •  403 Vues

Page 1 sur 85

Table des matières

1 - Overview and History of R        4

What is S?        4

Historical notes        4

S philosophy        5

Back to R        5

Features of R        6

Free software        7

Drawbacks of R        7

Design of the R System        8

Some R ressources        9

Some useful books on S/R        9

2 - Getting Help        11

Asking questions - 1        11

Finding answers        11

Asking questions - 2        12

Example: Error message        12

Asking questions - 3        13

Subject headers        13

Do        14

Don’t        15

Case Study: A recent post to the R-devel mailing mist        15

Response        15

Analysis: What went wrong?        16

Places to turn        16

3 - R Console Input and Evaluation        17

Input        17

Evaluation        17

Printing        18

4 - Data Types - R Objects and Attributes        19

Objects        19

Numbers        20

Attributes        20

5 - Data Types - Vectors and Lists        22

Creating Vectors        22

Mixing objects        22

Explicit coercion        23

Lists        23

6 - Data Types - Matrices        25

Matrices        25

Cbind-ing and Rbind-ing        26

7 - Data Types - Factors        27

Factors        27

8 - Data Types - Missing Values        29

Missing values        29

9 - Data Types - Data Frames        30

Dataframes        30

10 - Data Types - Names Attribute        32

Names        32

11 - Reading Tabular Data        33

Reading data        33

Writing data        33

Reading data files with read.table        33

12 - Reading Large Tables        36

Reading in larger datasets using read.table        36

Know Thy system        37

Calculating memory requirements        37

13 - Textual Data Formats        39

Textual formats        39

Dput-ing R objects        40

Dumping R objects        40

14 - Connections Interfaces to the Outside World        41

Interface to the outside world        41

File connections        41

Connections        42

Reading lines of a text file        42

15 - Subsetting - Basics        43

Subsetting        43

16 - Subsetting - Lists        45

Subsetting lists        45

Subsetting nested elements of a list        46

17 - Subsetting - Matrices        47

Subsetting a matrix        47

18 - Subsetting - Partial Matching        49

Partial matching        49

19 - Subsetting - Removing Missing Values        50

Removing NA values        50

20 - Vectorized Operations        52

Vectorized Operations        52

Vectorized matrix operations        52


1 - Overview and History of R

What is S?

And then in this lecture, I'm going to give a little overview and a very brief history of the R statistical programing environment. So the very first question, I think is most obvious, is which is, what is R? And the answer is quite simple. It's basically R is a dialect of S. Okay, so that leads to the next logical question, which is what is S? So S was a language, or is a language that was developed by John Chambers and at the now-defunct Bell Labs. And it was initiated in 1976 as an internal statistical analysis environment, so the, an environment that people at Bell Labs could use to analyze data. And initially it was implemented as a series of FORTRAN libraries to kind of implement routines that were tedious to have to do over and over again, so there were FORTRAN libraries to repeat these statistical routines. Early versions of the language did not contain functions for statistical modelling. That did not come until roughly version three of the language. So in 1988, the system was rewritten in the C language and to make it more portable across systems and it began to resemble the system that we have today. So this was version three. And there was a seminal book the, called the Statistical Models in S written by John Chambers and Trevor Hastie. Sometimes referred to as the white book. And that documents, all the statistical analysis functionality that came into the version, that version of the language. Version four of the S language was released in 1998. And its version, it's the version we more or less use today. The book Programming with Data, which is a reference for this course, is written by John Chambers sometimes called the green book and it documents version four of the S language.

...

Télécharger au format  txt (108.5 Kb)   pdf (542.1 Kb)   docx (46.4 Kb)  
Voir 84 pages de plus »
Uniquement disponible sur LaDissertation.com