48 min read

Scraping NBA data

The NBA does a great job releasing statistics on every aspect of the game. Most teams have analytics experts crunching those numbers for insights to get a competitive advantage.

In this post I go through the process of scraping data from basketball-reference.com using the R package rvest. We also do some data munging with purrr and string interpolation with glue.

NBA Reference

This website has statistics on every aspect of the game for a long time. In this post, I’ll focus on individual player stats (Field Goal Percentage, Average Minutes, etc.) from the 2000/2001 season up to 2017/2018.

If you follow this link, you can find the data for the 2017 season in an html table.

Html uses a special markup language to format the tables so that your browser can render it properly. Typically, an html table has this structure:

<table>
  <th>Player</th>
  <td>Data<td>
</table>

To get this into an R data frame, you can use rvest. The first step is to get all the html from the page using read_html.

library(rvest)
nba_url <- "https://www.basketball-reference.com/leagues/NBA_2017_per_game.html"
web_page <- read_html(nba_url)

Now we need to get the actual data from the table. html_table does just that:

data <- html_table(web_page)
head(data[[1]])
##   Rk        Player Pos Age  Tm  G GS   MP  FG FGA  FG%  3P 3PA  3P%  2P
## 1  1  Alex Abrines  SG  23 OKC 68  6 15.5 2.0 5.0 .393 1.4 3.6 .381 0.6
## 2  2    Quincy Acy  PF  26 TOT 38  1 14.7 1.8 4.5 .412 1.0 2.4 .411 0.9
## 3  2    Quincy Acy  PF  26 DAL  6  0  8.0 0.8 2.8 .294 0.2 1.2 .143 0.7
## 4  2    Quincy Acy  PF  26 BRK 32  1 15.9 2.0 4.8 .425 1.1 2.6 .434 0.9
## 5  3  Steven Adams   C  23 OKC 80 80 29.9 4.7 8.2 .571 0.0 0.0 .000 4.7
## 6  4 Arron Afflalo  SG  31 SAC 61 45 25.9 3.0 6.9 .440 1.0 2.5 .411 2.0
##   2PA  2P% eFG%  FT FTA  FT% ORB DRB TRB AST STL BLK TOV  PF PS/G
## 1 1.4 .426 .531 0.6 0.7 .898 0.3 1.0 1.3 0.6 0.5 0.1 0.5 1.7  6.0
## 2 2.1 .413 .521 1.2 1.6 .750 0.5 2.5 3.0 0.5 0.4 0.4 0.6 1.8  5.8
## 3 1.7 .400 .324 0.3 0.5 .667 0.3 1.0 1.3 0.0 0.0 0.0 0.3 1.5  2.2
## 4 2.2 .414 .542 1.3 1.8 .754 0.6 2.8 3.3 0.6 0.4 0.5 0.6 1.8  6.5
## 5 8.2 .572 .571 2.0 3.2 .611 3.5 4.2 7.7 1.1 1.1 1.0 1.8 2.4 11.3
## 6 4.4 .457 .514 1.4 1.5 .892 0.1 1.9 2.0 1.3 0.3 0.1 0.7 1.7  8.4

That was easy! html_table returns a list with all the tables in the web_page as a data_frame. What we want is the first element of that list.

data <- data[[1]]

Let’s inspect the data:

str(data)
## 'data.frame':    619 obs. of  30 variables:
##  $ Rk    : chr  "1" "2" "2" "2" ...
##  $ Player: chr  "Alex Abrines" "Quincy Acy" "Quincy Acy" "Quincy Acy" ...
##  $ Pos   : chr  "SG" "PF" "PF" "PF" ...
##  $ Age   : chr  "23" "26" "26" "26" ...
##  $ Tm    : chr  "OKC" "TOT" "DAL" "BRK" ...
##  $ G     : chr  "68" "38" "6" "32" ...
##  $ GS    : chr  "6" "1" "0" "1" ...
##  $ MP    : chr  "15.5" "14.7" "8.0" "15.9" ...
##  $ FG    : chr  "2.0" "1.8" "0.8" "2.0" ...
##  $ FGA   : chr  "5.0" "4.5" "2.8" "4.8" ...
##  $ FG%   : chr  ".393" ".412" ".294" ".425" ...
##  $ 3P    : chr  "1.4" "1.0" "0.2" "1.1" ...
##  $ 3PA   : chr  "3.6" "2.4" "1.2" "2.6" ...
##  $ 3P%   : chr  ".381" ".411" ".143" ".434" ...
##  $ 2P    : chr  "0.6" "0.9" "0.7" "0.9" ...
##  $ 2PA   : chr  "1.4" "2.1" "1.7" "2.2" ...
##  $ 2P%   : chr  ".426" ".413" ".400" ".414" ...
##  $ eFG%  : chr  ".531" ".521" ".324" ".542" ...
##  $ FT    : chr  "0.6" "1.2" "0.3" "1.3" ...
##  $ FTA   : chr  "0.7" "1.6" "0.5" "1.8" ...
##  $ FT%   : chr  ".898" ".750" ".667" ".754" ...
##  $ ORB   : chr  "0.3" "0.5" "0.3" "0.6" ...
##  $ DRB   : chr  "1.0" "2.5" "1.0" "2.8" ...
##  $ TRB   : chr  "1.3" "3.0" "1.3" "3.3" ...
##  $ AST   : chr  "0.6" "0.5" "0.0" "0.6" ...
##  $ STL   : chr  "0.5" "0.4" "0.0" "0.4" ...
##  $ BLK   : chr  "0.1" "0.4" "0.0" "0.5" ...
##  $ TOV   : chr  "0.5" "0.6" "0.3" "0.6" ...
##  $ PF    : chr  "1.7" "1.8" "1.5" "1.8" ...
##  $ PS/G  : chr  "6.0" "5.8" "2.2" "6.5" ...

Ok, so all the columns are read in as text. That’s natural, since all html pages are text, so we need to convert the data to the correct data types.

Most columns seem to be numeric, except Player Name which is a character vector, Player Position and Team which are factors.

One approach to correcting the data types is to use dplyr’s mutate:

library(dplyr)
data_1 <- data %>% 
  mutate(Rk = as.numeric(Rk),
         Pos = as.factor(Pos),
         Age = as.numeric(Age))
#       ...

str(select(data_1, Rk, Pos, Age))
## 'data.frame':    619 obs. of  3 variables:
##  $ Rk : num  1 2 2 2 3 4 5 6 7 8 ...
##  $ Pos: Factor w/ 7 levels "C","PF","PF-C",..: 7 2 2 2 1 7 1 1 2 2 ...
##  $ Age: num  23 26 26 26 23 31 28 28 31 27 ...

But there’s a better way. mutate_at applies a function to a list of columns.

data_2 <- data %>% 
  mutate_at(vars(Tm, Pos), factor)

str(select(data_2, Tm, Pos))
## 'data.frame':    619 obs. of  2 variables:
##  $ Tm : Factor w/ 32 levels "ATL","BOS","BRK",..: 21 30 7 3 21 26 19 18 27 12 ...
##  $ Pos: Factor w/ 7 levels "C","PF","PF-C",..: 7 2 2 2 1 7 1 1 2 2 ...

The Columns need to be specified as a list of variables: vars(Tm, Pos). In this case, the function applied is factor.

Repeat for all the numeric variables using the : notation to select all the columns from G to PS.G.

data_3 <- data %>%
  mutate_at(vars(G:`PS/G`, Age), as.numeric)
str(select(data_3, G:`PS/G`, Age))
## 'data.frame':    619 obs. of  26 variables:
##  $ G   : num  68 38 6 32 80 61 39 62 72 61 ...
##  $ GS  : num  6 1 0 1 80 45 15 0 72 5 ...
##  $ MP  : num  15.5 14.7 8 15.9 29.9 25.9 15 8.6 32.4 14.3 ...
##  $ FG  : num  2 1.8 0.8 2 4.7 3 2.3 0.7 6.9 1.3 ...
##  $ FGA : num  5 4.5 2.8 4.8 8.2 6.9 4.6 1.4 14.6 2.8 ...
##  $ FG% : num  0.393 0.412 0.294 0.425 0.571 0.44 0.5 0.523 0.477 0.458 ...
##  $ 3P  : num  1.4 1 0.2 1.1 0 1 0 0 0.3 0 ...
##  $ 3PA : num  3.6 2.4 1.2 2.6 0 2.5 0.1 0 0.8 0 ...
##  $ 3P% : num  0.381 0.411 0.143 0.434 0 0.411 0 NA 0.411 0 ...
##  $ 2P  : num  0.6 0.9 0.7 0.9 4.7 2 2.3 0.7 6.6 1.3 ...
##  $ 2PA : num  1.4 2.1 1.7 2.2 8.2 4.4 4.5 1.4 13.8 2.7 ...
##  $ 2P% : num  0.426 0.413 0.4 0.414 0.572 0.457 0.511 0.523 0.48 0.461 ...
##  $ eFG%: num  0.531 0.521 0.324 0.542 0.571 0.514 0.5 0.523 0.488 0.458 ...
##  $ FT  : num  0.6 1.2 0.3 1.3 2 1.4 0.7 0.2 3.1 0.4 ...
##  $ FTA : num  0.7 1.6 0.5 1.8 3.2 1.5 1 0.4 3.8 0.5 ...
##  $ FT% : num  0.898 0.75 0.667 0.754 0.611 0.892 0.725 0.682 0.812 0.697 ...
##  $ ORB : num  0.3 0.5 0.3 0.6 3.5 0.1 1.2 0.8 2.4 1.7 ...
##  $ DRB : num  1 2.5 1 2.8 4.2 1.9 3.4 1.7 4.9 1.9 ...
##  $ TRB : num  1.3 3 1.3 3.3 7.7 2 4.5 2.5 7.3 3.6 ...
##  $ AST : num  0.6 0.5 0 0.6 1.1 1.3 0.3 0.4 1.9 0.9 ...
##  $ STL : num  0.5 0.4 0 0.4 1.1 0.3 0.5 0.4 0.6 0.3 ...
##  $ BLK : num  0.1 0.4 0 0.5 1 0.1 0.6 0.4 1.2 0.4 ...
##  $ TOV : num  0.5 0.6 0.3 0.6 1.8 0.7 0.8 0.3 1.4 0.5 ...
##  $ PF  : num  1.7 1.8 1.5 1.8 2.4 1.7 2 1.4 2.2 1.3 ...
##  $ PS/G: num  6 5.8 2.2 6.5 11.3 8.4 5.3 1.7 17.3 2.9 ...
##  $ Age : num  23 26 26 26 23 31 28 28 31 27 ...

But wait, what about the rest of the years? You probably heard about the DRY principe: Don’t Repeat Yourself. It’d be silly to write a whole script just to download data from another season, because most of the code would be the same. So we need to reuse our code to get data from the rest of the seasons.

To make our code more reusable, the first thing we need to do is build the url to get the html. We’ll use a super cool package called glue for that.

library(glue)
name <- "Rafa"
glue("Hello, {name}")
## Hello, Rafa

I’m sure you get the idea. It’s what programmer’s call string interpolation.

For loops are used to iterate over collections, so we can repeat some process without copying the code. With a for loop, we can generate a vector of urls for every season we’re interested in:

base_url <- "https://www.basketball-reference.com/leagues/"
page <- "NBA_{year}_per_game.html"
urls <- vector("double", length(2000:2017))
for (year in 2000:2017) {
  urls[[year-1999]] <- glue(base_url, page) ## year -1999 gives 1, 2, 3, ...
}
head(urls)
## [1] "https://www.basketball-reference.com/leagues/NBA_2000_per_game.html"
## [2] "https://www.basketball-reference.com/leagues/NBA_2001_per_game.html"
## [3] "https://www.basketball-reference.com/leagues/NBA_2002_per_game.html"
## [4] "https://www.basketball-reference.com/leagues/NBA_2003_per_game.html"
## [5] "https://www.basketball-reference.com/leagues/NBA_2004_per_game.html"
## [6] "https://www.basketball-reference.com/leagues/NBA_2005_per_game.html"

Awesome! Now instead of creating a vector of urls, we’ll create a list of data frames, one for every season.

base_url <- "https://www.basketball-reference.com/leagues/"
page <- "NBA_{year}_per_game.html"
data <- list()

for (year in 2015:2017) {
  nba_url <- glue(base_url, page) 
  web_page <- read_html(nba_url)
  data[[year-2014]] <- html_table(web_page)[[1]]

}
str(data)
## List of 3
##  $ :'data.frame':    675 obs. of  30 variables:
##   ..$ Rk    : chr [1:675] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:675] "Quincy Acy" "Jordan Adams" "Steven Adams" "Jeff Adrien" ...
##   ..$ Pos   : chr [1:675] "PF" "SG" "C" "PF" ...
##   ..$ Age   : chr [1:675] "24" "20" "21" "28" ...
##   ..$ Tm    : chr [1:675] "NYK" "MEM" "OKC" "MIN" ...
##   ..$ G     : chr [1:675] "68" "30" "70" "17" ...
##   ..$ GS    : chr [1:675] "22" "0" "67" "0" ...
##   ..$ MP    : chr [1:675] "18.9" "8.3" "25.3" "12.6" ...
##   ..$ FG    : chr [1:675] "2.2" "1.2" "3.1" "1.1" ...
##   ..$ FGA   : chr [1:675] "4.9" "2.9" "5.7" "2.6" ...
##   ..$ FG%   : chr [1:675] ".459" ".407" ".544" ".432" ...
##   ..$ 3P    : chr [1:675] "0.3" "0.3" "0.0" "0.0" ...
##   ..$ 3PA   : chr [1:675] "0.9" "0.8" "0.0" "0.0" ...
##   ..$ 3P%   : chr [1:675] ".300" ".400" ".000" "" ...
##   ..$ 2P    : chr [1:675] "2.0" "0.8" "3.1" "1.1" ...
##   ..$ 2PA   : chr [1:675] "4.0" "2.0" "5.7" "2.6" ...
##   ..$ 2P%   : chr [1:675] ".494" ".410" ".547" ".432" ...
##   ..$ eFG%  : chr [1:675] ".486" ".465" ".544" ".432" ...
##   ..$ FT    : chr [1:675] "1.1" "0.5" "1.5" "1.3" ...
##   ..$ FTA   : chr [1:675] "1.4" "0.8" "2.9" "2.2" ...
##   ..$ FT%   : chr [1:675] ".784" ".609" ".502" ".579" ...
##   ..$ ORB   : chr [1:675] "1.2" "0.3" "2.8" "1.4" ...
##   ..$ DRB   : chr [1:675] "3.3" "0.6" "4.6" "3.2" ...
##   ..$ TRB   : chr [1:675] "4.4" "0.9" "7.5" "4.5" ...
##   ..$ AST   : chr [1:675] "1.0" "0.5" "0.9" "0.9" ...
##   ..$ STL   : chr [1:675] "0.4" "0.5" "0.5" "0.2" ...
##   ..$ BLK   : chr [1:675] "0.3" "0.2" "1.2" "0.5" ...
##   ..$ TOV   : chr [1:675] "0.9" "0.5" "1.4" "0.5" ...
##   ..$ PF    : chr [1:675] "2.2" "0.8" "3.2" "1.8" ...
##   ..$ PS/G  : chr [1:675] "5.9" "3.1" "7.7" "3.5" ...
##  $ :'data.frame':    601 obs. of  30 variables:
##   ..$ Rk    : chr [1:601] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:601] "Quincy Acy" "Jordan Adams" "Steven Adams" "Arron Afflalo" ...
##   ..$ Pos   : chr [1:601] "PF" "SG" "C" "SG" ...
##   ..$ Age   : chr [1:601] "25" "21" "22" "30" ...
##   ..$ Tm    : chr [1:601] "SAC" "MEM" "OKC" "NYK" ...
##   ..$ G     : chr [1:601] "59" "2" "80" "71" ...
##   ..$ GS    : chr [1:601] "29" "0" "80" "57" ...
##   ..$ MP    : chr [1:601] "14.8" "7.5" "25.2" "33.4" ...
##   ..$ FG    : chr [1:601] "2.0" "1.0" "3.3" "5.0" ...
##   ..$ FGA   : chr [1:601] "3.6" "3.0" "5.3" "11.3" ...
##   ..$ FG%   : chr [1:601] ".556" ".333" ".613" ".443" ...
##   ..$ 3P    : chr [1:601] "0.3" "0.0" "0.0" "1.3" ...
##   ..$ 3PA   : chr [1:601] "0.8" "0.5" "0.0" "3.4" ...
##   ..$ 3P%   : chr [1:601] ".388" ".000" "" ".382" ...
##   ..$ 2P    : chr [1:601] "1.7" "1.0" "3.3" "3.7" ...
##   ..$ 2PA   : chr [1:601] "2.8" "2.5" "5.3" "7.9" ...
##   ..$ 2P%   : chr [1:601] ".606" ".400" ".613" ".469" ...
##   ..$ eFG%  : chr [1:601] ".600" ".333" ".613" ".500" ...
##   ..$ FT    : chr [1:601] "0.8" "1.5" "1.4" "1.5" ...
##   ..$ FTA   : chr [1:601] "1.2" "2.5" "2.5" "1.8" ...
##   ..$ FT%   : chr [1:601] ".735" ".600" ".582" ".840" ...
##   ..$ ORB   : chr [1:601] "1.1" "0.0" "2.7" "0.3" ...
##   ..$ DRB   : chr [1:601] "2.1" "1.0" "3.9" "3.4" ...
##   ..$ TRB   : chr [1:601] "3.2" "1.0" "6.7" "3.7" ...
##   ..$ AST   : chr [1:601] "0.5" "1.5" "0.8" "2.0" ...
##   ..$ STL   : chr [1:601] "0.5" "1.5" "0.5" "0.4" ...
##   ..$ BLK   : chr [1:601] "0.4" "0.0" "1.1" "0.1" ...
##   ..$ TOV   : chr [1:601] "0.5" "1.0" "1.1" "1.2" ...
##   ..$ PF    : chr [1:601] "1.7" "1.0" "2.8" "2.0" ...
##   ..$ PS/G  : chr [1:601] "5.2" "3.5" "8.0" "12.8" ...
##  $ :'data.frame':    619 obs. of  30 variables:
##   ..$ Rk    : chr [1:619] "1" "2" "2" "2" ...
##   ..$ Player: chr [1:619] "Alex Abrines" "Quincy Acy" "Quincy Acy" "Quincy Acy" ...
##   ..$ Pos   : chr [1:619] "SG" "PF" "PF" "PF" ...
##   ..$ Age   : chr [1:619] "23" "26" "26" "26" ...
##   ..$ Tm    : chr [1:619] "OKC" "TOT" "DAL" "BRK" ...
##   ..$ G     : chr [1:619] "68" "38" "6" "32" ...
##   ..$ GS    : chr [1:619] "6" "1" "0" "1" ...
##   ..$ MP    : chr [1:619] "15.5" "14.7" "8.0" "15.9" ...
##   ..$ FG    : chr [1:619] "2.0" "1.8" "0.8" "2.0" ...
##   ..$ FGA   : chr [1:619] "5.0" "4.5" "2.8" "4.8" ...
##   ..$ FG%   : chr [1:619] ".393" ".412" ".294" ".425" ...
##   ..$ 3P    : chr [1:619] "1.4" "1.0" "0.2" "1.1" ...
##   ..$ 3PA   : chr [1:619] "3.6" "2.4" "1.2" "2.6" ...
##   ..$ 3P%   : chr [1:619] ".381" ".411" ".143" ".434" ...
##   ..$ 2P    : chr [1:619] "0.6" "0.9" "0.7" "0.9" ...
##   ..$ 2PA   : chr [1:619] "1.4" "2.1" "1.7" "2.2" ...
##   ..$ 2P%   : chr [1:619] ".426" ".413" ".400" ".414" ...
##   ..$ eFG%  : chr [1:619] ".531" ".521" ".324" ".542" ...
##   ..$ FT    : chr [1:619] "0.6" "1.2" "0.3" "1.3" ...
##   ..$ FTA   : chr [1:619] "0.7" "1.6" "0.5" "1.8" ...
##   ..$ FT%   : chr [1:619] ".898" ".750" ".667" ".754" ...
##   ..$ ORB   : chr [1:619] "0.3" "0.5" "0.3" "0.6" ...
##   ..$ DRB   : chr [1:619] "1.0" "2.5" "1.0" "2.8" ...
##   ..$ TRB   : chr [1:619] "1.3" "3.0" "1.3" "3.3" ...
##   ..$ AST   : chr [1:619] "0.6" "0.5" "0.0" "0.6" ...
##   ..$ STL   : chr [1:619] "0.5" "0.4" "0.0" "0.4" ...
##   ..$ BLK   : chr [1:619] "0.1" "0.4" "0.0" "0.5" ...
##   ..$ TOV   : chr [1:619] "0.5" "0.6" "0.3" "0.6" ...
##   ..$ PF    : chr [1:619] "1.7" "1.8" "1.5" "1.8" ...
##   ..$ PS/G  : chr [1:619] "6.0" "5.8" "2.2" "6.5" ...

Better iteration with map

Now our script does everything we want it to do. But there’s a wrinkle we have to iron out.

A lot of programming is about iteration, and every programming language supports for loops. R has a great way to make iteration cleaner with the purrr package. There’s a whole chapter on iteration in R for Data Science, which is a must read for anyone interested in R.

Our script has to repeat the same steps for every year from 2000 to 2017. That’s a job for map! map takes a list and applies a function to every element of the list.

For example, here we apply the head function to a list of data frames to get the first 6 lines of each one:

my_list <- list(mtcars, iris)
purrr::map(my_list, head)
## [[1]]
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
## 
## [[2]]
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Instead of head, which is a predefined function, you can use your own functions:

my_list <- list(mtcars, iris)
my_func <- function(data_frame) { print("Hello from my fun!")}
purrr::map(my_list, my_func)
## [1] "Hello from my fun!"
## [1] "Hello from my fun!"
## [[1]]
## [1] "Hello from my fun!"
## 
## [[2]]
## [1] "Hello from my fun!"

When applying map, you need to figure out our initial list to iterate over, and the function we want to apply to it’s elements in every iteration. The first part is easy, to get a list of seasons, we convert a numeric vector into a list like: as.list(2000:2017).

Now we need a function that takes one year and does the job for that year. Something like:

get_player_data <- function(year) {
  base_url <- "https://www.basketball-reference.com/leagues/"
  page <- "NBA_{year}_per_game.html"
  # Download data
  nba_url <- glue(base_url, page) 
  # extract the data frame from the table
  web_page <- read_html(nba_url)
  data <- html_table(web_page)[[1]]
  # return a data frame
  data
}

data.14 <- get_player_data(2014)
head(data.14)
##   Rk       Player Pos Age  Tm  G GS   MP  FG FGA  FG%  3P 3PA  3P%  2P 2PA
## 1  1   Quincy Acy  SF  23 TOT 63  0 13.4 1.0 2.2 .468 0.1 0.2 .267 1.0 2.0
## 2  1   Quincy Acy  SF  23 TOR  7  0  8.7 0.9 2.0 .429 0.3 0.7 .400 0.6 1.3
## 3  1   Quincy Acy  SF  23 SAC 56  0 14.0 1.1 2.3 .472 0.0 0.2 .200 1.0 2.1
## 4  2 Steven Adams   C  20 OKC 81 20 14.8 1.1 2.3 .503 0.0 0.0      1.1 2.3
## 5  3  Jeff Adrien  PF  27 TOT 53 12 18.1 2.7 5.2 .520 0.0 0.0      2.7 5.2
## 6  3  Jeff Adrien  PF  27 CHA 25  0 10.2 0.9 1.6 .550 0.0 0.0      0.9 1.6
##    2P% eFG%  FT FTA  FT% ORB DRB TRB AST STL BLK TOV  PF PS/G
## 1 .492 .482 0.6 0.8 .660 1.1 2.3 3.4 0.4 0.4 0.4 0.5 1.9  2.7
## 2 .444 .500 0.7 1.1 .625 0.7 1.4 2.1 0.6 0.6 0.4 0.3 1.1  2.7
## 3 .496 .480 0.5 0.8 .667 1.2 2.4 3.6 0.4 0.3 0.4 0.5 2.0  2.7
## 4 .503 .503 1.0 1.7 .581 1.8 2.3 4.1 0.5 0.5 0.7 0.9 2.5  3.3
## 5 .520 .520 1.4 2.2 .639 1.9 3.8 5.8 0.7 0.5 0.7 0.7 2.0  6.8
## 6 .550 .550 0.5 1.0 .520 1.3 2.2 3.5 0.3 0.3 0.6 0.3 1.4  2.3

Pipes

Pipes are a well known feature in the tidyverse, so I won’t go into detail of how they work. We could turn our function into a set of pipes easily except for this line: data <- html_table(web_page)[[1]].

We’re actually calling two functions in that line: html_table with the web_page argument and [[ with 1 as an argument.

web_page <- read_html(nba_url)
tbl_lst <- html_table(web_page)
df <- `[[`(tbl_lst, 1)

So we can pipe it like this:

df_2 <- web_page %>% 
 html_table %>% 
 `[[`(1)

Finally, we re-write our get_player_data function as:

get_player_data <- function(year) {
  base_url <- "https://www.basketball-reference.com/leagues/"
  page <- "NBA_{year}_per_game.html"
  
  glue(base_url, page) %>% 
    read_html %>% 
    html_table %>% 
    `[[`(1)
}

df.14 <- get_player_data(2014)
head(df.14)
##   Rk       Player Pos Age  Tm  G GS   MP  FG FGA  FG%  3P 3PA  3P%  2P 2PA
## 1  1   Quincy Acy  SF  23 TOT 63  0 13.4 1.0 2.2 .468 0.1 0.2 .267 1.0 2.0
## 2  1   Quincy Acy  SF  23 TOR  7  0  8.7 0.9 2.0 .429 0.3 0.7 .400 0.6 1.3
## 3  1   Quincy Acy  SF  23 SAC 56  0 14.0 1.1 2.3 .472 0.0 0.2 .200 1.0 2.1
## 4  2 Steven Adams   C  20 OKC 81 20 14.8 1.1 2.3 .503 0.0 0.0      1.1 2.3
## 5  3  Jeff Adrien  PF  27 TOT 53 12 18.1 2.7 5.2 .520 0.0 0.0      2.7 5.2
## 6  3  Jeff Adrien  PF  27 CHA 25  0 10.2 0.9 1.6 .550 0.0 0.0      0.9 1.6
##    2P% eFG%  FT FTA  FT% ORB DRB TRB AST STL BLK TOV  PF PS/G
## 1 .492 .482 0.6 0.8 .660 1.1 2.3 3.4 0.4 0.4 0.4 0.5 1.9  2.7
## 2 .444 .500 0.7 1.1 .625 0.7 1.4 2.1 0.6 0.6 0.4 0.3 1.1  2.7
## 3 .496 .480 0.5 0.8 .667 1.2 2.4 3.6 0.4 0.3 0.4 0.5 2.0  2.7
## 4 .503 .503 1.0 1.7 .581 1.8 2.3 4.1 0.5 0.5 0.7 0.9 2.5  3.3
## 5 .520 .520 1.4 2.2 .639 1.9 3.8 5.8 0.7 0.5 0.7 0.7 2.0  6.8
## 6 .550 .550 0.5 1.0 .520 1.3 2.2 3.5 0.3 0.3 0.6 0.3 1.4  2.3

Good, now we’re all set for using map:

list_of_data_frames <- as.list(2000:2017) %>% 
  purrr::map(get_player_data)
  
str(list_of_data_frames)
## List of 18
##  $ :'data.frame':    517 obs. of  30 variables:
##   ..$ Rk    : chr [1:517] "1" "1" "1" "2" ...
##   ..$ Player: chr [1:517] "Tariq Abdul-Wahad" "Tariq Abdul-Wahad" "Tariq Abdul-Wahad" "Shareef Abdur-Rahim" ...
##   ..$ Pos   : chr [1:517] "SG" "SG" "SG" "SF" ...
##   ..$ Age   : chr [1:517] "25" "25" "25" "23" ...
##   ..$ Tm    : chr [1:517] "TOT" "ORL" "DEN" "VAN" ...
##   ..$ G     : chr [1:517] "61" "46" "15" "82" ...
##   ..$ GS    : chr [1:517] "56" "46" "10" "82" ...
##   ..$ MP    : chr [1:517] "25.9" "26.2" "24.9" "39.3" ...
##   ..$ FG    : chr [1:517] "4.5" "4.8" "3.4" "7.2" ...
##   ..$ FGA   : chr [1:517] "10.6" "11.2" "8.7" "15.6" ...
##   ..$ FG%   : chr [1:517] ".424" ".433" ".389" ".465" ...
##   ..$ 3P    : chr [1:517] "0.0" "0.0" "0.1" "0.4" ...
##   ..$ 3PA   : chr [1:517] "0.4" "0.5" "0.1" "1.2" ...
##   ..$ 3P%   : chr [1:517] ".130" ".095" ".500" ".302" ...
##   ..$ 2P    : chr [1:517] "4.4" "4.8" "3.3" "6.9" ...
##   ..$ 2PA   : chr [1:517] "10.2" "10.7" "8.6" "14.4" ...
##   ..$ 2P%   : chr [1:517] ".435" ".447" ".388" ".478" ...
##   ..$ eFG%  : chr [1:517] ".426" ".435" ".393" ".477" ...
##   ..$ FT    : chr [1:517] "2.4" "2.5" "2.1" "5.4" ...
##   ..$ FTA   : chr [1:517] "3.2" "3.3" "2.8" "6.7" ...
##   ..$ FT%   : chr [1:517] ".756" ".762" ".738" ".809" ...
##   ..$ ORB   : chr [1:517] "1.7" "1.7" "1.6" "2.7" ...
##   ..$ DRB   : chr [1:517] "3.1" "3.5" "1.9" "7.4" ...
##   ..$ TRB   : chr [1:517] "4.8" "5.2" "3.5" "10.1" ...
##   ..$ AST   : chr [1:517] "1.6" "1.6" "1.7" "3.3" ...
##   ..$ STL   : chr [1:517] "1.0" "1.2" "0.4" "1.1" ...
##   ..$ BLK   : chr [1:517] "0.5" "0.3" "0.8" "1.1" ...
##   ..$ TOV   : chr [1:517] "1.7" "1.9" "1.3" "3.0" ...
##   ..$ PF    : chr [1:517] "2.4" "2.5" "2.1" "3.0" ...
##   ..$ PS/G  : chr [1:517] "11.4" "12.2" "8.9" "20.3" ...
##  $ :'data.frame':    559 obs. of  30 variables:
##   ..$ Rk    : chr [1:559] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:559] "Mahmoud Abdul-Rauf" "Tariq Abdul-Wahad" "Shareef Abdur-Rahim" "Cory Alexander" ...
##   ..$ Pos   : chr [1:559] "PG" "SG" "SF" "PG" ...
##   ..$ Age   : chr [1:559] "31" "26" "24" "27" ...
##   ..$ Tm    : chr [1:559] "VAN" "DEN" "VAN" "ORL" ...
##   ..$ G     : chr [1:559] "41" "29" "81" "26" ...
##   ..$ GS    : chr [1:559] "0" "12" "81" "0" ...
##   ..$ MP    : chr [1:559] "11.9" "14.5" "40.0" "8.7" ...
##   ..$ FG    : chr [1:559] "2.9" "1.5" "7.5" "0.7" ...
##   ..$ FGA   : chr [1:559] "6.0" "3.8" "15.8" "2.2" ...
##   ..$ FG%   : chr [1:559] ".488" ".387" ".472" ".321" ...
##   ..$ 3P    : chr [1:559] "0.1" "0.1" "0.1" "0.2" ...
##   ..$ 3PA   : chr [1:559] "0.3" "0.3" "0.8" "0.6" ...
##   ..$ 3P%   : chr [1:559] ".286" ".400" ".188" ".250" ...
##   ..$ 2P    : chr [1:559] "2.8" "1.3" "7.3" "0.5" ...
##   ..$ 2PA   : chr [1:559] "5.7" "3.5" "15.0" "1.5" ...
##   ..$ 2P%   : chr [1:559] ".500" ".386" ".487" ".350" ...
##   ..$ eFG%  : chr [1:559] ".496" ".405" ".477" ".357" ...
##   ..$ FT    : chr [1:559] "0.5" "0.7" "5.5" "0.5" ...
##   ..$ FTA   : chr [1:559] "0.7" "1.2" "6.6" "0.7" ...
##   ..$ FT%   : chr [1:559] ".759" ".583" ".834" ".667" ...
##   ..$ ORB   : chr [1:559] "0.1" "0.5" "2.2" "0.0" ...
##   ..$ DRB   : chr [1:559] "0.5" "1.6" "6.9" "1.0" ...
##   ..$ TRB   : chr [1:559] "0.6" "2.0" "9.1" "1.0" ...
##   ..$ AST   : chr [1:559] "1.9" "0.8" "3.1" "1.4" ...
##   ..$ STL   : chr [1:559] "0.2" "0.5" "1.1" "0.6" ...
##   ..$ BLK   : chr [1:559] "0.0" "0.4" "1.0" "0.0" ...
##   ..$ TOV   : chr [1:559] "0.6" "1.2" "2.9" "1.0" ...
##   ..$ PF    : chr [1:559] "1.2" "1.9" "2.9" "1.1" ...
##   ..$ PS/G  : chr [1:559] "6.5" "3.8" "20.5" "2.0" ...
##  $ :'data.frame':    521 obs. of  30 variables:
##   ..$ Rk    : chr [1:521] "1" "1" "1" "2" ...
##   ..$ Player: chr [1:521] "Tariq Abdul-Wahad" "Tariq Abdul-Wahad" "Tariq Abdul-Wahad" "Shareef Abdur-Rahim" ...
##   ..$ Pos   : chr [1:521] "SG" "SG" "SG" "PF" ...
##   ..$ Age   : chr [1:521] "27" "27" "27" "25" ...
##   ..$ Tm    : chr [1:521] "TOT" "DEN" "DAL" "ATL" ...
##   ..$ G     : chr [1:521] "24" "20" "4" "77" ...
##   ..$ GS    : chr [1:521] "12" "12" "0" "77" ...
##   ..$ MP    : chr [1:521] "18.4" "20.9" "6.0" "38.7" ...
##   ..$ FG    : chr [1:521] "2.3" "2.8" "0.0" "7.8" ...
##   ..$ FGA   : chr [1:521] "6.1" "7.3" "0.5" "16.8" ...
##   ..$ FG%   : chr [1:521] ".374" ".379" ".000" ".461" ...
##   ..$ 3P    : chr [1:521] "0.0" "0.1" "0.0" "0.3" ...
##   ..$ 3PA   : chr [1:521] "0.1" "0.1" "0.0" "0.9" ...
##   ..$ 3P%   : chr [1:521] ".500" ".500" "" ".300" ...
##   ..$ 2P    : chr [1:521] "2.3" "2.7" "0.0" "7.5" ...
##   ..$ 2PA   : chr [1:521] "6.0" "7.2" "0.5" "15.9" ...
##   ..$ 2P%   : chr [1:521] ".372" ".378" ".000" ".470" ...
##   ..$ eFG%  : chr [1:521] ".378" ".383" ".000" ".469" ...
##   ..$ FT    : chr [1:521] "1.0" "1.2" "0.0" "5.4" ...
##   ..$ FTA   : chr [1:521] "1.4" "1.6" "0.3" "6.8" ...
##   ..$ FT%   : chr [1:521] ".727" ".750" ".000" ".801" ...
##   ..$ ORB   : chr [1:521] "1.7" "2.0" "0.5" "2.6" ...
##   ..$ DRB   : chr [1:521] "1.8" "2.0" "1.0" "6.5" ...
##   ..$ TRB   : chr [1:521] "3.5" "3.9" "1.5" "9.0" ...
##   ..$ AST   : chr [1:521] "1.0" "1.1" "0.5" "3.1" ...
##   ..$ STL   : chr [1:521] "0.8" "0.9" "0.5" "1.3" ...
##   ..$ BLK   : chr [1:521] "0.4" "0.5" "0.3" "1.1" ...
##   ..$ TOV   : chr [1:521] "1.1" "1.2" "0.8" "3.2" ...
##   ..$ PF    : chr [1:521] "2.3" "2.6" "1.3" "2.8" ...
##   ..$ PS/G  : chr [1:521] "5.6" "6.8" "0.0" "21.2" ...
##  $ :'data.frame':    504 obs. of  30 variables:
##   ..$ Rk    : chr [1:504] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:504] "Tariq Abdul-Wahad" "Shareef Abdur-Rahim" "Courtney Alexander" "Malik Allen" ...
##   ..$ Pos   : chr [1:504] "SG" "PF" "PG" "PF" ...
##   ..$ Age   : chr [1:504] "28" "26" "25" "24" ...
##   ..$ Tm    : chr [1:504] "DAL" "ATL" "NOH" "MIA" ...
##   ..$ G     : chr [1:504] "14" "81" "66" "80" ...
##   ..$ GS    : chr [1:504] "0" "81" "7" "73" ...
##   ..$ MP    : chr [1:504] "14.6" "38.1" "20.6" "29.0" ...
##   ..$ FG    : chr [1:504] "1.9" "7.0" "2.9" "4.2" ...
##   ..$ FGA   : chr [1:504] "4.1" "14.6" "7.7" "9.9" ...
##   ..$ FG%   : chr [1:504] ".466" ".478" ".382" ".424" ...
##   ..$ 3P    : chr [1:504] "0.0" "0.3" "0.3" "0.0" ...
##   ..$ 3PA   : chr [1:504] "0.1" "0.7" "0.9" "0.1" ...
##   ..$ 3P%   : chr [1:504] ".000" ".350" ".333" ".000" ...
##   ..$ 2P    : chr [1:504] "1.9" "6.7" "2.6" "4.2" ...
##   ..$ 2PA   : chr [1:504] "4.1" "13.9" "6.8" "9.8" ...
##   ..$ 2P%   : chr [1:504] ".474" ".485" ".388" ".426" ...
##   ..$ eFG%  : chr [1:504] ".466" ".487" ".401" ".424" ...
##   ..$ FT    : chr [1:504] "0.2" "5.6" "1.8" "1.2" ...
##   ..$ FTA   : chr [1:504] "0.4" "6.7" "2.2" "1.5" ...
##   ..$ FT%   : chr [1:504] ".500" ".841" ".808" ".802" ...
##   ..$ ORB   : chr [1:504] "1.0" "2.2" "0.6" "1.7" ...
##   ..$ DRB   : chr [1:504] "1.9" "6.2" "1.2" "3.6" ...
##   ..$ TRB   : chr [1:504] "2.9" "8.4" "1.8" "5.3" ...
##   ..$ AST   : chr [1:504] "1.5" "3.0" "1.2" "0.7" ...
##   ..$ STL   : chr [1:504] "0.4" "1.1" "0.5" "0.5" ...
##   ..$ BLK   : chr [1:504] "0.2" "0.5" "0.1" "1.0" ...
##   ..$ TOV   : chr [1:504] "0.5" "2.6" "1.0" "1.6" ...
##   ..$ PF    : chr [1:504] "1.9" "3.0" "1.9" "2.9" ...
##   ..$ PS/G  : chr [1:504] "4.1" "19.9" "7.9" "9.6" ...
##  $ :'data.frame':    607 obs. of  30 variables:
##   ..$ Rk    : chr [1:607] "1" "1" "1" "2" ...
##   ..$ Player: chr [1:607] "Shareef Abdur-Rahim" "Shareef Abdur-Rahim" "Shareef Abdur-Rahim" "Malik Allen" ...
##   ..$ Pos   : chr [1:607] "PF" "PF" "PF" "PF" ...
##   ..$ Age   : chr [1:607] "27" "27" "27" "25" ...
##   ..$ Tm    : chr [1:607] "TOT" "ATL" "POR" "MIA" ...
##   ..$ G     : chr [1:607] "85" "53" "32" "45" ...
##   ..$ GS    : chr [1:607] "56" "53" "3" "6" ...
##   ..$ MP    : chr [1:607] "31.6" "36.9" "22.8" "13.7" ...
##   ..$ FG    : chr [1:607] "5.9" "7.2" "3.7" "1.8" ...
##   ..$ FGA   : chr [1:607] "12.4" "14.9" "8.3" "4.4" ...
##   ..$ FG%   : chr [1:607] ".475" ".485" ".447" ".419" ...
##   ..$ 3P    : chr [1:607] "0.1" "0.1" "0.1" "0.0" ...
##   ..$ 3PA   : chr [1:607] "0.4" "0.4" "0.3" "0.0" ...
##   ..$ 3P%   : chr [1:607] ".265" ".217" ".364" "" ...
##   ..$ 2P    : chr [1:607] "5.8" "7.1" "3.6" "1.8" ...
##   ..$ 2PA   : chr [1:607] "12.0" "14.5" "7.9" "4.4" ...
##   ..$ 2P%   : chr [1:607] ".482" ".493" ".451" ".419" ...
##   ..$ eFG%  : chr [1:607] ".480" ".488" ".455" ".419" ...
##   ..$ FT    : chr [1:607] "4.4" "5.5" "2.5" "0.6" ...
##   ..$ FTA   : chr [1:607] "5.0" "6.3" "3.0" "0.7" ...
##   ..$ FT%   : chr [1:607] ".869" ".880" ".832" ".758" ...
##   ..$ ORB   : chr [1:607] "2.2" "2.7" "1.5" "0.9" ...
##   ..$ DRB   : chr [1:607] "5.3" "6.7" "3.0" "1.7" ...
##   ..$ TRB   : chr [1:607] "7.5" "9.3" "4.5" "2.6" ...
##   ..$ AST   : chr [1:607] "2.0" "2.4" "1.5" "0.4" ...
##   ..$ STL   : chr [1:607] "0.8" "0.8" "0.8" "0.3" ...
##   ..$ BLK   : chr [1:607] "0.4" "0.4" "0.6" "0.6" ...
##   ..$ TOV   : chr [1:607] "2.2" "2.5" "1.7" "0.6" ...
##   ..$ PF    : chr [1:607] "2.6" "2.8" "2.3" "1.8" ...
##   ..$ PS/G  : chr [1:607] "16.3" "20.1" "10.0" "4.2" ...
##  $ :'data.frame':    608 obs. of  30 variables:
##   ..$ Rk    : chr [1:608] "1" "2" "3" "3" ...
##   ..$ Player: chr [1:608] "Shareef Abdur-Rahim" "Cory Alexander" "Malik Allen" "Malik Allen" ...
##   ..$ Pos   : chr [1:608] "PF" "SG" "PF" "PF" ...
##   ..$ Age   : chr [1:608] "28" "31" "26" "26" ...
##   ..$ Tm    : chr [1:608] "POR" "CHA" "TOT" "MIA" ...
##   ..$ G     : chr [1:608] "54" "16" "36" "14" ...
##   ..$ GS    : chr [1:608] "49" "1" "1" "0" ...
##   ..$ MP    : chr [1:608] "34.6" "12.6" "14.4" "17.7" ...
##   ..$ FG    : chr [1:608] "6.2" "1.0" "2.3" "2.5" ...
##   ..$ FGA   : chr [1:608] "12.4" "3.1" "4.9" "5.4" ...
##   ..$ FG%   : chr [1:608] ".503" ".327" ".475" ".461" ...
##   ..$ 3P    : chr [1:608] "0.3" "0.5" "0.0" "0.0" ...
##   ..$ 3PA   : chr [1:608] "0.7" "1.2" "0.0" "0.0" ...
##   ..$ 3P%   : chr [1:608] ".385" ".421" "" "" ...
##   ..$ 2P    : chr [1:608] "6.0" "0.5" "2.3" "2.5" ...
##   ..$ 2PA   : chr [1:608] "11.7" "1.9" "4.9" "5.4" ...
##   ..$ 2P%   : chr [1:608] ".510" ".267" ".475" ".461" ...
##   ..$ eFG%  : chr [1:608] ".514" ".408" ".475" ".461" ...
##   ..$ FT    : chr [1:608] "4.1" "0.6" "0.7" "0.9" ...
##   ..$ FTA   : chr [1:608] "4.7" "0.8" "0.8" "1.0" ...
##   ..$ FT%   : chr [1:608] ".866" ".750" ".929" ".929" ...
##   ..$ ORB   : chr [1:608] "2.3" "0.5" "1.1" "1.7" ...
##   ..$ DRB   : chr [1:608] "5.0" "1.3" "1.6" "2.0" ...
##   ..$ TRB   : chr [1:608] "7.3" "1.8" "2.8" "3.7" ...
##   ..$ AST   : chr [1:608] "2.1" "2.3" "0.5" "0.8" ...
##   ..$ STL   : chr [1:608] "0.9" "0.6" "0.3" "0.3" ...
##   ..$ BLK   : chr [1:608] "0.5" "0.1" "0.6" "0.8" ...
##   ..$ TOV   : chr [1:608] "2.2" "1.2" "0.5" "0.8" ...
##   ..$ PF    : chr [1:608] "2.8" "1.8" "1.5" "2.1" ...
##   ..$ PS/G  : chr [1:608] "16.8" "3.1" "5.4" "5.9" ...
##  $ :'data.frame':    585 obs. of  30 variables:
##   ..$ Rk    : chr [1:585] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:585] "Shareef Abdur-Rahim" "Alex Acker" "Malik Allen" "Ray Allen*" ...
##   ..$ Pos   : chr [1:585] "PF" "SG" "PF" "SG" ...
##   ..$ Age   : chr [1:585] "29" "23" "27" "30" ...
##   ..$ Tm    : chr [1:585] "SAC" "DET" "CHI" "SEA" ...
##   ..$ G     : chr [1:585] "72" "5" "54" "78" ...
##   ..$ GS    : chr [1:585] "30" "0" "20" "78" ...
##   ..$ MP    : chr [1:585] "27.2" "7.0" "13.0" "38.7" ...
##   ..$ FG    : chr [1:585] "4.6" "0.8" "2.2" "8.7" ...
##   ..$ FGA   : chr [1:585] "8.8" "3.2" "4.6" "19.2" ...
##   ..$ FG%   : chr [1:585] ".525" ".250" ".490" ".454" ...
##   ..$ 3P    : chr [1:585] "0.1" "0.2" "0.0" "3.4" ...
##   ..$ 3PA   : chr [1:585] "0.3" "1.0" "0.0" "8.4" ...
##   ..$ 3P%   : chr [1:585] ".227" ".200" "1.000" ".412" ...
##   ..$ 2P    : chr [1:585] "4.5" "0.6" "2.2" "5.3" ...
##   ..$ 2PA   : chr [1:585] "8.5" "2.2" "4.6" "10.9" ...
##   ..$ 2P%   : chr [1:585] ".536" ".273" ".488" ".486" ...
##   ..$ eFG%  : chr [1:585] ".529" ".281" ".492" ".544" ...
##   ..$ FT    : chr [1:585] "3.0" "0.0" "0.4" "4.2" ...
##   ..$ FTA   : chr [1:585] "3.9" "0.0" "0.7" "4.6" ...
##   ..$ FT%   : chr [1:585] ".784" "" ".605" ".903" ...
##   ..$ ORB   : chr [1:585] "1.5" "0.2" "0.8" "0.9" ...
##   ..$ DRB   : chr [1:585] "3.5" "0.8" "1.8" "3.3" ...
##   ..$ TRB   : chr [1:585] "5.0" "1.0" "2.6" "4.3" ...
##   ..$ AST   : chr [1:585] "2.1" "0.8" "0.4" "3.7" ...
##   ..$ STL   : chr [1:585] "0.7" "0.2" "0.3" "1.3" ...
##   ..$ BLK   : chr [1:585] "0.6" "0.0" "0.3" "0.2" ...
##   ..$ TOV   : chr [1:585] "1.5" "0.8" "0.6" "2.4" ...
##   ..$ PF    : chr [1:585] "3.2" "0.8" "1.7" "1.9" ...
##   ..$ PS/G  : chr [1:585] "12.3" "1.8" "4.9" "25.1" ...
##  $ :'data.frame':    538 obs. of  30 variables:
##   ..$ Rk    : chr [1:538] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:538] "Shareef Abdur-Rahim" "Hassan Adams" "Maurice Ager" "LaMarcus Aldridge" ...
##   ..$ Pos   : chr [1:538] "C" "SG" "SG" "C" ...
##   ..$ Age   : chr [1:538] "30" "22" "22" "21" ...
##   ..$ Tm    : chr [1:538] "SAC" "NJN" "DAL" "POR" ...
##   ..$ G     : chr [1:538] "80" "61" "32" "63" ...
##   ..$ GS    : chr [1:538] "45" "8" "1" "22" ...
##   ..$ MP    : chr [1:538] "25.2" "8.1" "6.7" "22.1" ...
##   ..$ FG    : chr [1:538] "3.9" "1.2" "0.7" "3.8" ...
##   ..$ FGA   : chr [1:538] "8.2" "2.2" "2.2" "7.6" ...
##   ..$ FG%   : chr [1:538] ".474" ".556" ".314" ".503" ...
##   ..$ 3P    : chr [1:538] "0.0" "0.0" "0.2" "0.0" ...
##   ..$ 3PA   : chr [1:538] "0.3" "0.0" "0.5" "0.0" ...
##   ..$ 3P%   : chr [1:538] ".150" ".000" ".333" ".000" ...
##   ..$ 2P    : chr [1:538] "3.8" "1.2" "0.5" "3.8" ...
##   ..$ 2PA   : chr [1:538] "7.9" "2.2" "1.7" "7.6" ...
##   ..$ 2P%   : chr [1:538] ".484" ".560" ".309" ".505" ...
##   ..$ eFG%  : chr [1:538] ".476" ".556" ".350" ".503" ...
##   ..$ FT    : chr [1:538] "2.1" "0.4" "0.6" "1.3" ...
##   ..$ FTA   : chr [1:538] "2.9" "0.6" "1.0" "1.8" ...
##   ..$ FT%   : chr [1:538] ".726" ".667" ".606" ".722" ...
##   ..$ ORB   : chr [1:538] "1.5" "0.6" "0.0" "2.3" ...
##   ..$ DRB   : chr [1:538] "3.5" "0.7" "0.6" "2.7" ...
##   ..$ TRB   : chr [1:538] "5.0" "1.3" "0.7" "5.0" ...
##   ..$ AST   : chr [1:538] "1.4" "0.2" "0.2" "0.4" ...
##   ..$ STL   : chr [1:538] "0.7" "0.3" "0.1" "0.3" ...
##   ..$ BLK   : chr [1:538] "0.5" "0.1" "0.1" "1.2" ...
##   ..$ TOV   : chr [1:538] "1.5" "0.4" "0.5" "0.7" ...
##   ..$ PF    : chr [1:538] "3.0" "0.8" "0.8" "3.0" ...
##   ..$ PS/G  : chr [1:538] "9.9" "2.9" "2.2" "9.0" ...
##  $ :'data.frame':    617 obs. of  30 variables:
##   ..$ Rk    : chr [1:617] "1" "2" "3" "3" ...
##   ..$ Player: chr [1:617] "Shareef Abdur-Rahim" "Arron Afflalo" "Maurice Ager" "Maurice Ager" ...
##   ..$ Pos   : chr [1:617] "PF" "SG" "SG" "SG" ...
##   ..$ Age   : chr [1:617] "31" "22" "23" "23" ...
##   ..$ Tm    : chr [1:617] "SAC" "DET" "TOT" "DAL" ...
##   ..$ G     : chr [1:617] "6" "75" "26" "12" ...
##   ..$ GS    : chr [1:617] "0" "9" "3" "3" ...
##   ..$ MP    : chr [1:617] "8.5" "12.9" "6.3" "6.4" ...
##   ..$ FG    : chr [1:617] "0.5" "1.3" "0.8" "0.4" ...
##   ..$ FGA   : chr [1:617] "2.3" "3.2" "2.5" "2.3" ...
##   ..$ FG%   : chr [1:617] ".214" ".411" ".323" ".185" ...
##   ..$ 3P    : chr [1:617] "0.0" "0.1" "0.1" "0.0" ...
##   ..$ 3PA   : chr [1:617] "0.0" "0.6" "0.7" "0.7" ...
##   ..$ 3P%   : chr [1:617] "" ".208" ".158" ".000" ...
##   ..$ 2P    : chr [1:617] "0.5" "1.2" "0.7" "0.4" ...
##   ..$ 2PA   : chr [1:617] "2.3" "2.6" "1.8" "1.6" ...
##   ..$ 2P%   : chr [1:617] ".214" ".461" ".391" ".263" ...
##   ..$ eFG%  : chr [1:617] ".214" ".432" ".346" ".185" ...
##   ..$ FT    : chr [1:617] "0.7" "0.9" "0.2" "0.4" ...
##   ..$ FTA   : chr [1:617] "0.7" "1.2" "0.5" "0.5" ...
##   ..$ FT%   : chr [1:617] "1.000" ".782" ".500" ".833" ...
##   ..$ ORB   : chr [1:617] "1.0" "0.5" "0.2" "0.1" ...
##   ..$ DRB   : chr [1:617] "0.7" "1.3" "0.3" "0.3" ...
##   ..$ TRB   : chr [1:617] "1.7" "1.8" "0.5" "0.3" ...
##   ..$ AST   : chr [1:617] "0.7" "0.7" "0.3" "0.3" ...
##   ..$ STL   : chr [1:617] "0.2" "0.4" "0.0" "0.0" ...
##   ..$ BLK   : chr [1:617] "0.0" "0.1" "0.0" "0.1" ...
##   ..$ TOV   : chr [1:617] "0.2" "0.5" "0.2" "0.3" ...
##   ..$ PF    : chr [1:617] "1.5" "1.1" "0.7" "0.9" ...
##   ..$ PS/G  : chr [1:617] "1.7" "3.7" "2.0" "1.3" ...
##  $ :'data.frame':    604 obs. of  30 variables:
##   ..$ Rk    : chr [1:604] "1" "1" "1" "2" ...
##   ..$ Player: chr [1:604] "Alex Acker" "Alex Acker" "Alex Acker" "Hassan Adams" ...
##   ..$ Pos   : chr [1:604] "SG" "SG" "SG" "SG" ...
##   ..$ Age   : chr [1:604] "26" "26" "26" "24" ...
##   ..$ Tm    : chr [1:604] "TOT" "DET" "LAC" "TOR" ...
##   ..$ G     : chr [1:604] "25" "7" "18" "12" ...
##   ..$ GS    : chr [1:604] "0" "0" "0" "0" ...
##   ..$ MP    : chr [1:604] "8.0" "2.9" "9.9" "4.3" ...
##   ..$ FG    : chr [1:604] "1.2" "0.6" "1.4" "0.3" ...
##   ..$ FGA   : chr [1:604] "3.0" "1.6" "3.6" "1.1" ...
##   ..$ FG%   : chr [1:604] ".395" ".364" ".400" ".308" ...
##   ..$ 3P    : chr [1:604] "0.3" "0.0" "0.4" "0.0" ...
##   ..$ 3PA   : chr [1:604] "0.8" "0.6" "0.9" "0.0" ...
##   ..$ 3P%   : chr [1:604] ".350" ".000" ".438" "" ...
##   ..$ 2P    : chr [1:604] "0.9" "0.6" "1.1" "0.3" ...
##   ..$ 2PA   : chr [1:604] "2.2" "1.0" "2.7" "1.1" ...
##   ..$ 2P%   : chr [1:604] ".411" ".571" ".388" ".308" ...
##   ..$ eFG%  : chr [1:604] ".441" ".364" ".454" ".308" ...
##   ..$ FT    : chr [1:604] "0.2" "0.1" "0.2" "0.3" ...
##   ..$ FTA   : chr [1:604] "0.4" "0.3" "0.4" "0.5" ...
##   ..$ FT%   : chr [1:604] ".500" ".500" ".500" ".500" ...
##   ..$ ORB   : chr [1:604] "0.3" "0.0" "0.4" "0.1" ...
##   ..$ DRB   : chr [1:604] "0.6" "0.3" "0.8" "0.5" ...
##   ..$ TRB   : chr [1:604] "1.0" "0.3" "1.2" "0.6" ...
##   ..$ AST   : chr [1:604] "0.5" "0.1" "0.6" "0.1" ...
##   ..$ STL   : chr [1:604] "0.2" "0.3" "0.2" "0.1" ...
##   ..$ BLK   : chr [1:604] "0.2" "0.1" "0.2" "0.1" ...
##   ..$ TOV   : chr [1:604] "0.3" "0.0" "0.4" "0.3" ...
##   ..$ PF    : chr [1:604] "0.4" "0.0" "0.5" "0.3" ...
##   ..$ PS/G  : chr [1:604] "2.9" "1.3" "3.5" "0.9" ...
##  $ :'data.frame':    600 obs. of  30 variables:
##   ..$ Rk    : chr [1:600] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:600] "Arron Afflalo" "Alexis Ajinca" "LaMarcus Aldridge" "Joe Alexander" ...
##   ..$ Pos   : chr [1:600] "SG" "C" "PF" "SF" ...
##   ..$ Age   : chr [1:600] "24" "21" "24" "23" ...
##   ..$ Tm    : chr [1:600] "DEN" "CHA" "POR" "CHI" ...
##   ..$ G     : chr [1:600] "82" "6" "78" "8" ...
##   ..$ GS    : chr [1:600] "75" "0" "78" "0" ...
##   ..$ MP    : chr [1:600] "27.1" "5.0" "37.5" "3.6" ...
##   ..$ FG    : chr [1:600] "3.3" "0.8" "7.4" "0.1" ...
##   ..$ FGA   : chr [1:600] "7.1" "1.7" "15.0" "0.8" ...
##   ..$ FG%   : chr [1:600] ".465" ".500" ".495" ".167" ...
##   ..$ 3P    : chr [1:600] "1.3" "0.0" "0.1" "0.0" ...
##   ..$ 3PA   : chr [1:600] "3.0" "0.0" "0.2" "0.1" ...
##   ..$ 3P%   : chr [1:600] ".434" "" ".313" ".000" ...
##   ..$ 2P    : chr [1:600] "2.0" "0.8" "7.4" "0.1" ...
##   ..$ 2PA   : chr [1:600] "4.1" "1.7" "14.8" "0.6" ...
##   ..$ 2P%   : chr [1:600] ".488" ".500" ".498" ".200" ...
##   ..$ eFG%  : chr [1:600] ".557" ".500" ".497" ".167" ...
##   ..$ FT    : chr [1:600] "0.9" "0.0" "2.9" "0.3" ...
##   ..$ FTA   : chr [1:600] "1.2" "0.2" "3.9" "0.4" ...
##   ..$ FT%   : chr [1:600] ".735" ".000" ".757" ".667" ...
##   ..$ ORB   : chr [1:600] "0.7" "0.2" "2.5" "0.3" ...
##   ..$ DRB   : chr [1:600] "2.4" "0.5" "5.6" "0.4" ...
##   ..$ TRB   : chr [1:600] "3.1" "0.7" "8.0" "0.6" ...
##   ..$ AST   : chr [1:600] "1.7" "0.0" "2.1" "0.3" ...
##   ..$ STL   : chr [1:600] "0.6" "0.2" "0.9" "0.1" ...
##   ..$ BLK   : chr [1:600] "0.4" "0.2" "0.6" "0.1" ...
##   ..$ TOV   : chr [1:600] "0.9" "0.3" "1.3" "0.0" ...
##   ..$ PF    : chr [1:600] "2.7" "0.8" "3.0" "1.1" ...
##   ..$ PS/G  : chr [1:600] "8.8" "1.7" "17.9" "0.5" ...
##  $ :'data.frame':    647 obs. of  30 variables:
##   ..$ Rk    : chr [1:647] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:647] "Jeff Adrien" "Arron Afflalo" "Maurice Ager" "Alexis Ajinca" ...
##   ..$ Pos   : chr [1:647] "PF" "SG" "SG" "C" ...
##   ..$ Age   : chr [1:647] "24" "25" "26" "22" ...
##   ..$ Tm    : chr [1:647] "GSW" "DEN" "MIN" "TOT" ...
##   ..$ G     : chr [1:647] "23" "69" "4" "34" ...
##   ..$ GS    : chr [1:647] "0" "69" "0" "2" ...
##   ..$ MP    : chr [1:647] "8.5" "33.7" "7.3" "10.0" ...
##   ..$ FG    : chr [1:647] "1.0" "4.5" "1.5" "1.7" ...
##   ..$ FGA   : chr [1:647] "2.3" "9.1" "2.8" "3.9" ...
##   ..$ FG%   : chr [1:647] ".426" ".498" ".545" ".444" ...
##   ..$ 3P    : chr [1:647] "0.0" "1.5" "0.8" "0.4" ...
##   ..$ 3PA   : chr [1:647] "0.0" "3.6" "1.0" "1.0" ...
##   ..$ 3P%   : chr [1:647] "" ".423" ".750" ".353" ...
##   ..$ 2P    : chr [1:647] "1.0" "3.0" "0.8" "1.4" ...
##   ..$ 2PA   : chr [1:647] "2.3" "5.5" "1.8" "2.9" ...
##   ..$ 2P%   : chr [1:647] ".426" ".546" ".429" ".475" ...
##   ..$ eFG%  : chr [1:647] ".426" ".581" ".682" ".489" ...
##   ..$ FT    : chr [1:647] "0.5" "2.0" "0.0" "0.4" ...
##   ..$ FTA   : chr [1:647] "0.8" "2.4" "0.0" "0.5" ...
##   ..$ FT%   : chr [1:647] ".579" ".847" "" ".722" ...
##   ..$ ORB   : chr [1:647] "1.0" "0.7" "0.0" "0.5" ...
##   ..$ DRB   : chr [1:647] "1.5" "3.0" "0.5" "1.8" ...
##   ..$ TRB   : chr [1:647] "2.5" "3.6" "0.5" "2.3" ...
##   ..$ AST   : chr [1:647] "0.4" "2.4" "0.3" "0.3" ...
##   ..$ STL   : chr [1:647] "0.2" "0.5" "0.3" "0.3" ...
##   ..$ BLK   : chr [1:647] "0.2" "0.4" "0.0" "0.6" ...
##   ..$ TOV   : chr [1:647] "0.4" "1.0" "1.0" "0.5" ...
##   ..$ PF    : chr [1:647] "1.2" "2.2" "1.0" "2.1" ...
##   ..$ PS/G  : chr [1:647] "2.5" "12.6" "3.8" "4.2" ...
##  $ :'data.frame':    574 obs. of  30 variables:
##   ..$ Rk    : chr [1:574] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:574] "Jeff Adrien" "Arron Afflalo" "Blake Ahearn" "Solomon Alabi" ...
##   ..$ Pos   : chr [1:574] "PF" "SG" "PG" "C" ...
##   ..$ Age   : chr [1:574] "25" "26" "27" "23" ...
##   ..$ Tm    : chr [1:574] "HOU" "DEN" "UTA" "TOR" ...
##   ..$ G     : chr [1:574] "8" "62" "4" "14" ...
##   ..$ GS    : chr [1:574] "0" "62" "0" "0" ...
##   ..$ MP    : chr [1:574] "7.9" "33.6" "7.5" "8.7" ...
##   ..$ FG    : chr [1:574] "0.9" "5.3" "1.0" "0.9" ...
##   ..$ FGA   : chr [1:574] "2.0" "11.3" "3.5" "2.6" ...
##   ..$ FG%   : chr [1:574] ".438" ".471" ".286" ".361" ...
##   ..$ 3P    : chr [1:574] "0.0" "1.4" "0.5" "0.0" ...
##   ..$ 3PA   : chr [1:574] "0.0" "3.6" "2.3" "0.0" ...
##   ..$ 3P%   : chr [1:574] "" ".398" ".222" "" ...
##   ..$ 2P    : chr [1:574] "0.9" "3.9" "0.5" "0.9" ...
##   ..$ 2PA   : chr [1:574] "2.0" "7.7" "1.3" "2.6" ...
##   ..$ 2P%   : chr [1:574] ".438" ".504" ".400" ".361" ...
##   ..$ eFG%  : chr [1:574] ".438" ".534" ".357" ".361" ...
##   ..$ FT    : chr [1:574] "0.9" "3.2" "0.0" "0.5" ...
##   ..$ FTA   : chr [1:574] "1.5" "4.0" "0.0" "0.6" ...
##   ..$ FT%   : chr [1:574] ".583" ".798" "" ".875" ...
##   ..$ ORB   : chr [1:574] "0.6" "0.6" "0.0" "1.1" ...
##   ..$ DRB   : chr [1:574] "2.1" "2.5" "0.5" "2.3" ...
##   ..$ TRB   : chr [1:574] "2.8" "3.2" "0.5" "3.4" ...
##   ..$ AST   : chr [1:574] "0.1" "2.4" "0.3" "0.2" ...
##   ..$ STL   : chr [1:574] "0.0" "0.6" "0.0" "0.1" ...
##   ..$ BLK   : chr [1:574] "0.3" "0.2" "0.0" "0.6" ...
##   ..$ TOV   : chr [1:574] "0.3" "1.4" "1.3" "0.4" ...
##   ..$ PF    : chr [1:574] "1.6" "2.2" "1.0" "0.8" ...
##   ..$ PS/G  : chr [1:574] "2.6" "15.2" "2.5" "2.4" ...
##  $ :'data.frame':    596 obs. of  30 variables:
##   ..$ Rk    : chr [1:596] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:596] "Quincy Acy" "Jeff Adrien" "Arron Afflalo" "Josh Akognon" ...
##   ..$ Pos   : chr [1:596] "PF" "PF" "SF" "PG" ...
##   ..$ Age   : chr [1:596] "22" "26" "27" "26" ...
##   ..$ Tm    : chr [1:596] "TOR" "CHA" "ORL" "DAL" ...
##   ..$ G     : chr [1:596] "29" "52" "64" "3" ...
##   ..$ GS    : chr [1:596] "0" "5" "64" "0" ...
##   ..$ MP    : chr [1:596] "11.8" "13.7" "36.0" "3.0" ...
##   ..$ FG    : chr [1:596] "1.4" "1.4" "6.2" "0.7" ...
##   ..$ FGA   : chr [1:596] "2.6" "3.2" "14.1" "1.3" ...
##   ..$ FG%   : chr [1:596] ".560" ".429" ".439" ".500" ...
##   ..$ 3P    : chr [1:596] "0.0" "0.0" "1.1" "0.3" ...
##   ..$ 3PA   : chr [1:596] "0.1" "0.0" "3.8" "0.7" ...
##   ..$ 3P%   : chr [1:596] ".500" ".000" ".300" ".500" ...
##   ..$ 2P    : chr [1:596] "1.4" "1.4" "5.1" "0.3" ...
##   ..$ 2PA   : chr [1:596] "2.5" "3.2" "10.4" "0.7" ...
##   ..$ 2P%   : chr [1:596] ".562" ".434" ".489" ".500" ...
##   ..$ eFG%  : chr [1:596] ".567" ".429" ".478" ".625" ...
##   ..$ FT    : chr [1:596] "1.1" "1.3" "3.0" "0.0" ...
##   ..$ FTA   : chr [1:596] "1.3" "1.9" "3.5" "0.0" ...
##   ..$ FT%   : chr [1:596] ".816" ".650" ".857" "" ...
##   ..$ ORB   : chr [1:596] "1.0" "1.3" "0.5" "0.0" ...
##   ..$ DRB   : chr [1:596] "1.6" "2.5" "3.3" "0.3" ...
##   ..$ TRB   : chr [1:596] "2.7" "3.8" "3.7" "0.3" ...
##   ..$ AST   : chr [1:596] "0.4" "0.7" "3.2" "0.3" ...
##   ..$ STL   : chr [1:596] "0.4" "0.3" "0.6" "0.0" ...
##   ..$ BLK   : chr [1:596] "0.5" "0.5" "0.2" "0.0" ...
##   ..$ TOV   : chr [1:596] "0.6" "0.6" "2.2" "0.0" ...
##   ..$ PF    : chr [1:596] "1.8" "1.5" "2.1" "1.0" ...
##   ..$ PS/G  : chr [1:596] "4.0" "4.0" "16.5" "1.7" ...
##  $ :'data.frame':    635 obs. of  30 variables:
##   ..$ Rk    : chr [1:635] "1" "1" "1" "2" ...
##   ..$ Player: chr [1:635] "Quincy Acy" "Quincy Acy" "Quincy Acy" "Steven Adams" ...
##   ..$ Pos   : chr [1:635] "SF" "SF" "SF" "C" ...
##   ..$ Age   : chr [1:635] "23" "23" "23" "20" ...
##   ..$ Tm    : chr [1:635] "TOT" "TOR" "SAC" "OKC" ...
##   ..$ G     : chr [1:635] "63" "7" "56" "81" ...
##   ..$ GS    : chr [1:635] "0" "0" "0" "20" ...
##   ..$ MP    : chr [1:635] "13.4" "8.7" "14.0" "14.8" ...
##   ..$ FG    : chr [1:635] "1.0" "0.9" "1.1" "1.1" ...
##   ..$ FGA   : chr [1:635] "2.2" "2.0" "2.3" "2.3" ...
##   ..$ FG%   : chr [1:635] ".468" ".429" ".472" ".503" ...
##   ..$ 3P    : chr [1:635] "0.1" "0.3" "0.0" "0.0" ...
##   ..$ 3PA   : chr [1:635] "0.2" "0.7" "0.2" "0.0" ...
##   ..$ 3P%   : chr [1:635] ".267" ".400" ".200" "" ...
##   ..$ 2P    : chr [1:635] "1.0" "0.6" "1.0" "1.1" ...
##   ..$ 2PA   : chr [1:635] "2.0" "1.3" "2.1" "2.3" ...
##   ..$ 2P%   : chr [1:635] ".492" ".444" ".496" ".503" ...
##   ..$ eFG%  : chr [1:635] ".482" ".500" ".480" ".503" ...
##   ..$ FT    : chr [1:635] "0.6" "0.7" "0.5" "1.0" ...
##   ..$ FTA   : chr [1:635] "0.8" "1.1" "0.8" "1.7" ...
##   ..$ FT%   : chr [1:635] ".660" ".625" ".667" ".581" ...
##   ..$ ORB   : chr [1:635] "1.1" "0.7" "1.2" "1.8" ...
##   ..$ DRB   : chr [1:635] "2.3" "1.4" "2.4" "2.3" ...
##   ..$ TRB   : chr [1:635] "3.4" "2.1" "3.6" "4.1" ...
##   ..$ AST   : chr [1:635] "0.4" "0.6" "0.4" "0.5" ...
##   ..$ STL   : chr [1:635] "0.4" "0.6" "0.3" "0.5" ...
##   ..$ BLK   : chr [1:635] "0.4" "0.4" "0.4" "0.7" ...
##   ..$ TOV   : chr [1:635] "0.5" "0.3" "0.5" "0.9" ...
##   ..$ PF    : chr [1:635] "1.9" "1.1" "2.0" "2.5" ...
##   ..$ PS/G  : chr [1:635] "2.7" "2.7" "2.7" "3.3" ...
##  $ :'data.frame':    675 obs. of  30 variables:
##   ..$ Rk    : chr [1:675] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:675] "Quincy Acy" "Jordan Adams" "Steven Adams" "Jeff Adrien" ...
##   ..$ Pos   : chr [1:675] "PF" "SG" "C" "PF" ...
##   ..$ Age   : chr [1:675] "24" "20" "21" "28" ...
##   ..$ Tm    : chr [1:675] "NYK" "MEM" "OKC" "MIN" ...
##   ..$ G     : chr [1:675] "68" "30" "70" "17" ...
##   ..$ GS    : chr [1:675] "22" "0" "67" "0" ...
##   ..$ MP    : chr [1:675] "18.9" "8.3" "25.3" "12.6" ...
##   ..$ FG    : chr [1:675] "2.2" "1.2" "3.1" "1.1" ...
##   ..$ FGA   : chr [1:675] "4.9" "2.9" "5.7" "2.6" ...
##   ..$ FG%   : chr [1:675] ".459" ".407" ".544" ".432" ...
##   ..$ 3P    : chr [1:675] "0.3" "0.3" "0.0" "0.0" ...
##   ..$ 3PA   : chr [1:675] "0.9" "0.8" "0.0" "0.0" ...
##   ..$ 3P%   : chr [1:675] ".300" ".400" ".000" "" ...
##   ..$ 2P    : chr [1:675] "2.0" "0.8" "3.1" "1.1" ...
##   ..$ 2PA   : chr [1:675] "4.0" "2.0" "5.7" "2.6" ...
##   ..$ 2P%   : chr [1:675] ".494" ".410" ".547" ".432" ...
##   ..$ eFG%  : chr [1:675] ".486" ".465" ".544" ".432" ...
##   ..$ FT    : chr [1:675] "1.1" "0.5" "1.5" "1.3" ...
##   ..$ FTA   : chr [1:675] "1.4" "0.8" "2.9" "2.2" ...
##   ..$ FT%   : chr [1:675] ".784" ".609" ".502" ".579" ...
##   ..$ ORB   : chr [1:675] "1.2" "0.3" "2.8" "1.4" ...
##   ..$ DRB   : chr [1:675] "3.3" "0.6" "4.6" "3.2" ...
##   ..$ TRB   : chr [1:675] "4.4" "0.9" "7.5" "4.5" ...
##   ..$ AST   : chr [1:675] "1.0" "0.5" "0.9" "0.9" ...
##   ..$ STL   : chr [1:675] "0.4" "0.5" "0.5" "0.2" ...
##   ..$ BLK   : chr [1:675] "0.3" "0.2" "1.2" "0.5" ...
##   ..$ TOV   : chr [1:675] "0.9" "0.5" "1.4" "0.5" ...
##   ..$ PF    : chr [1:675] "2.2" "0.8" "3.2" "1.8" ...
##   ..$ PS/G  : chr [1:675] "5.9" "3.1" "7.7" "3.5" ...
##  $ :'data.frame':    601 obs. of  30 variables:
##   ..$ Rk    : chr [1:601] "1" "2" "3" "4" ...
##   ..$ Player: chr [1:601] "Quincy Acy" "Jordan Adams" "Steven Adams" "Arron Afflalo" ...
##   ..$ Pos   : chr [1:601] "PF" "SG" "C" "SG" ...
##   ..$ Age   : chr [1:601] "25" "21" "22" "30" ...
##   ..$ Tm    : chr [1:601] "SAC" "MEM" "OKC" "NYK" ...
##   ..$ G     : chr [1:601] "59" "2" "80" "71" ...
##   ..$ GS    : chr [1:601] "29" "0" "80" "57" ...
##   ..$ MP    : chr [1:601] "14.8" "7.5" "25.2" "33.4" ...
##   ..$ FG    : chr [1:601] "2.0" "1.0" "3.3" "5.0" ...
##   ..$ FGA   : chr [1:601] "3.6" "3.0" "5.3" "11.3" ...
##   ..$ FG%   : chr [1:601] ".556" ".333" ".613" ".443" ...
##   ..$ 3P    : chr [1:601] "0.3" "0.0" "0.0" "1.3" ...
##   ..$ 3PA   : chr [1:601] "0.8" "0.5" "0.0" "3.4" ...
##   ..$ 3P%   : chr [1:601] ".388" ".000" "" ".382" ...
##   ..$ 2P    : chr [1:601] "1.7" "1.0" "3.3" "3.7" ...
##   ..$ 2PA   : chr [1:601] "2.8" "2.5" "5.3" "7.9" ...
##   ..$ 2P%   : chr [1:601] ".606" ".400" ".613" ".469" ...
##   ..$ eFG%  : chr [1:601] ".600" ".333" ".613" ".500" ...
##   ..$ FT    : chr [1:601] "0.8" "1.5" "1.4" "1.5" ...
##   ..$ FTA   : chr [1:601] "1.2" "2.5" "2.5" "1.8" ...
##   ..$ FT%   : chr [1:601] ".735" ".600" ".582" ".840" ...
##   ..$ ORB   : chr [1:601] "1.1" "0.0" "2.7" "0.3" ...
##   ..$ DRB   : chr [1:601] "2.1" "1.0" "3.9" "3.4" ...
##   ..$ TRB   : chr [1:601] "3.2" "1.0" "6.7" "3.7" ...
##   ..$ AST   : chr [1:601] "0.5" "1.5" "0.8" "2.0" ...
##   ..$ STL   : chr [1:601] "0.5" "1.5" "0.5" "0.4" ...
##   ..$ BLK   : chr [1:601] "0.4" "0.0" "1.1" "0.1" ...
##   ..$ TOV   : chr [1:601] "0.5" "1.0" "1.1" "1.2" ...
##   ..$ PF    : chr [1:601] "1.7" "1.0" "2.8" "2.0" ...
##   ..$ PS/G  : chr [1:601] "5.2" "3.5" "8.0" "12.8" ...
##  $ :'data.frame':    619 obs. of  30 variables:
##   ..$ Rk    : chr [1:619] "1" "2" "2" "2" ...
##   ..$ Player: chr [1:619] "Alex Abrines" "Quincy Acy" "Quincy Acy" "Quincy Acy" ...
##   ..$ Pos   : chr [1:619] "SG" "PF" "PF" "PF" ...
##   ..$ Age   : chr [1:619] "23" "26" "26" "26" ...
##   ..$ Tm    : chr [1:619] "OKC" "TOT" "DAL" "BRK" ...
##   ..$ G     : chr [1:619] "68" "38" "6" "32" ...
##   ..$ GS    : chr [1:619] "6" "1" "0" "1" ...
##   ..$ MP    : chr [1:619] "15.5" "14.7" "8.0" "15.9" ...
##   ..$ FG    : chr [1:619] "2.0" "1.8" "0.8" "2.0" ...
##   ..$ FGA   : chr [1:619] "5.0" "4.5" "2.8" "4.8" ...
##   ..$ FG%   : chr [1:619] ".393" ".412" ".294" ".425" ...
##   ..$ 3P    : chr [1:619] "1.4" "1.0" "0.2" "1.1" ...
##   ..$ 3PA   : chr [1:619] "3.6" "2.4" "1.2" "2.6" ...
##   ..$ 3P%   : chr [1:619] ".381" ".411" ".143" ".434" ...
##   ..$ 2P    : chr [1:619] "0.6" "0.9" "0.7" "0.9" ...
##   ..$ 2PA   : chr [1:619] "1.4" "2.1" "1.7" "2.2" ...
##   ..$ 2P%   : chr [1:619] ".426" ".413" ".400" ".414" ...
##   ..$ eFG%  : chr [1:619] ".531" ".521" ".324" ".542" ...
##   ..$ FT    : chr [1:619] "0.6" "1.2" "0.3" "1.3" ...
##   ..$ FTA   : chr [1:619] "0.7" "1.6" "0.5" "1.8" ...
##   ..$ FT%   : chr [1:619] ".898" ".750" ".667" ".754" ...
##   ..$ ORB   : chr [1:619] "0.3" "0.5" "0.3" "0.6" ...
##   ..$ DRB   : chr [1:619] "1.0" "2.5" "1.0" "2.8" ...
##   ..$ TRB   : chr [1:619] "1.3" "3.0" "1.3" "3.3" ...
##   ..$ AST   : chr [1:619] "0.6" "0.5" "0.0" "0.6" ...
##   ..$ STL   : chr [1:619] "0.5" "0.4" "0.0" "0.4" ...
##   ..$ BLK   : chr [1:619] "0.1" "0.4" "0.0" "0.5" ...
##   ..$ TOV   : chr [1:619] "0.5" "0.6" "0.3" "0.6" ...
##   ..$ PF    : chr [1:619] "1.7" "1.8" "1.5" "1.8" ...
##   ..$ PS/G  : chr [1:619] "6.0" "5.8" "2.2" "6.5" ...

Another functional idiom: Reduce

Map takes list and returns a list. In our script, it takes a list of numbers as input and returns a list of data frames. We now need to roll all those data frames into a single data frame.

That’s what reduce does. It takes a list and applies a function to consequtive pairs. accumulating the results.

lst <- list(1, 2, 3, 4) 
purrr::reduce(lst, sum) # don't do this in real life!
## [1] 10

Of course, since R’s sum is vectorized you don’t ever need to do that,

Reduce works great with map:

lst <- list(1, 2, 3, 4) 
# Find the sum of the square root of 1, 2, 3, 4
purrr::map(lst, sqrt) %>% 
  purrr::reduce(sum)
## [1] 6.146264

Another example: get the sum of the rows in a list of data frames:

list(mtcars, iris) %>% 
  purrr::map(nrow) %>% 
  purrr::reduce(sum)
## [1] 182

In our case, what we need it to use reduce with bind_rows, so we accumulate all the rows in our data_frame into a single one.

Putting all the pieces together in the final script:

library(glue)
library(purrr)
library(dplyr)
library(rvest)

# Download data table from basketball reference
get_player_data <- function(year) {
  
  base_nba_url <- "https://www.basketball-reference.com/leagues/NBA_{year}_per_game.html"
  
  glue(base_nba_url) %>% 
    read_html() %>% 
    html_table() %>% 
    `[[`(1) %>% 
    mutate(Year = year)
}

data <- as.list(2000:2018) %>% 
  purrr::map(get_player_data) %>% 
  purrr::reduce(bind_rows) %>% 
  mutate_at(vars(G:`PS/G`, Age), as.numeric) %>% 
  mutate_at(vars(Tm, Pos), factor)

#saveRDS(data, file="~/Desktop/data_science/nba_data_00_18.rds")

What to do with the data