This document contains the descriptive statistics (number of non-missing observations, mean, standard deviation, minimum value, maximum value and internal reliablity, where relevant) of the variables used in the current coordinated analyses. We provide the code used to generate this table, which calls upon our analysis summary objects (also provided).

Variable N Valid Mean SD Min Max \(\alpha\)
BASEII
age 1613 61.56 16.58 22.00 87.00
gender 1613 0.50 0.50 0.00 1.00
edu 1613 14.42 2.89 7.00 18.00
neur 1613 3.71 1.28 1.00 7.00 0.68
con 1613 5.59 0.99 1.67 7.00 0.61
extra 1613 4.77 1.17 1.00 7.00 0.69
agree 1613 5.21 0.98 1.33 7.00 0.46
open 1613 5.32 0.98 1.00 7.00 0.67
smoker 1608 0.12 0.32 0.00 1.00
drinker 620 0.03 0.16 0.00 1.00
active 1564 0.68 0.47 0.00 1.00
EAS
edu 799 14.40 3.36 3.00 24.00
gender 799 0.61 0.49 0.00 1.00
age 799 79.01 5.36 69.00 99.00
neur 799 3.85 0.65 1.80 5.00 0.75
con 799 3.81 0.65 1.00 5.00 0.79
extra 799 3.34 0.64 1.20 4.90 0.76
agree 799 4.04 0.53 2.00 5.00 0.70
open 799 3.67 0.63 1.80 5.00 0.70
smoker 797 0.04 0.20 0.00 1.00
drinker 798 0.76 0.43 0.00 1.00
active 727 0.67 0.47 0.00 1.00
ELSA
age 8832 66.31 9.52 29.00 99.00
gender 8832 0.56 0.50 0.00 1.00
edu 8832 4.12 2.24 1.00 7.00
neur 8832 2.10 0.60 1.00 4.00 0.68
con 8832 3.29 0.50 1.00 4.00 0.67
extra 8831 3.15 0.56 1.00 4.00 0.75
agree 8829 3.51 0.48 1.00 4.00 0.80
open 8819 2.88 0.56 1.00 4.00 0.79
smoker 8832 0.12 0.33 0.00 1.00
drinker 8647 0.01 0.11 0.00 1.00
active 8832 0.94 0.24 0.00 1.00
HRS
gender 19242 0.59 0.49 0.00 1.00
edu 19242 12.65 3.13 0.00 17.00
age 19242 66.27 11.16 25.00 105.00
neur 19242 2.07 0.63 1.00 4.00 0.71
con 19242 3.35 0.49 1.00 4.00 0.66
extra 19242 3.19 0.56 1.00 4.00 0.74
agree 19241 3.52 0.49 1.00 4.00 0.78
open 19212 2.94 0.57 1.00 4.00 0.79
smoker 19125 0.15 0.36 0.00 1.00
drinker 19206 0.06 0.25 0.00 1.00
active 19238 0.94 0.24 0.00 1.00
ILSE
age 482 62.51 0.96 60.00 64.00
gender 482 0.52 0.50 0.00 1.00
edu 482 2.43 1.03 1.00 4.00
neur 482 2.56 0.58 1.08 4.33 0.76
con 482 3.94 0.43 2.45 4.92 0.73
extra 482 3.35 0.47 1.67 4.67 0.70
agree 482 3.70 0.39 2.58 4.75 0.62
open 482 3.36 0.40 2.00 4.58 0.48
smoker 480 0.21 0.41 0.00 1.00
active 477 0.50 0.50 0.00 1.00
drinker 351 0.25 0.44 0.00 1.00
LBC
gender 962 1.51 0.50 1.00 2.00
edu 962 10.77 1.12 8.00 14.00
age 962 69.50 0.84 67.61 71.30
neur 962 1.43 0.64 0.00 3.92 0.87
con 962 2.89 0.50 0.92 4.00 0.86
extra 962 2.25 0.49 0.50 3.58 0.79
agree 962 2.76 0.44 1.33 3.92 0.73
open 962 2.17 0.48 0.75 3.58 0.72
drinker 962 0.87 0.34 0.00 1.00
smoker 962 0.11 0.10 0.00 1.00
active 951 0.69 0.46 0.00 1.00
LBLS
age 1361 68.97 13.35 30.00 97.00
gender 1361 0.54 0.50 0.00 1.00
edu 1361 14.39 2.80 4.00 23.00
neur 1361 1.80 0.42 0.46 3.40 0.89
con 1361 2.16 0.33 0.88 3.50 0.81
extra 1361 2.24 0.32 1.31 3.38 0.79
agree 1361 1.63 0.37 0.00 2.88 0.86
open 1361 2.45 0.34 1.21 3.65 0.84
smoker 1341 0.09 0.29 0.00 1.00
drinker 1143 0.64 0.48 0.00 1.00
active 895 0.45 0.50 0.00 1.00
MAP
age 982 79.85 7.14 56.14 99.81
gender 982 0.76 0.43 0.00 1.00
edu 982 15.23 2.98 4.00 28.00
neur 982 2.22 0.57 1.00 4.75 0.85
con 982 4.80 0.49 2.50 6.00 0.83
extra 653 3.66 0.52 1.83 5.00 0.69
smoker 974 0.01 0.10 0.00 1.00
drinker 982 0.01 0.10 0.00 1.00
active 981 0.87 0.33 0.00 1.00
MAS
age 879 78.71 4.78 70.29 90.80
gender 879 0.46 0.50 0.00 1.00
edu 879 11.74 3.55 3.00 24.00
neur 879 3.32 0.79 0.92 5.00 0.89
con 879 2.96 0.51 1.50 4.75 0.78
open 879 2.82 0.62 1.00 4.42 0.75
smoker 874 0.04 0.19 0.00 1.00
drinker 879 0.88 0.32 0.00 1.00
active 879 0.84 0.37 0.00 1.00
MIDUS
age 4009 56.19 12.37 30.00 84.00
gender 4009 0.55 0.50 0.00 1.00
edu 4009 7.26 2.54 1.00 12.00
neur 4009 2.07 0.63 1.00 4.00 0.74
con 4009 3.46 0.45 1.00 4.00 0.58
extra 4008 3.11 0.57 1.00 4.00 0.76
agree 4008 3.45 0.50 1.00 4.00 0.80
open 4007 2.90 0.54 1.00 4.00 0.77
smoker 4009 0.14 0.35 0.00 1.00
drinker 4009 0.04 0.19 0.00 1.00
active 3989 0.97 0.16 0.00 1.00
NAS
edu 899 2.02 1.12 1.00 5.00
age 899 64.45 7.38 47.00 85.00
gender 899 0.00 0.00 0.00 0.00
neur 899 4.34 0.91 1.40 6.90 0.85
con 899 6.83 0.93 2.65 9.00 0.91
extra 899 5.76 0.92 2.70 8.40 0.87
agree 899 6.81 0.90 3.10 9.00 0.90
open 899 6.11 0.91 2.75 8.79 0.88
drinker 683 0.21 0.41 0.00 1.00
active 756 0.58 0.49 0.00 1.00
smoker 899 0.59 0.49 0.00 1.00
OATS
age 536 71.40 5.54 65.08 90.06
gender 536 0.65 0.48 0.00 1.00
edu 536 11.30 3.43 2.00 26.00
neur 536 2.34 0.59 1.00 4.25 0.85
con 536 3.85 0.44 2.00 5.00 0.81
extra 536 3.32 0.48 1.75 4.73 0.77
agree 536 3.87 0.42 2.42 5.00 0.75
open 536 3.33 0.49 1.67 4.75 0.75
smoker 533 0.06 0.23 0.00 1.00
drinker 535 0.04 0.20 0.00 1.00
active 422 0.89 0.31 0.00 1.00
ROS
age 1394 75.95 7.47 55.78 102.15
gender 1394 0.71 0.45 0.00 1.00
edu 1394 18.35 3.30 3.00 30.00
neur 1394 2.39 0.49 1.00 4.00 0.80
con 1394 3.84 0.42 1.92 5.00 0.80
extra 1394 4.16 0.52 2.50 5.67 0.66
agree 1394 4.86 0.34 3.58 6.00 0.67
open 1394 4.39 0.45 2.50 5.92 0.69
smoker 1394 0.02 0.12 0.00 1.00
drinker 1394 0.00 0.05 0.00 1.00
active 1394 0.79 0.41 0.00 1.00
SLS
age 1194 66.45 14.33 29.00 101.00
gender 1194 0.55 0.50 0.00 1.00
edu 1194 15.72 2.58 8.00 20.00
neur 1194 1.60 0.45 0.42 3.12 0.93
con 1194 2.50 0.37 1.21 3.73 0.90
extra 1194 2.20 0.40 0.90 3.44 0.90
agree 1194 2.71 0.31 1.56 3.81 0.87
open 1194 2.41 0.41 0.98 3.67 0.91
smoker 1189 0.04 0.19 0.00 1.00
drinker 1194 0.09 0.29 0.00 1.00
active 1176 0.87 0.34 0.00 1.00
WLS
age 10723 53.77 4.52 33.00 75.00
gender 10723 0.54 0.50 0.00 1.00
edu 10723 13.74 2.40 0.00 21.00
neur 10723 3.21 0.98 1.00 6.00 0.77
con 10723 4.83 0.70 1.50 6.00 0.65
extra 10716 3.81 0.90 1.00 6.00 0.76
agree 10719 4.73 0.75 1.00 6.00 0.69
open 10706 3.62 0.80 1.00 6.00 0.60
smoker 10723 0.17 0.38 0.00 1.00
drinker 10206 0.55 0.50 0.00 1.00
active 10587 0.94 0.24 0.00 1.00

Code

The following packages were used to generate this table:

library(papaja)
library(tidyverse)
library(knitr)
library(kableExtra)
library(here)

The files needed for this table are available at osf.io/mzfu9 in the Individual Study Output folder.

First we load the individual study analysis objects.

study.names = c("BASEII", "EAS","ELSA", "HRS", "ILSE", "LBC",
                "LBLS", "MAP", "MAS","MIDUS","NAS", "OATS", 
                "ROS","SLS","WLS")

lapply(here(paste0("behavior/", study.names, "_behavior.Rdata")), load, .GlobalEnv)

We extract the relevant statistics in a loop. (The first author is just learning how to use the purrr package, and so often resorts to loops when under a time constraint.)

First we extract the Cronbach’s alpha values from the data objects. These are stored in a dataframe, with each reliability coefficient from each perosnality meausre from each study comprising a single row.

alpha.list <- data.frame()
n = 0
for(i in study.names){
  n = n+1
  x = get(paste0(i,"_behavior")) # get output object
  if(!is.null(x$alpha)){
    y = as.data.frame(unlist(x$alpha))
    y$study = i
    alpha.list = rbind(alpha.list, y)
  }
}

The fit statistics had been extracted in long form. This code adds a “variable” variable and then spreads the data frame, so each trait within each study has a single row.

alpha.list <- alpha.list %>%
  mutate(var = rownames(.),
         var = gsub("[0-9]", "", var)) %>%
  separate(var, into = c("statistic", "var")) %>%
  spread(key = statistic, value = `unlist(x$alpha)`)

Next we extract and wrangle the descriptive statistics (a data frame created using the describe() function in the psych package).

describe.df = lapply(X = study.names, FUN = function(x) get(paste0(x,"_behavior"))$descriptives) %>%
  map2_df(., study.names, ~ mutate(.x, study = .y, var = rownames(.x))) %>%
  # join the descriptives to the data set containing the reliability statistics
  full_join(alpha.list) %>%
  # select the columns we want to include in the table
  dplyr::select(study, var, n, mean, sd, min, max, alpha) %>%
  # select rows representing variables used in analyses
  filter(var %in% c("age", "gender", "edu", 
                    "neur", "con", "extra", "agree", "open",
                    "smoker", "drinker", "active"))

We identify the rows corresponding to each data set. We then select the minimum and maximum rows as starting and end points for grouping.

rows = sapply(study.names, function(x) which(describe.df$study == x))
rows = lapply(rows, function(x) c(min(x), max(x)))

The LBC dataset incorrectly considered the smoking variable to be a factor level variable, and did not appropriate compute the descriptie statistics. Given the numbers we have at baseline – 966 non- and former-smokers and 125 current smokers – we can calculate appropriate statistics by hand. These values are estimated before removing participants based on the inclusion/exclusion criteria, but they are our best guess until we can update the data file from this dataset.

Finally, we pipe the data frame into the kable() function and additional formatting through the kableExtra package. We remove the study column, as this becomes redundant with the grouping headers.

describe.df %>%
  dplyr::select(-study) %>%
  kable(.,  booktabs = T, escape = F, digits = 2, format = "html", 
        col.names = c("Variable", "N Valid", "Mean", "SD", "Min", "Max", "$\\alpha$")) %>%
  kable_styling(full_width = T, latex_options = c("repeat_header")) %>%
  group_rows(names(rows)[1], rows[[1]][1], rows[[1]][2]) %>%
  group_rows(names(rows)[2], rows[[2]][1], rows[[2]][2]) %>%
  group_rows(names(rows)[3], rows[[3]][1], rows[[3]][2]) %>%
  group_rows(names(rows)[4], rows[[4]][1], rows[[4]][2]) %>%
  group_rows(names(rows)[5], rows[[5]][1], rows[[5]][2]) %>%
  group_rows(names(rows)[6], rows[[6]][1], rows[[6]][2]) %>%
  group_rows(names(rows)[7], rows[[7]][1], rows[[7]][2]) %>%
  group_rows(names(rows)[8], rows[[8]][1], rows[[8]][2]) %>%
  group_rows(names(rows)[9], rows[[9]][1], rows[[9]][2]) %>%
  group_rows(names(rows)[10], rows[[10]][1], rows[[10]][2]) %>%
  group_rows(names(rows)[11], rows[[11]][1], rows[[11]][2]) %>%
  group_rows(names(rows)[12], rows[[12]][1], rows[[12]][2]) %>%
  group_rows(names(rows)[13], rows[[13]][1], rows[[13]][2]) %>%
  group_rows(names(rows)[14], rows[[14]][1], rows[[14]][2]) %>%
  group_rows(names(rows)[15], rows[[15]][1], rows[[15]][2])