Load diamonds from ggplot2 package. Randomly select 2000 rows from
this dataset, call it data. Use the first half of the data as the training data,
and the other half as the testing data (using createDataPartition).