Problem raised by @dyb :
At the moment we are passing the full data frame to filter (creating another copy in memory). We then get a full data frame returned which we then drop columns from. Dropping the columns first should be more efficient.
Lines 572-574 in new_merge_ids_year.R