STAT 19000: Project 14 — Fall 2020
Motivation: Functions are the building blocks of more complex programming. It’s vital that you understand how to read and write functions. In this project we will incrementally build and improve upon a function designed to recommend a beer. Note that you will not be winning any awards for this recommendation system, it is just for fun!
Context: One of the main focuses throughout the semester has been on functions, and for good reason. In this project we will continue to exercise our R skills and build up our recommender function.
Scope: r, functions
Questions
Question 1
Read /class/datamine/data/beer/beers.csv
into a data.frame named beers
. Read /class/datamine/data/beer/breweries.csv
into a data.frame named breweries
. Read /class/datamine/data/beer/reviews.csv
into a data.frame named reviews
. As in the previous project, make sure you used the fread
function from the data.table
package, and convert the data.table
to a data.frame
. We want to create a very basic beer recommender. We will start simple. Create a function called recommend_a_beer
that takes as input my_beer_id
(a single value) and returns a vector of beer_ids
from the same style
. Test your function on 2093
.
Make sure you do not include the given |
You may find the function |
You will not win any awards for this recommendation system! |
x <- c('a','b','b','c')
y <- c('c','b','d','e','f')
setdiff(x,y)
setdiff(y,x)
-
R code used to solve the problem.
-
Length of result from
recommend_a_beer(2093)
. -
The result of
2093 %in% recommend_a_beer(2093)
.
Question 2
That is a lot of beer recommendations! Let’s try to narrow it down. Include an argument in your function called min_score
with default value of 4.5. Our recommender will only recommend beer_ids
with a review score of at least min_score
. Test your improved beer recommender with the same beer_id
from question (1).
Note that now we need to look at both |
-
R code used to solve the problem.
-
Length of result from
recommend_a_beer(2093)
.
Question 3
There is still room for improvement (obviously) for our beer recommender. Include a new argument in your function called same_brewery_only
with default value FALSE
. This argument will determine whether or not our beer recommender will return only beers from the same brewery. Test our newly improved beer recommender with the same beer_id
from question (1) with the argument same_brewery_only
set to TRUE
.
You may find the function
|
-
R code used to solve the problem.
-
Length of result from
recommend_a_beer(2093, same_brewery_only=TRUE)
.
Question 4
Oops! Bad idea! Maybe including only beers from the same brewery is not the best idea. Add an argument to our beer recommender named type
. If type=style
our recommender will recommend beers based on the style
as we did in question (3). If type=reviewers
, our recommender will recommend beers based on reviewers with "similar taste". Select reviewers that gave score equal to or greater than min_score
for the given beer id (my_beer_id
). For those reviewers, find the beer_ids
for other beers that these reviewers have given a score of at least min_score
. These beer_ids
are the ones our recommender will return. Be sure to test our improved recommender on the same beer_id
as in (1)-(3).
-
R code used to solve the problem.
-
Length of result from
recommend_a_beer(2093, type="reviewers")
.
Question 5
Let’s try to narrow down the recommendations. Include an argument called abv_range
that indicates the abv range we would like the recommended beers to be at. Set abv_range
default value to NULL
so that if a user does not specify the abv_range
our recommender does not consider it. Test our recommender for beer_id
2093, with abv_range = c(8.9,9.1)
and min_score=4.9
.
You may find the function |
-
R code used to solve the problem.
-
Length of result from
recommend_a_beer(2093, abv_range=c(8.9, 9.1), type="reviewers", min_score=4.9)
.
Question 6
Play with our recommend_a_beer
function. Include another feature to it. Some ideas are: putting a limit on the number of beer_id`s we will return, error catching (what if we don’t have reviews for a given `beer_id
?), including a plot to the output, returning beer names instead of ids or new arguments to decide what `beer_id`s to recommend. Be creative and have fun!
-
R code used to solve the problem.
-
The result from running the improved
recommend_a_beer
function showcasing your improvements to it. -
1-2 sentecens commenting on what you decided to include and why.