Search for a beer using the search bar below. Start tying to return matches in realtime - click on the desired entry and 10 beers will be recommended to you based on the results of a K-Means clustering algorith.
			This web app is the result of the final project for the data analytics course. Initally, a dataset from Kaggle 
            (Beer Information - Tasting Profiles)
            was used for the model; however, after initial EDA (including a preliminary K-Means clustering algorithm), 
            we determined that additional information could help with model 
            accuracy. The Kaggle dataset initally contained approximately 5,500 beers scraped from 
            BeerAdvocate. The scraped entries contained counts of key words from 25 
            reviews that were grouped into various 11 taste profiles (e.g., the "fruity" taste profile contained key words such as 
            "berries, "fruit", "juice", and "tropical") to provide scrores for each beer. The flavor pofiles are: fruity, 
            hoppy, spices, malty, bitter, sweet, sour, salty, astringent, body, and alcoholic. 
			
            
            After communication with the Kaggle dataset uploader, they provided us (and uploaded it to the Kaggle page linked above) 
            with the key words used to calculate each profile. We then created a web scraping script to scrape data from the  
            top ranking beers from each substyle
            (e.g., "Stouts" contain the substyles "American" and "Irish Dry") that contained at least
            75 reviews. We determined that more reviews would help differentiate the flavor profiles better for each beer and reduce 
            clustering overlap.
            
            
            After obtaining the newly scraped data, we re-ran the K-Means clustering algorithm and received better silhouette scores 
            for each cluster, along recommended beers that felt more similar to the input beer. The dataset was testing using both 
            the min/max scaler and standard scaler - both returned similar silhouette scores; however, the min/max returned a 
            slightly higher score, so min/max was chosen for the final model. The data were then clustered into 3 main clusters (classes), then 
            the main classes were then clustered into subclusters (subclasses), ranging from with each of the main clusters containing 7, 8, and 2 
            subclasses, respectively. Figure 1 (interactive) displays the number of beers in each class and subclass. Figure 2 (also interactive) 
            presents the distribution of beer ratings. 
        
Figure 1: Sunburst chart displaying the number of beer in each class and subclass.
Figure 2: Histogram displaying distribution of beer ratings.
            After modeling, a script was created that allowed a user to input a beer (contained within the dataset) and 10 
            recommended beers would be returned. The recommended beers with be within the same class and subclass. Then, 
            the flavor profile scores for the input score would be compared to the beers within the subclass and the beers with the 
            smallest difference in scrores would be returned. Following that proof-of-concept, the code was refactored to operate 
            within the Django framework - this allowed for easier use that did not require downloading code, and allowed for a 
            "live-search" function so the user could, in real time, search for beers within the dataset. 
            
            
            A Tableau storyboard displaying various metrics from data exploration and creation of the model can be found in the 
            following link: 
            Tableau Storyboard
            
            All of the code for the project is located on the following GitHub repository: 
             GitHub Repository
        
This project was a joint venture - links for the project members are listed below: