Food Industry Trend Tracker

Welcome

This app leverages regular expressions, Latent Dirichlet Allocation and time series visualization to create an automated work flow that allows media marketing and business analysts to track trends in unstructured text from multiple sources.

Word Trend Options

Instructions

To search for word trends, first select a data set from the drop down menu. In the 'Keywords' box type the words of interest, separated by commas, and click 'Confirm Words'. This will display the frequency of each word on the time series graph.

Topic Trend Options

Instructions

To display topic trends, simply select one or more items from the 'Topics' drop down list. If there is interest in how the relative proportions of the topics have changed across time, check the 'Analyze Topic Proportions' box and the graph will update accordingly. In the table below are a list of the top 10 words associated with each topic.

Video Tutorial


Data

We've collected 3,165 product recalls issued from the Food and Drug Administration from January 2012 until August 2016 reflecting a subset of regulatory actions taken by the from FDA during that time. Also included in the app is a database of 7,260 published articles from the food and beverage industry digital magazine foodbusinessnews.net

Methodology

To create the app, first all data was read into R in plain text format, then was preprocessed to remove numbers, change all letters to their lower case variant, and omit stop words. Both the articles and recall descriptions were ready for searching at this point. To categorize the articles into easily understandable topics, Latent Dirichlet Allocation was performed using the 'lda' library in R. Lastly, for visualization, plot.ly was leveraged to easily produce interactive time series plots of the word frequency across time.

Use Case

With a work flow centered on R's NLP capabilities, RShiny's web frame work, and plot.ly's interactive graphics, an analyst interested in better understanding their industry, but is short on time, can create a powerful social listening tool that will synthesis large amounts of data into useful graphics to help them form a high level understanding of the business landscape.