Finding Keywords
Keywords help make the Promising Practices database easier and more convenient to navigate. They assist in narrowing down the searches by showing specific Promising Practices that are deemed necessary for the Bloomington-Normal community.
Once all of our data was moved to Voyant, the keyword selection process began. To demonstrate this process, we will walk you through our work with the McLean County CHIP Report. Here is the cloud based on that document:
Many of the terms that immediately jump out (McLean, health, community, etc.), are not useful for our purposes. We already know the area of interest and population we are looking into. To weed out the words that were not useful to us, we utilized the stop words function. To do so, we first selected this icon in the top right corner of the word cloud and gained access to this options list. Next we hit the “stop words” option and then selected “edit words.”
Voyant has a preset list of stop words to exclude from the word cloud display, such as years, articles, and pronouns. By adding certain words to the list, you can exclude the words that are not helpful for your keyword list.
Below are some examples of the stop words we selected and why:
- "baseline", "outcome", "increase", "respondents" and similar words were removed because they related directly to the methods behind the surveys on which the McLean County CHIP Report was based.
- "OSF", "Oz", "Joseph", and related words were removed because they related to specific community partners. While these partners are helpful in addressing a health problem once it's been identified, they obscure our ability to identify the problems that need solving.
- "disorders", "social", "triage", "treatment" and similar words were harder to weed out, because they do relate to health. Ultimately, we had to consider whether the words could actually act as helpful keywords regardless of their preexisting relationship to health.
After using this method over an extended period of time, we applied 135 stop words. This allowed us to pull 16 potential keywords out of the 1,177 unique words appearing in the original document of data pulled from the behavioral health section of the CHIP report.
This process was repeated with the document of data from the Bloomington-Normal Always Unstoppable Instagram page which resulted in 9 potential keywords. Finally, we repeated this process one last time with the document of data from the Chestnut Family Health Facebook group which resulted in 9 more potential keywords.
Therefore, our initial draft of a keyword list included 34 words, 5 of which related to specific demographics (i.e. youth, teens). To maximize our time and the specificity of our results, we immediately determined that our list needed to be narrowed.
We began by cutting words based on their relevance to our specific subgoal. For example, words like "access", "barriers", and "counseling", while all important to behavioral health overall, did not have a direct connection to the subgoal we selected. We also removed words that we felt were too broad to deliver specific results. Examples include "stress", "crisis", and "drinking".
After these removals we were down to the following 25 words:
Still too many. Our next step for chipping away at this list was to see how helpful they were in producing results in the Promising Practices database. We typed in each keyword to see how many results it received and removed those that did not produce any. This helped us eliminate 8 additional words, bringing our total down to 17.
We recognized that a lot of our words were similar to each other, so we next compared the search output for each keyword to see if there was any overlap. For example, when we read through the 9 search results for "substance use", all of them appeared within the list of 16 search results for "substance abuse". As a result, we were able to cut "substance use" from our list. Other words that were eliminated during this process included "stigma", "marijuana", and "opioid(s)".