Following segmentation into different customer clusters, targeted marketing can conserve significant costs by communicating product information to the most relevant audience in the most effective way. Outside of corporate marketing, data from social media has also been mined and analyzed to understand public opinions around government development initiatives , monitor public response to disease outbreaks , and advance food safety risk communication . Ding and Zhang compared the volume of posts from social media platforms used by government agencies and citizens for risk communication during the swine flu epidemics to identify what prevalent platforms are used, and preformed a text analysis to understand the types/function of their communications— allowing them to chronicle how publicly valuable information is effectively disseminated online amongst different groups and entities. Thus far, there has been sparse application of this opportunity to understand the adoption of improved agricultural practices. Ofori and El-Gayar present the sole example we were able to find of this particular application. Their research employed machine-learning algorithms for sentiment analysis and topic analysis of social media posts. They quantified emotions as well as categorized text into various groupings that reflect prevalent drivers and challenges expressed in social media discourse pertaining to “Smart Agriculture” and “Precision Agriculture.”
They found that according to social media discourse,pipp grow racks the public perceived that most prevalent drivers of adoption for these farming approaches were their potential to generate jobs in agricultural technology industries, and the creation of supportive public polices, and that the biggest barriers to adoption were related to the cost and complexity of the technology. Their findings can help precision agriculture technology providers better address concerns regarding the innovations, as well as leverage the key drivers of adoption in their technology promotion efforts. The present research was conducted to address the gaps in understanding of how producers and consumers perceive the benefits and drawbacks of pastured poultry and integrated cropping, as well as to exploit the emerging technologies in consumer analytics of social media data, which has thus far not been extensively used for supporting the adoption of agricultural practices and technologies. The farming system in question is not strictly defined by its own unique term—it is rather a set of various practices that different producers may choose to combine together in distinct ways, making it more challenging to capture the relevant data . The search string was developed through an iterative process of first identifying core terms and associated pseudonyms that describe the concept of rotational grazing, combined with the terms chicken or poultry, as well as the term “chicken tractor” which specifically describes a type of mobile housing for poultry. Then, a group of terms describing crop production was added. Finally, based on a review of the results returned, additional terms were identified to exclude noise from the posts. We did this by reviewing posts from the highest-ranking websites as well as key topics detected by Brand watch’s machine learning classifier.
We determined that certain websites or topics were generating high volumes of irrelevant posts –these included posts focused on sharing recipes, or those that were primarily focused on the avian influenza outbreaks, which was a trending topic at the time. We eliminated these posts by excluding terms like “recipe” or “flu” or entire web domains such as “food.com” that turned up in our search. By excluding the term “Ancient Nutrition” a popular brand of dietary supplements, we were able to eliminated hundreds of irrelevant posts that advertised their product. While it is possible that some exclusion terms, particularly those referring to the disease outbreak, could result in the loss of some relevant conversations, it was necessary to reduce the large volume of unrelated posts that would otherwise have cluttered the dataset. Through an iterative process, the final Boolean string provided a set of results that did not include obviously irrelevant web pages or key topics. We used a hybrid approach by first applying machine learning tools to analyze the full dataset, before applying traditional content analysis methodology to conduct a detailed manual review of a subset of the data originating only from Twitter, Reddit and online forums. Pertaining to the first research question, we performed descriptive statistics of the posts based on the relative volume of posts generated from various content sources, as well as changes in the post volume over time. To gain a more complex understanding of which platforms are most strategic for potential public engagement and social listening pertaining to the research topic, we further considered the Impact and Reach scores of these posts. Reach Estimate is a score calculated by Brand watch to estimate how many individuals may have seen a piece of content using regression models on various post metadata . It is calculated differently for different post types to account for the different metrics attached to different post types, so that the score is comparable relative to various content types .
Impact is a Brand watch metric to measure the impact of a post based on numerous industry metrics that indicate potential views and shares.It is scored on a logarithmic scale between 0-100 normalized . Pertaining to the second research question, we conducted an exploratory topic analysis of the online conversations by utilizing Brand watch’s natural language processing technology, which utilizes a supervised machine learning classifier to identify prevalent topics in the text body of the posts. For the full 26-month span of the study , the algorithm provided the prevalence of key phrases and locations emerging from the text corpuses of all conversations. We were also able to identify prevalence of singular keywords for a 24-month span within the time frame of the study . We then took a deeper dive on the dataset by narrowing our focus to a manual review of posts from three selected content types: Twitter, Forums and Reddit. Although news and blog sources generated a high volume of relevant posts, they were not prioritized for manual review since they take longer to review and additionally seemed to be declining in popularity based on a preliminary look at the post volume data. Due to the high volume of non-functional URLs and links, missing post body text, and/or irrelevant posts generated by the web scraper, we were unable to include posts from Youtube.com in the manual text analysis. Additionally, since blogs and news sites seemed to be decreasing in popularity based on post volume comparisons over time, they were deemed less strategic when considering future agriculture and research extension efforts . We also further limited our query to only posts that originate from the three major English-speaking poultry producer countries: USA, Great Britain, and Canada . We reviewed all 1,042 posts from the resulting subset of data and manually-coded the text of these posts by defining variables of interest relevant to our research questions. Just as it is useful to analyze consumer sentiment around the features of a product ,rolling grow racks the categories selected included specific “features” of pastured poultry and integrated crop production perceived by the users. Our selection of the categories was guided by the trends uncovered by the machine learning classifier and the researchers’ preliminary review of the posts. We also included some categories that were not necessarily direct impacts of any production system per se, but which reflected broader values expressed by the users . Lastly, some categories were added based on the interest of research and extension practitioners working on the relevant production systems. The directionality of user sentiment attributed towards these features were defined by a positive or a negative 1. Posts indicating both negative and positive attributes of integrated poultry-crop production were counted twice, otherwise, volumes do not reflect multiple mentions or a particular attribute in one post. The full list of categories is defined in the code book in Appendix 1. The criteria for each category was defined in a code book which was used during the coding and coder training process to avoid subjectivity and ensure consistency in coding. Coders were trained in hands on session totaling over 10 hours over the course of a month during which coders regularly met to assess inter-coder reliability. Early on, coders frequently discussed and refined the code book and eventually decided on 22 categories to code. Before coding the dataset independently, coders collectively coded a randomly-extracted training dataset, which contained no less than 20% of each content source type, untill they reached over 90% agreement on the coding.
Three coders were used in total with each source type reviewed by at minimum two coders. We used Microsoft Excel in the data analysis to preform descriptive statistics and generate figures and tables. We also used Brand watch to generate tables and conduct natural language processing. A total of 2399 posts from 1089 unique authors were retrieved and downloaded with 56 columns of data and metadata related to the posts. Relevant metadata included author, date, country, web URL, partial post text, and others. The posts retrieved were automatically categorized by Brand watch into content source types based on the web domains: News, Twitter, Blog, YouTube, Review, Forum, and Reddit. We note that while Reddit is technically a type of online Forum, it is disaggregated into its own category due to the singular prominence of this platform and its community of users. Online news sites returned the highest volume of posts pertaining to pastured poultry and integrated cropping. It generated a moderate impact score, but has the lowest reach relative to other content types such as forums, blogs, and Twitter. The news posts are for the most part not centralized to any singular news platform, but rather dispersed across multiple news sites—with the exception of farm progress.com, which generated 21 posts relevant to our research topic. When considering the potential for this type of content type to be effective for information dissemination, a strategy that ensures distribution of articles across many news outlets may be important for impact. Additionally, we should bear in mind that although analyzing the content of news articles can indicate what information people are consuming, this more formal platform may not fully reflect the sentiments of the broader public as compared to platforms such as forums that allow online users to interact in unstructured conversation. Outreach and engagement strategies should also consider that although over half of Americans in 2022 across all demographics get their news from news websites or apps, a greater percentage of Americans under 30 years old report a preference for and usage of social media as a news source. Overall, preference for news websites or apps has shown a slight decrease from 2020 to 2022 while preference for social media as a news source trends slightly up . Post volume data from our study from 2020-2021 similarly showed a decreased trend of news websites versus increased volume of social media platforms as compared to the prior year . Although almost a quarter of American adults use Twitter, Twitter produced a relatively lower volume of relevant posts in our study . Research shows that Twitter users bias using it as a platform for political content , thus since poultry and vegetable production are not specifically politicized issues, it is possible that people are less inclined to tweet about it. Despite the low post volume, Twitter posts generated an overwhelmingly high average reach score . This finding is not entirely surprising given that 97% of all Tweets are produced by just 25% of users with a large following, while a majority of Twitter users “lurk” on the platform to see what others are saying instead of publishing content themselves . A closer examination of our data showed that almost 40% of the Twitter posts generated an estimated reach of zero, while the top three posts averaged an estimated reach score of over 17 thousand per post. Finally, considering the low impact score of the Twitter posts collected by the commercial web scraper along with research showing that highly active Tweeters receive little engagement from the broader Twitter audience , our data seems to suggest that, although ideas or content shared on Twitter may be seen by a large number of people, most posts may not have much influencing power on how overall opinions or conversations online will trend.