Twitter Voices: Twitter Users’ Sentiments and Emotions About COVID-19 Vaccination within the United States Voices: Twitter Users ’ and Emotions About COVID-19 Vaccination

Introduction: The Coronavirus Disease 2019 (COVID-19) has negatively impacted society as a whole. Vaccination became the only reliable solution to overcome the severity of this pandemic. A critical factor to achieve an adequate vaccination coverage is by improving public confidence in immunization. Social media plays an important role in reflecting public perception towards certain topics, such as COVID-19 vaccination. This study aims to evaluate U.S. Twitter users ’ sentiments and emotions towards COVID-19 vaccination, and the changes experienced before and after vaccine rollout. Methods: COVID-19 vaccine related tweets were collected from Twitter ’ s Application Programming Interface. We analyzed tweets from March 11, 2020, to May 17, 2021, and divide them into two groups; before and after the first vaccine was implemented in the U.S. Sentiment analysis, negative binomial regression and linear regression models were used for inferential analysis. Results: A total of 19,654 tweets were extracted. From those, 10,374 and 9,280 tweets were posted before and after COVID-19 vaccine was launched in U.S., respectively. A statistically significant difference was evidenced between the two groups when comparing each individual emotion, and positive and negative sentiments, except for joy. Lastly, a statistically significant increase of the sentiment score in the post COVID-19 vaccine group compared to the pre COVID-19 vaccine group was evidenced. Conclusion: Our findings evidenced that public perception of the COVID-19 vaccine has positively changed over time and suggest that the terms “ trials ” and “ vaccination ” , which were associated to trust, could potentially be used to create targeted educational and promotional schemes to achieve a better vaccination coverage rate.


INTRODUCTION
The novel COVID-19 was formally declared a pandemic on March 11, 2020 (Cucinotta and Vanelli, 2020;Zhu et al., 2020). In December 2019, the first case was reported in Wuhan, China, and due to its highly contagious capacity, this disease spread rapidly throughout the world (Huang et al., 2020;Khan et al., 2020). The situation within the U.S. was no exception, reaching 34,527,963 confirmed cases and 619,992 deaths by late June 2021 (Worldometer, n.d.). This public health emergency of international concern made healthcare providers, researchers, scientists, governments officials, and others alike, seek rapid and efficient solutions to overcome this unprecedented and alarming crisis.
Widespread lockdowns and quarantines protocols, health care system changes, such as cancelation of nonurgent surgical and medical cases, as well as budgetary, logistical, and medical equipment re distribution were some of the significant changes adopted as prevention and contingencies methods for this pandemic. Unsurprisingly, COVID-19 resulted in negative impacts to governments, global economy, and physical health problems, as well as increased mental health burden. All the while, vaccine development was immediately implemented as the permanent foreseeable solution.
It is well established that vaccination provides memory to the immune system to combat in a fast and strong way the targeted pathogen when infection occurs. Therefore, a high vaccination coverage rate is paramount to overcome and prevent the continuance of this pandemic. On June 30, 2021, the CDC provided emergency authorization for the Pfizer BioNTech COVID-19 vaccine, with the first dose received by an intensive care nurse in New York City on December 14, 2020 (The Washington Post, n. d.). Shortly after, Moderna followed by Johnson & Johnson/Janssen COVID-19 vaccines were also granted CDC emergency authorization (The American Journal of Managed Care, 2021). Currently, these three COVID-19 vaccines are being provided in the U.S. (Centers for Disease Control and Prevention, 2021).
A critical factor to achieve an increased and sustained vaccination coverage rate is the public confidence in immunization (Centers for Disease Control and Prevention, 2015). Social media plays an important role for people to communicate worldwide. It is through social media that people express their feelings and emotions about various topics without reservation, including the COVID-19 pandemic and its vaccination. In 2021, Kwok et al. (2021) evaluated Australian Twitter users' sentiments concerning COVID-19 vaccination before its implementation. They found that users level of positive public sentiment may have not been sufficient to achieve adequate vaccination coverage. Moreover, they found that users supported infection controls and disproved misinformation. Therefore, they concluded that governments should consider the public opinion and sentiments towards this pandemic and its vaccination to accomplish adequate vaccination promotion plan (Kwok et al., 2021). To the best of our knowledge, there has not been a study exploring public perceptions and sentiments via a social media platform in the U.S. related to the COVID-19 vaccine before and after its implementation. Hence, this study aims to compare U.S. located Twitter users' perceptions and sentiments towards COVID-19 vaccination before and after its implementation. We hypothesized that after the vaccine was launched the overall public sentiment score would increase.

Data Source
The social media platform Twitter was used as the source to extract text through "tweets." As of 2021, this real time microblogging platform was deemed within the most popular social media platforms used, with over 335 million monthly active users (Lua, A., n. d.).

Data Collection
Twitter API v2 is a set of programmatic endpoints that serve as a source to better understand ongoing conversations on Twitter. After obtaining approval for access to the Academic Research product track of Twitter's API v2, we were able to extract COVID-19 vaccine related tweets in a precise, complete, and unbiased manner (Twitter, Inc., 2021b). We retrieved COVID-19 vaccine related tweets posted from March 11, 2020, to May 17, 2021. The date when COVID-19 was declared a pandemic was chosen as our starting timepoint and the date when we started extracting tweets was chosen as our ending timepoint. The following terms were used in our query: "COVID19" OR "COVID_19" OR "SARS_CoV_2" OR "SARSCoV2" OR "Corona" OR "Covid" AND "CovidVaccine" OR "COVIDVaccines" OR "COVIDVaccination" OR "vacc", "vaccine" OR "vax" OR "vaccination" OR "CovidVaccine" OR "COVIDVaccines" OR "vaccine". We exclude all tweets with language other than English, retweets, and geolocations outside of the U.S.
Attending to a difference in the overall sentiment score of at least 0.05 between groups, a power of 90% and an alpha of 0.05 a sample size of 16,814 tweets was estimated. To account for the possibility of duplicate tweets, we inflated this estimation by a factor of 15% for a final sample size of 19,336 tweets. A simple random sampling method was used to obtain our sample. In order to extract each tweet during a specific time point, we translated the random date into a timestamp. Later, we used the timestamp to identify and forward engineer Twitter's snowflake IDs for each tweet. With this information, we used Twitter API v2 to extract tweets' text, time of creation, and location where each was created. After obtaining the sample size, we stratified into two groups, taking into account the first day that the first COVID-19 vaccine was given in the U.S. (December 14, 2020): pre COVID-19 vaccine and post COVID-19 vaccine groups.
Microsoft Excel (version 16.49) (Microsoft Corporation, 2019) was used to generate the simple random sampling of the time points (date, month, year, and time). After obtaining the sample of timepoints, a Python script was run using PythonTM (Python Software Foundation, Delaware, U.S.), a high-level programming language, was used to generate the Twitter snowflakes IDs (Galbreath, 2012). Once we acquired this information, we used the Postman (Postman Inc., California, U.S.), a collaboration platform for Twitter API v2 development, to create the HTTP requests to extract the tweets text, time, and place based on our key terms. This request is developed by using a graphical user interface available in Postman (Twitter, Inc., 2021a).

Statistical Analysis
R Studio (RStudio, PBC, MA, U.S.), was used to conduct descriptive and inferential statistics including, but not limited to sentiment analysis. Before formally analyzing the data, the cohort of tweets underwent several preprocessing operations, where special characters (i.e., @), emojis, numbers, tabs, blanc spaces at the beginning and end of each text, punctuation marks, and links) were removed. Furthermore, all characters were transformed into lower case. Later, the R library "syuzhet" package was used to conduct the sentiment analysis (Galbreath, 2012;Jockers, 2020;Priest, 2017). Each sentiment score is based on the overall vectors generated by two sentiments (positive and negative) and eight emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) by applying the aforementioned package which uses the Stanford's Core National Language Processing to classify text as defined in the Canadian National Research Council's Word Emotion Association Lexicon (Manning et al., 2014;Turney, 2010, 2013).
Figures were used to present emotional comparison word clouds, number of tweets related to emotions and feelings, and trends over time. To model the relationship between the absolute number of words related to the eight emotions and the positive and negative feelings, we used a negative binomial regression due to the overdispersion of the variables under consideration. Additionally, a linear regression model was used to evaluate the relationship between sentiment scores before and after COVID-19 vaccine implementation.
A subgroup analysis was conducted to assess Twitter users' feelings regarding COVID-19 vaccine in surgery. For this, we filtered our database to include tweets with the following keywords: "surg" AND "surgery" AND "surgical" AND "operation". After, each tweet was manually read to ascertain whether they had any keyword related to surgery. Descriptive analysis similar to the one used for our primary analysis was undertaken. Moreover, each tweet was manually analyzed in a qualitative manner.
The number of tweets related to positive and negative feelings regarding the COVID-19 vaccine are shown in Figure  3. A statistically significant difference was found between the two groups when comparing these two feelings (positive p<0.001 and negative p<0.001, respectively). COVID-19 vaccine related tweets had 0.91 (95% CI 0.90-0.93) and 0.75 times (95% CI 0.72-0.77) the rate of expressing positive and negative feelings after vaccine implementation compared to those before vaccine implementation, respectively. Sentiment score trends over time are depicted in Figure 4. This graph represents the daily sentiment score for both   groups, which shows a trend towards a more positive sentiment score over time. Specifically, there was a relative increase of 109.5% in the sentiment scores when comparing between before and after COVID-19 vaccine introduction. Each tweet could have a positive and negative sentiment with vectors in opposite directions. However, the overall sentiment score is the sum of all these vectors. Specifics peaks and troughs were evidenced in this cohort of tweets. Two peaks during June 2020 and March 2021 were found, and four troughs during May 2020, June 2020, October 2020, and between February and March 2021were evidenced (Figure 4). Univariate analysis evidenced a statistically significant increase of the sentiment score by 0.16 in the post COVID-19 vaccine group compared to the pre COVID-19 vaccine group (p <0.001).

DISCUSSION
As of June 29th, 2021, the U.S. COVID-19 vaccination rate was 53.8% and 46.1% for at least one dose and fully vaccinated, respectively (Our World in Data, 2021). Notably, based on a recent poll from Monmouth University, one in five Americans were unwilling to get a vaccine (Soucheray, S., 2021). The reasons behind willingness and unwillingness to get a vaccine might be a sum of different factors. A determined position comes with a related feeling and emotion that may be reflected on social media, perhaps Twitter. However, feelings and emotions have a dynamic nature and can easily change over time, depending on innumerable circumstances surrounding a person. As such, the COVID-19 vaccination is a clear example of how volatile this could be and is confirmed by our study. We found that among U.S. located Twitter users' there was a statistically significant change in emotions, sentiments, and scores of COVID-19 vaccine related tweets before and after the COVID-19 vaccine was launched.
Interestingly, this study's results demonstrated not only a decrease in negative sentiments but also a decrease in all the emotions expect for "joy" after COVID-19 vaccine implementation. Particularly, we identified that the emotion "fear" decreased by 25% after vaccination was launched [IRR 0.75, 95% CI 0.72-0.79], which we believe might reflect the positive impact and confidence of patients towards the vaccine. As previously mentioned, public confidence in immunization is a paramount factor to increase vaccination coverage (Centers for Disease Control and Prevention, 2015). These findings could be related to the several high quality clinical trials and public education related to the COVID-19 vaccine by the Centers for Disease Control and Prevention (2015). Figure 4, changes in sentiment score trends were observed over time. Despite fluctuations, a relative increase of 109.5% in the sentiment scores was seen after the COVID-19 vaccine was implemented in the U.S. Interestingly, one negative trough was identified between February and March 2021. A possible explanation can be attributed to the several immunization drives that were canceled due to hazardous winter storms, which greatly hampered the national vaccine rollout, impeding the distribution of six million doses (The American Journal of Managed Care, 2021). On the other hand, an important peak on sentiment scores was evidenced in March 2021. A possible explanation could be that in late February, clinical trials related to the Pfizer vaccine showed 98.8% effectiveness in preventing patient death and hospitalization, and the Food and Drug Administration (FDA) recommended expedited trials for COVID-19 Booster Shot (The American Journal of Managed Care, 2021). Moreover, in early March 2021, the U.S. president declared that vaccines would be available for every adult located in the U.S. by May, 2021. As we can evidence in our trends and supported by the COVID-19 timeline events, critical changes and improvements in the national scope surrounding the pandemic are reflected in Twitter user sentiment scores.

As depicted in
Previous studies that conducted sentiment analyses before COVID-19 vaccination was introduced, reported negative emotions towards this pandemic and its vaccination (Garcia and Berton, 2021;Hussain et al., 2021;Kwok et al., 2021). However, the overall magnitude of positive sentiments was greater than the negative ones, which aligns with our study's findings (Hussain et al., 2021). Interestingly, Garcia and Berton (2021) published an article in 2021 which revealed that the negative sentiments were related to proliferation care, case reports, and statistics topics. The authors discussed that these feelings could be balanced by developing public health strategies that address them (e.g., communication). Moreover, analysis of Australian Twitter user sentiments related to COVID-19 vaccination in 2021 indicated that this population was in favor of infection controls against the virus, disproved misinformation, and that those conspiracy theories might have influenced user perceptions and views of the COVID-19 vaccine (Kwok et al., 2021).
Moreover, vaccine development, effectiveness, and trials were identified to be linked with public optimism before the COVID-19 vaccine introduction (Hussain et al., 2021). Our study's results showed that one of the most frequently used words before COVID-19 vaccine introduction was "trials" in the section of trust, which aligns with the aforementioned statement, suggesting that people have trusting feelings regarding research related to COVID-19 vaccines. Likewise, after the COVID-19 vaccine was implemented, feelings of trust persisted in relation to the words "vaccination", "vaccinated", and "doctors", corresponding with the trustfulness of Twitter's users in relation to the COVID-19 vaccine. Similar findings regarding trustfulness with the COVID-19 vaccine were evidenced by Chen Lyu et al.'s study which analyzed public COVID-19 vaccine related tweets from March 11, 2020 to January 31, 2021 (Lyu et al., 2021). On the other hand, our study showed that "pandemic" word was related to sadness and that "death" word to disgust, in both groups, before and after COVID-19 vaccine introduction.
Undoubtedly, public opinion on sentiments towards COVID-19 and the vaccine is the cornerstone of better understanding the public's position related to COVID-19 vaccination. By targeting public opinion, healthcare providers and governmental agencies can identify and target potential misperceptions and improve health care policy and messaging as well as public attitudes regarding them. Public perceptions and opinions as portrayed in social media should be taken into account by governments and institutions together with surveys and conventional methods that assess the public's attitudes (Hussain et al., 2021).
In our subgroup analysis, we explored the public's perceptions and feelings related to surgery and COVID-19 vaccination. Patients undergoing surgery often have postoperative immunosuppression that might lead to an increased risk of acquiring COVID-19. Furthermore, the risk of developing a severe case and requiring hospitalization, intensive medical care, and even death are increased (Aminian et al., 2020;Bustos et al., 2020;Lei et al., 2020). From a public health standpoint, prioritizing vaccinations for patients who require elective procedures should be of utmost importance in order to avoid delays in care that could lead to disease progression of alternative diseases. However, until now public perceptions and feelings regarding COVID-19 vaccination and surgery remain obscure. In our preliminary results, the majority of tweets were found in the post COVID-19 group. The word "surgery" was associated with feelings 2 anger and the word "elective" with feelings of anticipation. Also, the word "pandemic" among this subgroup was associated with feelings of disgust. These preliminary findings suggest public awareness of how COVID-19 has impacted surgical protocols, generating feelings of anger and disgust.
This study's limitations could be related to missing keywords employed to extract COVID-19 vaccine related tweets. Moreover, this study included tweets geographically posted within the U.S., which limits generalizability to other populations. No information could be extracted in regard to Twitter users' ages, which is an important factor that could predict varying emotions and feelings. Lastly, we did not conduct topic modeling which could have provided additional information relevant to the topic at hand.

CONCLUSIONS
Our findings indicate that public perception of the COVID-19 vaccine has positively changed over time. All emotions and sentiments except for joy showed a decrease after COVID-19 vaccine implementation. Governmental agencies and healthcare institutions should take into consideration the public's sentiment towards this pandemic and vaccination to create targeted educational interventions and promotion schemes to achieve a better vaccination coverage rate.