Research on Social Media Methods and Analysis: Facebook Datasets and their Assessment

1.0   Introduction

Through the advances in the Internet usage and the technological developments opening up a new feature of making everyone feel more connected, Facebook has become the most popular social networking website. The works of Larsson (2014) have conveyed that the extensive use of Facebook can be observed from the fact that the people are almost always online. It has also been found that the use of social networking websites has gained popularity for the generation of brand awareness or spreading information that are vital. It is these developments that have taken forward the extent of the use of social media like Facebook to determine the proceedings of a particular event or the details of a specific celebrity within the world. The considerable changes in the use of the content like texts and images and their analysis has also been brought into notice and is being used on a regular note.

Through this essay the analysis of the content of one of the popular Facebook pagesis done. The understanding is developed based on the datasets containing analysis of posts from between Nov 7, 2016 to March 9, 2017. This is done in order to gain a better understanding about the use of API and the differences that it possesses in comparison to the web interface platform. The research questions that could be answered using the datasets have been identified and also concerned ethical measures have been discussed.



2.0 Literature Review

A number of studies have been conducted based on the foundation of the social media and the application of the same for the study of the latest news or the personality traits of the users of the social networking website.


The works done by Park et al. (2015) have highlighted that the development of the alternatives supporting the inclusion of the social media and related analytics can be useful in verifying the personality of the different individuals involved in the process. The studies conducted by Park et al. (2015) have been based on the dimensions surrounding as many as 66,732 Facebook users and it examined the aspects considering the personality of the users. This has been based on the political attitudes, friends and the impulsiveness among the people. Through the understanding it has been found that the participants on the Facebook are consumed by the knowledge driven by the personal traits and this also is responsible for directing the practices adopted by the participants.


The studies are not limited to the individual personalities, but also take into account the different associated and relevant factors. In the works of Seidman (2013) it has been verified that the development of the strategies concerning the self-presentation and belongingness on Facebook is determined by the extent of influence that the external factors have on the individuals. Through the association of the factors like the posts with maximum comments from other users and the ones that have been shared a number of times it was easy to identify the effect of the external influence on the perception of the others about a specific content (Seidman, 2013). This suggests that the users rule the social networking sites and they are also interdependent on each other.


In the further studies, it has been analysed that the status on the Facebook and the way it is perceived by others reflects the state of well being of the associated individuals. In the works carried by Pan et al. (2015) it has been verified that the development of the understanding about the ways in which the people express their emotions and feelings on Facebook can be helpful in understanding their dependency on the social networking site to get relieved. The study conducted by the authors analysed the content posted and shared by a specific number of users in the past 9-10 months of when the study was conducted with special reference to the negative emotions that were brought to focus (Pan et al., 2015). Through such analysis it can be verified that the use of Facebook is done mostly to identify the validity and the effectiveness of the content shared by the users.


The studies conducted verify that the use of social network can be done in order to identify and predict the operations of the individuals at a larger level. The works of Ruths and Pfeffer (2014) have dictated that the use of social networks is done to spread important news or create awareness about a specific cause. The study correlates its findings with an exit poll as in case of presidential election wherein it is easy to verify the opinion of the people through their discussions and thoughts. Further, Ruths and Pfeffer (2014) have stimulated the thoughts that the detailed analysis of the condition of the content shared on the social network directs to the fact that the type of the posts shared influences individuals. This has also been found to work for the users who need to put their opinion across others, as Facebook and likes present a large platform at a minimum investment.


Thus, through the different studies conducted and the literature reviewed it can be verified that social networking websites present a large pool of audience for the needy. There are different ways in which the required updates and news can be dispersed and the influential force of the internal as well as the external factors can be used to deliver the news or the information to as many people as is possible.





3.0 Assessment of Platform Data and Research Questions

In order to further develop an understanding about the different ways in which the information is shared and the social networking sites are used the datasets for Hillary Clinton’s Facebook page have been studied. The analysis is based on the different aspects of the posts that have been published on the page ranging from Nov 7, 2016 to March 9, 2017 (Hillary Clinton Facebook Page, 2017). The main aim has been to develop an understanding about the popularity and reach of the figure. In order to do so the different posts have been studied while developing an in-depth analysis of the different application programming interfaces that could be found commonly throughout the page.


In order to process with the understanding there are different parameters that have been taken into consideration. The first and foremost criterion has been to develop an understanding about the ways in which the Facebook page can be verified for its validity and reliability. As stated in the works of Halfpenny and Procter (2015) it can be observed through the ‘blue tick’ on the left of the page name if the page has been identified as true and secure. The use of the posts spread across a specific timeline is beneficial is reaching to the best of the results and also it gets easy to filter the relevancy of the messages and the interfaces. Thus, the posts have been verified and they have been used to identify the overall statistics, the statistics based on per day observation, and also the top comments. Through the analysis it was also easy to list down the top comments on the posts that have been the part of the study.


The entire data has been processed using API, which refers to the fact that the third parties have been involved to identify the interfaces that they wanted to use in particular to carry out the study. As in the works of Riffe, Lacy, and Fico (2014) API is useful in limiting the load on the programmer whereas in case of web interface all the happening on the platform can be seen and observed at the same time. However, even while suing the web interface one can filter the type of information that is needed and is to be given prime importance. While using web interface in the undertaken study, it is a possibility that certain other factors with a higher importance and usability could be highlighted.


The use of such type of datasets can be done in order to reach an understanding to different types of research questions. In the works of Riffe, Lacy, and Fico (2014) it has been highlighted that social media content analysis plays an important role in defining the requirements of the individuals. The datasets can identify public polls and their preferences under descriptive research questions. They can also be used to determine the rational in a specific study and are just as useful in conducting a casual analysis.


Through the dataset listed, it is possible to understand the answer to the following questions:

  • In which country is Hillary Clinton most popular?
  • What kind of posts and opinions do people give more importance to?
  • What is the most common response or opinion of the people regarding a specific aspect?


4.0 Methods and Ethical Concerns

This essay contains the description of the data that has been collected from the posts on the Facebook page of Hillary Clinton in between specific timeline – Nov 7, 2016 to March 9, 2017. The period experienced ups and downs due to the voting process of the presidential elections and the aftermath of the results and hence has been found to be a moderately active period. In order to take control of the happenings and also of the fact that it has been an active timeline, all the conversations have been marked with utmost care. There have been a number of updates and posts and therefore the Facebook page has been kept under speculation with a regular interval of 12 hours as the harvest time. The points from the different posts were filled through the use of the excel sheet wherein the date and day of the post has been recorded with the type of pots, its id, link, post message, picture (if any), link and the domain recorded in different columns. In order to further systematise the process the number if likes, shares, comments, different types of reactions and the total count of the different types of engagement has been taken into account and placed in the excel sheet in chronological order. The coding was done in the following manner:

Figure 1: Coding for Analysis


Figure 2: Coding for Response

In order to verify the relevancy of the posts, the content analysis method was used and the data was recorded manually as there is no free to use API provided by Facebook. The works of Stemler (2015) have highlighted that the content analysis approach used for the social media data identification and study can lead to the defined results leading to an in-depth understanding. The coding considered the comments and shares from people belonging to different countries and also identified the relevance of the elections and their results.


However, it was found that the changing opinion of the people and the time that the posts got as an advantage prior to going into a debate area. Thus, the comments and the likes can be influenced due to the external factors.


As far as ethical issues are considered, it has been verified that the methods of data collection and the principles of methodological applications do not lie under any unethical stance. The participants and their personal details have not been brought to limelight except for the proportion of participants from different countries. The tracking of the comments has been limited through the use of special codes without compromising on the authenticity and personal details.


5.0 Conclusion

Through the essay it has been verified that the social networking sites are becoming a necessary tool for expression of personal thoughts as well as information, which is of substantial nature. The use of various parameters and the annual record of API has been helpful in the content analysis of the Facebook page of Hillary Clinton for a specific and defined timeline. The methodology applied made use of certain specific codes to generate the datasets and through this a direction has been obtained towards the importance of the content analysis using Facebook data. It can therefore be concluded that the use of social media methods can be analysed through content analysis as one of the measures.



De Choudhury, M., Counts, S. and Horvitz, E. (2013). Predicting postpartum changes in emotion and behavior via social media. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems – CHI ’13. (2017). Hillary Clinton. [online] Available at: [Accessed 20 May 2017].

Halfpenny, P. and Procter, R. (2015). Innovations in Digital Research Methods. 1st ed. London: SAGE Publications Ltd.

He, W., Zha, S. and Li, L. (2013). Social media competitive analysis and text mining: A case study in the pizza industry. International Journal of Information Management, 33(3), pp.464-472.

Larsson, A. (2016). Online, all the time?A quantitative assessment of the permanent campaign on Facebook. New Media & Society, 18(2), pp.274-292.

Liu, P., Tov, W., Kosinski, M., Stillwell, D. and Qiu, L. (2015). Do Facebook Status Updates Reflect Subjective Well-Being?. Cyberpsychology, Behavior, and Social Networking, 18(7), pp.373-379.

Oeldorf-Hirsch, A., Birnholtz, J. and Hancock, J. (2017). Your post is embarrassing me: Face threats, identity, and the audience on Facebook. Computers in Human Behavior, 73, pp.92-99.

Park, G., Schwartz, H., Eichstaedt, J., Kern, M., Kosinski, M., Stillwell, D., Ungar, L. and Seligman, M. (2015). Automatic personality assessment through social media language. Journal of Personality and Social Psychology, 108(6), pp.934-952.

Riffe, D., Lacy, S. and Fico, F. (2014). Analyzing media messages. New York, N.Y.: Routledge.

Ruths, D. and Pfeffer, J. (2014).Social media for large studies of behavior. Science, 346(6213), pp.1063-1064.

Seidman, G. (2013). Self-presentation and belonging on Facebook: How personality influences social media use and motivations. Personality and Individual Differences, 54(3), pp.402-407.

Stemler, S. (2015).Content Analysis. Emerging Trends in the Social and Behavioral Sciences, pp.1-14.

Van Iddekinge, C., Lanivich, S., Roth, P. and Junco, E. (2013).Social Media for Selection?Validity and Adverse Impact Potential of a Facebook-Based Assessment. Journal of Management, 42(7), pp.1811-1835.