pubmed.ncbi.nlm.nih.gov

Can tweets predict citations? Metrics of social impact based on Twitter and correlation with traditional metrics of scientific impact - PubMed

  • ️Sat Jan 01 2011

Can tweets predict citations? Metrics of social impact based on Twitter and correlation with traditional metrics of scientific impact

Gunther Eysenbach. J Med Internet Res. 2011.

Erratum in

  • doi:10.2196/jmir.2041

Abstract

Background: Citations in peer-reviewed articles and the impact factor are generally accepted measures of scientific impact. Web 2.0 tools such as Twitter, blogs or social bookmarking tools provide the possibility to construct innovative article-level or journal-level metrics to gauge impact and influence. However, the relationship of the these new metrics to traditional metrics such as citations is not known.

Objective: (1) To explore the feasibility of measuring social impact of and public attention to scholarly articles by analyzing buzz in social media, (2) to explore the dynamics, content, and timing of tweets relative to the publication of a scholarly article, and (3) to explore whether these metrics are sensitive and specific enough to predict highly cited articles.

Methods: Between July 2008 and November 2011, all tweets containing links to articles in the Journal of Medical Internet Research (JMIR) were mined. For a subset of 1573 tweets about 55 articles published between issues 3/2009 and 2/2010, different metrics of social media impact were calculated and compared against subsequent citation data from Scopus and Google Scholar 17 to 29 months later. A heuristic to predict the top-cited articles in each issue through tweet metrics was validated.

Results: A total of 4208 tweets cited 286 distinct JMIR articles. The distribution of tweets over the first 30 days after article publication followed a power law (Zipf, Bradford, or Pareto distribution), with most tweets sent on the day when an article was published (1458/3318, 43.94% of all tweets in a 60-day period) or on the following day (528/3318, 15.9%), followed by a rapid decay. The Pearson correlations between tweetations and citations were moderate and statistically significant, with correlation coefficients ranging from .42 to .72 for the log-transformed Google Scholar citations, but were less clear for Scopus citations and rank correlations. A linear multivariate model with time and tweets as significant predictors (P < .001) could explain 27% of the variation of citations. Highly tweeted articles were 11 times more likely to be highly cited than less-tweeted articles (9/12 or 75% of highly tweeted article were highly cited, while only 3/43 or 7% of less-tweeted articles were highly cited; rate ratio 0.75/0.07 = 10.75, 95% confidence interval, 3.4-33.6). Top-cited articles can be predicted from top-tweeted articles with 93% specificity and 75% sensitivity.

Conclusions: Tweets can predict highly cited articles within the first 3 days of article publication. Social media activity either increases citations or reflects the underlying qualities of the article that also predict citations, but the true use of these metrics is to measure the distinct concept of social impact. Social impact measures based on tweets are proposed to complement traditional citation metrics. The proposed twimpact factor may be a useful and timely metric to measure uptake of research findings and to filter research findings resonating with the public in real time.

PubMed Disclaimer

Conflict of interest statement

The author is editor and publisher of JMIR and is a shareholder of JMIR Publications Inc., which owns and publishes JMIR. He does not currently take any salary from JMIR Publications Inc., but his wife does (and complains it isn't enough). JMIR Publications Inc. also owns the domains twimpact.org, twimpactfactor.org and twimpactfactor.com with the possible goal to create services to calculate and track twimpact and twindex metrics for publications and publishers, and may or may not directly or indirectly profit from these services.

Figures

Figure 1
Figure 1

Top Articles ranking on the Journal of Medical Internet Research (JMIR) (sorted by most-tweeted articles in November 2011).

Figure 2
Figure 2

Number of tweetations within 7 days of article publication, per article ID. Asterisks next to article IDs denote that the article is top-cited (see also Figure 8): ** top 25th citation percentile within issue by both Scopus and Google Scholar citation counts * top 25th citation percentile according to Google Scholar only, (*) top 25th citation percentile according to Scopus only.

Figure 3
Figure 3

Tweetation dynamics. The blue, shaded area (left y-axis) shows the tweet rate (new tweetations per day, as a proportion of all tweetations during the first 60 days [tw60]). The red line (right y-axis) represents cumulative tweetations.

Figure 4
Figure 4

Tweetation dynamics over time on a log-log scale. All tweetations were categorized according to when, in relationship to the cited article publication date, they were tweeted (x-axis), with 1 being the day of article publication.

Figure 5
Figure 5

Tweetation dynamics in the first 7 days after article publication for one specific issue. The 4-digit number is the article identifier (last digits of the DOI), number in parentheses is the citation count (as per Google Scholar, November 2011), and the last number is the (cumulative) number of tweets on day 7 (tw7).

Figure 6
Figure 6

Tweetation density by account. Each Twitter account is ranked by the number of tweetations sent and plotted by rank on the x-axis. The y-axis shows how many tweetations were sent by each ranked account. For example, the top Twitter account ranked number 1 (@JMedInternetRes) sent 370 tweetations. Note the linear pattern on a log-log scale, implying a power law.

Figure 7
Figure 7

Left: Zipf plot for JMIR articles 3/2000-12/2009 (n=405), with number of citations (y-axis) plotted against the ranked articles. Right: Zipf plot showing the number of tweetations in the first week (tw7) to all JMIR articles (n=206) published between April 3, 2009 and November 15, 2011 (y-axis) plotted against the ranked articles. For example, the top tweeted article got 97 tweetations, the 10th article got 43 tweetations, and the 102th ranked article got 9 tweetations.

Figure 8
Figure 8

Google Scholar citation counts for all articles published between issue 3/2009 and issue 2/2010. Top-cited articles (75th percentile) within each issue are marked ** (top cited according to Google Scholar and Scopus), * (Google Scholar only), or (*) (Scopus only).

Figure 9
Figure 9

Citation and tweetation dynamics of a highly cited (and highly tweeted) article [article ID 1376]; citations according to Scopus.

Figure 10
Figure 10

Correlations between citations in November 2011 (Google Scholar) and the cumulative number of early tweets by day 7 (tw7). Note the logarithmic scale. Articles with 0 tweets or 0 citations are not displayed here, because the log of 0 is not defined. However, conceptually they all fall into the lower left quadrant.

Figure 11
Figure 11

Tweetation curves: cumulative tweetations (twn), as a proportion of all tweetations sent within 30 days.

Figure 12
Figure 12

Model of the relationship between social impact and research impact metrics.

Similar articles

Cited by

References

    1. Hirsch JE. An index to quantify an individual's scientific research output. Proc Natl Acad Sci U S A. 2005 Nov 15;102(46):16569–72. doi: 10.1073/pnas.0507655102. - DOI - PMC - PubMed
    1. Garfield E. The history and meaning of the journal impact factor. JAMA. 2006 Jan 4;295(1):90–3. doi: 10.1001/jama.295.1.90. - DOI - PubMed
    1. Rossner M, Van Epps H, Hill E. Show me the data. J Cell Biol. 2007 Dec 17;179(6):1091–2. doi: 10.1083/jcb.200711140. http://www.jcb.org/cgi/pmidlookup?view=long&pmid=18086910 - DOI - PMC - PubMed
    1. PLoS Editors The impact factor game. It is time to find a better way to assess the scientific literature. PLoS Med. 2006 Jun;3(6):e291. doi: 10.1371/journal.pmed.0030291. http://dx.plos.org/10.1371/journal.pmed.0030291 - DOI - DOI - PMC - PubMed
    1. Smith R. Measuring the social impact of research. BMJ. 2001 Sep 8;323(7312):528. http://bmj.com/cgi/pmidlookup?view=long&pmid=11546684 - PMC - PubMed

MeSH terms

LinkOut - more resources