biblioshiny is a shiny app providing a web-interface for bibliometrix.
bibliometrix package provides a set of tools for quantitative research in bibliometrics and scientometrics.
bibliometrix is an open-source tool for executing a comprehensive science mapping analysis of scientific literature.
It was programmed in R language to be flexible and facilitate integration with other statistical and graphical packages. Indeed, bibliometrics is a constantly changing science and bibliometrix has the flexibility to be quickly upgraded and integrated. Its development can address a large and active community of developers formed by prominent researchers.
bibliometrix provides various routines for importing bibliographic data from SCOPUS, Clarivate Analytics’ Web of Science, PubMed and Cochrane databases, performing bibliometric analysis and building data matrices for co-citation, coupling, scientific collaboration analysis and co-word analysis.
For an introduction and live examples, see the page documents.
biblioshiny supports scholars in easy use of the main features of bibliometrix:
o Data importing and conversion to data frame collection;
o Data filtering;
o Descriptive analysis of the bibliographic collection;
o Analyzing the different architectures of a bibliographic collection through Conceptual, Intellectual and Social structures.
1) Download and install the most recent version of R (https://cran.r-project.org/)
2) Download and install the most recent version of Rstudio (http://www.rstudio.com)
3) Open Rstudio and, in the console window, digit:
4) After the package installation, to start with the shiny web-interface, digit:
To try the biblioshiny app, we provide a collection downloaded from Web of Knowledge. It includes all articles published by the Journal of Informetrics from 2007 to 2017.
The file can be downloaded at the following link: http://www.bibliometrix.org/datasets/joy.zip
biblioshiny imports and converts data extracted from the two main bibliographic databases: SCOPUS and Clarivate Analytics Web of Science.
SCOPUS (http://www.scopus.com), founded in 2004, offers a great deal of flexibility for the bibliometric user. It permits to query for different fields, such as titles, abstracts, keywords, references and so on. SCOPUS allows for relatively easy downloading data-queries, although there are some limits on very large results sets with over 2,000 items.
The SCOPUS platform allows to only 2000 records at a time.
Choose the file type “BibTeX export” and “all available information”. The SCOPUS export tool creates one or more export files with the default name “scopus (number).bib”. Export files can be separately stored.
Clarivate Analytics Web of Science (WoS) (http://www.webofknowledge.com), owned by Clarivate Analytics, was founded by Eugene Garfield, one of the pioneers of bibliometrics.
This platform includes many different collections.
The WOS platform permits to export only 500 records at a time.
The Clarivate Analytics Web of Science export tool creates one or more export files with a default name “savedrecs (numeber)” with an extension “.txt” or “.bib” for plain text or BibTeX format respectively. Export files can be separately stored.
The argument database indicates from which database the collection has been downloaded.
It can be:
o “Web of Knowledge” (for Clarivate Analytics Web of Science database),
o “Scopus” (for SCOPUS database).
The argument File format indicates the file format of the imported collection. It can be “plaintext” or “BibTeX” for WOS collection and mandatorily “BibTeX” for SCOPUS collection.
If you have more than a single export file (.txt or .bib), please compress all export files in a single zip archive (.zip). Then you can load it directly with the browse button. biblioshiny will automatically merge and convert export files in a single data frame.
Each manuscript contains several elements, such as authors’ names, title, keywords and other information. All these elements constitute the bibliographic attributes of a document, also called metadata.
Data frame columns are named using the standard Clarivate Analytics WoS Field Tag codify.
The main field tags are:
Filters are useful to define the boundaries of a collection. biblioshiny includes the main filters such as:
o Document type
Bibliographic documents have different types, such as articles, reviews, editorials, and so on. The most common science mapping analysis use articles as unit of analysis. Anyway, scholars can be interested in doing a meta-review using the document type “reviews”;
o Publication year
This filter is useful to restrict the timespan. Comparing structures in different timeslices traces its historical evolution. In this case, scholars can see the dynamics of knowledge structures.
o Total Citations
Scholars may be interested in selecting the most important articles in a field, known as “citation classics”. These articles are considered knowledge and research drivers;
This filter permits to focus on a specific subset of journals, selected on the bases of relevance, or on a single journal.
The first step is to perform a descriptive analysis of the bibliographic data frame.
The tab “Tables” displays main information about the bibliographic data frame and several tables, such as annual scientific production, top manuscripts per number of citations, most productive authors, most productive countries, total citation per country, most relevant sources (journals) and most relevant keywords.
Main information table describes the collection size in terms of number of documents, number of authors, number of sources, number of keywords, timespan, and average number of citations.
Furthermore, many different co-authorship indices are shown.
In particular, the Authors per Article index is calculated as the ratio between the total number of articles and the total number of authors.
The Co-Authors per Articles index is calculated as the average number of co-authors per article. In this case, the index takes into account the author appearances while for the “authors per article” an author, even if he has published more than one article, is counted only once. For that reasons, Authors per Article index ≥ Co-authors per Article index.
The Collaboration Index (CI) is calculated as Total Authors of Multi-Authored Articles/Total Multi-Authored Articles (Elango and Rajendran, 2012; Koseoglu, 2016). In other word, the Collaboration Index is a Co-authors per Article index calculated only using the multi-authored article set.
The Tab “Plots” draws several descriptive plots.
biblioshiny in the descriptive analysis includes also the Tab “Wordcloud”" that draw a wordcloud graph using keywords or terms in titles and abstracts.
Co-word networks show the conceptual structure, that uncovers links between concepts through term co-occurences.
Conceptual structure is often used to understand the topics covered by scholars (so-called research front) and identify what are the most important and the most recent issues.
Dividing the whole timespan in different timeslices, through Filter menu, and comparing the conceptual structures is useful to analyze the evolution of topics over time.
biblioshiny is able to analyze keywords, but also the terms in the articles’ titles and abstracts. It does it using network analysis or correspondance analysis (CA) or multiple correspondance analysis (MCA). CA and MCA visualise the conceptual structure in a two-dimensional plot.
(Multiple) Correspondence Analysis
Co-word analysis draws clusters of keywords. They are considered as themes, whose density and centrality can be used in classifying themes and mapping in a two-dimensional diagram.
Thematic map is a very intuitive plot and we can analyze themes according to the quadrant in which they are placed:
(1) upper-right quadrant: motor-themes;
(2) lower-right quadrant: basic themes;
(3) lower-left quadrant: emerging or disappearing themes;
(4) upper-left quadrant: very specialized/niche themes.
Please see Cobo, M. J., Lopez-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2011). An approach for detecting, quantifying, and visualizing the evolution of a research field: A practical application to the fuzzy sets theory field. Journal of Informetrics, 5(1), 146-166.
Citation analysis is one of the main classic techniques in bibliometrics. It shows the intellectual structure of a specific field through the linkages between nodes (e.g. authors, papers, journal), while the edges can be differently interpretated depending on the network type, that are namely co-citation, direct citation, bibliographic coupling. Please see Aria, Cuccurullo (2017).
biblioshiny can build:
o co-citation networks that show relations between cited-reference works (Field: “Papers”);
o co-citation networks that use cited-authors as unit of analysis (Field: “Authors”);
o co-citation networks that use cited-journals as unit of analysis (Field: “Sources”).
The useful dimensions to comment the co-citation networks are:
(i) centrality and peripherality of nodes;
(ii) their proximity and distance;
(iii) strength of ties, (iv) clusters;
(iiv) bridging contributions.
A historiograph is built on direct citations. It draws the intellectual linkages in a historical order.
Cited works of thousands of authors contained in a collection of published scientific articles is sufficient for recostructing the historiographic structure of the field, calling out the basic works in it.
Collaboration networks show how authors, institutions (e.g. universities or departments) and countries relate to others in a specific field of research.
Next figure below is a co-author network. It discovers regular study groups, hidden groups of scholars, and pivotal authors.
biblioshiny can build also social networks about institutions or countries that uncovers relevant institutions or main countries in a specific research field and their relations.
(about bibliometrix and science mapping)
Aria, M. & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis, Journal of Informetrics, 11(4), pp 959-975, Elsevier, DOI: 10.1016/j.joi.2017.08.007
Cuccurullo, C., Aria, M., & Sarto, F. (2016). Foundations and trends in performance management. A twenty-five years bibliometric analysis in business and public administration domains, Scientometrics, DOI: 10.1007/s11192-016-1948-8
Cuccurullo, C., Aria, M., & Sarto, F. (2015). Twenty years of research on performance management in business and public administration domains. Presentation at the Correspondence Analysis and Related Methods conference (CARME 2015) in September 2015 (link).
Sarto, F., Cuccurullo, C., & Aria, M. (2014). Exploring healthcare governance literature: systematic review and paths for future research. Mecosan (link).
Cuccurullo, C., Aria, M., & Sarto, F. (2013). Twenty years of research on performance management in business and public administration domains. In Academy of Management Proceedings (Vol. 2013, No. 1, p.14270). Academy of Management DOI: 10.5465/AMBPP.2013.14270abstract