Keywords: biomedical research, co-authorship network, hub, research center, shared core facility, bibliometric
We analyzed co-authorship patterns within the National Institutes of Health Center of Biomedical Research Excellence in Matrix Biology program from 2014 to 2022. In this study, we analyzed junior investigators, senior researchers, and research scientists within a shared core facility. Social network analysis techniques were applied to evaluate the co-authorship network based on journal publications from members of the center. The results indicated that co-authorship network visualization and analysis is a useful tool for understanding the relationship between a shared core facility and young investigators within a research center. Young investigators collaborated with and relied upon the individual research scientists of the shared core facility to serve as contributing members of their extended research team. This reliance on the shared core facility effectively increases the size and productivity of the research team led by the young investigator. Our results indicate that shared core facility staff may serve as hubs within the network of biomedical researchers, particularly at institutions with a growing research emphasis.
ADDRESS CORRESPONDENCE TO: Julia Thom Oxford, 1910 University Dr., Biomolecular Research Center, Boise, Idaho 83725, USA (Phone: 208-426-2238; Fax: 208-426-2237; E-mail: [email protected])
Conflict of Interest Statement: The authors have no conflicts of interest to declare.
Boise State University, a historically undergraduate teaching university, has increased its focus on research over the past 2 decades. Funding for biomedical research from the National Institutes of Health (NIH) increased from less than $440,000 annually in 2002 to more than $6.6M annually in 2022, representing approximately a 15-fold increase in extramurally funded biomedical research activity.
Boise State University was awarded a Center of Biomedical Research Excellence (COBRE) in Matrix Biology grant award from NIH in 2014. The COBRE in Matrix Biology completed Phase I (2014-2019) and is currently in Phase II (2019-2024). The COBRE program supports research growth within a multidisciplinary thematic focus, with an established investigator as the program director. As a requirement, the program director must have expertise within the thematic research area and, in this respect, may serve as the initial organizing hub of the scientific network at the outset of the center activity. Additionally, the COBRE program emphasizes the central role that shared core facilities play in the productivity of researchers.,, We hypothesized that research scientists within a shared core facility may also play the role of hubs within the new biomedical research network that is formed by the establishment of a COBRE-funded research center. Furthermore, we hypothesized that shared core facilities may be particularly beneficial to the career development of young investigators within the COBRE program.
The objective of this study was to assess the extent to which shared core facility staff members serve as hubs for biomedical research. To do this, we carried out social network analysis to understand the structure of the network and to identify which nodes play significant roles within the network. In this study, we analyzed publication patterns of junior investigators, senior researchers, and research scientists within the Biomolecular Research Center, a shared core facility. A hub in network science is a high-degree node, which is a node that has a greater number of links to other nodes than the average node in the network.
Social network analysis techniques were applied to evaluate the COBRE in Matrix Biology. Bibliometric data was analyzed using cited publications obtained from PubMed.gov. The workflow used to conduct the analysis is shown in Figure 1. A MEDLINE text file from the US National Library of Medicine containing bibliometric data was generated from PubMed.gov and included 200 peer-reviewed, published papers that cited the NIH grant #P20GM109095 from 2014 to 2022. The generated MEDLINE text file was uploaded into VOSviewer with a thesaurus that was created to resolve the duplication of authors who have more than 1 publishing name. VOSviewer was used to create a Graph Modeling Language (GML) file. The GML file was uploaded to Gephi to create a network map using the ForceAtlas2 algorithm and to generate statistical and metric information used for an analysis of our network., Statistical metrics generated by Gephi were used to further analyze the network. The data we used for analysis included betweenness centrality, closeness, clustering coefficient, weight, degree, hub, pageranks, triangles, eigencentrality, and modularity for each node within the network. Violin plots were created with RAWGraphics to analyze clustering coefficients and the modularity of the network.
The results indicated that co-authorship network visualization and analysis are useful tools for understanding the relationship between a shared core facility and investigators within a research center. The results of our analysis provide information about the centrality of the nodes representing research scientists within the shared core facility, affiliated investigators, and all other co-authors.
The COBRE in Matrix Biology co-authorship network 2014 to 2022 was constructed using Gephi (Figure 2). The co-authorship network includes 664 nodes and 3917 edges from 200 COBRE in Matrix Biology–cited papers. ForceAtlas2 was used as the layout algorithm for the force-directed network graph. The edge weight defined the strength of the connection, and nodes repulse each other like charged particles, while edges attract nodes like springs. Figure 2 demonstrates that the core research scientists (orange nodes) within the network map make up approximately 2% of the nodes and represent some of the highest degree nodes, reflected by the diameter of the nodes within the graph, and several of these are located within the central region of the graph. Investigators that rely on the services of the shared core facility (blue nodes) make up approximately 7% of the nodes within the network. The other 90% represent students, collaborators, and research personnel within individual laboratories.
Correlations among the calculated parameters for each node were identified by a Pearson’s correlation and visualized as a heat map (Figure 3). A strong correlation between attributes was observed for “degree,” “eigencentrality,” “betweenness,” and “hub.”
Plots of eigencentrality, betweenness, and hub as a function of degree for each node indicated that as degree increased, so did eigencentrality, betweenness, and hub (Figure 4). Orange data points within the plots indicate shared core facility research staff overlaid on other data points within the network. Shared core facility research staff had relatively high values for degree, eigencentrality, and hub compared to other individuals within the network. In contrast, shared core facility research staff did not always have high values for betweenness.
A modularity analysis of the network indicated that the network consists of 18 communities. These were analyzed for frequency distribution of degree, closeness centrality, eigencentrality, and hub as shown in Figure 5. The highest degree individuals in this network included shared core facility staff members (blue box, Figure 5, A). Shared core facility personnel were located in neighborhoods 2, 5, 14, 15, and 16. The majority of shared core facility personnel are represented by data points within the upper region of 14 (circled in red, Figure 5, B, C, and D) and the upper region of 15 (circled in black, Figure 5, B, C, and D). In addition to shared core facility personnel, the neighborhoods indicated by modularity studies (2, 5, 14, 15, and 16) were those that included young investigators that had participated in mentored career development programs as part of the COBRE in Matrix Biology and were also successful in their efforts to establish independent research programs subsequent to COBRE center support. Specifically, neighborhood 2 includes data points that represent 2 individuals who were supported as research project leads and then went on to compete successfully for R01 grants from NIH. Neighborhood 5 includes a data point representing an investigator who developed a new COBRE application. Neighborhood 14 includes data points representing core research scientists and other research leaders on campus. Neighborhood 15 includes data points representing a COBRE investigator who has subsequently established a research center. Neighborhood 16 includes an investigator who established a biotechnology company.
Our analysis of co-authorship patterns among scientists within a research center that includes a shared core facility, the COBRE in Matrix Biology, showed that co-authorship network visualization and analysis can provide a useful set of tools to better understand the relationship between a center-based thematic research focus with access to shared core facilities and research productivity and career development of investigators. We found a strong correlation between “degree,” “eigencentrality,” “betweenness,” and “hub” and that shared core facility research staff had relatively high values for degree, eigencentrality, and hub compared to the majority of nodes within the network. This study also analyzed the modularity of the network and found that neighborhoods or communities that included share core facility research staff also included successful young investigators, suggesting that an association with shared core facilities can help to foster the career success of young investigators. We cannot exclude the possibility that our dataset is incomplete due to some investigators that may utilize the shared core facility and fail to cite the grant that supported the shared core facility. The challenge that shared core facilities face in having their contribution represented in the resulting publications has been documented in the literature. While we believe that these limitations have not impacted the major conclusions and the overall outcomes of the study, future work may seek to determine to what extent the dataset is incomplete and to what extent that may influence our conclusions.
Social networks play crucial roles in problem-solving approaches, and the success in achieving goals and the relationships between nodes and the underlying social structure can be understood through an analysis of the network. Key players in the network may be identified as hubs. In our study, shared core facility research personnel had high hub values and may therefore be considered key players in the network. A hub in a network may facilitate the collaborative community. At Boise State, scientific collaborations are facilitated by technological advances and local access to shared core facilities.,,, They are limited, however, by the geographical distance between locations and a lack of resources. In states with low populations like Idaho with geographically isolated universities, collaboration is a challenge. Collaborations and shared core facilities can increase research productivity and outcomes. Collaborations can also increase research quality and lead to co-authored publications. Collaboration and co-authorship can produce a higher research impact, as the work can be multidisciplinary. Successful establishment and growth of an academic biomedical research program such as the COBRE in Matrix Biology can overcome existing challenges.
Node importance in a network is based on connections. Degree centrality is determined by the number of connections to a given node. Here, we analyze betweenness and eigencentrality as well as the hub statistical information of the nodes to better understand the network. Centrality measurements are fundamental to social networks, as they identify important or central nodes of the network. Betweenness centrality values are determined by the number of shortest paths over all pairs of nodes passing through a given node.
Centrality indicators used in this study included eigencentrality and hub, which is related to degree centrality but is also dependent not only on how many connections to other nodes but also on connections to high-degree nodes. Eigencentrality was determined to assess the centrality of a node as proportional to its neighbors’ importance. Nodes representing core facility staff scientists with a high eigenvector value and degree value have a high number of connections and were more influential within the network. For undirected networks such as this one, eigencentrality and hub provide the same information as seen in our study as well., The significance of nodes that play the role of a hub in a network relate to the general attributes of the network to effectively decrease the distance between nodes or increase the effectiveness of the network. Hubs can be identified by their large number of links, and they may serve to bridge and connect smaller degree nodes. The presence of hubs contributes to the strength of a network.
Our analysis of neighborhoods revealed 18 neighborhoods within our network. In the context of laboratories at an academic institution, each neighborhood could represent predominant laboratories, and the presence of shared core facilities research staff within individual neighborhoods may indicate that the shared core facility staff member is serving as an ex officio member of the laboratory based on their position in the shared core and their expertise that is of value to the laboratory.
Our finding that node betweenness centrality was not completely associated with nodes representing shared core facility staff did not support our hypothesis. Authors with the most publications are not necessarily most important to the network connectivity, and interestingly, shared core facility research staff did not always demonstrate the highest values for betweenness centrality. The concept of betweenness centrality assigns the importance of the node centrality as the potential of a node for the control of information flow in the network. Nodes are structurally central to the degree to which they stand between others and can therefore facilitate, impede, or bias the transmission of information. Freeman defined the betweenness centrality of a graph as the average difference between the measures of centrality of the most central nodes and that of all other nodes.
While our study analyzed data spanning 9 years, we did not investigate the dynamics or growth over time. Additionally, our study does not answer the questions of how and why authors work together to co-author manuscripts, and future studies focused on these questions could provide useful information. We also did not study the impact that students and postdoctoral fellows have in the network, and this also could provide valuable information about the larger biomedical research network. Potential follow-up research studies may include modeling that would allow predictions of the dynamics of the network. Such models may be critical during the planning of shared core facility sustainability.
In conclusion, research investigators collaborate with and rely upon the core research scientists of the shared core facility to strengthen their extended research team. Shared core facilities effectively increase the size and productivity of research teams, particularly those led by a young investigator. Social network analysis indicates that shared core facilities may serve as hubs within a network of biomedical researchers and that this approach is effective at a primarily undergraduate or R2 institution with a growing research emphasis. Research scientists within shared core facilities can serve key roles in networks among investigators at emerging research institutions such as Boise State University.
The project described was supported by Institutional Development Awards from the National Institute of General Medical Sciences of the NIH under Grants P20GM103408 and P20GM109095. We acknowledge support from the Biomolecular Research Center, RRID:SCR_019174, Duane and Lori Stueckle, and the Idaho State Board of Education.