Back to Journals » Drug Design, Development and Therapy » Volume 17

Mapping Research Trends of Medications for Multidrug-Resistant Pulmonary Tuberculosis Based on the Co-Occurrence of Specific Semantic Types in the MeSH Tree: A Bibliometric and Visualization-Based Analysis of PubMed Literature (1966–2020)

Authors Xu S, Fu Y, Xu D, Han S, Wu M, Ju X, Liu M, Huang DS , Guan P 

Received 22 February 2023

Accepted for publication 28 June 2023

Published 10 July 2023 Volume 2023:17 Pages 2035—2049

DOI https://doi.org/10.2147/DDDT.S409604

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Professor Manfred Ogris



Shuang Xu,1 Yi Fu,2 Dan Xu,1 Shuang Han,1 Mingzhi Wu,3 Xinrong Ju,1 Meng Liu,1 De-Sheng Huang,4,5 Peng Guan4,6

1Library of China Medical University, Shenyang, Liaoning, People’s Republic of China; 2School of Health Management, China Medical University, Shenyang, Liaoning, People’s Republic of China; 3Library of Shenyang Pharmaceutical University, Shenyang, Liaoning, People’s Republic of China; 4Key Laboratory of Environmental Stress and Chronic Disease Control & Prevention (China Medical University), Ministry of Education, Shenyang, Liaoning, People’s Republic of China; 5Department of Intelligent Computing, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, People’s Republic of China; 6Department of Epidemiology, School of Public Health, China Medical University, Shenyang, Liaoning, People’s Republic of China

Correspondence: De-Sheng Huang; Peng Guan, China Medical University, Shenyang, Liaoning, 110122, People’s Republic of China, Email [email protected]; [email protected]

Background: Before the COVID-19 pandemic, tuberculosis is the leading cause of death from a single infectious agent worldwide for the past 30 years. Progress in the control of tuberculosis has been undermined by the emergence of multidrug-resistant tuberculosis. The aim of the study is to reveal the trends of research on medications for multidrug-resistant pulmonary tuberculosis (MDR-PTB) through a novel method of bibliometrics that co-occurs specific semantic Medical Subject Headings (MeSH).
Methods: PubMed was used to identify the original publications related to medications for MDR-PTB. An R package for text mining of PubMed, pubMR, was adopted to extract data and construct the co-occurrence matrix-specific semantic types. Biclustering analysis of high-frequency MeSH term co-occurrence matrix was performed by gCLUTO. Scientific knowledge maps were constructed by VOSviewer to create overlay visualization and density visualization. Burst detection was performed by CiteSpace to identify the future research hotspots.
Results: Two hundred and eight substances (chemical, drug, protein) and 147 diseases related to MDR-PTB were extracted to form a specific semantic co-occurrence matrix. MeSH terms with frequency greater than or equal to six were selected to construct high-frequency co-occurrence matrix (42 × 20) of specific semantic types contains 42 substances and 20 diseases. Biclustering analysis divided the medications for MDR-PTB into five clusters and reflected the characteristics of drug composition. The overlay map indicated the average age gradients of 42 high-frequency drugs. Fifteen top keywords and 37 top terms with the strongest citation bursts were detected.
Conclusion: This study evaluated the literatures related to MDR-PTB drug therapy, providing a co-occurrence matrix model based on the specific semantic types and a new attempt for text knowledge mining. Compared with the macro knowledge structure or hot spot analysis, this method may have a wider scope of application and a more in-depth degree of analysis.

Keywords: multidrug-resistant tuberculosis, pulmonary tuberculosis, medication trends, specific semantic types, MeSH tree, pubMR

Introduction

Tuberculosis (TB) has existed since written records began, DNA evidence for the presence of TB has even been found in mummies in ancient Egypt.1 It has been reported as the leading cause of death from a single infectious agent worldwide, ranking above HIV/AIDS.2,3 The emergence of multidrug-resistant tuberculosis (MDR-TB) is posing increasing threats to the vision of the End TB Strategy of the World Health Organization (WHO) by the year 2035.4 Resistance to rifampicin and isoniazid is defined as MDR-TB, the treatment of diagnosed cases of MDR-TB is more difficult and requires drugs that cause more side-effects.5,6

Multidrug-resistant pulmonary tuberculosis (MDR-PTB) is a special disease in TB, which belongs to refractory pulmonary tuberculosis (PTB). It is caused by Mycobacterium tuberculosis (MTB),7 and is resistant to both isoniazid and rifampicin. MTB uses macrophages as host cells, the cholesterol on the surface of macrophages serves as the anchoring site for MTB cells. MTB anchors on the surface of macrophages by binding to cholesterol. Moreover, there are many antigen receptors (immunoglobulin receptor, Toll-like receptor, complement receptor, etc.) on the surface of macrophages. MTB can bind to these receptors and form phagosomes through receptor mediated endocytosis into macrophages.8 Generally, phagosomes have the function of killing pathogens, so the fate of MTB in phagosomes depends on whether it can escape the destruction of lysosomal reactive oxygen intermediates and reactive nitrogen intermediates. In general, TB bacteria spread through the air from one person to another. Bacteria generally directly attack the lungs, leading to chest pain, weakness, weight loss, fever, night sweats and blood sputum, sometimes it affects the brain, kidneys and spine.9

The first-line anti-tuberculosis drugs are isoniazid, rifampicin, pyrazinamide, streptomycin and ethambutol. The genes related to isoniazid resistance mainly include katG, inhA, etc.10 Their mutations can cause the resistance of MTB to isoniazid. The mutations of rpoB, pncA and embB genes are the main reasons for the resistance of MTB to rifampicin, pyrazinamide and ethambutol, respectively. The main mutation gene of streptomycin-resistant tuberculosis is rpsL gene, followed by rrs gene, and their mutations will also lead to drug resistance.10 In 1944, the medical community invented antibiotics to treat PTB,11 and made a breakthrough. However, due to the failure of prevention and treatment, global poverty, and the emergence of drug-resistant pulmonary tuberculosis, the disease made a comeback in the 1990s.12

In dealing with the challenges to end tuberculosis, intensive global efforts have been exerted in various aspects, including the improvement of surveillance system, the increase of funding for essential TB services and research, the innovations in the fields of testing and diagnosis techniques, expansion of investment, the call for political commitment to the fight against TB and etc.13–16 Among them, research on medications of drug-resistant tuberculosis is a key element for directing global therapy, especially in those high TB burden countries. Countries, institutions, authors, journals and the number of publications on multidrug-resistant (MDR), extensively drug-resistant (XDR), and totally drug-resistant (TDR) TB have been studied through bibliometric studies.17 However, to date, there has been no bibliometric analysis to study the development trend of medications for MDR-PTB.

The treatment of MDR-PTB is a challenge for the whole healthcare system, quick understanding the existing state of the target research field or understanding the progressive association between each research output is impractical when there is enormous number of publications on this subject. The bibliometric study on the medications for MDR-PTB is relevant, especially to both policy makers and researchers, in giving guidance on the development trends in the field. They can have a clearer idea of the appropriate places to collate information about the medications for MDR-PTB, and thus they can quickly understand the progress and assess the possibility of incorporating new knowledge into TB prevention and control program or into their own research project. Hence, the present study aims at presenting directions on and trends in medications currently available to treat MDR-PTB through a novel method of bibliometrics which co-occurs specific semantic Medical Subject Headings (MeSH) terms.

Bibliometrics is a series of statistic methods that can quantitatively measure the distribution, correlation and clustering of relevant literatures. Semantics are the meanings of the concepts represented by the real-world objects to which data correspond. Semantic relationships are the relationships between these meanings, which are the interpretation and logical representation of data in the field.18 Semantic types are classes divided by semantic relationships based on different classification criteria. PubMR can extract diseases and substance names (chemical substances, drugs, proteins, etc.) from the PubMed collection of drug treatment literature to form a co-occurrence matrix. The biclustering heatmap and social network graph generated by the disease-substance co-occurrence matrix may reveal the current research status of drug treatment for this disease. Visual analysis of the field of drug therapy for a certain disease can intuitively display the network relationships between various diseases and medications related to the disease. The global research trends of medications and the relationship with related diseases were analyzed by bibliometrics based on the co-occurrence of specific semantic types in the MeSH tree. MeSH terms co-occurrence matrix of distinct semantic types was constructed for mapping micro-level knowledge structure (or status) of a scientific topic, such as recognizing a specific disease, drugs, and extracting relationships between them.19 This study was thus carried out to guide global medication for MDR-PTB, the objective of this study was also to help the researchers adjust the research directions of vaccines and drug development. This model can also be adopted to co-occur other semantic types to explore other aspects of clinical diseases, such as the co-occurrence of drugs and side effects, tumors and genes, etc.

Materials and Methods

MeSH Tree Structure

MeSH is an authoritative thesaurus compiled by the US National Library of Medicine (NLM), which is normalized, expandable, and dynamic. MeSH classifies all related free terms into a MeSH term. It provides uniformity and consistency for the indexing and cataloging of biomedical literatures.20 A MeSH term represents all free words with the same meaning. Because of the branching structure of the hierarchies, the lists of MeSH terms are referred to as MeSH tree structure. MeSH tree structure classifies the MeSH terms into 16 main branches according to their semantic types and subject attributes, and most of them are further subdivided into 13 levels hierarchically. Sixteen categories are represented by A to N, V and Z (Figure 1A). Each descriptor is followed by one or more numbers truncated at the third level that indicates its tree location. The number of each subcategory is separated by a dot. In each category, the MeSH terms are arranged from hypernym to hyponym. The hierarchical subordination is expressed in the way of step-by-step indentation. The MeSH terms of the same level are arranged alphabetically. Generally, a MeSH term belonging to a category is given a descriptor followed by associated numbers. In fact, some MeSH terms have more than two attributes. These MeSH terms may belong to two or more categories at the same time, and the corresponding tree structure numbers are also given in other categories. The tree number of “Tuberculosis, Multidrug-Resistant” is “C01.150.252.410.040.552.846.775”, while its hyponym ‘Extensively Drug-Resistant Tuberculosis’ is “C01.150.252.410.040.552.846.775.500” (Figure 1B). This means that all references containing ‘Extensively Drug-Resistant Tuberculosis’ will be retrieved when “Tuberculosis, Multidrug-Resistant” is used as the search term.

Figure 1 MeSH tree structure.

Notes: (A) MeSH tree structure 16 categories; (B) MeSH tree structure hierarchical subordination. Reproduced from National Library of Medicine.21

According to the MeSH tree structure, specific tree structure numbers can qualify MeSH terms with distinct semantic types. For example, the MeSH terms “Disease” and “Chemicals and Drugs” are characterised by “C” and “D” in the MeSH tree (Figure 1A). Therefore, the co-occurrence matrix of specific semantic types can be constructed by tree structure numbers of the MeSH terms.19 Forming a specific semantic matrix may mine the bibliographic information on a customized subject.

Specific Semantic Types Co-Occurrence Matrix Construction by pubMR

Data extraction and co-occurrence matrix construction of specific semantic types were performed using pubMR, a PubMed text mining tool on R platform, developed by Professor Lei Cui and Professor Xiaobei Zhou from China Medical University, which is located in Shenyang, China and available for free online (https://github.com/xizhou/pubMR). PubMR integrates the retrieval and download, parsing and extraction, basic statistics, multidimensional matrix, paper similarity, hotspot analysis, concept recognition and network analysis. The operational flowchart of pubMR is shown in Figure 2. Users can customize individual methods according to their objectives, such as “SimArticle” providing similar articles related to traget article, “Statisticor” providing basic statistics of information generated from PubMed, “Buzzindex” providing the score for burst words, “Simmatrix” providing MeSH–MeSH co-occurrence matrix.22

Figure 2 Operation process of pubMR, a PubMed text mining tool on R platform.

Before extracting PubMed data, the following procedures are necessary: the installation of the corresponding pubMR package in the computer system, the development of a search strategy related to medications for MDR-PTB, and the following program on R platform.

library(pubMR)

m <- ‘“Tuberculosis, Multidrug-Resistant/diet therapy”[Mesh] OR “Tuberculosis, Multidrug-Resistant/drug therapy”[Mesh] OR “Tuberculosis, Multidrug-Resistant/nursing”[Mesh] OR “Tuberculosis, Multidrug-Resistant/prevention and control”[Mesh] OR “Tuberculosis, Multidrug-Resistant/radiotherapy”[Mesh] OR “Tuberculosis, Multidrug-Resistant/rehabilitation”[Mesh] OR “Tuberculosis, Multidrug-Resistant/surgery”[Mesh] OR “Tuberculosis, Multidrug-Resistant/therapy”[Mesh]) AND (“Tuberculosis, Pulmonary[Mesh])’

obj <- txtList(input=m)

A total of 1439 original literatures were retrieved in PubMed using the above retrieval strategy up to February 4, 2021. Reviews, systematic reviews, meta-analyses, books, and documents were excluded. A total of 1210 original articles were retrieved. The retrieval results were independently screened and extracted by the study group of two investigators (SX and YF), and all the discrepancies were resolved by the principal investigator (PG). Then, a co-occurrence matrix of disease and substance (chemical, drug, protein) was constructed by pubMR.

obj1=data.table(PMID=obj@PMID,MH=obj@MH)

MH <- obj1[,MH]

idx <- sapply(MH,is.null)

obj1 <- obj1[!idx,]

obj1 = obj1%>% unnest(MH)

V <- table(obj1[,c(“MH”, “PMID”)])

V1 <- crossprod(t(V))

meshtree <- “https://github.com/xizhou/pubMR/raw/master/meshtree2019.Rdata

load(url(meshtree))

nms <- rownames(V1)

nms <- gsub(“\\/.*”,nms)

idr <- which(nms %in% meshtree[class==“D”,mesh])

idc <- which(nms %in% meshtree[class==“C”,mesh])

V2 <- V1[idr,idc]

write.csv(V2,file=“result.csv”)

An initial 208×147 (category D × category C) co-occurrence matrix of specific semantic types was formed (Table 1), in which rows represented chemical, drug, and protein and columns represented diseases associated with MDR-PTB. Price equation was used to determine the threshold of the high-frequency MeSH terms.

Table 1 Co-Occurrence Matrix of Specific Semantic Types (208×147)

Nmax represents the highest frequency of a high-frequency word. M represents the threshold of the high-frequency MeSH terms.23 In addition to rifampicin and isoniazid, ethambutol is the most frequent specific drug in this study. Nmax=67, thus

Therefore, a high-frequency co-occurrence matrix (42 × 20) of specific semantic types with the frequency equal to or more than six was constructed (Table 2).

Table 2 High-Frequency Co-Occurrence Matrix of Specific Semantic Types (42×20)

Biclustering Analysis by gCLUTO

Biclustering analysis can realize clustering the rows and columns of the matrix, respectively, and then merging the clustering results. Biclustering analysis was performed using Graphical Clustering Toolkit (gCLUTO 1.0), developed by Professor Matt Rasmussen, Professor Mark Newman and Professor George Karypis from Karypis Lab of Minnesota University (http://glaros.dtc.umn.edu/gkhome/views/cluto). The biclustering options were set as follows: “Repeated Bisection” was selected as “Cluster method”; selected “I2” was selected as “Criterion Function”; “Cosine” was selected as “Similarity Function”; “10” was set as the number of iterations.

The highest average internal similarities (Isim) and the lowest average external similarities (Esim) would be the optimization results.24 By adjusting the number of clusters, five clusters were finally determined which were the best clustering results (Table 3). ISdev represents the standard deviation of ISim, while ESdev represents the standard deviation of Esim. The clusters exhibit a local mode with a high similarity of co-occurrence on drugs and diseases associated with MDR-PTB. Matrix visualization (Figure 3A) and mountain visualization (Figure 3B) were generated by gCLUTO 1.0 biclustering analysis.

Table 3 Clustering Parameters of Co-Occurrence Matrix

Figure 3 Matrix visualization and mountain visualization of biclustering of high-frequency MeSH terms on substance (chemical, drug, protein) and related diseases of multidrug-resistant-pulmonary tuberculosis. (A) Matrix visualization; (B) mountain visualization.

Visualization Analysis by VOSviewer

The basic principle of scientific knowledge maps analysis is to extract the relationship between knowledge units from scientific literatures, and make it matrix and visualization. In the results of a relational map of scientific knowledge, the relationship between two elements is connected by an edge. The width of an edge is usually used to indicate the strength of the relationship, and the direction of the connection of the edge represents a particular combination of relationships. In essence, the representation of a knowledge map is a network representation of matrix, so the method of network analysis can be applied in the present study.

Bibliometric networks visualization in this study was performed using VOSviewer (Version 1.6.13), developed by Professor Nees Jan van Eck and Professor Ludo Waltman from the Centre for Science and Technology Studies (CWTS) of Leiden University. It is a knowledge mapping tool for networks analysis and visualization, which can create maps based on network data. By fusion analysis of terms and time, VOSviewer provides three ways to visualize a map; the generated network knowledge map is visualized as follows: Network Visualization, Overlay Visualization, and Density Visualization.25

Burst Detection by CiteSpace

Kleinberg proposed an algorithm for detecting the frequency of sudden increase in 2002, which is called “burst detection algorithm”.26 If the frequency of some words suddenly shows a rapid growth, the most reasonable explanation is that these words hit a key part in the complex system of the academic field. These words with a sudden rise in relative growth rate are called burst words. Burst words may not reach the threshold of high-frequency words, but they have the potential to develop into high-frequency words. They are emerging or sudden theoretical trends or new themes, representing active ideas in a research field. Such nodes in the knowledge network can reveal the new development direction or hot spot transfer in the field. In this study, CiteSpace (version 5.7.R2), developed by Professor Chaomei Chen from College of Computing and Informatics Drexel University,27 was used to explore the burst keywords and the burst terms of the research on medications of MDR-PTB.

CiteSpace is a multi-dimensional, time-sharing and dynamic visual analysis software which focuses on analyzing the potential knowledge contained in scientific literatures. It was originally designed to analyze the structure of knowledge presented based on a network of literatures and references. Through the continuous updating of the software, we can also analyze the research hotspots by co-occurrence of keywords/terms and study group performance through cooperation of authors, institutions, and countries/regions. Professor Chaomei Chen updated the burst detection function of CiteSpace in 2012. In CiteSpace, research fronts and research bases are defined at the same time. The frontier of research is the citing literatures, and the basis of research is the cited literatures. Research hotspots can be explored through burst terms/phases in the research frontiers, and through citation bursts in the research bases. Burst detection is used for two kinds of variables: one is the frequency of words or phrases used in the citing literatures; the other is the frequency of citations obtained in the cited literatures. In the present study, the burst keywords and the burst terms extracted by CiteSpace were used to identify the hotspots of future research in the target field.

Results

High-Frequency MeSH Terms of Specific Semantic Types and Related Co-Occurrence Matrix

A total of 1210 original relevant literatures were retrieved on the topic of medications for MDR-PTB. After the distinguishing of semantic types of the MeSH terms with pubMR, 208 substances (chemical, drug, protein) and 147 diseases related to MDR-PTB were extracted to form a co-occurrence matrix (Table 1). According to Price equation, the MeSH terms with frequency equal to or more than six were high-frequency terms. The high-frequency co-occurrence matrix (42 × 20) (Table 2) of specific semantic types contains 42 substances (chemical, drug, protein) and 20 diseases related to MDR-PTB.

Medication Status Concluded by MeSH Term Clusters

The software, gCLUTO 1.0 was performed to visualize the biclustering results of the high-frequency MeSH terms co-occurrence matrix of specific semantic types (Table 3). A cluster dendrogram (Figure 3A) and a cluster mountain map (Figure 3B) were generated by gCLUTO 1.0. The results of biclustering reflect the characteristics of drug composition in treating MDR-PTB.

Cluster 0 contains two substances: interferon-gamma (IFN-γ) and BCG (Bacillus Calmette-Guérin) Vaccine. BCG Vaccine was first developed by French scientists Calmette and Guérin. It is a Mtblive vaccine made from a suspension of attenuated bovine MTB, which can effectively prevent TB. After 1928, widespread use of BCG Vaccine in the world still is the only available vaccine for the prevention of TB and is also the first needle after the birth in China. Its cellular immune function produces protective effects against different respiratory infections other than TB including COVID–19.28 Conferring consistent protection against TB, recombinant BCG expressing IFN-γ results in more efficient bacterial clearance.29

Cluster 1 contains twelve substances: Anti-Infective Agents, Drug Combinations, Anti-Bacterial Agents, Antitubercular Antibiotics, Streptomycin, Pyrazinamide, Ethambutol, Isoniazid, Rifampin, Antitubercular Agents, Cycloserine, and Anti-HIV Agents. This cluster is mainly about first-line anti-TB drugs. These antibiotics and anti-infective agents against MTB gradually develop resistance after a long period of use, especially rifampicin and isoniazid, the two most powerful drug-resistant tuberculosis drugs. Somoskovi et al studied the molecular basis of resistance to isoniazid, rifampin, and pyrazinamide in MTB.30 Coban et al carried out a meta-analysis to detect resistance of isoniazid, rifampin, ethambutol, and streptomycin in MTB by a nitrate reductase assay.31

Cluster 2 contains five substances: Aerosols, Nitroimidazoles, Quinolines, Ethionamide, and Kanamycin. These drugs are not only used in MDR-PTB, but also in extensively drug-resistant pulmonary tuberculosis (XDR-TB). Garcia-Contreras et al reported the potential use of nitroimidazo-oxazine PA-824 dry powder aerosols as a novel candidate for the treatment of MDR-TB, XDR-TB, and latent TB.32 The nitroimidazole pretomanid was recently approved for XDR-TB in combination with bedaquiline and linezolid.33,34 Quinoline, as a hypernym, contains a series of drugs used to treat TB, such as diarylquinoline and fluoroquinolone. Ethionamide and kanamycin are two second-line anti-TB drugs.

Cluster 3 contains twenty substances: Diarylquinolines, Oxazoles, Aza Compounds, Culture Media, Linezolid, Moxifloxacin, Fluoroquinolones, Capreomycin, Amikacin, “DNA, Bacterial”, Oxazolidinones, Acetamides, Bacterial Proteins, DNA-Directed RNA Polymerases, Prothionamide, Ofloxacin, Levofloxacin, Codon, Catalase and Oxidoreductases. Most of the literatures about the drugs in cluster 3 appeared in the past 10 years, especially in the last five years. These drugs are included in the second wave in TB drug development, and even some of them are in the third wave.35 Some substances are needed for laboratory testing, disease diagnosis or new drug development, such as culture media, “DNA, Bacterial”, bacterial proteins, DNA-directed RNA polymerases, etc. There are also drugs in cluster 3 that are being used to treat HIV-positive TB patients. MTB and HIV together can quickly make the immune system of infected people to collapse. TB is the number one killer of people living with HIV. People who are infected with both MTB and HIV will lose their lives if without proper treatment. Therefore, the development of drugs to treat AIDS patients with MDR-PTB is more urgent.

Cluster 4 contains three substances: Clofazimine, Aminosalicylic Acid, and Ciprofloxacin. According to descriptive and discriminating features of cluster 4, this cluster is about medications for drug-resistant tuberculosis combined with other organ TB, including lymphatic TB, pleural TB, etc. It has been reported that lymphatic TB was treated with clofazimine, aminosalicylic acid and ciprofloxacin, respectively.36–38

Figure 3 shows the matrix visualization and the mountain visualization of each cluster. In the matrix visualization (Figure 3A), white area represents a value closer to zero, that is, the terms do not appear. The deepening red area represents a larger value, that is, the terms corresponding to the row in the literatures of the column appear more frequently.39 Black horizontal lines separate the clusters. In the mountain visualization (Figure 3B), each cluster is described as a 3D mountain, labeled by the cluster number starting with 0. The location, volume, height, and color of the mountains portray information about a cluster. The distance between the mountains reflects the relative similarity of the two clusters. The height is positively correlated with the ISim of the cluster. The volume of a mountain reflects the number of terms included in the corresponding cluster. The color of the summit is proportional to ISdev, with red representing a low standard deviation and blue representing a high standard deviation. As ISdev increases successively, the mountain tops are red, yellow, green, light blue, and dark blue.

Specific Semantic Types Co-Occurrence Network Analysis and Visualization

VOSviewer (Version 1.6.13) was used to create a specific semantic types co-occurrence map for the 42 high-frequency MeSH terms. In the overlay map of high-frequency MeSH terms over time (Figure 4A), each element is formed by a circle and a label, and the size of the element depends on the degree of the node, the strength of the line, the frequency of co-occurrence, etc. Because all 42 drugs belong to antitubercular agents and co-occurrence with “antitubercular agents”, this element has the largest size as we can see. The average age of the 42 drugs is assigned different color gradients. The darker the color, the longer the term appears. Yellow represents the drugs that have emerged in recent years. As shown in Figure 4A, there are 11 yellow terms: anti-bacterial agents, bacterial proteins, catalase, clofazimine, diarylquinolines, DNA-directed RNA polymerases, linezolid, moxifloxacin, nitroimidazoles, and oxidoreductases. These drugs have been used to treat MDR-PTB in the past five years.

Figure 4 High-frequency MeSH keywords-time dual-map and density map of high-frequency co-occurrence substance (chemical, drug, protein) of multidrug-resistant pulmonary tuberculosis. (A) Overlay visualization; (B) density visualization.

Based on the bibliographic data, VOSviewer (Version 1.6.13) was used to generate a high-frequency MeSH terms density map that excluded “antitubercular agents”. In the density map (Figure 4B), each point on the map is colored according to the density of the elements around it. The higher the density, the deeper the red, and the lower the density, the deeper the blue. The density depends on the number of elements in the surrounding area and the importance of those elements. The density map can be used to discover the research focus and hotspots in a research field. As shown in Figure 4B, the research focus on rifampin and isoniazid. There are 39 drug links with rifampin, and the total link strength is 502. Isoniazid with 464 total link strength, links 38 drugs. This means that almost all drugs are co-occurrence with both drugs. According to the purpose of this study, it can be seen that they are the main drug-resistant drugs at present, most studies mainly focus on the resistance of these two drugs.

Burst Detection for Hotspots on Medications of MDR-PTB

There were 15 top keywords (Figure 5A) and 37 top terms (Figure 5B) with the strongest citation bursts extracted by CiteSpace (Version 5.7.R2). The keywords given by an author are the core summary of a paper, and the analysis of the keywords can provide some insight into the topics in the field. Among the top 15 keywords with the strongest citation bursts, the burst time of 11 keywords lasted until 2012, and the remaining four keywords showed a burst state in recent years. In the field of MDR-PTB treatment, the research on gene mutation of drug-resistant MTB has always been a research hotspot and an urgent problem to be solved. As shown in Figure 5A, mutations of rpoB, katG, and inhA are the most prominent genes associated MTB resistance to rifampicin and isoniazid.40 After 2014, MDR-TB-related studies have attracted renewed attention.

Figure 5 Top keywords and top terms with the strongest citation bursts in the research field of medications for multidrug-resistant pulmonary tuberculosis. (A) Top 15 keywords with the strongest citation bursts; (B) top 37 terms with the strongest citation bursts.

In order to get a more comprehensive burst theme, noun phrase was selected for term type in CiteSpace (Version 5.7.R2) to visualize the co-occurrence network of terms. Figure 5B shows the top 37 terms with the strongest citation bursts by burst start time. It can be found from the burst terms in the recent 10 years that researchers carried out anti MDR-TB researches from the aspects of diagnosis, treatment, laboratory examination and so on. Xpert-MTB/RIF analysis is effective in the diagnosis of drug resistance, which enables rapid diagnosis of rifampicin resistance and timely initiation of second-line TB treatment.41 Directly Observed Therapy (DOT) has been proved to be a cost-effective and health-improving treatment against MDR-PTB.42 Culture conversion has become a burst term again in recent years. Culture conversion as a prognostic marker of treatment outcome in patients with MDR-TB is essential in the development of new drugs and drug combination tests.43

Discussion

In this paper, we proposed a model of MeSH terms co-occurrence based on specific semantic types to discover domain knowledge structure. The effectiveness of the proposed method was verified through the discovery of the drug composition characteristics of diseases by the biclustering of diseases (category C) and substance (chemical, drug, protein) (category D). To the knowledge of the authors, this is the first bibliometric analysis model to evaluate the status of MDR-PTB medications from the perspective of semantic types based on the microscopic characteristics of the data.

It should be noted that the selection of data sources is limited when using this method for biclustering analysis. The classification system with unclear hierarchy is not suitable for this model, such as Web of Science (WOS) subject classification.44 The WOS subject classification system divides the included journals into 252 subjects. The relationship between subject classification and journals is many-to-many. A journal can belong to more than one WOS subject classification at the same time, and one WOS subject classification corresponds to more than one journal. The relationship between categories is equal rather than hierarchical. The semantic matrix cannot be formed without the upper class summarizing and distinguishing the objects.

Category D represents chemical, drug, and protein. The extracted high-frequency words are generally divided into four categories, which are relatively high-level MeSH terms representing a class of drugs, such as “Anti-Infective Agents”, a certain class of drugs, such as “Oxazoles”, specific drugs, such as “Rifampin”, and drug ingredients or components, such as “Bacterial Proteins”. Some new drugs that have not yet become MeSH terms cannot be captured by pubMR. However, if their ingredients or components are MeSH terms, they can be extracted by pubMR. To a certain extent, this helps some new drugs that have just been put on the market not to be ignored in this study.

Except for the first layer of the MeSH tree, all MeSH terms have hypernyms. In the literature collection of medications for MDR-PTB, all hyponyms of category C and category D were extracted. However, in the same category, there may be hierarchical relationship between the extracted MeSH terms. Among the 42 substances (chemical, drug, protein) extracted by pubMR (Table 2), “Antitubercular Agents” is the hypernym of “Antibiotics, Antitubercular”, “Anti-Bacterial Agents” is the hypernym of “Antitubercular Agents”, and “Anti-Infective Agents” is the hypernym of “Anti-Bacterial Agents”. They all belong to category D “Chemicals and Drugs Category”. Multiple hyponyms were extracted hierarchically, which may result in the scattered statistics of the frequency of this sub category. For example, “Linezolid” is the hyponym of “Acetamides”. “Acetamides” and “Linezolid” are two high-frequency MeSH terms extracted in this study. This may affect the frequency of certain MeSH terms, so that they cannot be selected as high-frequency words, which may cause some deviation to the results.

This study presented the research trends of drugs in the treatment of MDR-PTB in recent years (Figure 4A) and explored the burst keywords and the burst terms of medications (Figure 5). However, when we need to mine predictive drugs through literatures, the burst detection analysis in CiteSpace may not be as effective as Kleinberg’s burst detection algorithm which can mine for specific drugs.26 If the natural language semantic relationship mining tool SemRep is used to extract drugs for MDR-PTB, and the burst weight index of these drugs is calculated by Kleinberg’s burst algorithm formula. The resulting burst drugs are more specific and have more practical meaning than the burst keywords and the burst terms extracted by CiteSpace.45 This method has a huge amount of computation, which is the next improvement direction of the present study.

Although MDR-PTB is resistant to two or more antituberculosis drugs, including isoniazid and rifampicin at least; isoniazid combined with rifampicin can prevent acquired drug resistance to MTB.46 Monotherapy with rifampicin or isoniazid may lead to clinically related drug resistance. The combination of different antibiotics may prevent the emergence of a mutant resistant to a single antibiotic.47 A drug-resistant drug combined with other drugs may enhance its antibacterial activity and achieve better therapeutic effect. Pyrazinamide is the first-line anti-TB drug, multidrug resistant strains of MTB have high resistance to pyrazinamide. Dawson et al proposed pyrazinamide combined with moxifloxacin and PA-824 in the treatment of TB as the first step of developing a single treatment scheme for DR-TB.48 In this new multicomponent treatment scheme, pyrazinamide and moxifloxacin are MeSH terms extracted in the present study (Table 2). PA-824 is not a MeSH term and cannot be captured by PubMR. However, PA-824 is a new nitroimidazole compound, which is listed in Table 2. At present, the model of this study cannot identify the multi-component treatment regimen, but only the drugs in the combination regimen can be extracted.

We constructed a MeSH terms co-occurrence matrix of specific semantic types and exhibited the relationships between entities in the field of MDR-PTB drug therapy through biclustering analysis. Two specific semantic types in this study are category C for diseases and category D for substance (chemical, drug, protein) in the MeSH tree. This research model can be extended to construct other semantic types of co-occurrence network, such as mining the relationship between drugs and targets, genes and tumors, and so on.

Limitation

Nevertheless, our analysis still has some limitations. Firstly, due to limitations in the analytical method, searches were only conducted in a single database, PubMed database. It would be better to combine these results with the results of other databases such as Web of Science and Scopus. MeSH terms are co-occurrence at the level of article rather than sentence, which may lead to mismatching. In the next step, we consider whether we can retain the original subheadings of MeSH terms to achieve a more specific and explicit semantic relationship. Secondly, due to the collection characteristics of PubMed, most of the final collected literature was in English. To some extent, this ignores those publications in other languages. Thirdly, the number of clusters in clustering analysis could be adjusted based on the author’s subjective viewpoint. Fourth, after the COVID-19 pandemic, the statistical analysis of MDR-PTB was somewhat disrupted, and we may have underestimated the contribution of different analyses to the recently published research.

Conclusion

This study evaluated the literatures related to MDR-PTB drug therapy, providing a co-occurrence matrix model based on the specific semantic types and a new attempt for text knowledge mining. Compared with the macro knowledge structure or hot spot analysis, this method may have a wider scope of application and a more in-depth degree of analysis.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant No. 71974199), the Liaoning Social Science Planning Fund (Grant No. L21BTQ008) and the Liaoning Natural Science Foundation of China (2020-MS-159). The authors thank Professor Lei Cui and Professor Xiaobei Zhou from China Medical University for the contributions to the R coding.

Disclosure

The authors declare no conflicts of interest in this work.

References

1. Zink AR, Sola C, Reischl U, et al. Characterization of Mycobacterium tuberculosis Complex DNAs from Egyptian Mummies by Spoligotyping. J Clin Microbiol. 2003;41(1):359–367. doi:10.1128/JCM.41.1.359-367.2003

2. Furin J, Cox H, Pai M. Tuberculosis. Lancet. 2019;393(10181):1642–1656. doi:10.1016/S0140-6736(19)30308-3

3. World Health Organization. Global Tuberculosis Report 2022. Geneva: World Health Organization; 2022. Available from: https://www.who.int/teams/global-tuberculosis-programme/tb-reports/global-tuberculosis-report-2022. Accessed June 29, 2023.

4. World Health Organization. The End TB Strategy. Geneva: World Health Organization; 2015. Available from: https://www.who.int/teams/global-tuberculosis-programme/the-end-tb-strategy. Accessed June 29, 2023.

5. Conradie F, Diacon AH, Ngubane N, et al. Treatment of Highly Drug-Resistant Pulmonary Tuberculosis. N Engl J Med. 2020;382(10):893–902. doi:10.1056/NEJMoa1901814

6. World Health Organization. WHO Consolidated Guidelines on Tuberculosis: Module 4: Treatment: Drug-Resistant Tuberculosis Treatment. Geneva: World Health Organization; 2020. Available from: https://www.who.int/publications/i/item/9789240007048. Accessed June 29, 2023.

7. Liu YG, Matsumoto M, Ishida H, et al. Delamanid: from discovery to its use for pulmonary multidrug-resistant tuberculosis (MDR-TB). Tuberculosis. 2018;111:20–30. doi:10.1016/j.tube.2018.04.008

8. Ernst JD. Macrophage receptors for Mycobacterium tuberculosis. Infect Immun. 1998;66(4):1277–1281. doi:10.1128/IAI.66.4.1277-1281.1998

9. Singh L, Dua K, Kumar S, Kumar D, Majhi S. Targeting Molecular and Cellular Mechanisms in Tuberculosis. In: Dua K, Löbenberg R, Malheiros Luzo ÂC, Shukla S, Satija S, editors. Targeting Cellular Signalling Pathways in Lung Diseases. Singapore: Springer; 2021:37–353. doi:10.1007/978-981-33-6827-9_14

10. Borah P, Deb PK, Venugopala KN, et al. Tuberculosis: an update on pathophysiology, molecular mechanisms of drug resistance, newer anti-tb drugs, treatment regimens and host- directed therapies. Curr Top Med Chem. 2021;21(6):547–570. doi:10.2174/1568026621999201211200447

11. Ntoumi F, Kaleebu P, Macete E, et al. Taking forward the World TB Day 2016 theme ‘Unite to End Tuberculosis’ for the WHO Africa Region. Int J Infect Dis. 2016;46:34–37. doi:10.1016/j.ijid.2016.03.003

12. Grange J, Story A, Zumla A. Tuberculosis in disadvantaged groups. Curr Opin Pulm Med. 2001;7(3):160–164. doi:10.1097/00063198-200105000-00008

13. Conradie F, Bagdasaryan TR, Borisov S, et al. Bedaquiline-pretomanid-linezolid regimens for drug-resistant tuberculosis. N Engl J Med. 2022;387(9):810–823. doi:10.1056/NEJMoa2119430

14. Edwards BD, Field SK. The struggle to end a millennia-long pandemic: novel candidate and repurposed drugs for the treatment of tuberculosis. Drugs. 2022;82(18):1695–1715. doi:10.1007/s40265-022-01817-w

15. Dheda K, Gumbo T, Maartens G, et al. The Lancet Respiratory Medicine Commission: 2019 update: epidemiology, pathogenesis, transmission, diagnosis, and management of multidrug-resistant and incurable tuberculosis. Lancet Respir Med. 2019;7(9):820–826. doi:10.1016/S2213-2600(19)30263-2

16. Kamara RF, Saunders MJ, Sahr F, et al. Social and health factors associated with adverse treatment outcomes among people with multidrug-resistant tuberculosis in Sierra Leone: a national, retrospective cohort study. Lancet Glob Health. 2022;10(4):e543–e554. doi:10.1016/S2214-109X(22)00004-3

17. Sweileh WM, AbuTaha AS, Sawalha AF, et al. Bibliometric analysis of worldwide publications on multi-, extensively, and totally drug - resistant tuberculosis (2006-2015). Multidiscip Respir Med. 2017;11:45. doi:10.1186/s40248-016-0081-0

18. Kashyap V. The UMLS semantic network and the semantic web. AMIA Annu Symp Proc. 2003;2003:351–355.

19. Fang L, Zhou X, Cui L. Biclustering high-frequency MeSH terms based on the co-occurrence of distinct semantic types in a MeSH tree. Scientometrics. 2020;124:1179–1190. doi:10.1007/s11192-020-03496-4

20. Yu G. Using meshes for MeSH term enrichment and semantic analyses. Bioinformatics. 2018;34(21):3766–3767. doi:10.1093/bioinformatics/bty410

21. National Library of Medicine. Tree View. Available from: https://meshb.nlm.nih.gov/treeView. Accessed July 6, 2023.

22. Zhou XB User’s guide to the “pubMR” package. GitHub; 2021. Available from: https://github.com/xizhou/pubMR. Accessed June 29, 2023.

23. Liu Y, Wang Y, Li M. An empirical analysis for the applicability of the methods of definition of high-frequency words in word frequency analysis. Digi Lib Forum. 2017;9:42–49. doi:10.3772/j.issn.1673-2286.2017.09.007

24. Li F, Li M, Guan P, Ma S, Cui L. Mapping publication trends and identifying hot spots of research on Internet health information seeking behavior: a quantitative and co-word biclustering analysis. J Med Internet Res. 2015;17(3):e81. doi:10.2196/jmir.3326

25. van Eck NJ, Waltman L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics. 2010;84(2):523–538. doi:10.1007/s11192-009-0146-3

26. Kleinberg J. Bursty and hierarchical structure in streams. Data Min Knowl Disc. 2003;7(4):373–397. doi:10.1023/a:1024940629314

27. Chen CM. CiteSpace II: detecting and visualizing emerging trends and transient patterns in scientific literature. J Am Soc Inf Sci Technol. 2006;57(3):359–377. doi:10.1002/asi.20317

28. Yitbarek K, Abraham G, Girma T, Tilahun T, Woldie M. The effect of Bacillus Calmette-Guérin (BCG) vaccination in preventing severe infectious respiratory diseases other than TB: implications for the COVID-19 pandemic. Vaccine. 2020;38(41):6374–6380. doi:10.1016/j.vaccine.2020.08.018

29. Wangoo A, Brown IN, Marshall BG, Cook HT, Young DB, Shaw RJ. Bacille Calmette-Guérin (BCG)-associated inflammation and fibrosis: modulation by recombinant BCG expressing interferon-gamma (IFN-gamma). Clin Exp Immunol. 2000;119(1):92–98. doi:10.1046/j.1365-2249.2000.01100.x

30. Somoskovi A, Parsons LM, Salfinger M. The molecular basis of resistance to isoniazid, rifampin, and pyrazinamide in Mycobacterium tuberculosis. Respir Res. 2001;2(3):164–168. doi:10.1186/rr54

31. Coban AY, Deveci A, Sunter AT, Martin A. Nitrate reductase assay for rapid detection of isoniazid, rifampin, ethambutol, and streptomycin resistance in Mycobacterium tuberculosis: a systematic review and meta-analysis. J Clin Microbiol. 2014;52(1):15–19. doi:10.1128/JCM.01990-13

32. Garcia-Contreras L, Sung JC, Muttil P, et al. Dry powder PA-824 aerosols for treatment of tuberculosis in Guinea pigs. Antimicrob Agents Chemother. 2010;54(4):1436–1442. doi:10.1128/AAC.01471-09

33. Ignatius EH, Abdelwahab MT, Hendricks B, et al. Pretomanid Pharmacokinetics in the Presence of Rifamycins: interim Results from a Randomized Trial among Patients with Tuberculosis. Antimicrob Agents Chemother. 2021;65(2):e01196–20. doi:10.1128/AAC.01196-20

34. Stancil SL, Mirzayev F, Abdel-Rahman SM. Profiling Pretomanid as a Therapeutic Option for TB Infection: evidence to Date. Drug Des Devel Ther. 2021;15:2815–2830. doi:10.2147/DDDT.S281639

35. Ignatius EH, Dooley KE. New Drugs for the Treatment of Tuberculosis. Clin Chest Med. 2019;40(4):811–827. doi:10.1016/j.ccm.2019.08.001

36. Mirsaeidi SM, Tabarsi P, Edrissian MO, et al. Primary multi-drug resistant tuberculosis presented as lymphadenitis in a patient without HIV infection. Monaldi Arch Chest Dis. 2004;61(4):244–247. doi:10.4081/monaldi.2004.690

37. Goh TL, Towns CR, Jones KL, Freeman JT, Wong CS. Extensively drug-resistant tuberculosis: new Zealand’s first case and the challenges of management in a low-prevalence country. Med J Aust. 2011;194(11):602–604. doi:10.5694/j.1326-5377.2011.tb03115.x

38. Chahed H, Hachicha H, Berriche A, et al. Paradoxical reaction associated with cervical lymph node tuberculosis: predictive factors and therapeutic management. Int J Infect Dis. 2017;54:4–7. doi:10.1016/j.ijid.2016.10.025

39. Yang Y, Xu D, Chen S, Han S, Xu S. Analysis of Research Hotspots in the Field of Global Medical Research Based on Natural Index. J China Soc Sci Tech Info. 2019;38(11):1129–1137. doi:10.3772/j.issn.1000-0135.2019.11.001

40. Isakova J, Sovkhozova N, Vinnikov D, et al. Mutations of rpoB, katG, inhA and ahp genes in rifampicin and isoniazid-resistant Mycobacterium tuberculosis in Kyrgyz Republic. BMC Microbiol. 2018;18(1):22. doi:10.1186/s12866-018-1168-x

41. Mahwire TC, Zunza M, Marukutira TC, Naidoo P. Impact of Xpert MTB/RIF assay on multidrug-resistant tuberculosis treatment outcomes in a health district in South Africa. S Afr Med J. 2019;109(4):259–263. doi:10.7196/SAMJ.2019.v109i4.13180

42. Wilton P, Smith RD, Coast J, Millar M, Karcher A. Directly observed treatment for multidrug-resistant tuberculosis: an economic evaluation in the United States of America and South Africa. Int J Tuberc Lung Dis. 2001;5(12):1137–1142.

43. Franke MF, Khan P, Hewison C, et al. Culture Conversion in Patients Treated with Bedaquiline and/or Delamanid. A Prospective Multicountry Study. Am J Respir Crit Care Med. 2021;203(1):111–119. doi:10.1164/rccm.202001-0135OC

44. Crespo JA, Herranz N, Li YR, Ruiz-Castillo J. The effect on citation inequality of differences in citation practices at the web of science subject category level. J Assoc Inf Sci. 2014;65(6):1244–1256. doi:10.1002/asi.23006

45. Xu S, Xu D, Wen L, et al. Integrating Unified Medical Language System and Kleinberg’s Burst Detection Algorithm into Research Topics of Medications for Post-Traumatic Stress Disorder. Drug Des Devel Ther. 2020;14:3899–3913. doi:10.2147/DDDT.S270379

46. Moulding TS, Le HQ, Rikleen D, Davidson P. Preventing drug-resistant tuberculosis with a fixed dose combination of isoniazid and rifampin. Int J Tuberc Lung Dis. 2004;8(6):743–748.

47. Kayigire XA, Friedrich SO, van der Merwe L, Diacon AH. Acquisition of Rifampin Resistance in Pulmonary Tuberculosis. Antimicrob Agents Chemother. 2017;61(4):e02220–16. doi:10.1128/AAC.02220-16

48. Dawson R, Diacon A. PA-824, moxifloxacin and pyrazinamide combination therapy for tuberculosis. Expert Opin Investig Drugs. 2013;22(7):927–932. doi:10.1517/13543784.2013.801958

Creative Commons License © 2023 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.