Genomic and “Polyomic” Studies of Cardiovascular and Inﬂammatory Diseases Microarray proﬁling analysis and validation of novel long noncoding RNAs and mRNAs as potential biomarkers and their functions in atherosclerosis

Microarray proﬁling analysis and validation of novel long noncoding RNAs and mRNAs as potential biomarkers and their functions in atherosclerosis. Physiol Genomics 51: 644–656, 2019. 00077.2019.—Long noncoding (lnc)RNAs have been implicated in the development and progression of atherosclerosis. However, the expression and mechanism of action of lncRNAs in atherosclerosis are still unclear. We implemented microarray analysis in human advanced atherosclerotic plaques and normal arterial intimae to detect the lncRNA and mRNA expression proﬁle. Gene Ontology functional enrichment and pathway analyses were applied to explore the potential functions and pathways involved in the pathogenesis of atherosclerosis. A total of 236 lncRNAs and 488 mRNAs were selected for further Ingenuity Pathway Analysis. Moreover, quantitative RT-PCR tests of most selected lncRNAs and mRNAs with high fold changes were consistent with the microarray data. We also performed ELISA to investigate the corresponding proteins levels of selected genes and showed that serum levels of SPP1, CD36, ATP6V0D2, CHI3L1, MYH11, and BDNF were differentially expressed in patients with coronary heart disease compared with healthy subjects. These proteins correlated with some biochemical parameters used in the diagnosis of cardiovascular diseases. Furthermore, receiver operating characteristic analysis showed a favorable diagnostic performance. The microarray proﬁling analysis and validation of differentially-expressed lncRNAs and mRNAs in atherosclerosis not only provide new insights into the pathogenesis of this disease but may also reveal new biomarkers for its diagnosis and treatment.


INTRODUCTION
Atherosclerosis is the common pathological basis of coronary heart disease and many other cardiovascular diseases (3,17). The pathogenesis of atherosclerosis is complex, involving lipid deposition, vascular smooth muscle cell proliferation and apoptosis, endothelial cell dysfunction, and macrophage foam cell formation (7,9). A large number of molecules such as transcription factors, cytokines, and kinases participate in this progress.
Long noncoding RNAs (lncRNAs) are noncoding RNAs more than 200 nucleotides in length located in the cell nucleus or cytoplasm (19,30). With the exploration of lncRNAs, several hypotheses that lncRNAs are associated with mRNA expression have been proposed (27). lncRNAs can interfere with nearby promoters by cis-or trans-targeting, causing transcriptional interference (6,16,19). As an enhancer, lncRNA stabilizes chromatin loops, leading to a distal enhancer to interact with a promoter. In posttranscriptional regulation, lncRNAs combine with mRNA through microRNAs (miRNAs) to synthesize lncRNA-miRNA-mRNA axis, affecting mRNA stability or translation. Additionally, lncRNAs act as miRNA precursors or miRNA sponges to regulate corresponding miRNA, affecting in posttranscriptional regulation of gene expression (15,24). Moreover, lncRNAs are also involved in many biological processes such as chromatin modification, transcription, splicing, translation, degradation, and transport and have effects on growth, metabolism, and senescence through cell proliferation, differentiation, and apoptosis (8,31,33,34,41). Accumulating studies indicate that lncRNAs are involved in the process of injury and repair of endothelial cells, migration and proliferation of smooth muscle cells, lipid loading and inflammatory response in macrophages, and lipid deposition and the formation of plaques, all of which affect the progression of atherosclerosis and other cardiovascular diseases (4,21,23). For example, the lncRNA ANRIL (antisense noncoding RNA in the INK4 locus) is associated with coronary atherosclerosis, carotid atherosclerosis, and peripheral artery disease (1,11,26). The lncRNA MALAT1 (metastasis-asso-ciated lung adenocarcinoma transcript 1), on the other hand, has reduced expression in patients with coronary artery disease (1,10). Using microarray analysis, we recently found that the lncRNA RP5-833A20.1/has-miR-382-5p/nuclear factor IA pathway is essential for the regulation of cholesterol homeostasis and inflammatory responses in THP-1 macrophages (14). Furthermore, we found that lncRNA NEXN-AS1 (nexilin Factin binding protein antisense RNA 1) inhibited TLR4 oligomerization and NF-B activity, suppressed monocyte adhesion to endothelial cells, and prevented against atherosclerosis (13). However, studies of lncRNA in atherosclerosis are in their infancy, and our understanding of the role of lncRNA in this disease remains limited.
To systematically study the role of lncRNAs in atherosclerosis, we examined lncRNA and mRNA expression profiles in samples of atherosclerotic plaque and normal arterial intima. This study was designed to identify lncRNAs and mRNAs implicated in the pathogenesis of atherosclerosis and to establish candidate biomarkers for prevention and treatment of atherosclerosis.

MATERIALS AND METHODS
Patients and samples. For microarray analysis, six tissue samples were collected from patients at Nanfang Hospital, Guangzhou, China. According to the 1990 classifications of the Committee on Vascular Lesions of the Council on Atherosclerosis, American Heart Association, advanced abdominal aortic plaques or common carotid plaques were taken from patients diagnosed with grade V or VI atherosclerosis. As a normal arterial intima control, the abdominal aortas of three patients without atherosclerosis were obtained. One of the normal arterial intimae and advanced plaque tissue were collected from the same individual. Before the first surgery, no patient had received previous atherosclerotic treatment. Samples were obtained and frozen in liquid nitrogen. To confirm and validate the expression of differentially expressed lncRNAs and mRNAs in clinical samples, we also obtained 10 normal intima, and 15 plaque tissues from patients at Nanfang Hospital matched in age and sex. Between October 2015 and July 2016, serum samples were collected from patients with chronic coronary heart disease (CCHD; n ϭ 108), acute coronary syndrome (ACS; n ϭ 93), and heart failure (HF; n ϭ 75), as well as from healthy controls (n ϭ 103). Diagnosis of these cardiovascular diseases was made according to the clinical guidelines standardized by the American College of Cardiology/American Heart Association (22, 39a). Patients diagnosed with any form of cancer were excluded from the study group. Serum was collected before treatment with any medications and stored at Ϫ80°C. Basic clinical test results were also collected including serum total cholesterol (TC), triglyceride (TG), high density lipoprotein (HDL), low density lipoprotein (LDL), apolipoprotein A-I (ApoA-1), ApoB, lipoprotein a (LPa), homocysteine (HCY), myoglobin (Myo), cardiac troponin I (cTnI), creatine kinase (CK), CK-MBmass (CK-MBm), CK-MB, C-reactive protein (CRP), lactate dehydrogenase (LDH), and aspartate transaminase (AST). All samples were collected with informed written consent from patients, and ethical consent was granted from the Committees for Ethical Review of Research involving Human Subjects of Southern Medical University, Guangzhou, China([2013]111).
Microarray analysis. In accordance with the manufacturer's instructions, total RNA was extracted from frozen tissue with TRIzol reagent (Invitrogen, Carlsbad, CA). RNA quality was assessed by NanoDrop ND-1000, while RNA integrity was evaluated with standard denaturing agarose gel electrophoresis. Microarray analysis was performed with the Agilent Array platform (Agilent Technologies). Under the manufacturer's standard protocols with minor modifications, sample preparation and microarray hybridization were performed. In short, mRNA was purified from total RNA after removal of rRNA (mRNA-ONLY Eukaryotic mRNA Isolation Kit, Epicenter). Thereafter, employing a random primer method, we amplified all samples and transcribed them into fluorescent cRNA along the whole length of the transcript without 3=-bias. The labeled cRNAs were hybridized onto the Human lncRNA Array v3.0 (8 ϫ 60K, Arraystar). After washing the slides, we scanned the arrays with the Agilent Scanner G2505C. Agilent Feature Extraction software (version Fig. 1. Profiles of lncRNA and mRNA microarray data. Volcano plots of lncRNA expression profiles (A) and mRNA expression profiles (B) were used for visualizing differential expression between two different conditions. Each red point in the plot represents differentially expressed RNA with statistical significance. The vertical lines represent 2.0-fold up-and downregulation and the horizontal lines indicate a P value of 0.05. 11.0.1.1) was utilized to analyze acquired array images, and the GeneSpring GX v11.5.1 software package (Agilent) was applied to quantile normalization and subsequent data processing. LncRNAs and mRNAs from at least three out of six samples flagged as present or marginal ("All Targets Value") were chosen for further data analysis after quantile normalization of the raw data. Volcano plot filtering was applied to identify the differentially expressed lncRNAs and mRNAs statistically different between the two groups. The roles these differentially expressed mRNAs play in terms of biological pathways or gene ontology (GO) terms were determined through pathway analysis and GO analysis. Finally, the distinguishable lncRNA and mRNA expression patterns among samples were showed by hierarchical clustering. All of the microarray data were upload to the National Center for Biotechnology Information Gene Expression Omnibus (GEO) with a GEO accession number (GSE97210).
Ingenuity Pathway Analysis. Biological function enrichment, canonical pathway analysis, and upstream regulator analysis were identified applying the Ingenuity Pathway Analysis database (IPA, Ingenuity Systems Inc., Redwood City, CA).
Network analysis. GeneGo MetaCore data mining software was applied to construct gene networks. Enrichment analysis included mapping gene IDs of the data set onto gene IDs in entities of built-in functional ontologies represented in MetaCore by pathway maps and networks. IPA linked specific genes to a database of gene functions gleaned from the biomedical research literature and was used to generate additional network associations. More information about the two kinds of software was acquired from http://www.geneontology. org/ and https://www.ingenuity.com.
RNA isolation and real-time quantitative PCR analysis. Total RNA from tissues was extracted with TRIzol reagent (TaKaRa Bio, Inc., Statistical analysis. Data analyses were performed using Statistical Package for the Social Sciences (version13.0) software (SPSS, Inc, Chicago, IL). Data are presented as means Ϯ SD or medians (interquartile range) unless otherwise indicated. Results were analyzed by one-way analysis of variance followed by the Bonferroni test, or by an unpaired Student's t test when continuous variables were normally distributed, then multiple testing correction was adopted by Benjamini and Hochberg false discovery rate. When the data were not normally distributed, the Mann-Whitney U test was utilized to analyze continuous variables. The correlations between selected proteins levels and biochemical parameters were determined by Pearson or Spearman correlation analysis. After adjusting for sex, age, and known cardiovascular risk factors to estimate the magnitude of associations between the selected proteins and cardiovascular diseases, we employed logistic regressions to calculate the odds ratio (OR) and 95% confidence interval (CI). The receiver operating characteristic (ROC) curve was generated to determine the diagnostic performance of selected proteins. A two-tailed probability P value Ͻ0.05 was considered statistically significant.

RESULTS
Overview of lncRNA and mRNA expression profiles. In this research, we have characterized the lncRNA and mRNA profiles in three advanced atherosclerosis samples and in three normal intima tissues. A total of 30,586 lncRNAs and 26,109 coding transcripts were detected by microarray analysis. As shown in Fig.  1, volcano plot filtering found that 5,461 lncRNAs were upregulated and 6,115 lncRNAs were downregulated in advanced atherosclerotic plaques compared with normal arterial intima (fold change Ͼ2 and P Ͻ 0.05; Fig. 1A). On the other hand, 4,230 mRNAs were upregulated and 4,128 mRNAs were downregulated (Fig. 1B). Scatter plots were used to visualize the expression variation between the two groups and showed a Ͼ2.0-fold change in abundant lncRNAs and mRNAs ( Supplementary Fig. S1, A and B; all supplementary figures are available at https:// figshare.com/articles/Supplementary_Figures/9601130). Hierarchical clustering exhibits the relationships among lncRNAs as well as mRNA expression patterns present in the samples ( Supplementary Fig. S1, C and D). Based on the magnitude of change, we listed the top 10 upregulated and downregulated lncRNAs and mRNAs in Table 1 and Table 2.
GO and Kyoto Encyclopedia of Genes and Genomes pathway analysis of differentially-expressed mRNAs. GO analysis revealed the functions of differentially expressed mRNAs. It covers three domains: biological process, cellular component, and molecular function. We first investigated the upregulated mRNAs. In terms of biological process ( Fig. 2A), the most significant categories were immune response, immune system process, defense response, and related immune functions. The top three categories of the cellular component (Fig. 2B) were cell periphery, plasma membrane, and plasma membrane part. With regard to molecular function (Fig. 2C), the most enriched molecules were receptors, chemokines, and cytokines. The most downregulated mRNAs were involved in cellular component organization or biogenesis at the cellular level (biological process), intracellular part (cellular component), and binding (molecular function; Fig. 2, D-F).
Significant pathways of differentially expressed genes were compared with the Kyoto Encyclopedia of Genes and Genomes (KEGG) database to further specify and identify target mRNAs. Pathway analysis showed that 65 pathways corresponded to upregulated mRNAs and 52 pathways corresponded to downregulated mRNAs. The top 10 enriched pathways for upregulated and downregulated mRNAs are shown in Supplementary Fig. S2, A and B, respectively.
IPA and network build. After removing duplicates and no values, we selected a total of 17,173 lncRNAs and 18,972 mRNAs for further analysis. Without any filtering, these ln-cRNAs and mRNAs showed small correlations (Fig. 3A). We then focused on differentially expressed lncRNAs and mRNA. The criteria were as follows: absolute fold change Ͼ2.5 and P value Ͻ0.01. After correlation analysis, we selected a total of 346 mRNAs and 236 lncRNAs that showed high correlations to one another (ϩ1, positive correlation; Ϫ1, negative correlation; Fig. 3B), indicating the presence of enriched signaling between the biological genes and lncRNAs. Hierarchical clustering based on the selected 236 lncRNAs and 346 mRNAs is shown in Fig. 3C. Moreover, we selected an additional 142 mRNAs transcripts located near to the differentially expressed lncRNAs (within 100k bp) that may function as cis-regulated mRNAs. A total of 488 mRNAs constitute our gene set for function/pathway analysis by IPA. The results indicate that these mRNAs participated in various biological processes ( Supplementary Fig. S3), such as cellular movement, cancer, cardiovascular development, and cardiovascular disease, etc. Upstream regulators such as transcription factors, cytokines, and kinases play important roles in the development and progression of arteriosclerosis. More information about these key genes from the upstream factor list, as well as the networks they regulate, is provided in Supplementary Table S3.
We continued our analysis by building networks according to these 488 selected genes. There was evidence indicating connections among these molecules in 25 networks. The highest scoring gene networks were associated with cellular growth and proliferation, cancer, and respiratory disease (Supplementary Table S4). Adhesion and migration genes such as integrin are also involved in these networks. The top six networks are shown in Supplementary Fig. S4. Among them, network 4 was associated with cardiovascular disease, cardiovascular system development and function, and cellular development (Supplementary Fig. S4D).
Coexpression analysis of lncRNA and nearby protein-coding genes. An increasing number of studies indicate that lncRNA function can be inferred from its location and will probably affect the expression of nearby protein-coding genes (2,40). To demonstrate their interrelationships, a total of 197 lncRNAs and 180 mRNAs were chosen to perform the coexpression analysis ( Supplementary Fig. S5). This figure shows that a gene may be associated with multiple neighboring lncRNAs and vice versa. Detailed information about these probes is presented in Supplementary Table S5. The top 10 upregulated and downregulated lncRNAs in this analysis are also listed in Table 3. Additionally, we found 5,279 pairs of upregulated differentially expressed lincRNAs and nearby coding genes and 5,201 pairs of downregulated lincRNAs and nearby coding genes (distance Ͻ300 kb, fold change Ͼ2, P Ͻ 0.05). Based on the fold change of lincRNAs, we have listed the top 10 differentially expressed upregulated and downregulated lin-cRNAs (Supplementary Table S6). Moreover, some lncRNAs have a similar function as an enhancer and increase the expression of nearby protein-coding genes (12). We selected 731 pairs of upregulated and 635 pairs of downregulated enhancer lncRNAs with their nearby coding genes (distance Ͻ300 kb, fold change Ͼ2, P Ͻ 0.05). The top 10 differentially expressed enhancer-like lncRNAs and their nearby coding genes are shown in Supplementary Table S7.
Detection of human serum levels of selected proteins by ELISA. Next, we performed ELISA to investigate the corresponding protein levels of selected genes in serum samples from patients with CCHD, ACS, or HF and from healthy individuals. Table 4 shows the serum levels of these proteins and the levels of a number of biochemical measurements including TC, TG, HDL, LDL, ApoA-1, ApoB, LPa, HCY, Myo, cTnI, CK, CK-MBm, CK-MB, CRP, LDH, and AST. Furthermore, we evaluated the relationship between serum concentration of selected proteins and clinical biochemical indices of study subjects ( Table 5). The specificity and sensitivity of selected differentially expressed protein measurements were assessed by ROC curves as shown in Fig. 5. The serum levels of SPP1 in patients with HF and ACS were markedly  Continued higher than those in the healthy group. Moreover, SPP1 was negatively correlated with ApoA-1 in the healthy group and positively correlated with CRP in ACS patients (Table 5). When HF patients and ACS patients were separately compared with the healthy group, the area under the curve (AUC) for SPP1 was 0.837 (P ϭ 0.011) and 0.883 (P Ͻ 0.001), respectively. CD36 were significantly higher in patients with CCHD or ACS (P Ͻ 0.01). The expression level of CD36 was positively correlated with ApoB in HF patients, and its AUC was 0.784 (P ϭ 0.002) when healthy individuals and ACS patients were evaluated and 0.757 (P ϭ 0.002) when healthy individuals and CCHD patients were evaluated. In additional, four protein in patients with HF, ACS, or CCHD, which include ATP6VOD2, CHI3L1, MYH11, and brain-derived neurotrophic factor (BDNF), were significantly increased compared with the healthy group, respectively (all at P Ͻ 0.05; Table 4). Their AUC values are separately exhibited in Fig. 5, showing favorable diagnostic performance. Moreover, serum CHI3L1 levels were positively associated with HCY in the HF group and with Myo as well as CRP in ACS patients. The positive connections were found between MYH11 with LDL, ApoB, and CK in CCHD patients (all at P Ͻ 0.05). Additionally, BDNF was negatively correlated with CK-MB in the healthy group and with Myo in the CCHD. Several selected proteins were associated with cardiovascular disease in logistic regression models adjusted for age, sex, and known cardiovascular risk factors ( Table 6). The expression of CHI3L1, CD36, and ATP6V0D2 was significantly associated with the increased risk of ACS and CCHD (all at P Ͻ 0.05),while the MYH11 level was slightly connected with the risk of CCHD (OR 1.002, 95% CI 1.000 -1.005, P ϭ 0.039). On the other hand, the ACADL level was inversely associated with the risk of HF (OR 0.763, 95% CI 0.597-0.975, P ϭ 0.03; Table 6).

DISCUSSION
Atherosclerosis, the underlying cause of cardiovascular disease, represents the leading cause of morbidity and mortality throughout industrialized societies (3,17). Due to the complexity of its pathogenesis, effective diagnosis and therapeutic strategies have not yet been applied in routine clinical practice. We focused on plaques obtained from patients and investigated differentially expressed RNAs by microarray analysis and verified the expression levels of those RNAs and proteins that may play important roles in atherosclerosis and other acute cardiovascular events. In our study, we investigated differences in lncRNA and mRNA profiles between advanced atherosclerotic plaques and the normal arterial intima. Given that ln-cRNAs can play regulatory roles through interaction with target genes, we also systematically investigated differentially expressed mRNAs. Some of these genes have been verified as pivotal to the occurrence and development of atherosclerosis and other cardiovascular diseases, but most of them and their associated lncRNAs have not been well studied, and their potential functions remain unclear. Therefore, our study provides novel information about these specific lncRNAs and target genes, which may contribute to the pathogenesis of atherosclerosis and identifies potential therapeutic targets.
Numerous dysregulated lncRNAs and mRNAs in plaques tissues were detected by performing microarray profiling. We systematically investigated these differentially expressed mR-NAs in their function, pathway, and network. GO functionalenrichment analysis of genes showed that several differentially expressed mRNAs were related with cytokine and chemokine activity, which participates in all phases of atherosclerotic lesion development. Additionally, other molecules involved in vascular trauma including NF-B, alpha-catenin, vascular endothelial growth factor, and extracellular regulated protein kinases (ERK) were also activated. From function analysis, we speculate that the selected genes may be related to cardiovascular development and blood cell movement.
The pathway analysis shed light on which pathways were enriched in genes contributing to the development of advanced atherosclerotic plaques. They include the MAPK signaling pathway, Toll-like receptor signaling pathway, and NF-signaling pathway, which have previously been reported to be associated with atherosclerosis pathogenesis (25,28,32,37). Moreover, we undertook a network analysis to predict pathways that may regulate a panel of selected genes and observed high-scoring gene networks related to cellular growth and proliferation, cancer, respiratory disease, and others. Many of these networks are associated with cardiovascular anomalies such as lipid metabolism, cardiovascular system development, and function. The molecules in these networks may play important roles in regulating the selected genes. Many of them are consistent with regulators in the upstream regulator analysis such as ERK, P38, and SPP1. The network analysis provides a direction for further function and mechanism studies and indicates potential regulatory factors and downstream genes that may play a key role in atherosclerosis.
In our analysis, a number of lncRNAs and mRNAs were found to be related to atherosclerosis. We found that SPP1, which displayed the largest fold change in the microarray analysis, was differentially expressed in the serum of patients with HF or ACS when compared with healthy controls. Studies have shown that SPP1 is an integrin-binding ligand and Nlinked glycoprotein participating in atherosclerotic inflammation (35). Our results support a role for SPP1 in atherosclerosis. Additionally, we found that CD36, a scavenger receptor, was at higher levels in the serum of patients with ACS or CCHD than in normal subjects. It is widely accepted that CD36 mediates Data are presented as means Ϯ SD or medians (interquartile range) unless otherwise indicated. ACS, acute coronary syndrome; CCHD, chronic coronary heart disease; HF, heart failure. Compared with healthy group: *P Ͻ 0.05; †P Ͻ 0.01; ‡P Ͻ 0.001. foam cell formation and promotes inflammatory response and oxidative stress (29,39). Our results are consistent with a role for CD36 in promoting atherosclerosis. ATP6V0D2, which is required for osteoclast differentiation as well as the boneresorptive function of mature osteoclasts (5,20), was found to be expressed at higher levels in the serum of patients with HF, ACS, or CCHD than in the healthy group. CHI3L1, which has been widely investigated in cell proliferation, differentiation, apoptosis, angiogenesis, inflammation, and extracellular tissue remodeling (18), also had higher levels in patients with HF, ACS, or CCHD. It is also worth noting that serum levels of BDNF, which has gained strong interest for its relevance to the incidence of psychiatric syndromes and neurocognitive functioning (38), were lower in patients with CCHD, ACS, or HF than in healthy subjects. Furthermore, we found that BDNF protein levels were significantly correlated with the levels of CK-MBm and Myo, which have been widely used in the diagnosis of cardiovascular diseases. These ELISA results demonstrated that some of our selected proteins were differentially expressed not only in plaque tissues and normal intimae, but also in the serum of patients with different cardiovascular diseases and may, therefore, serve as biomarkers for the diagnosis and prognosis monitoring of diseases. In summary, our results support the conclusion that these proteins (i.e., SPP1, ATP6V0D2, CHI3L1, and BDNF) likely play important roles in the development of atherosclerosis-related diseases and further indicate that they are potential diagnostic and therapeutic biomarkers.
Our study has several limitations. First, the atherosclerotic plaques used in the microarray analysis were of grade VI, while the plaques used in the qRT-PCR validation analysis varied from grade IV to VI. Due to the complexity of plaque composition and tissue variability, the fold changes of the selected lncRNAs and mRNAs in the qRT-PCR analysis were not as large as those seen in the microarray experiment. It is likely that the expression levels of these genes vary depending on the grade of atherosclerotic plaque. Second, probably because of the insufficient sample size in our pilot study, some lncRNAs and mRNAs, which expressed significant differentially between the health group and the disease group by the chip analysis, did not show such significant differences in verification of qPCR and ELISA. So even though we expanded the sample size for qPCR validation and ELISA testing, the six tissue samples we used for pilot studies was slightly small. Third, the comparison of plaque/control sample has some limitation, as the samples come from whole tissue and do not provide information on which cell type has the highest expression; this may limit the application of the present study.

Conclusions
In conclusion, our study provides a distinct expression profile of lncRNAs and mRNAs in advanced atherosclerotic plaques compared with normal arterial intimae and identifies the pathways involved. Furthermore, our study indicates that SPP1, CD36, ATP6V0D2, CHI3L1, MYH11, and BDNF, which showed favorable diagnostic performance and high correlations with several clinical biochemical parameters, may potentially serve as biomarkers to diagnose and monitor the prognosis of atherosclerosis.