Differential circRNA expression profiles in latent human cytomegalovirus infection and validation using clinical samples.

Human cytomegalovirus (HCMV) is an opportunistic prototypic beta-herpesvirus that can cause severe and even fatal diseases in immune-naive newborns and immunocompromised adults. Host-virus interactions occurring at the transcriptional and posttranscriptional levels are critical for establishing an HCMV latent or lytic infection, but the mechanisms remain poorly understood. Herein, we investigated the expression of circRNAs in human leukemia monocytes (THP-1 cells) latently infected with HCMV and explored the diagnostic value of circRNAs in children with HCMV infection. A total of 2,110 and 1,912 circRNAs were identified in mock-infected and HCMV latent-infected THP-1 cells, respectively. Of these, we identified 1,421 differently expressed circRNAs, of which 650 were upregulated and 771 were downregulated. The host genes corresponding to the differentially expressed circRNAs were mainly involved in the regulation of host cell secretion pathways, cell cycle, and cell apoptosis. The differentially expressed circRNAs had binding sites for microRNAs, suggesting an important role in the mechanism of HCMV latent infection. Furthermore, a clinical analysis showed that the expression levels of hsa_circ_0001445 and hsa_circ_0001206 were statistically significantly different in HCMV-infected patients vs. normal controls, suggesting that these circRNAs could potentially serve as biomarkers of HCMV-infection.


INTRODUCTION
Human cytomegalovirus (HCMV) is an opportunistic prototypic beta-herpesvirus that is widespread in the human population and can lead to severe and even fatal diseases in immune-naive newborns and immunocompromised adults (12,32,45). Following the initial infection, HCMV is most likely to establish a lifelong latent infection and cause reactivated infection in immunodeficient hosts (13,31). Host-virus interactions occur at both the transcriptional and posttranscriptional levels and are critical for establishing either a latent or lytic HCMV infection, but the mechanism underlying this disparity remains poorly understood. HCMV is the only beta herpes virus known to encode microRNAs that are involved in the regulation of diverse cell processes such as cell proliferation, apoptosis, differentiation, and carcinogenesis. The microRNAs encoded by HCMV target and suppress different molecules in the human immune system by means of transcriptional and epigenetic regulation to achieve immune escape, so as to provide favorable conditions for HCMV latent infection (29,43,44). The effects of cell-encoded microRNAs on transcriptome changes during latent infection are currently unclear (34). Therefore, the transcriptional and epigenetic mechanisms of microRNAs in an HCMV latent infection need to be studied in detail.
CircRNAs are single-stranded transcripts that can contain multiple types of RNA in a covalently closed loop. In higher eukaryotes, the downstream 3=-splice site of the linear precursor mRNA is trans-spliced onto the upstream 5=-splice site to form the circRNA. CircRNAs were discovered 20 yr ago but were considered to be an uncommon byproduct of abnormal splicing. Following the development of RNA sequencing (RNA-Seq), a variety of algorithms for detecting genome-wide RNA expression have shown that eukaryotes generally express circRNAs (8,11,27,38,51). For example, it is estimated that Ͼ10% of the expressed genes can produce circRNAs in cells and tissues (8,11,14). Recent studies on the mechanism of cyclization suggest that reverse splicing cyclization is handled by the splice and is regulated by cis-and trans-modulators (5,8,15,19,42). Most reverse splicing that has been reported so far occurs at the outer boundary of annotated exons or at the position of the classical splice signal identified by the splice. The size of circRNA molecules can range from Ͻ100 nt to Ͼ4 kb (18), but the most common size in human cells appears to be in the hundreds of nucleotides (11,46). circRNAs can be formed by spliced introns or exons with the most common in humans being formed by two or three exons. Although several genes can produce dozens of different cyclic products, most genes with circular isoforms produce only one or two different circRNAs (10,37,46,47,51).
Although the functions of most circRNAs are unknown, studies have shown that they may have important effects on transcriptional and epigenetic regulation when expressed in tissue-and developmental-specific ways (3,24,25,27,38,55). Recent studies have reported that circRNAs harbor binding sites for the microRNAs, thereby acting as competing endog-enous RNAs to affect the ability of miRNAs to regulate gene expression. In addition, circRNA may also be associated with the development of many different human diseases. Recently, researchers have found abnormal expression of circRNA in cancer (21,33) nervous system disease (36,52), cardiovascular system disease (3), and autoimmune diseases (50). However, there are few studies on circRNA in infectious diseases such as those caused by viruses. In this study, we used a high-throughput sequencing (HTS) technology to obtain the expression profile of circRNA in HCMV latent-infected vs. mock-infected THP-1 cells and performed bioinformatics and a clinical analysis of the differentially expressed circRNAs to provide insights into the mechanism of HCMV infection.

MATERIALS AND METHODS
Cells culture and virus infections. THP-1 cells were cultured at 37°C under an atmosphere of 5% CO 2 with RPMI 1640 medium (GIBCO, San Jose, CA) supplemented with 10% (vol/vol) fetal bovine serum, 100 units/ml penicillin, and 100 g/ml Streptomycin (GIBCO). THP-1 cells were latently infected with HCMV as previously described (6). The mock-infected THP-1 cells not infected with HCMV are the control, but the other culture conditions are the same as with the infected THP-1 cells.
RNA extraction, Illumina library construction, and sequencing. Total RNA was extracted from triplicate cultures of THP-1 cells (either latent infected or mock-infected) with RNA-Solv Reagent following the RNA-Solv protocol. The quality of extracted total RNA was assessed with a Qubit HS RNA Assay Kit (Thermo Scientific, Carlsbad, CA), 1% agarose gel electrophoresis, and a 2100 Bioanalyzer RNA 6000 Nano Assay Kit (Agilent Technologies, Santa Clara, CA). Ribosomal RNA was removed with Ribo-Zero (Epicentre, Madison, WI), and linear RNA was removed with RNase R. Ribosomal and linear-depleted RNA from each of the two treatment groups was randomly fragmented into fragments with a size of 200 -500 nt. cDNA was generated by reverse-transcription using a TruSeq RNA Sample Prep Kit (V2-Set A; Illumina, San Diego, CA), and two libraries, one from each treatment group, were constructed by amplification with the TruSeq PE Cluster Kit v3-cBot-HS (Illumina). The libraries were subjected to double-ended sequencing by using Illumina HiSeq 4000 (Illumina), with a single-ended read length of 150 bp.
circRNA detection pipeline. The main steps of the circRNA detection pipeline are briefly described as follows. First, the original sequencing data, transformed by base identification (base calling) analysis from the HTS of the original image data files, were subjected to a quality control analysis to obtain clean sequence data. The clean sequence data were mapped to the human genome by the Bowtie2 comparison tool as previously reported (9,17), and unmapped reads were extracted from the comparison file. The unmapped reads were analyzed using "find_circ_enhance" based on the "find_circ" (27). Anchor sequences (20 nt) at the two ends of the extraction were denoted as the 5=-anchor and 3=-anchor, respectively, and each pair of anchor sequences was again mapped to the reference sequence. If the two anchor sequences could be mapped to the reference genome in a chiastic manner (i.e., the 5=-anchor was on the right end, and the 3=-anchor was on the left) and the complete reads of the anchor could find the splice site at the match position and were perfectly matched, the complete read, with a read count of at least two, was detected as a candidate circRNA. Finally, to ensure the reliability of the results, multiple screening conditions were used to obtain the final circRNAs. The circRNA filtration conditions used were as follows: 1) numReads: the total reads of the detected circRNA should be at least 2; 2) numUniq: the specifically matched reads of the detected circRNA should be at least 2; 3) bestQual (A/B): the optimal matched quality of the 5= anchor or the 3= anchor of circRNA should be at least 35; 4) edits: edit distance should be less than or equal to 2; 5) anchor_overlap: the number of repeats of the anchor in the circRNA should be less than or equal to 2; 6) bPnums: the number of breakpoints in circRNA need to be less than or equal to 1; 7) the length of the circRNA genome should be shorter than 1,000,000 bp and longer than 100 bp; and 8) for the circRNAs located at positions within 20 bp, only retain the circRNA having the larger unique read.
Validation of circRNA using Sanger sequencing. Total RNA from each sample was reverse transcribed to cDNA using a HiScript 1st Strand cDNA Synthesis Kit (Vazyme Biotech, Nanjing, Jiangsu, China), according to the manufacturer's protocols. Divergent primers were designed using an NCBI primer design tool (data not shown). Amplification of the cDNA was performed using a 2ϫ Phanta Master Mix (Vazyme Biotech). The PCR products were subjected to electrophoresis on a 2% agarose gel. The PCR products were purified with a TaKaRa MiniBEST Agarose Gel DNA Extraction Kit Ver. 4.0 (TaKaRa Bio, Kusatsu, Shiga, Japan) and then cloned using pEASY-Blunt Zero Cloning kit (TransGen Biotech, Beijing, China) with Trans1-T1 Phage Resistant Chemically Competent Cells (TransGen Biotech). Sequencing of the cloned product was performed by Sanger sequencing.
Identification of differentially expressed circRNAs. Gene expression was assessed by the TPM method (transcripts per million) (56). The steps used for the TPM calculation were as follows: 1) Divide the read counts by the length of each gene in kilobases to obtain the reads per kilobase (RPK); 2) Count up all the RPK values in a sample and divide this number by 1,000,000 to obtain the "per million" scaling factor; and 3) Divide the RPK values by the "per million" scaling factor to get TPM (data not shown). The normalized expression level of each circRNA was further analyzed with hierarchical clustering.
Validation of differential expression with qPCR. Validation of the differential expression of circRNAs identified by RNA-Seq analysis was performed with quantitative real-time PCR (qPCR) in 11 randomly selected RNA samples. Divergent primers were designed with the NCBI primer design tool (data not shown). The housekeeping gene glyceraldehyde 3-phosphate dehydrogenase was used as the control. To ensure the accuracy of the results, we verified the target specificity of the PCR primers with BLAST (https://blast. ncbi.nlm.nih.gov/Blast.cgi). After reverse transcribing total RNA extracted using RNA-Solv Reagent (Omega, Radnor, PA) to produce cDNA, we performed an amplification using Takara SYBR Fast qPCR Mix (TaKaRa Bio) on a Bio-Rad CFX96 (Bio-Rad, Des Plaines, IL) with three replicates each. The reaction conditions were as follows: 95°C for 3 min, followed by 35 cycles of 95°C for 5 s, 60°C for 30 s. The appearance of a single peak in the melting curve indicated the specificity of the PCR results. The data were analyzed by Ϫ⌬⌬CT.
Function prediction for the differentially expressed circRNAs. Gene Ontology (GO, http://www.geneontology.org/) and KEGG (Kyoto Encyclopedia of Genes and Genome) analyses were carried to predict the function of the host genes corresponding to the differentially expressed circRNAs (top 300) in the HCMV latent-infected vs. mock-infected THP-1 cells. A P value denotes the significance of the GO term and pathway that correlated to the conditions, and a P value Ͻ0.05 was considered significantly different. Finally, microRNA response elements as regulators of mRNAs were predicted by a combination of TargetScan and miRanda.
Patient sample collection and qPCR. On the basis of the validation of differential expression of the circRNAs by qPCR, we chose hsa_circ_0001445 and hsa_circ_0001206 for further analysis by qPCR in patient samples. A total of 64 whole blood samples were collected from the participants (including 30 HCMV-infected patients and 34 normal controls) (data not shown) with EDTA-coated Vacutainers and stored on ice until used for RNA preparation. This study was conducted with the approval of the institutional ethics committee of the Second Hospital affiliated to Wenzhou Medical University. The parents or legitimate guardians of all pediatric subjects had signed an informed consent agreement to participate in the study. Total RNA was then isolated from whole blood samples using 1 ml RNA-Solv Reagent to extract the RNA from 100 l of blood. Samples were homogenized by vortexing for 30 s; after being incubated at room temperature for 15 min they were then centrifuged for 10 min (room temperature, 12,000 g); the supernatant was transferred to a new EP tube (typically 1 ml), and 200 l of chloroform was added. The sample tubes were capped securely and vortexed vigorously for 15 s and then incubated on ice for 10 min. After centrifugation at 12,000 g for 15 min at 4°C, the aqueous phase was collected in a new tube (typically 500 l). RNA was precipitated by addition of 800 l of cold isopropanol per 1 ml of RNA-Solv Reagent used for the initial homogenization and incubated for 1 h at Ϫ80°C. RNA pellets were recovered by centrifuging at 12,000 g for 10 min at 4°C. RNA pellets were washed with 1 ml of 80% ethanol and subsequently air-dried at room temperature for 5 min. The RNA was resuspended in 20 l RNase-free water, which was incubated at 60°C for 5 min. Quantification and quality evaluation of the RNA were performed with NanoDrop 2000 (Thermo Scientific). After reverse-transcription to cDNA, PCR amplification of the cDNA was conducted with Takara SYBR Fast qPCR Mix on a Bio-Rad CFX96 with three replicates each. The data were analyzed by the Ϫ⌬⌬CT method.
Statistical analysis. A Pearson correlation analysis between the data obtained from the qPCR and RNA-Seq analyses was performed with SPSS version 23.0 software. The Ϫ⌬⌬CT results from qPCR data comparing patients and healthy controls were tested by an unpaired Student's t-test and visualized with GraphPad Prism5.0. The diagnostic value of circRNAs was analyzed by receiver operating characteristic (ROC) curves and the area under the ROC curve (AUC) with SPSS version 23.0 software. P values Ͻ 0.05 (two-tailed) were considered as statistically significant.

Differential expression of circRNAs in HCMV latent-infected vs. mock-infected THP-1 cells.
To obtain a comprehensive profile of differentially expressed circRNAs in HCMV latent-infected vs. mock-infected THP-1 cells, we performed an RNA-Seq analysis. As a result, we identified 1,912 and 2,110 specific candidate circRNAs, respectively (Fig. 1A). Between the two sets of THP-1 cells, 1,559 (63%) circRNAs were found to be in common, 353 (15%) circRNAs were specifically expressed in HCMV latent-infected THP-1 cells, whereas 551 (22%) circRNAs were specifically expressed in mock-infected THP-1 cells. The genomic coordinates and additional information such as read count and annotation are available at circbase.org (35). We then experimentally validated 17 of these circRNAs. The end-joining regions that were unique to these circRNAs were amplified by reverse transcription and PCR using divergent primers, followed by standard Sanger sequencing (data not shown). All the circRNAs we sequenced had a reversed order in the genome as expected, indicating that they were head-to-tail splicing and were indeed circular. Four of them were formed by a single exon. Moreover, one gene may have produced several types of circRNAs containing different exons, such as hsa_circ_0000284 and hsa_circ_0008887.
We next compared the circRNA expression levels in HCMV latent-infected vs. mock-infected THP-1 cells. There were 303 upregulated circRNAs and 227 downregulated circRNAs that had a Ͼ2-fold change (FC) in an expression level comparing the latent-infected vs. mockinfected THP-1 cells (data not shown). The FC values for the circRNA expression in latent-infected vs. mock-infected THP-1 cells group ranged from 10.73 to 0.07. A scatter plot visually shows the variation in circRNA expression in latent-infected vs. mock-infected THP-1 cells group (Fig.  1B). To validate the differential expression of these cir-cRNAs identified by RNA-Seq, we randomly selected 11 circRNAs (four upregulated and seven downregulated cir-cRNAs) for qPCR validation. We found that there was a significant correlation between the qPCR data and the RNA-Seq data (Pearson correlation coefficient ϭ 0.88; P Ͻ 0.001) (Fig. 1C). The high correlation between two techniques demonstrates the accuracy of the RNA-Seq data.
Assignment of GO terms and KEGG pathways. On the basis of sequence homology, we assigned the host genes corresponding to the top 300 differentially expressed circRNAs in HCMV latent-infected vs. mock-infected THP-1 cells GO terms. The GO terms were classified as "biological processes" (n ϭ 170, P Ͻ 0.05), "cellular components" (n ϭ 55, P Ͻ 0.05), or "molecular functions" (n ϭ 63, P Ͻ 0.05), respectively (data not shown). The results of a GO analysis showed that "biological process" mainly included endoplasmic reticulum (ER) stress response, protein phosphorylation, kinase activity regulation, vesicle transport, and ubiquitin, whereas cellular component mainly included the presence of "vesicles," COPI clothing, cytoplasmic ribonucleoprotein particles, Golgi-related vesicles, transport vesicles, intermediates, and spindles, and molecular function mainly included enzyme activity, ubiquitin transferase activity, DNA-dependent ATPase activity, nuclear transport activity, tRNA binding, GTPase binding, protein threonine kinase activity, and nuclease activity (Fig. 2,  A-C). We then performed a pathway analysis and found that the host genes corresponding to the top 300 differentially expressed circRNAs in HCMV latent-infected vs. mock-infected THP-1 cells could be assigned to 18 pathways (P Ͻ 0.05) (data not shown). The most enriched KEGG pathways were ubiquitin-mediated proteolysis, protein processing in ER, Fc gamma R-mediated phagocytosis, and HTLV-I infection (Fig. 2D).
MicroRNAs target prediction for circRNAs. As circRNAs likely serve as microRNA sponges through microRNA response elements (MREs), we analyzed putative MREs for the circRNAs identified in these two groups. We found that 1,119 and 1,130 circRNAs had potential target sites for one or more microRNAs in mock-infected and latent-infected THP-1 cells, respectively, of which most circRNAs had only one to three putative MREs, with a maximum of 98 (data not shown). The predicted MREs for the top 10 upregulated and top 10 downregulated circRNAs in latent-infected vs. mock-infected THP-1 cells are listed in Tables 1 and 2. Three of the top 10 upregulated circRNAs had putative MREs, with hsa_circ_ 0007648 having the most potential target sites with five microRNAs (i.e., hsa-miR-4716-3p, hsa-miR-1200, hsa-miR-4725-5p, hsa-miR-1233-3p, and hsa-miR-324-3p). In addition, six of the top 10 downregulated circRNAs had putative MREs, with hsa_circ_0007637 having the most potential target sites with three microRNAs (i.e., hsa-miR-650, hsa-miR-4755-3p, and hsa-miR-3154). Similarly, the same microRNA could bind to multiple circRNAs, and these include the top 10 microRNAs differently expressed in mock-infected vs. latent-infected THP-1 cells we had detected previously (38), which are listed in Table 3. Potential diagnostic values of hsa_circ_0001445 and hsa_circ_0001206 in HCMV infection. A quantitative analysis of hsa_circ_0001445 and hsa_circ_0001206 in the two groups (HCMV-infected patients and normal controls) was conducted by qPCR, as well as validation by Sanger sequencing (Fig. 3, A-D). The expression of hsa_circ_0001445 and hsa_circ_0001206 was downregulated during HCMV infection, and both of these differences were statistically significant (P Ͻ 0.001). The accuracy of the test was evaluated by calculating the area under the ROC curve (AUC) (Fig. 3E). The values were 0.779 (95% CI 0.660 -0.899) and 0.840 (95% CI 0.741-0.939), respectively.

DISCUSSION
HCMV has evolved multiple strategies to establish latent infection in host cells by controlling the expression of immediate early genes, which involves various viral and cellular factors (35,39,41). However, the mechanisms underlying this infection type are not clearly understood. A previous study has shown that the expression of cellular genes is rapidly altered during an HCMV latent infection, including the expression of mRNAs and long noncoding RNAs (lncRNAs) (53). Numer-ous other studies have also focused on microRNAs, such as hcmv-miR-UL112-1 (7, 29) and cellular microRNAs (1,39). However, the occurrence of circRNAs in HCMV infection has not been examined. In this study, we profiled the expression of circRNAs in HCMV latent-infected THP-1 cells in comparison with mock-infected THP-1 cells. A number of differentially expressed circRNAs were revealed by HTS, suggesting that circRNAs might play a significant role in the mechanism of HCMV latent infection.
CircRNAs have been demonstrated across species. Altered expression of circRNAs has been reported in several diseases including cancer, heart disease, and neurological disorders, although the underlying mechanisms are yet to be disclosed. Analysis of circRNAs in esophageal carcinoma tissue has revealed that circRNA ITCH expression in tumor tissue is significantly lower than in peritumoral tissue (20). Similarly, it has been reported that the ratio of circular to linear RNA isoforms is significantly lower in colon tumor compared with normal colon samples and even lower in colorectal cancer cell lines (2). In this study, we found that the expression of circRNAs differed between HCMV latent-infected and mockinfected THP-1 cells. Through a KEGG analysis of the differentially expressed circRNA host genes, we identified 18 pathways that were enriched in the top 300 differentially expressed circRNA host genes (Figs. 1 and 2). We found that the host genes corresponding to the differentially expressed circRNAs were mainly involved in the regulation of host cell secretion pathways, cell cycle, and cell apoptosis. These results suggest that differentially expressed circRNAs likely participate in the regulation of these cellular processes. Since the altered expression of circRNAs was related to HCMV infection, we speculate that circRNAs interact with HCMV DNA replication and latent infection. Our findings may provide a new avenue to explore the pathogenesis of HCMV infection. Column "MicroRNA Name" represents the theoretical microRNAs response to the circRNAs.  CircRNAs act as "microRNA sponges" to competitively modulate expression levels of microRNAs, thereby inhibiting microRNA function. By virtue of their great stability and elaborate regulatory mechanisms of gene expression, circRNAs play important roles in certain physiological and pathogenic activities. However, there have been few studies on the link to viral diseases. In this study, we found that most of the differentially expressed circRNAs, identified in HCMV latent-infected THP-1 cells compared with mock-infected THP-1 cells, had potential target sites for microRNAs. We found that the expressions of hsa_circ_ 0062984, hsa_circ_0001119, hsa_circ_0001118, and hsa_circ_ 0041539 were downregulated, while the expressions of hsa_ circ_0002857 and hsa_circ_0008832 were upregulated in latentinfected THP-1 cells (Tables 1-3). Although the target microR-NAs of these circRNAs have not been identified yet, we are interested in studying the interaction of hsa-miR-21 with these circRNAs, as hsa-miR-21 is an important intrinsic antiviral factor that is regulated by HCMV infection (16). Studies have reported that miR-21 is depressed in HCMV-infected NPCs (26) or fibroblasts (48) and that miR-21 plays an important role in regulating the cell cycle, controlling tumor growth, and reducing apoptosis (16). Therefore, altered expressions of cir-cRNAs in HCMV-infected cells may exert their biological functions through interacting with cell-encoded microRNAs such as hsa-miR-21.
According to current research, the latent infection-related genes such as UL138, LUNA, and US28 were less expressed in HCMV latent-infected individuals in clinics that lack reliable detection methods. Therefore, the detection of host biomarkers like circRNA becomes an alternate way to identify latent infection effectively. Unlike linear RNAs, circRNAs are usually stable, abundant, conserved RNA molecules and often exhibit tissue/developmental-stage specific expression (40,49). Thus, circRNAs have distinct advantages as new biomarkers for predicting prognostic and therapeutic responses over other noncoding RNA molecules such as lncRNAs and microRNAs. To this end, we collected blood samples from HCMV-infected patients and normal subjects to assess cir-cRNA profiles and analyze the relationship between circRNAs and HCMV infection. Our qPCR results revealed that hsa_ circ_0001445 and hsa_circ_0001206 were dysregulated in HCMV infection as they were significantly downregulated in HCMV-infected individuals compared with normal controls (Fig. 3). In this study, we used qPCR to carry out the semiquantitative detection of circRNA as qPCR has better specificity and stability than ordinary PCR. And the accuracy of the test was evaluated by calculating the AUC in which the values were 0.779 (95% CI 0.660 -0.899) and 0.840 (95% CI 0.741-0.939), respectively. Thus, our results suggest that altered hsa_circ_0001445 and hsa_circ_0001206 detected through qPCR may be potentially used as biomarkers for the diagnosis or prognosis of HCMV infection. Previous studies have evaluated circRNAs as potential biomarkers for gastric cancer (22), atherosclerosis (3), and major depressive disorder (4). Studies have also indicated that the hsa_circ_0001445 is downregulated in hepatocellular carcinoma (HCC) tissues and regulates HCC development, so that it could serve as a potential diagnostic biomarker for HCC (54). However, current research on hsa_circ_0001445 and hsa_circ_0001206 in association with HCMV infection is still insufficient. To further understand the biological role of circRNAs in HCMV infection, both in vitro and in vivo assays should be performed in the future after successfully engineering these circRNAs. In summary, our data demonstrate that the expression of circRNAs was altered during HCMV latent infection. The differentially expressed circRNAs might be involved in the mechanisms of immune escape during HCMV latent infection and could act as microRNA sponges to regulate the expression of microRNAs, thereby affecting HCMV infection. In addition, the clinical analysis suggests that circRNAs could be of great significance in the diagnosis or prognosis of HCMV infection.