Genomic and “Polyomic” studies of Cardiovascular and Inﬂammatory Diseases Differentially expressed genes for atrial ﬁbrillation identiﬁed by RNA sequencing from paired human left and right atrial appendages

Differentially expressed genes for atrial ﬁbrillation identiﬁed by RNA sequencing from paired human left and right atrial appendages. Physiol 51: 323–332, ﬁbrillation is a signiﬁcant worldwide contributor to cardiovascular morbidity and mortality. Few studies have investigated the differences in gene expression between the left and right atrial appendages, leaving their characterization largely unexplored. In this study, differential gene expression was investigated in atrial ﬁbrillation and sinus rhythm using left and right atrial appendages from the same patients. RNA sequencing was performed on the left and right atrial appendages from ﬁve sinus rhythm (SR) control patients and ﬁve permanent AF case patients. Differential gene expression in both the left and right atrial appendages was analyzed using the Bioconductor package edgeR. A selection of differentially expressed genes, with relevance to atrial ﬁbrillation, were further validated using quantitative RT-PCR. The distribution of the samples assessed through principal component analysis showed distinct grouping between left and right atrial appendages and between SR controls and AF cases. Overall 157 differentially expressed genes were identiﬁed to be downregulated and 90 genes upregulated in AF. Pathway enrichment analysis indicated a greater involvement of left atrial genes in the Wnt signaling pathway whereas right atrial genes were involved in clathrin-coated vesicle and collagen formation. The differing expression of genes in both left and right atrial appendages indicate that there are different mechanisms for development, support and remodeling of AF within the left and right atria.


INTRODUCTION
Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia worldwide and a significant contributor to cardiovascular morbidity and mortality. Increased incidence of AF is associated with increasing age and male sex as well as other risk factors, such as hypertension, diabetes, obesity, and smoking, that act to induce structural and histopathological changes to the atrium subsequently increasing susceptibility to AF (39). AF is characterized by high-frequency excitation of the atrium resulting in both dys-synchronous atrial contraction and irregular ventricular excitation (39). Initiation of AF is likely triggered via abnormal calcium handling that causes spontaneous myocyte depolarization in and around the pulmonary veins. Re-entry and ectopic impulse formation then promote the stabilization of AF (23,39) with the left atria playing a predominant role in AF initiation and maintenance (23). Subsequent changes in atrial structure and function due to remodeling either by altered ion channel expression and/or fibrosis increase re-entry and vulnerability to AF (23,39).
The genetics of AF is complex with both rare and common variants that increase susceptibility to AF being described (9). Familial AF has been linked to various mutations in K ϩ and Na ϩ channel genes and their accessory proteins; for example, mutations in KCNJ2, KCNE1, and SCN5A have been identified (9,39). More recently, genome-wide association studies (GWAS) have identified around 200 loci, which may also be involved in AF susceptibility (10,14,19,30,31,34,38,39). For example, the PITX2 locus on chromosome 4q25 has been associated with AF. The PITX2 gene encodes the paired-like homeodomain transcription factor that is one of the isoforms expressed in the heart and may play a role in the transcription regulation of atrial regulation genes (9). Other examples are ZFHX3 and NEURL; decrease in ZFHX3 expression decreases the levels of Ca 2ϩ and shortens the action potential duration (39). The knockdown of NEURL, an E3 ubiquitin ligase, ortholog in zebrafish led to significant prolongation of the atrial action potential (38). Other loci associated with AF include HCN4, KCNN3, and PRRX1 (9, 10, 14, 38,39) Previous studies have used either left atrial appendage (LAA) (12, 21,35,41) or right atrial appendage (RAA) (3,8,15) samples to study gene expression and microRNAs differences in atrial fibrillation. There are limited studies using both the left and right atrial appendages (11,22). In this study, we use RNA sequencing (RNA-Seq) to assess differential gene expression in AF using paired left and right atrial appendage samples. Understanding any differences in expression may help to further define the pathophysiology of AF.

MATERIALS AND METHODS
Patient selection. Patients undergoing coronary artery bypass grafting and/or atrial/mitral valve repair or replacement at Barts Heart Centre, Barts Health National Health Service (NHS) Trust gave informed consent for samples of their left and right atrial appendages to be taken during surgery. Initially the samples were collected as part of the Inflammation in Atrial Fibrillation Study at Barts Health NHS Trust [approved by the East London Research Ethics Committee (10/H0704/43)] and latterly in collaboration with the National Institutes for Health Research (NIHR) Barts Biomedical Research Centre Cardiovascular Bio-Registry (approved by the East of England-Cambridge Central Research Ethics Committee (14/EE/0007), https:// www.qmul.ac.uk/whri/research/core-facilities/nihr-bioinformaticsand-bio-repository/). All patients were nondiabetic males with a heart rhythm classified as permanent AF or a normal regular heart rhythm, sinus rhythm (SR). Patients in SR who developed postoperative AF were excluded from the study. Samples were immediately immersed in RNAlater to preserve RNA quality. From the samples obtained, two groups were formed. The first group comprised left and right atrial appendage from five SR control patients and five AF patients (Table  1, top). This group was used for RNA-Seq and quantitative RT-PCR validation. A second group that comprised left and right atrial appendage from six SR control patients and seven AF patients (Table 1, bottom) was used for independent replication by quantitative RT-PCR. Due to the difficulty in obtaining left atrial appendage from some AF patients, this independent validation group was formed of four left atrial appendage and seven right atrial appendage samples.
Isolation of RNA from human left and right atrial appendage. RNA was isolated from 30 mg of human left and right atrial appendages, with a RNeasy mini tissue kit (Qiagen). RNA was also subjected to DNase treatment. The resulting concentration of RNA was determined by NanoDrop 1000 or Agilent Bioanalyzer before downstream applications. Only RNA of suitable quality (i.e., RNA integrity number Ͼ 8, rRNA ratio (28s/18s) Ͼ2, concentration Ͼ 50 nl/l) was used for RNA sequencing.
RNA sequencing and differential gene expression analysis of group 1 samples. Libraries were generated with TruSeq Stranded Total RNA library prep with RiboZero (Illumina). RNA sequencing was performed at the Queen Mary University of London Genome Centre (http://www.smd.qmul.ac.uk/gc/) with Illumina NextSeq500 with Ͼ100 million short reads generated per sample. The quality of the sequencing reads (fastq files) was assessed by FastQC (version 0.11.2). The reads were trimmed for adaptor sequences and poor quality reads with Trim Galore (version 0.3.7). Two trimming phases were applied, the first to remove adaptors and the second to remove poly G sequences. The quality of the trimmed sequences was reassessed with FastQC (version 0.11.2). After satisfactory quality control, trimmed sequences were aligned to the coding regions of the human reference genome (GRCh37) with TopHat2 (version 2.0.13) and bowtie2 (version 2.2.3). Transcript abundance was then calculated by HTSeq-counts software (version 0.6.0) (2). Unadjusted transcript abundance was then exported to the R environment (version 3.1.2) for exploratory data analysis and differential expression analyses. The principal component analysis (PCA) and distance between samples from DESeq2 (version 1.6.3) were used to assess the dispersion and categorization of samples. Differential expression analysis was investigated with edgeR (version_3.8.6) (28). Genes with low counts and expressed in only one sample per category were removed from further analysis (n ϭ 14,823 investigated genes). The calcNormFactors function was used to calculate the normalization factors to account for library sizes. Samples were then investigated for differences either between SR and AF or differences between the left and right atrium. Dispersion was calculated by using the functions estimateCommon-Disp and estimateTagwiseDisp. The exact test was applied (exactTest) to obtain genes differentially expressed between SR and AF samples for either LAA or RAA. The data has been deposited in the National Center for Biotechnology Information's Gene Expression Omnibus (GEO) (13) and are accessible through GEO Series accession number GSE128188 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?accϭ GSE128188).
Downstream and enrichment analyses. Downstream analyses were performed on the differentially expressed genes with a false discovery rate (FDR) of 5% or less. We investigated transcription, pathway, and cell type enrichment with EnrichR (http://amp.pharm.mssm.edu/ Enrichr) (5,25). Furthermore, we have queried the GWAS catalog and the Stanford Global Biobank Engine to identify variants associated to AF that were also differentially expressed in this RNA-Seq study.

RESULTS
Comparison of left and right atrial appendage RNA sequence data across AF cases and SR controls. The patient characteristics of group 1 are described in Table 1, top. Differential gene expression analysis using edgeR was performed for AF cases and SR controls. PCA analysis and inspection of the first two principal components illustrate the presence of four groups of samples; corresponding to SR-LAA, SR-RAA, AF-LAA, and AF-RAA (Fig. 1). The PCA plot shows samples from the AF cases are clustered on the top region of the plot and differentiating between left and right atrial appendage, indicating a similarity between AF samples but at the same time a distinction between the two atrial appendages (left and right). The control group, SR-LAA, generally clustered together. However, the SR-RAA samples showed more variability. A similar pattern is shown in the heat map (Supplementary Fig. S1; for all supplementary material see https://figshare. com/s/8483d57e6e79fe95e3c9). Typically, similar biological samples would be expected to cluster together as we see for the AF samples and generally for the SR samples. Further observation of the sample variability can also be visualized by MD plots and volcano plots (Supplementary Figs. S2 and S3). Using a 5% FDR threshold, we found 128 genes to be differentially expressed within the LAA between the SR controls and the AF cases (Supplementary Table S1). Of these 52 genes were upregulated and 76 were downregulated in AF. In the RAA, 173 genes were differentially expressed with a 5% FDR threshold between the SR controls and the AF cases (Supplementary Table S2). Of these, 49 genes were upregulated and 124 genes were downregulated in AF. Overall from both LAA and RAA, 157 genes were found to be downregulated in AF within the 5% FDR threshold. Of these, 27% (43 genes) of these were downregulated in both AF-LAA and AF-RAA. In the AF-LAA, 21% (33 genes) were found to be downregulated, whereas a greater proportion of genes, 52% (81 genes), were found to be downregulated in the AF-RAA. A smaller number of upregulated genes were identified (90 genes) to be within the 5% FDR threshold, and only 12% (11 genes) of these were upregulated in both AF-LAA and AF-RAA. The proportion of genes that were upregulated in either AF-LAA or AF-RAA, 46% (41 genes) vs. 42% (38 genes) respectively was more equally spread (Fig. 2). The gene expression hierarchical clustering of differentially expressed genes [FDR Յ5%, log fold change (LogFC) Ն1] from LAA and RAA is shown in Supplementary Figs. S4 and S5 respectively.
Validation by quantitative RT-PCR of select differentially expressed genes. Down-and upregulated genes with relevance to AF differentially expressed below the 5% FDR threshold and with a logFC Ն 1 ( Table 2) were selected for further validation by quantitative RT-PCR in group 1 samples. In brief, downregulated genes in AF in both LAA and RAA (MT1X, AGTR1, and PPP1R1A), LAA (BMP7 and CACNA1G), and RAA (GPR183, GPR83, GREM1, and KCNK2) were selected. Upregulated genes in AF in LAA (MYH7, GRIA1, and KCNJ2) and RAA (ATP1B4, COMP, and STYL5) were also selected. The results of the quantitative RT-PCR generally follow the differential expression observed with the RNA-Seq data (Fig. 3, Table 2). The expression of AGTR1, MT1X ,and PPP1R1A in both AF-LAA and AF-RAA was significantly decreased as was CACNA1G gene expression in AF-LAA and GPR183 and GPR83 gene expression in AF-RAA (Fig. 3, A and C). This was consistent with a significant decrease in the expression of these genes in the edgeR differential expression analysis between the LAA of SR and AF and the RAA of SR and AF. The gene expression of MYH7 in AF-LAA and COMP, STYL5, and ATP1B4 in the AF-RAA was significantly increased, consistent with the increase in gene expression shown by edgeR differential expression analysis between the LAA of SR and AF and the RAA of SR and AF (Fig. 3, B and D). However, there were some exceptions. BMP7 showed upregulation and GRIA1 and KCNJ2 downregulation in AF-LAA in the edgeR differential expression analysis (Table 2), whereas quantitative RT-PCR did not detect any significant changes in expression (Fig.  3, A and B). Similarly the edgeR differential expression analysis indicated that KCNK2 and GREM1 would be significantly downregulated in AF-RAA; however, this was not shown by quantitative RT-PCR (Fig. 3C). Using the counts from edgeR analysis, we find a high level of variation within each SR and AF LAA and RAA set for these genes. Reads per kilobase million (RPKM) values from the GTEx database (18a) indicate that these genes also have relatively low levels of expression within the RAA. Therefore, less sensitive methods of detection such as quantitative RT-PCR may not be able to detect these small levels and their subsequent changes. Interestingly, MYH7 was significantly increased, and GPR83 and CACNA1G were found to be significantly decreased across both AF-LAA and AF-RAA (Fig. 3, A, C, and D). Independent replication. To confirm differential expression of selected candidate genes, we used quantitative RT-PCR to test samples from an independent cohort (group 2, Table 1, bottom). KCNK2, GREM1, and SYTL5 were removed from the candidate genes set due to low levels of expression in the validation cohort and high variability observed in the count data. The expression pattern (Fig. 4) was found to be the same as group 1 for the following genes: MT1X, PPP1R1A, AGTR1, CACNA1G, GPR83, MYH7, COMP, and ATP1B4. However, in group 2, a significant increase in GRIA1 was observed in AF-LAA, while BMP7 was significantly decreased in both AF-LAA and AF-RAA. Expression of both GRIA1 and BMP7 was not observed in the validation group, group 1. However the increased GRIA1 gene expression and decreased BMP7 gene expression observed within this independent group, group 2, are consistent with the edgeR differential expression analysis data. No decrease in expression was observed for GPR183 in AF-RAA in this group. This may be accounted for by varying mRNA levels of these genes in different patient samples.
Enrichment analysis. Enrichment analyses were conducted on the genes differentially expressed at a 5% FDR between SR and AF samples for the left and right atrial appendage. Supplementary Table S3 presents the enrichment results with an adjusted P value Յ0.10 for Reactome, KEGG, gene ontology, and MGI mammalian phenotype. The enriched ontologies and pathways differed greatly between appendages, with only five categories shared between the left and the right atrial appendage. These five categories consisted of the neuronal system; the transmission across chemical synapses; the WNT5A-dependent internalization of FZD2, FZD5, and ROR2 Reactome pathways; the integral component of plasma membrane gene ontology term; and the hyperactivity term in the mammalian phenotype models. The differentially expressed genes in the LAA showed enrichment on the circadian entrainment and nicotine addiction pathways and the coreceptor activity involved in the Wnt signaling pathways and ionotropic glutamate receptor gene ontology terms (Fig. 5A). The RAA showed enrichment in the extracellular matrix organization and collagen formation and collagen degradation pathways and the clathrin-coated endocytic vesicle membrane gene ontology term (Fig. 5B) overall enrichment results, there is much variability between the enrichment observed in the KEGG results for the left and right atrial appendages. In addition, to highlight pathways specific to each atrial appendage, genes that are only differentially expressed between SR and AF in each specific atrial appendage were investigated (i.e., genes only observed to be differ-entially expressed in the RAA, or only differentially expressed in the LAA). The results were consistent with the overall analysis, where a range of extracellular matrix pathways are enriched in the RAA, including collagen formation and degradation and clathrincoated endocytic vesicle membrane gene ontology terms. In the LAA, enrichment in the circadian entrainment and nicotine ad-  diction pathways and in the coreceptor activity involved in the Wnt signaling pathway, transforming growth factor-␤ (TGF-␤) receptor binding, and BMP receptor binding gene ontology terms are observed (Supplementary Table S5).
Furthermore, we queried the GWAS catalog and the Stanford Global Biobank Engine for variants reported to be associated with "atrial fibrillation." More than 250 variants mapping to 212 genes are found to be associated with AF at a genome wide significance threshold of P value Յ5e-08 (Supplementary Table S6). Seven of these genes were found to show differential expression in AF-LAA and/or AF-RAA in this edgeR data set. MYH7 and KCNJ2 were upregulated in AF-LAA; RPL3L was upregulated in AF-RAA; COG5, SYNE2, and C10orf11 were downregulated in AF-RAA; and DGKB was downregulated in both AF-LAA and AF-RAA.

DISCUSSION
In this study, 247 genes were identified to be differentially expressed between SR and AF. The initial cohort consisted of the left and right atrial appendage of five patients with permanent AF and five patients in SR, the control group. Additionally, the average age of the AF group was found to be~10 yr more than that of the SR group; this was expected as the incidence of AF increases with advancing age (39). All patients in the study were selected to be nondiabetic as there is evidence that pathogenesis of AF is also caused by structural and electrical changes similar to those observed in the hearts of diabetic patients (40). The selection of only nondiabetic patients meant that changes observed would be more likely due to progression of AF.
Most of the studies that have been previously published have been limited by using either left or right atrial appendage (3,8,12,15,21,35,41). One of the main advantages of this study is that it allowed for the direct comparison of genes expressed in the left and right atria in AF. Specifically it revealed significant differential expression in left and right atrial appendages in SR and AF. Only 12 and 27% of genes were up and downregulated respectively in both AF-LAA and AF-RAA, indicating that there are a greater number of genes with altered expression on only one side of the atrium. This supports the potential for significant differences in biology between the two chambers and in how they remodel in AF. Although the left atrium is thought to have a more significant role in the initiation and maintenance of AF (38), overall a greater number of genes were identified to be differentially expressed in the right atrium. In particular, 52% of genes were specifically downregulated in AF-RAA compared with 21% of genes downregulated in AF-LAA. These results could suggest that downregulation of genes, particularly in the right atria, plays a greater role in AF than upregulation. However, the variability observed in the PCA (Fig. 1) within the SR-RAA and AF-RAA samples compared with the SR-LAA and AF-LAA samples may result in a higher number of differentially expressed genes identified in the RAA samples.
Pathway and gene ontology enrichment analysis suggest that there are a multitude of pathways involved that differ between the two sides of the atrial appendages. The RAA demonstrates enrichment in pathways involved in extracellular matrix organization and degradation and collagen formation and degradation. Among the genes identified in these pathways are a member of the matrix metalloprotease family, MMP3, cartilage oligomeric matrix protein, COMP, and collagen-encoding genes, COL12A1 and COL23A1. These could play a key role in atrial fibrosis leading to structural remodeling in the right atria and consequently the promotion of AF. Increased expression in AF-RAA in this study of the COMP gene warrants further investigation as increased levels of COMP have also been found in coronary heart disease patients (44). The LAA also shows enrichment for atrial fibrosis via BMP receptor and TGF-␤ receptor binding. Genes involved in these pathways, the bone morphogenetic proteins, BMP7, BMP8A, and the inhibitory SMAD6 are all involved in the TGF-␤ signaling pathway contributing to atrial fibrosis. BMP7 has been shown to have a protective effect on fibrosis (6), and its downregulation in AF-LAA in this study may allow for increased fibrosis to occur in the left atria. Additionally in the LAA, circadian entrainment and nicotine addiction pathways involving the key genes, GRIN2A encoding a NMDA receptor subunit NR2A, and GRIA1 and GRIA3, encoding AMPA-sensitive glutamate receptor subunits GluR1 and GluR3, respectively, were identified. It is possible that nerve endings were harvested with the atrial appendage, and these may reflect genes differentially expressed in this tissue. The role of glutamate receptors in AF is unclear; however, there is evidence that activation of NMDA receptors results in an increase in sympathetic activity, decreased heart rate variability, and promotion of AF (37). There is also evidence that the expression of GluR1 glutamate receptors encoded by GRIA1 are altered in ischemic cardiomyopathy (18), and GluR1-containing complexes are associated with cyclic AMP generation via ␤-adrenergic stimulation that itself may be involved in the initiation and/or maintenance of AF (45). As GRIA1 is increased only in the AF-LAA this may indicate that it does indeed play a role in AF. Further study into the role of these glutamate receptors in atrial fibrillation is necessary. The noncanonical Wnt signaling pathway was also identified to be enriched in the LAA, and the receptor tyrosine kinase orphan receptor 2, ROR2, gene is significantly downregulated in both AF-LAA and AF-RAA in this study.
Given the large variety of genes that were identified from the differential gene analysis and subsequent enrichment analysis a subset of genes including those previously linked to AF and those who had no links were selected for further validation by quantitative RT-PCR to include genes differentially expressed in either LAA and/or RAA. Selected genes that had previously been implicated in AF included the ion channels, CACNA1G (15), KCNK2 (36) and KCNJ2 (9), signaling proteins and receptors, AGTR1 (17) and PPP1R1A (7), heart structural components, MYH7 (31,34) and BMP7 (6), and a cardioprotective protein, MT1X (24). Abnormal calcium handling plays an important role in AF (20,29) and the significantly decreased expression in both AF-LAA and AF-RAA of CACNA1G that encodes a T-type calcium channel, Cav3.1 (32), is worth investigating further. Changes in other calcium channels e.g., voltage-gated calcium channels Cav1.2, Cav 1.3, and Cav 3.2, were not found to be significant. It is known that remodeling Cav1.2 is important in chronic AF (16). Our data suggest that these changes as not as pronounced as others that might be occurring with other calcium handling pathways and that transcriptional remodeling might be modest with other posttranscriptional mechanisms in play. Another gene of par- ticular interest is PPP1R1A that encodes for I-1 and acts as an inhibitor of protein phosphatase 1 (PP1) (8). A link with an increase in PP1 activity and increased phosphorylation of ryanodine receptor 2 (RYR2) leading to AF has been identified (7); therefore, the observed downregulation of PPP1R1A and therefore I-1 could very likely play a role in progression of AF.
Of the familial genes commonly found in AF (9), only KCNJ2 showed upregulation in AF-LAA. Overexpression of KCNJ2 has been shown to lead to an increase in I K1 that causes hyperpolarization of the resting membrane potential resulting in stabilization and promotion of re-entry leading to sustained AF (46). Additionally, a selection of genes that have not been previously implicated in AF were also validated. These reflected our personal interests and included orphan G-protein coupled receptors for example. The genes selected were ATP1B4, COMP, GREM1, GRIA1, GPR183, GPR83, and SYTL5. Expression of ATP1B4 was increased only in AF-RAA and encodes a muscle-specific ␤M protein found in the inner nuclear membrane that may play a role in the regulation of gene expression through TGF-␤ (33).
In recent years, the number of genes identified by GWAS to be associated with AF has increased from 30 to around 212 potential genes (Supplementary Table S6) (2,10,14,19,21,26,30,31,34,38,43). Seven of these potential genes were found to be differentially expressed in this edgeR data set. One of these, MYH7 encodes for ␤-MyHC protein, the heavy chain subunit of cardiac myosin encoding the adult slow twitch isoform, and has recently been identified to be a functional candidate gene for AF involvement (31,34). Expression of MYH7 has been shown to be increased in atrial myocytes of patients with chronic AF (4), and variants in the gene have been linked to hypertrophic cardiomyopathy, where AF is a known complication (27). In a rabbit ischemic heart failure model, MYH7 expression was only detectable in the failing left atrium with a heterogeneous distribution (31). The increased presence of ␤-MyHC could cause the atria to become arrhythmogenic and predispose to permanent AF. Other genes found in this study that have previously been associated with AF include SYNE2, which encodes for synaptic nuclear envelope protein 2 (21,34), and RPL3L, which encodes a ribosomal protein (ribosome protein like 3L) that is specifically expressed in skeletal muscle and the heart unlike most ribosomal proteins, which are ubiquitously expressed (42).
Limitations of this study include the small number of samples available for RNA-Seq. This was due to difficulties during surgery in obtaining both left and right atrial appendages from each patient. However, the benefit of using paired left and right atrial appendages and a high sequencing depth somewhat counteracts a small sample size. Due to practical limitations only a few differentially expressed genes were able to be validated. However, not all of the genes that were selected for validation could be reliably detected by quantitative RT-PCR. RNA-Seq allows for subtle changes in gene expression to be detected as it identifies novel transcripts, spliced isoforms, and splice sites; these changes may not be observed when using quantitative RT-PCR as it is limited to the detection of known sequences. Further studies are required to understand the significance of these genes in AF.
In summary, using RNA-Seq data from left and right atrial appendages of SR and AF patients can give a further insight into potential new genes and pathways that may be involved in the progression of this complex and multifactorial disease. The left and right atria may play different roles in the development, support, and remodeling of AF as shown by the changes in gene expression levels and their involvement in different pathways between AF-LAA and AF-RAA. These data can be used to study these genes and their potential pathways in further detail.