Voltage Distributions in Extracellular Brain Recordings.

Extracellular recordings of brain voltage signals have many uses, including the identification of spikes and the characterization of brain states via analysis of local field potential (LFP) or EEG recordings. Though the factors underlying the generation of these signals are time-varying and complex, their analysis may be facilitated by an understanding of their statistical properties. To this end, we analyzed the voltage distributions of high-pass extracellular recordings from a variety of structures, including cortex, thalamus and hippocampus, in monkeys, cats and rodents. We additionally investigated LFP signals in these recordings as well as human EEG signals obtained during different sleep stages. In all cases, the distributions were accurately described by a Gaussian within ± 1.5 standard deviations from zero. Outside these limits, voltages tended to be distributed exponentially, i.e. they fell off linearly on log-linear frequency plots, with variable heights and slopes. A possible explanation for this is that sporadically and independently occurring events with individual Gaussian size distributions can sum to produce approximately exponential distributions. For the high-pass recordings, a second explanation results from a model of the noisy behaviour of ion channels which produce action potentials via Hodgkin-Huxley kinetics. The distributions produced by this model, relative to the averaged potential, were also Gaussian with approximately exponential flanks. The model also predicted time-varying noise distributions during action potentials, which were observed in the extracellular spike signals. These findings suggest a principled method for detecting spikes in high-pass recordings and transient events in LFP and EEG signals.


INTRODUCTION
Recordings of extracellular voltages in the brain include electroencephalographic (EEG) signals recorded from the surface of the skull, local field potential signals (LFP) recorded from electrodes inserted into the brain, and recordings of extracellular potentials in a higher frequency range (typically 500-8,000 Hz) made with microelectrodes or recording arrays (MEAs) with the goal of identifying signals corresponding to the action potentials (spikes) of individual neurons. Motivated initially by the desire to put the detection of spike signals on a firmer statistical basis than exists at present (1, 2), we examined the distribution of voltage signals in MEA recordings to see, for example, if the distribution was bimodal, in which case there would be a principled basis for placing a detection threshold. We did not find this, but instead noted that distributions appeared to follow a common pattern, with a Gaussian center and heavy tails. We then extended the analyses to LFP (cat, rat, mouse, and monkey) and human EEG recordings. In all three types of recording, we found that the distributions were well described by a Gaussian within ±1.5 standard deviations of the means. Outside of this range the distribution was heavy-tailed and was well described by an exponential function. Although the heights (intersection points of the tails with the Gaussian center) and rate constants of the exponential could vary within and between recordings, the exponential relationship was found to hold, often to within 10% of the individual bin counts, over up to five orders of magnitude. We explored two possible models that might explain this. One model postulates that distributions result from two processes: one, a continuous source of (effectively) random signals which sum as predicted by the central limit theorem to produce a Gaussian distribution, to which is added a second set of randomly occurring, independent "events" with a Gaussian height variation. If these events occur frequently enough that they often overlap, the resulting distributions can resemble the observed ones in having Gaussian centers and approximately exponential tails. A second model, which would apply only to the high-pass (spike containing) recordings, is based on the nonlinear mechanisms of action potential generation. We found that the voltage distribution generated by Hodgkin-Huxley kinetics which incorporate a model of molecular noise also have approximately exponential tails.
Although the shapes of the tails of distributions that are otherwise well described by a Gaussian might seem of little interest, for the high-pass recordings the tails are where most of the biological interest lies, as they contain the spikes. The same may be hypothesized to be true for other types of brain recordings. Characterization of distribution tails may offer a more principled way of setting event detection thresholds in high-pass recordings. It also may offer a means for identifying transient signals in LFP and EEG recordings which are event-like and may correlate with other neural phenomena. Characterization of distribution parameters might also lead to additional ways of defining brain states.

High-Pass Extracellular Recordings
Data were obtained from a set of eight extracellular recordings made with multichannel array (MEA) electrodes in a variety of brain structures and in different laboratories (Table 1). They include: 1) two recordings (R1 and R2) made simultaneously in different cortical areas in an awake macaque monkey (laboratory of Dr. C. Pack); 2) two recordings (R3 and R4) from anesthetized cat visual cortex, made in the laboratory of Dr. N. Swindale; 3) a recording (R5) from anesthetized cat lateral geniculate nucleus (LGN) [laboratory of Dr. C. Baker, methods given in Gharat and Baker (3)]; 4) a recording (R6) from mouse visual cortex made by M. Spacek and G. Born (laboratory of Dr. L. Busse), and 5) two recordings (R7 and R8) made simultaneously from hippocampal area CA1 and entorhinal cortex of an awake rat [laboratory of Drs. K. Mizuseki and G. Buzsáki (experimental id ec014.716 from data set hc3, available at www.crcns.org)]. The number of channels used in the different recordings ranged from 32 to 96 and the recording durations were between 18 min and 42 min. Further details are given in Table 1. The recordings from the Swindale laboratory had been bandpass hardware filtered between 0.5 KHz and 6 KHz before being digitized at 12-bit resolution (4). All the other recordings had been digitized with 16-bit resolution and included the local field potential. To obtain a high-pass signal containing spikes, low frequencies were removed digitally with a noncausal (Fourier domain) high-pass 4-pole Butterworth filter with f 0 = 0.5 KHz. A recording of 13-min duration made with a 54channel Neuronexus electrode placed in a beaker of saline was used as a control (Table 1). Sample voltage waveforms from a high-pass recording are shown in Fig. 1 (top 4 traces).

Local Field Potentials
For investigations of the voltage distribution in the LFP recordings other than the two made in the Swindale laboratory, we low-pass filtered the original data using a filter that was the complement of the one used to obtain the high-pass spike signal. The resulting signal was then subsampled to obtain recordings with a sample frequency of 1 KHz. A 60 Hz (plus harmonics) notch filter was applied to several of the recordings to remove artifacts. The other two recordings were obtained from separately filtered and recorded LFP data from a subset of 10 channels, as described elsewhere (5). These recordings also had a sample frequency of 1 KHz. Sample LFP waveforms from a recording are shown in Fig. 1 (middle trace).

VOLTAGE DISTRIBUTIONS IN EXTRACELLULAR RECORDINGS
states that, "The 16 healthy subjects included in the study did not present any neurological disorders and were free of drugs affecting the central nervous system." Sleep-stage scoring consisted of tags registered every 30 s indicating the sleep stage [S1-S4 and rapid eye movement (REM)] as well as waking and movement states. The analyses done here were confined to data recorded during sleep stages S2, S4, and REM. These were chosen because they constituted the bulk of the recording time and are likely to be distinct in terms of EEG patterns. Various EEG channel locations are available in the data. For the results shown in Fig. 12, we chose data from a single channel pair, A1-C4, that was common to all the data sets and spanned a larger distance on the head than the other pairs. Sample EEG waveforms from sleep stages S2, S4, and REM are shown in Fig. 1 (bottom 3 traces).

Analysis Methods
Histograms were compiled by binning the raw integer voltage values using bins of integer width to avoid aliasing. Care was taken to avoid inaccuracy resulting from integer truncation of floating-point voltage values (obtained, e.g., after software filtering). The standard integer truncation function, int(x), often implemented silently by compilers, has int (À1 < x 0) = int(0 < x < 1) = 0, which will yield artifactually inflated numbers of values in the bin for values that include zero. For the high-pass recordings, bin widths were normally chosen to be close to 2 mV with the exact value depending on the conversion factor between the digital voltage value and microvolts. For the low-pass (LFP) recordings, bin widths were chosen to yield histograms with around 20 bins per standard deviation of the central part of the voltage distribution, with the same bin width used for each channel. This was obtained with bin values between 2 mV and 20 mV. For the EEG data, bin widths were chosen to yield histograms with around 10 bins per standard deviation. Periods of silence (zero or constant voltage) present at the beginning and ends of some recordings were masked from analysis. A few brief periods with artifactual signals (large amplitude voltage fluctuations present on all the channels) were also masked from analysis. Faulty channels likely to lack biological signals were identified on the basis of the tests described by Swindale and Spacek (8) or by visual inspection of the waveforms and were excluded from analysis. There were normally three or fewer such channels per recording.
Histograms were quantified by first doing a least-squares fit of a Gaussian with variable height, center position and width (r), using the Levenberg-Marquardt algorithm (9). Rough initial values were derived from the histogram. In order that the fits be influenced as little as possible by the values in the extremes of the distribution, the goodness-offit values used for optimization were only calculated over xvalues in the range of ±1.5r, where r was the current value of the Gaussian width parameter. Following this, negative and positive shoulder points, S L and S R , were defined as the smallest absolute voltage values where the histogram bin count exceeded the value obtained from the Gaussian fit by 10% of the fit value or more. These points were normally in the range 1.7r-5r. Outer limits of the flanks were defined as the largest absolute voltage values for which the bin values were !10. Fits to the negative (left) and positive (right) flank histogram values were done for values between these limits and the shoulder points. Flank distributions were characterized by linear regression to the logarithm of the raw histogram values taken between the limits as just described. The analysis was restricted to cases where more than 10 bins lay between these limits. Positive and negative flank distributions were also characterized by fitting a parabola to each. Each parabola was constrained to have its peak at zero. In this case, the values obtained from the fit of the central Gaussian were subtracted from the bin values before fitting. The rationale for doing this for the parabola fits, and not for the linear fits is that each strategy resulted in better fit values. This effectively means that two different types of model are being compared: one, a piecewise set of functions (exponential, Gaussian, exponential) and the other the sum of two Gaussians. Note that each parabola had the same number of free parameters (2) as the straight-line fits, allowing an unbiased comparison of the goodness-of-fit values. The bestfitting parabola, like the best-fitting straight line, could be calculated explicitly. Fitting a parabola to the logarithm of the distribution values is comparable to fitting a Gaussian to the raw values, the hypothesis being that the overall distribution is the sum of two Gaussians of different widths. Fitting a straight line to the logarithm of the distribution is comparable to fitting an exponential function to the raw bin values. Fits of straight lines and parabolas to the logs of the distribution values were done in preference to fits of exponential and Gaussian functions to the raw values because Flank width was measured in terms of the sigma obtained from the best fitting parabola. This was done separately for the positive and negative flanks. An alternative description of the widths can be obtained from the exponential fits which resemble a Boltzmann distribution of the form P(v) $ e Àv/T , where v is voltage and T is a "temperature." A high temperature means a shallow slope (i.e., a flank that extends a large distance away from the origin) and vice versa. However, the sigma measure was used in preference to temperature because of its simpler interpretability. Independently of either of these measures, the "heaviness" of the flanks was measured by summing, over the intervals v < S L and v > S R , the difference between the measured bin value and that given by the fitted central Gaussian. This value was then divided by the sum of all the histogram bins and expressed as a percentage. Separate measures were also calculated for the positive and negative flanks.

The Template Deviation Histogram
In analyses of data containing spike waveforms, we calculated the distribution of voltages relative to the mean spike waveform (the "template") of a particular sorted unit. To do this, individual spike waveforms from sorted units (see Spike Sorting for details of the sorting) were aligned (slid sideways with a small temporal displacement) to the mean waveform, so as to minimize the root-mean-square (RMS) difference between the waveform and the template. After this was done for all the spikes in the unit, the template was recalculated and the procedure repeated until further movements were of vanishingly small size. Interpolation of waveform voltages was used to do this (5). Aligning to the interpolated trough of each spike waveform, was also tested and found to yield similar results, though it yielded overall slightly larger RMS error values and was not used for that reason.
Following alignment, the distribution of voltages relative to the mean was then calculated using a window that included most of the spike waveform. In addition, the distribution was calculated as a function of time during the spike, using 1 bin per temporal sample.

Hodgkin-Huxley Noise
To examine the possible effects on noise of nonlinear voltage generating mechanisms in the brain, we examined the behavior of simulated action potentials generated by the Hodgkin-Huxley (HH) equations (31) taking into account the noisy behavior expected when channel densities are relatively low, as is likely in the cortex (10). Thus we obtained solutions V(t) to where the symbols have the standard interpretation but the quantities g Na and g K were varied according to a probabilistic model of the opening and closing of the sodium and potassium channels at voltage-dependent rates. The model was a standard one based on Markov network models for the opening and closing of channels (11). The voltage-dependent conductances for the sodium and potassium currents were given by the following: g Na ¼ Q Na N Na g Na and g K ¼ where N Na is the total number of sodium channels, N K is the total number of potassium channels, Q Na and Q K are the numbers of sodium and potassium channels in the open state, and g Na and g K are the maximum sodium and potassium conductances. Further details of the model, parameter values, and the computational methods used can be found in Rowat (12). Table 2 lists the primary parameters that were used.
Points at intervals of 0.08 ms were exported, corresponding to a sample frequency of 12.5 KHz. The resulting voltage signals were subjected to the same procedures of event detection and RMS alignment of individual waveforms relative to the mean waveform as were used for the real recording data.
An additional set of simulations was carried out in which channel states were assumed to be non-noisy (i.e., the limiting case for high densities) but the input current I inj had a Gaussian noise distribution. The results from these simulations were similar in character to those reported here for the channel noise model, but further description will be omitted for the sake of brevity. Solutions with noisy input currents were also obtained using the Fitzhugh-Nagumo (13) and  models and likewise yielded similar results.

Spike Sorting
Analyses of raw voltage signals on specific electrode channels did not require spike sorting. However, analysis of the template deviation histograms relies on definition of units with a defined and stable shape via spike sorting. For these purposes, the extracellular high-pass recordings were sorted using SpikeSorter (5), which uses a clustering method applied to the principal components (PC) distributions of spike waveforms. Because of our interest in extreme, uncommon, voltage fluctuations it seemed best not to make assumptions (e.g., of a Gaussian distribution of PC values) that might bias the selection of spikes against those containing outlying values. This was checked, first by the use of a clustering method based on density gradients in PC distributions that specifically does not exclude points in the tails of distributions (5). Second, we found that the results were largely robust to the exclusion of outliers in the PC distributions. It was not the case that points in the flanks of the voltage distributions were also outliers in the PC distributions. Conversely, voltage distributions obtained from points taken from the centers of the PC distribution were similar to those obtained from the entire distribution. This may be because very few extreme values will be present for more than one or two points in any waveform and may have little actual impact on the PC value for a specific spike. We attempted to exclude both short-and long-term sources of variability in spike shape from the analysis. Shortterm variability, on a timescale of a few milliseconds, is most often manifest as a decrease in spike height at very short interspike intervals. This type of behavior was not common among the units studied here but we attempted to exclude it by measuring, for each spike pair in a given unit that was less than 10 ms apart, the height of the second spike as a percentage of the first. The mean and standard deviation of this percentage was taken across all such intervals and a measure, R, calculated as the ratio between the mean À 100 and the standard deviation. In six units (out of 180), R was less than À0.5, and these units were excluded from further analyses. Spike shapes can also be variable over periods of minutes or hours and this presumably reflects factors such as tissue movement or other sources of instability that are outside the current scope of interest. To minimize the impact of this, we used the following measure of spike height variability, termed Q. The recording period was divided into 10 consecutive, nonoverlapping, periods with equal numbers of spikes in each. For each period, the mean of the spike peak-to-trough voltage was calculated. Next, the overall standard deviation of the height relative to each mean was calculated for the whole recording. The variability, Q, was defined as the difference between the maximum and minimum of the mean values divided by the standard deviation. Units were excluded from analysis if this measure was !1.0, that is, if the means varied by more than the standard deviation. Though there are more formal tests of uniformity of variance, this measure was found to be effective at identifying and excluding units that clearly showed long-term variability in spike height. Nevertheless, it cannot be assumed that units that passed the test did not exhibit forms of shortand long-term variability in spike shape that might have affected the measures that were made. In some cases, spikes from short periods where spike shape exhibited systematic changes over time were masked from analysis to allow a unit to pass the test.

Statistical Tests
The data analyzed here are temporally correlated, however temporal structure in the signals was ignored in the analyses and normally every point in the waveform was included. The following argument can be made to justify this. Assuming that correlations fall to zero within a time period given by N samples, it is possible to compile a single histogram taking every Nth sample to obtain a distribution unaffected by correlations. N such unbiased histograms could be obtained by taking samples beginning at points 1, 2, 3 . . . N. The sum of these will of course be the complete histogram which should be similarly unbiased though now with correlations among the bin values. Notwithstanding this, estimates of the mean value of individual histogram bins are expected to be improved by including all the samples, even though this complicates other statistical interpretations such as v 2 tests of goodness-of-fit, or Kolmogorov-Smirnov tests of normality. The argument was tested by occasionally compiling histograms from coarsely sampled data and noting their similarity to the histograms from the complete time series. Examination of the autocorrelation functions of the raw data showed that correlations decayed to close to zero (r < 0.05) within 30-50 sample points ($1-2 ms) for the high-pass recordings and 500-1,000 sample points ($0.5-1 s) for the LFP and EEG recordings. For these reasons, in cases where Kolmogorov-Smirnov tests of normality were done, sample points were taken at intervals of 2,000 to achieve uncorrelated samples.

Goodness-of-Fit of a Gaussian
A Gaussian was an exceptionally good fit to the central portion of the voltage histograms in all three types of data examined-the high-pass recordings containing spikes, the LFP recordings, and the sleep EEG data (Fig. 2, A-C). The accuracy of the fits was quantified by measuring the standard deviation of the residual errors, taken across bins, with each residual error expressed as the percentage difference between the histogram bin count and the fitted value ( Fig.  2). An overall error value for the central portions was calculated for those histogram bins that lay within ±1.5r of the center of the fitted Gaussian. This calculation was done for all the unmasked channels in each of the high-pass and LFP recordings and for all the EEG channels. Standard deviations averaged 1.2% (range 0.36%-3.3%) in the high-pass data; 3.4% (range 1.5%-8.9%) in the LFP recordings, and 2.6% in the EEG data.
Although a Gaussian is an excellent fit to the centers of the distributions, it does not fit the flanks (tails) well (Fig. 3). On linear plots, like those shown in Fig. 2, A-C, the differences are small visually and might be regarded as insignificant except that, for the high-pass channels, they are a priori likely to include signals from spikes where most of the biological interest lies. The heavy tails are more clearly shown using logarithmic scaling on the y-axis (Fig. 2, D-H). This transforms a Gaussian into a parabola and makes the heavy positive and negative tails much more evident. We confirmed that the tails in the high-pass data mainly, if not exclusively, contain signals resulting from spikes by comparing the histograms derived from the entire recording with the voltage histogram derived from portions of the recordings containing spikes. To detect the spikes, we used a dynamic multiphasic filter method (2) with a window width of 0.3 ms and a detection threshold of Â4.5 a robust estimate of the voltage standard deviation on each channel, obtained as the median of the absolute values of the voltages on the channel divided by 0.6745 (1). Comparison of the event histogram with the complete distributions (Fig. 2, G and H, solid red lines) showed that the tails on either side were accounted for, over most of their range, by portions of the recordings containing spike events, as defined by the detection method. The few exceptions were points in regions bounded by the raw voltage histogram, the event histogram and the fitted Gaussian (Fig. 2, G and H, arrows). These points are likely to have fallen below the threshold used for event detection. The implication of this is that a more carefully chosen detection threshold might have detected more valid events. The correspondence between outlying points in the histograms and points considered to be events is almost by definition likely to be true, because a voltage threshold is used to define the events. However, events had to pass an additional test where the voltage had to change by an amount equal to twice the threshold within a short temporal window (in this case, 0.3 ms). Hence very few points with voltages in the range of the tails fail to pass this additional test for spikes.
Control voltage histograms from a recording made with a 54-channel electrode placed in a beaker of saline showed a Gaussian distribution across the entire voltage range on 51 of the channels (Fig. 2I). The voltages on these channels all passed a test for normality (Kolmogorov-Smirnov, with P > 0.1). The remaining three channels, one of which may have been faulty, showed heavy tails not unlike those seen in the biological recordings, and failed the test.

Modeling Noise Distributions
To better understand why voltage histograms in brain signals might have Gaussian centers and non-Gaussian flanks, we considered a hypothetical situation in which, to begin with, many continuously ongoing and overlapping sources of signals combine linearly. As predicted by the central limit theorem, their different individual distributions are likely to combine to produce a signal that is Gaussian. We will characterize this central distribution by its standard deviation, r c . To this signal, we hypothesize the addition of sporadically occurring "events" with a different, and assumed larger, Gaussian standard deviation, r e . The manner in which these events are added to the background noise has an impact on the resulting distribution. If the events are added in such a way that they rarely or never overlap, for example, if they occur at fixed intervals or some refractory process stops them from co-occurring, or if the rate is simply very low, so that coincidences are rare, then the resulting distribution will be the sum of two Gaussian distributions, the initial one with an unchanged central width, r c , and a second flanking one with a variance = r c 2 þ r e 2 . Hence the flanks produced by the events will have a Gaussian distribution and flank . Representative samples of voltage histograms from the three main data types analyzed. A: high-pass recording from data set R4 (channel 9). B: a local field potential (LFP) recording from data set R3 (channel 1). C: EEG data during stage 2 (S2) slow wave sleep (data set N1, channel pair F4-C4). The thicker black line is a Gaussian fit to the center of each distribution, as described in the METHODS. Numbers on the vertical axes are counts per bin. D-F: the same data as in A, B, and C, respectively, but using logarithmic scaling on the y-axis to make the flank distributions more visible. G: a single channel in data set R4 showing a stepped distribution of flanks often observed in the highpass (spike) recordings. H: a single highpass recording channel in data set R5 showing, in contrast, strikingly linear flanks over four orders of magnitude of density. The red lines in G and H show the histograms derived separately from all the events recorded on each channel. Arrows indicate regions possibly corresponding to missed events. I: a control histogram obtained from a recording made in a beaker of saline, together with the Gaussian fit.

VOLTAGE DISTRIBUTIONS IN EXTRACELLULAR RECORDINGS
parameters can be obtained by fitting a Gaussian, first to the center, and then to the flanks, as described in the METHODS (it is assumed that the central values of the flanking distribution have only a small effect on the central values of the main distribution). If, however, events are added in such a way that there is a reasonable probability of overlap, this is no longer the case. The probability of k events occurring and adding at any single point is given by a Poisson distribution, and the variance due to each value of k is equal to kr e 2 . The overall distribution of voltage, v, is given as follows: where k is the rate of occurrence of the events, that is, the total number of events divided by the number of points in the overall sample. The Gaussian term is the distribution expected from the convolution of k independent Gaussian distributions belonging to each event. Convolution is involved because the sum of two or more independent processes will give rise to a distribution that is the convolution of their distributions. This fact is made use of later on in the section dealing with template deviations. Given that P(v) is a sum of Gaussians of different widths it is not immediately obvious what the shape of the extremes of P(v) might be. However, the shape can be studied, either by calculating model distributions directly from Eq. 3 or by carrying out simulations with values derived from random number generators (Fig. 4, A and B). The resulting values can then be analyzed in the way described in the METHODS, by fitting either straight lines or parabolas to the flanks of the logarithm of the bin values (simulated or directly calculated). (Note that, as described in the METHODS, the same number of free parameters is involved in either type of fit, so the comparison is unbiassed in that respect.) This was done with simulations, and, as might be expected, for low values of k, where overlaps would be rare, a parabola gives a better fit (Fig. 4, A and C). For values of 0.1 < k < 10, the logarithms of the flank values are better fit by straight lines than parabolas (Fig. 4, C and D), that is, the distributions are better described as exponential than Gaussian. For still larger values of k, the distribution approximates a single Gaussian (not shown), which is also expected given that the signal is now largely composed of many randomly added events at each sample point. Hence, although it is not the only possible explanation, the presence of exponentially distributed voltage flanks is consistent with the presence of sparsely occurring "events" with Gaussian size distributions.
Although for real data, a value of k cannot be estimated directly, a proxy for it is the heaviness of the flanks, which can be measured directly. A graph from the simulated data showing the difference between the linear (i.e., exponential) and parabolic (i.e., Gaussian) fit values, versus the percentage of the values that lie in the flanks (as defined in the METHODS) shows a switch from parabolic to linear (whichever gives lower fit values) as flank percentage increases (Fig. 5A). A similar switch was seen in the high-pass recording data (Fig. 5B) though not in the LFP data. Overall, however, the differences between linear and parabolic fits to the flanks in the high-pass recordings were small (Fig. 6A) with only a slight advantage of linear over parabolic. The respective fit values were 18.7% versus 19.5% for the negative flanks and 13.4% versus 14.3% for the positive flanks. The difference was not significant for the negative flank (Wilcoxon matched pair test, two-tailed; n = 427) but was significant for the positive flank (P < 5%, Wilcoxon matched pair test, two-tailed; n = 395). Fits were significantly worse for the negative flanks compared to the positive flanks (Wilcoxon matched pair test, two tailed; P < 0.1% for both linear and parabolic cases; n = 395) consistent with the more variable shape observed for negative flanks.
Because of the greater familiarity with width as determined from the fit to a parabola, which yields a standard deviation for the positive and negative flanks, we used that measure to characterize flank width in preference to the temperature measure. Positive and negative flank widths, measured in units of the standard deviation of the central Gaussian for each channel, were correlated (Fig. 6B), with negative widths on average about twice those of positive widths. This difference would be expected given that spikes are usually asymmetric in voltage with larger negative than positive deflections. Likewise, flank weights, expressed as the percentage of counts found in the flanks relative to the entire distribution, were positively correlated, with larger negative weights (Fig. 6C). Both the width and the weight differences between flanks were highly significant (P ( 0.001).

The Influence of Temporal Structure in the Signals
Although one might expect that there should be a direct connection between the second order, temporal structure (derived from pairs of points) of a signal and its first-order (derived from single point samples) distribution, in general a simple connection cannot be made. For example, the same first-order distribution can result from many different temporally structured signals: taking a signal and reordering the samples in time can change the second-order but not the first-order structure. Conversely, a particular spectral shape does not simply translate to a related first-order distribution. For example, values sampled from a signal with a 1/f spectrum can have a Gaussian distribution (15) as does white  As described in the text, distributions were generated by simulating the addition, at random independent times, of points with a Gaussian distribution in voltage, r e to a continuous background Gaussian signal with a standard deviation r c . The rate of addition k is plotted on the horizontal axis. In this simulation, r e = 2r c . For low values of k, a parabola is a better fit, but for higher rates the distribution is better described by a straight line. D: example of the best fitting straight line and parabola to the positive flank of a model distribution where k = 0.75.

VOLTAGE DISTRIBUTIONS IN EXTRACELLULAR RECORDINGS
noise with a flat spectrum. There is, however, sometimes an obvious direct relationship. For example, a sine wave gives rise to a U-shaped distribution with cutoffs at ±1. However, given that most biological signals are likely to be the sum of many different processes with different individual distributions, the end result is likely to be Gaussian because of the central limit theorem. As argued above, the non-Gaussian flanks most likely indicate the presence of episodic signals whose effects on the power spectrum might be small and not necessarily distinctive (e.g., in the form of peaks). Hence, differences in the power spectrum might have little influence on the shape of the first-order distribution. We confirmed these arguments for the present data by taking a Gaussian noise signal and applying a 1/f filter, after which the distribution remained Gaussian. We also applied different filters to the raw data, including a 1/f filter, a low-pass filter with f 0 = 2 KHz and a high-pass filter with f 0 = 2 KHz. In each case, the linear flanks were preserved, albeit with slightly different slopes (Supplemental Fig. S2).

The Template Deviation Histogram
Given that spikes originate from the firing of a discrete and possibly small number of units that happen to be close to each recording site, the contribution to the channel histogram from a particular unit will be the convolution of the voltage distribution of the unit (its average waveform, or template) with the noise of the signal relative to the template. Figure 7 illustrates this process. The middle section of the figure shows a sample template waveform, with voltage on the horizontal axis and time on the vertical axis, together with the histogram of its voltage values below. On the top panel, the black curve shows the distribution of voltages relative to the template. This will be referred to here as the template deviation histogram. It can be computed, for individual units, after first carefully aligning each waveform to the template, and then taking the difference between the spike waveform voltage and the value of the template for that point in the waveform. The convolution of this histogram with the template voltage histogram (bottom) gives the contribution of the spikes in the unit in question to the raw channel voltage histogram (curve shown in red). Note that the straight flanks of the template deviation histogram are preserved (but shifted sideways) by the convolution. A reason for this is given in the section Resistance of the Exponential Distribution to Convolution.
The template deviation histograms, which show the noise relative to the waveform of a given template, were found to be similar in overall shape to the histograms for the raw channel waveforms (Fig. 8A) but lacking the stepped distribution of the negative flanks (e.g., Fig. 2G). It was noted that

VOLTAGE DISTRIBUTIONS IN EXTRACELLULAR RECORDINGS
the widths of the central Gaussian distributions for the template deviations were larger on average, by $33%, than the width determined from spike-free sections of signal from the same channel as the spike (Fig. 8B). This suggests that, independently of the flank values, many, though not all, spikes contribute a separate additional source of Gaussian noise to the channel noise, which is therefore smaller in the intervals between spikes. Negative flank widths, measured from the parabolic fits, were larger than positive flank widths (Fig. 8C, P < 0.001, Wilcoxon matched pairs, n = 160). Fit values were similar for positive and negative flanks (Fig. 8D) and linear fit values were significantly smaller than parabolic fit values (P ( 0.001 for both positive and negative flanks, Wilcoxon matched pairs, n = 162). Positive and negative flank widths, expressed in units of r c (Fig. 8E) were significantly correlated (Pearson correlation, P ( 0.001, n = 160). Negative flank weights were larger on average than positive flank widths (Fig. 8F), consistent with the widths being larger (Fig. 8C).

Resistance of the Exponential Distribution to Convolution
A resistance of slope to sideways shifting is a property of the exponential function. For example, shifting the function e Àx/T sideways by an amount s gives e ÀðxÀsÞ=T ¼ e s=T e Àx=T , which is a constant times the original function. Likewise, the slope is resistant to convolution with an arbitrary function f(x): as the term on the right-hand side is also a constant times the original function. This argument does not apply with the same force to a distribution that is not exponential over its entire range (as is the case here) but, if the range of values of x 0 for which f(x 0 ) is nonzero is less than the full width of the distribution, the slopes of both flanks will still tend to be preserved. This property of the exponential function is likely to be a factor in explaining the similarity between the template deviation flanks and the raw channel flanks.

The Exponential Flanks May Be a Consequence of Noisy Channel Dynamics
It was suggested earlier that a voltage distribution with a Gaussian center and exponential flanks might be the result of the random addition of events with a Gaussian size distribution to an ongoing Gaussian process (Eq. 3). In what follows, another possible explanation is explored. As the spike waveform is the direct result of the generation of an action potential by a neuron, it seemed appropriate to ask how the intrinsically noisy behavior of the channels involved in spike generation might influence the noise distribution.
To that end, we examined the behavior of action potentials generated by the HH equations in which the stochastic behavior of sodium and potassium channels was taken into account. With enough sodium and potassium channels the behavior of the stochastic HH equations is closely approximated by the deterministic equations, but when the channel numbers are in the hundreds or low thousands it is necessary to consider the stochastic opening and closing of channels. Following details given in the METHODS, we generated voltage records containing $1,000 spikes with an interval of $15 ms between spikes. These were detected, aligned using an RMS measure, as described in the METHODS, to the mean waveform and analyzed in the same way used to generate the template deviation histograms. The histogram was plotted as a function of time during the spike (Fig. 9) using a gray-scale representation to show the distribution of voltage relative to the mean action potential over time. Five stages, labeled A to E on the figure, were identified, defined by changes in the variance relative to the mean: stage A was a resting phase before the spike, where the distribution was approximately Gaussian, and decreased in width during the initial phase of the spike. Stage B was marked by a period of transiently increased noise coinciding with the rising phase of the spike and the rapid opening of Na channels. The noise profile was non-Gaussian and asymmetric. Stage C was a period of intermediate and roughly constant noise occurring during the falling phase of the spike. In stage D, there was a second transient increase in noise, coinciding with the peak of potassium conductance. Finally, stage E was a period of very low noise coinciding with the gradual repolarization of the membrane potential. The individual distributions during these different stages were varied in shape and often asymmetric (Fig. 9, lower left), but the overall distribution had relatively straight (i.e., exponential) flanks on a log scale.
Because extracellular potentials approximate the temporal derivative of the intracellular (transmembrane) potential (16), we also repeated the measures on the differentiated voltage waveform (Fig. 9, upper right). The resulting histogram (Fig. 9, lower right) can be compared in shape with the distribution shown in Fig. 8A. The slight asymmetries in flank widths seen in the real data (Fig. 8, C, E, and F) were not present in the model data. Further interpretation of these, and other differences from the real data may be limited, given that the negative slope is probably only an approximation of the extracellular waveform which can also be affected by the distance from the cell, extracellular matrix geometry and other factors.
Results from simulations in which currents with a Gaussian noise distribution were used to drive deterministic HH spikes were similar to those obtained with channel noise, with noise peaks during the rising and start of the repolarization phases of the spike and with exponential distributions resembling those shown in Fig. 9. Rowat (12) also observed a similarity between the effects of Gaussian input currents and channel noise.

Noise during the Course of the Spike
Given that noise varies with time in the HH spikes, with two periods of increased noise alternating with periods of reduced noise level during the spike, we asked if a related variation could be detected in real spike waveforms. Given that that the extracellular voltage is the negative derivative of the intracellular voltage and that the simulations show increases in noise levels that coincide with the rising and falling phases of the action potential, increases in the variance of the noise might be expected to coincide with peaks and troughs in the extracellular spike waveform. This expectation was confirmed in many units (Fig. 10). Most commonly, a period of increased noise coincided with the negative trough of the spike. Often, though not always, there was a second period of increased noise which could coincide with a positive peak.
Brief periods of reduced noise also occurred (e.g., Fig. 10D) which normally coincided with the most rapidly changing parts of the spike waveform. These changes might possibly be artifactual as it seemed a priori possible that the alignment procedure used to decrease the overall sum-of-squares difference between each spike and the template might tend to align spikes in such a way as to preferentially reduce noise in regions where small displacements might produce large variations in the sum-of-squares error. Control calculations (Fig. 10F) in which unchanging (i.e., noiseless) spikes, equal in shape to particular units in the data, were added to noise with a Gaussian distribution and then subjected to the same alignment procedures as were used for the real data, sometimes showed reduced noise levels coinciding with periods of rapidly changing spike voltage. However, periods of reduced noise were not always seen in these regions in the real spike data (e.g., Fig. 10E). The types of behaviors seen, including some cases where there was little or no obvious change in the noise distribution during the spike, did not appear to differ among the various datasets.

The Flank Distributions in LFP and EEG Recordings
Data from LFP and EEG recordings, examples of which were given earlier in Fig. 2, B and C, and Fig. 12, are included here to show the generality of the observations made on the high-pass recordings. Even though the factors giving rise to variability seem likely to be very different, distributions similar to those of the high-pass data were observed in both the LFP and EEG recordings. As for the high-pass data, a Gaussian was a good fit to the central distribution, within a range of ±1.5r (Fig. 3, B and C). Outside this range, flanks were often well described by an exponential, though less often for the sleep data. Figure 11 summarizes the observations made on the LFP data. For simplicity, all the channels in each recording (Table 1) were analyzed. However, because LFP signals were normally highly correlated across neighboring channels, different channels are not independent samples. For this reason, points tend to fall into clusters belonging to the same recording and normal statistical tests will be invalid. These are hence omitted. Figure 11A shows that, as with the high-pass recordings (Fig. 6C), the distributions had flanks that were well described by straight lines or parabolas, with straight lines giving better fits overall but with little obvious difference between the fit values for the positive or negative flanks. Figure 11B shows that the flank widths, measured in units of standard deviation of the central distributions, were similar for positive and negative flanks, and also similar in value to the flanks observed in the template deviation histograms (Fig. 7E). Figure 11C shows that flank weights, expressed a percentage of the total histogram, covered a similar range of values as in the high-pass recordings, however, unlike the high-pass recordings (Fig.  6C), any overall correlation in the weights of positive and negative flanks appeared to be weak or absent. The correlation that was observed within single recordings was most likely the result of similarity in waveform shapes within the individual recordings.
Examples of voltage distributions in EEG recordings made during various sleep stages in normal human subjects are shown in Fig. 12. These data were not analyzed in detail, but in all cases, heavy flanks were present. Based on casual observations, these were most linear, and similar in shape to the template deviation histograms during sleep stage S4, with the flanks joining the central distribution without any inflection. Histograms from stages S2 and REM (rapid eye Figure 11. Graphs of parameters derived from histograms of local field potential (LFP) voltages obtained, as described in the METHODS, from the eight data sets listed in Table 1. Each point is derived from the data on one channel. All the channels in each recording were analyzed, but as signals were normally highly correlated across neighboring channels, points tend to fall into clusters belonging to the same recording.  LGN)], n = 4,153 spikes; D: a unit in data set R6 (rat visual cortex), n = 3831 spikes; E: a unit in data set R7 (rat CA1), n = 2,658 spikes. F: a control calculation in which 5,000 identical (i.e., noiseless) spike waveforms taken from a unit in data set R4 were added to Gaussian noise with a standard deviation of 10 mV.

VOLTAGE DISTRIBUTIONS IN EXTRACELLULAR RECORDINGS
movement) sleep had flanks that were sometimes linear in shape on log plots (e.g., Fig. 3C) but were usually noisier and less easily characterized than for S4.

DISCUSSION
This paper has examined the voltage distributions measured in a variety of extracellular recording types and shows that although the centers of the distributions are almost always quite exactly described by a Gaussian, the flanksspecifically regions more than 1.5 standard deviations from the mean-are normally heavy-tailed and often strikingly exponential in shape over many orders of magnitude of P(v). There appear to have been relatively few previous investigations of this nature. Some EEG studies [e.g., Bender et al. (17) and Chen (18)] have suggested that the EEG is Gaussian whereas others have found that it is not [e.g., Weiss (19)]. Studies examining the impact of noise on spike sorting in high-pass recordings have tended to assume that noise is Gaussian [e.g., Pouzat et al. (20)] but this hypothesis does not appear to have been carefully tested. It might be argued that the percentage of points in our data not accounted for by a Gaussian description is generally low and can be ignored for practical purposes. For example, Fig. 8F shows that non-Gaussian deviations from average spike waveforms are usually present in less than 5% of the samples. However, this figure is quite variable and moreover is for single flanks and hence the overall percentage of points that was non-Gaussian could sometimes be 10% or more. And although an exhaustive set of tests was not carried out, in every case where they were, the distributions failed a Kolmogorov-Smirnov test for normality, usually with P ( 0.05. Perhaps the most important point is that, in general, distributions tails are likely to contain interesting, or informative, signals. There is no argument about this for the high-pass recordings, because the tails contain the spikes (Fig. 2, G and H). It may also be true that the tails of the LFP and EEG distributions mark signals that are especially informative. As an example, it was recently shown that a thresholding procedure similar to that used to identify spikes could be used to identify transient signals in the LFP during synchronous states. These signals correlated with up-state transitions and, like spikes, also had stereotyped shapes (21). More generally, it can be conjectured that sparse signals that appear in distribution tails may have a high information content. Theories of sparse coding (22) turn this argument around and postulate the existence in the brain of coding methods that result in sparse signals-those in the tails of distributions-containing as much information as possible. It does not seem likely that these coding methods are related to noise in spike waveforms but it is possible that their operation might have something to do with the distribution of signals in LFP and EEG waveforms which are ultimately the outcome of neural information processing at a much higher level.
Exponential distributions occur in other contexts relating to brain signals. For example, an exponential distribution maximizes the entropy of signals which are constrained to a particular mean firing rate (23,24). Experimentally, firing rate distributions of neurons in V1 and inferotemporal (IT) cortical areas responding to natural scene videos appear to be approximately exponential (25). Exponential distributions of interspike intervals are also expected if firing is governed by Poisson statistics. However, in neither case is a direct effect on the voltage histogram expected. In a thought experiment, portions of the signal could be moved, changing the interval distributions but leaving the voltage distribution the same. Analyses of LFP signals in cortical tissue cultures and slices have revealed the presence of large amplitude events, termed "avalanches" with size property distributions that can be described by power laws (26). In this case, the presence of events with stereotypical shapes will influence the shape of the voltage distribution (Fig. 7). However, further calculations would be required to show that events with shapes governed by power laws might produce exponentially distributed voltages.
It is possible that unresolved background spikes from cells at varying distances from the recording electrode might have contributed to the tails of the template deviation histograms. In a modeling study, Pettersen and Einevoll (27) examined various factors that influenced the relationship between extracellular spike amplitude and distance, r, from the cell. The relationship varied as 1/r a with 1 a 2 close to soma and a ! 2 far away. The number of individual neuronal sources contributing to a signal recorded at a given position can be assumed to increase as r 2 which gives P(v) $ 1/r aÀ2 . This distribution is not exponential and can fit the data in the present study only over short ranges of magnitude of P (v). Hence, although it is possible that unresolved background spikes contributed some components of the tails of the template deviation histograms, Pettersen and Einevoll's findings do not lead to a simple explanation of the exponential shape of the tails. However, more exploration of detailed biophysically and anatomically based models of cortical tissue might be informative.
As argued in the section Modeling Noise Distributions, the tails are likely to represent episodic contributions to the signals. Given that the sources of these episodic signals are likely to be quite varied, no specific form might be expected a priori. Nevertheless, the exponential distribution occurred in very different types of recording. This paper explored a simple statistical model which proposes that two separate processes contribute to brain signals. The first is multifactorial and continuous and, by virtue of the central limit theorem, generates a Gaussian signal. The second is episodic and also random and adds signals with a Gaussian height distribution (around a zero mean) to the ongoing signal. If this happens at an intermediate range of rates such that signals sometimes overlap, the resulting distribution can have tails that are well described by an exponential function (Fig. 4D). This model also makes the prediction that the relative goodness-of-fit of an exponential versus a second Gaussian to the tail will vary with the heaviness of the tail, that is, the frequency with which the episodic events occur. If the rate of addition of events to the background signal is low, the flanks will be weak and will (trivially) have a Gaussian shape. If it is high, the distribution will also tend to be Gaussian. This prediction was confirmed for the high-pass recordings though not for the LFP recordings (Fig. 5). In this model, events were considered to consist of a single voltage sample. If, however, the events are spread out in time but still stereotyped in shape, the resulting distribution will be the convolution of the shape histogram and the height distribution. As shown above (Eq. 4), an exponential function can be preserved after convolution with a second function. An explanation for the failure of the model to predict the behavior of the flanks of the LFP recordings is that the "events" in these recordings have a more complex and variable structure and do not satisfy the simple assumptions of the model.

The Noisy HH Model
A second model, applicable only to the spike-containing recordings, explored the possibility that the exponential tails originate in the nonlinear kinetics of action potential generation. This was shown by examining the distribution of signals generated by the Hodgkin-Huxley equations in the presence of either a model of molecular channel noise, or of noisy input currents with a Gaussian distribution. The resulting voltage distribution, relative to the mean action potential waveform, was also Gaussian with approximately exponential tails. These models, however, do not offer a particular explanation of why the tails should have the shape they do. Perhaps a better understanding might come from a mathematical analysis of the HH model's behavior in the presence of these sources of noise. The fact that the shape of the distribution varies over time might also suggest that it is more of a coincidence than anything else that the summed distribution has a similar shape to the one observed in highpass recordings. These limitations notwithstanding, the results suggest that the nonlinear behavior of spike-generating mechanisms may be a source of non-Gaussian signals in the extracellular recordings. The observation of bursts of noise coinciding with the rising and falling phases of the modeled action potential (Fig. 9) and the presence of similar bursts coinciding with the peaks and troughs of extracellularly recorded action potentials (Fig. 10) further supports this conclusion.
The finding of periods of increased noise in the spike waveform in the positions predicted by the noisy HH model support the suggestion that stochastic channel noise is a partial cause of voltage noise observed in extracellular recordings. This would be in addition to functionally significant consequences of channel noise, such as the production of temporally precise spikes (10), limiting wiring density (28) and determining interspike interval distributions (12). However, the observation that Gaussian input currents and deterministic channel properties can also result in noisy distributions with exponential flanks means that molecular noise is not the only possible explanation for this observation.
Further consideration can be given to the factors determining the distribution of noise during the course of the spike (Figs. 9 and 10). Other things being equal, noise should be correlated with membrane resistance, being lower when resistance is low, and vice versa. More noise might also be expected during periods of transition, when channels are opening or closing, rather than being fully open or closed. In combination, these factors may explain the low noise seen during the repolarization phase of the action potential (stage E, Fig. 9) when K þ channels are fully open and membrane resistance is likely to be low. They may also explain the overall higher noise seen during the course of the spike (stages B, C, and D, Fig. 9) assuming that the noisiness of the channels outweighs the effects of increased conductances and lower membrane resistance. Although the first noise peak (stage B, Fig. 9) might be a consequence of noise among rapidly opening Na þ channels at the beginning of the action potential, it is harder to explain the second peak (stage D, Fig. 9) which occurs toward the end of the repolarization phase. One factor that may play a role in these observations is the method used to align the spikes, that is, to define a precisely fixed temporal origin for each. The RMS minimization procedure (see METHODS) is justified provided that, at each time point, the signal is the sum of noise plus a fixed expectation voltage, that is, that there is noise only in the voltage values. However, this assumption is probably not correct. If the spike is assumed to be noisy in all its dimensions (including, e.g., its width) then it becomes plausible that measured noise would be greater in regions where voltage was changing more rapidly, as a given small horizontal displacement would produce a change proportional to the gradient. This relationship is evident to some extent in Fig. 9, since both noise bursts occur during the steepest positive and negative gradients of the spike. A more principled analysis might be done using higher-dimensional phase-plane methods. These considerations mean that it is probably over-simple to characterize spikes as being most noisy during their rising and falling phases (and extracellular spikes as most noisy at their peaks and troughs). However, the analysis done here does have a practical use inasmuch as it is able to reveal a similarity in the behaviors of real spike waveforms (Fig. 10) and the noisy HH model (Fig. 9). Conceivably, further analyses might be able to reveal channel-based differences among cell types, though this possibility remains speculative.
Given that noise in the spike shape can be expected to make sorting less accurate, its impact might be reduced by omitting those parts of the waveform, namely, the peaks and troughs, likely to be most noisy (Fig. 10). We have not so far attempted to evaluate the consequences of this strategy.

Further Defining Events
The findings offer a principled way of defining "events" in extracellular recordings. Given an accurate model of the central Gaussian distribution, the placement of a threshold will allow calculation of a false positive rate for a given number of events falling above a positive threshold (or below a negative one). Conversely, a threshold can be adjusted to give a desired low false positive rate. A method for event detection based on this procedure has been in use in the laboratory of the first author for some time and has been found to work well. A similar approach might be applied to the detection of episodic features in LFP and EEG signals with a view to further characterizing them. As mentioned in the first paragraph of the DISCUSSION, spike-like events can be detected in LFP recordings this way and these can be shown to be of different types (21). We are not sure if EEG signals have been analyzed with a similar approach, beyond the aim of detecting artifacts in the signals. However, decomposition of the EEG in this way might be insightful. Given that the EEG changes over time, depending on brain state, it might also be characterized in terms of flanking distribution parameters derived from fits to short recording periods, in combination with the measured width of the central Gaussian distribution. This might lead to new ways of characterizing sleep stages and brain states in general. Conceivably, first-order distributions might be less noisy and idiosyncratic than the second-order ones normally used to define brain states.
The observation that voltage distributions are Gaussian over only a limited range of about ±1.5 SD (Fig. 3) may have consequences for spike sorting, especially those based on the common approach that explicitly models distributions as multivariate Gaussians (e.g., 29). Although the principal component (PC) distributions normally used for sorting are not trivially guaranteed to have the same distributions as those of the voltage values from which they are derived, some similarity might be expected given that the distributions will be a linearly weighted sum of the individual voltage distributions. Examination of examples of PC distributions for spikes in data sets R4 and R6 showed heavytailed distributions similar to those of the raw voltage values (Supplemental Fig. S3). It can be argued that the impact on sorting will usually be small, given that the tails are unlikely to have much influence on the estimation of the central Gaussian parameters and that the fits are used only to establish clustering domain boundaries. However, rejection of candidate spikes based on the probability of their belonging to an estimated distribution might lead to an unjustified rejection of spikes from sorted units on the grounds that they are outliers. A related caveat applies to the estimation of event detection accuracy, or the removal of outliers based on the assumption of Gaussian distributions (30). Accurate knowledge of voltage distribution parameters is clearly an important prerequisite for reliable spike sorting.