Datamining Schemes for discovering functional connectivity patterns for multi-neuronal spike trains

Inferring the functional connectivity in a neural tissue by analyzing simultaneously recorded spike trains is an interesting, but challenging problem. Current techniques, based on cross-correlograms, principal component analysis, etc. are computationally intensive for discovering patterns of connectivity involving large number of neurons. We present temporal data mining techniques that overcome many of these difficulties. Multi-neuronal spike train data Multi-neuron spike train data is obtained from experiments using Micro-electrode arrays (MEAs) and imaging techniques with indicator dyes for ions like Ca2+. Cross-correlogram and PSTH reveal pair-wise interactions between spike trains of neurons. Such methods become computationally infeasible when extended to more than 3 or 4 neurons. Discovery vs. Search: Looking for (?→?→ . . .→?) The challenge is to discover all frequent/significant patterns in the data without exhaustive enumeration of all possibilities. No. of neurons No. of possible in a pattern (N) patterns 2 3,600 5 777,600,000 7 2,799,360,000,000 10 604,661,760,000,000,000 Table 1: Combinatorially possible patterns of size N neurons in a 60neuron system We present techniques that overcome this difficulty using principles of data mining and make discovery of higher order interactions feasible. Frequent Episode Discovery Framework In this framework, data is viewed as a sequence of time-ordered events. Episodes are time-constrained partially ordered sets of event types (AxxxBxxC). Figure 1: Spike train as event sequence for frequent episode discovery Serial Episode: ordered set of events (A→ C → B). Parallel Episode: events occur in any order (B C D). Figure 2: Occurrences of serial and parallel episodes Solving combinatorial explosion!!! In this approach the curse of dimensionality is overcome using a three fold approach. • Level-wise discovery: For large patterns, the number of candidates grows too fast. But we only want patterns which occur sufficiently often. Given size N frequent patterns, generate size N+1 candidate patterns that could be frequent. ( a ) Level-wise discovery ( b ) Efficient candidate generation Figure 3: Apriori-like mining of patterns • Frequency measure: Counting all possible occurrences is inefficient. We use non-overlapped count [1] as frequency and hence have efficient algorithms for frequent episode discovery. Figure 4: Efficient counting algorithm The runtimes for this class of automaton based algorithms scale linearly with data length i.e. O(N) where N is the number of spikes in the spike trains. • Temporal constraints: Still, we may come up with too many patterns because the discovery is totally unsupervised. We use temporal constraints as a handle to make the discovery efficient as well as to focus search on relevant patterns. See Figure 5. The serial episode mining can automatically select the appropriate inter-event intervals for consecutive events. Figure 5: Inter-event and expiry time constraints for serial and parallel episodes respectively Details of efficient algorithms for mining episodes with temporal constraints are given in [2]. Network structure through episodes