Extracting Spatiotemporal Human Activity Patterns in ... - Microsoft [PDF]

In Section 3 we present our testbed along with the type of information it records and we describe in detail the proposed

2 downloads 22 Views 817KB Size

Report

Download PDF

PNG Network

Recommend Stories

Uncovering the Spatiotemporal Patterns of Collective Social Activity

Sorrow prepares you for joy. It violently sweeps everything out of your house, so that new joy can find

Carbamazepine modulates the spatiotemporal activity

We must be willing to let go of the life we have planned, so as to have the life that is waiting for

Explaining clinical decisions by extracting regularity patterns

The only limits you see are the ones you impose on yourself. Dr. Wayne Dyer

Extracting Change-patterns from CVS Repositories

Before you speak, let your words pass through three gates: Is it true? Is it necessary? Is it kind?

Synaptic background activity influences spatiotemporal integration in single pyramidal cells

Before you speak, let your words pass through three gates: Is it true? Is it necessary? Is it kind?

extracting data from pdf files

If you want to go quickly, go alone. If you want to go far, go together. African proverb

Spatiotemporal imaging of small GTPases activity in live cells

If you want to become full, let yourself be empty. Lao Tzu

A Framework for Spatiotemporal Analysis of Regional Economic Agglomeration Patterns

You can never cross the ocean unless you have the courage to lose sight of the shore. Andrè Gide

Microsoft® Application Architecture Guide (Patterns & Practices)

In the end only three things matter: how much you loved, how gently you lived, and how gracefully you

Learning Subgraph Patterns from text for Extracting Disease

Live as if you were to die tomorrow. Learn as if you were to live forever. Mahatma Gandhi

Idea Transcript

Extracting Spatiotemporal Human Activity Patterns in Assisted Living using a Home Sensor Network Dimitrios Lymberopoulos, Athanasios Bamis and Andreas Savvides Embedded Networks and Applications Lab, ENALAB New Haven, CT 06520, USA {dimitrios.lymberopoulos,athanasios.bamis,andreas.savvides}@yale.edu

ABSTRACT This paper presents an automated methodology for extracting the spatiotemporal activity model of a person using a wireless sensor network deployed inside a home. The sensor network is modeled as a source of spatiotemporal symbols whose output is triggered by the monitored person’s motion over space and time. Using this stream of symbols, we formulate the problem of human activity modeling as a spatiotemporal pattern-matching problem on top of the sequence of symbolic information the sensor network produces and solve it using an exhaustive search algorithm. The effectiveness of the proposed methodology is demonstrated on a real 30-day dataset extracted from an ongoing deployment of a sensor network inside a home monitoring an elder. Our algorithm examines the person’s data over these 30 days and automatically extracts the person’s daily pattern.

Categories and Subject Descriptors I.5 [Pattern Recognition]: Models

General Terms Algorithms,Design,Experimentation

Keywords Human activity model, spatiotemporal activity patterns

1.

INTRODUCTION

The growing numbers of aging baby boomers and the increasing healthcare cost obviates the need for automated services that will increase the independence and autonomy of elders living at home. Wireless sensor networks offer a promising technology for realizing such services. On one hand, small wearable devices can collect biometric information, provide feedback and automatically update medical records. On the other hand, other devices deployed inside the living environment, can monitor the actual people over space and time, understand their activities/behaviors and provide responsive services to them. For instance, an intelligent sensor network could be used to guard against unsafe

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. PETRA 2008 July 15-19, 2008, Athens, Greece Copyright 2008 ACM X-XXXXX-XX-X/XX/XX ...$5.00.

activities inside the house, post reminders, automate tasks and even initiate conversation with the monitored person. Instead of passively monitoring the home environment, the wireless sensor network could actively use the sensing information to enforce a set of rules such as: make sure that the person is engaging(or not engaging) into the activities that his doctor recommends (or does not recommend), predict the activities that will be performed next and automatically prepare the house for them, detect unusual or abnormal activity and notify the necessary medical personnel and/or family and close friends with detailed information about the status of the monitored person over the last few minutes, hours or days, and more. In this paper we explore the problem of creating a human activity model from data collected by a sensor network deployed inside a home. We do so by deploying sensors in the house infrastructure without requiring the inhabitant to wear any sensors. Our reasoning for constructing the models is based on the fact that human activity is a sequence of actions over space and time. All humans execute a daily cycle in which many activities are periodic and elders living alone in particular, tend to have a highly periodic set of routines that they execute from day to day. As the monitored person moves from room to room inside the house, a sequence of detected sensing features is produced over time. This sequence represents the monitored person’s activity signature and its composed of a set of triplets containing location, time and duration information. To better illustrate this, consider the 7-day data trace, we have acquired through an actual sensor network deployment in an elder’s person house, shown in Figure 1. By simply inspecting the sequence of rooms that the person visited over time inside the house, it is clear that patterns, strongly related to the person’s activities, start emerging. The sleeping pattern, the bathroom visits, the time that the person is not home and many more patterns provide invaluable information about the person’s daily living habits. The main contribution of this paper is the development of a methodology for automatically identifying the activity model of a person (like the one shown in Figure 7) using a wireless sensor network. First, we model the sensor network as a spatiotemporal symbol generator that is triggered by the monitored person as he moves over space and time. Based on our network model, we formulate the problem of finding the daily activity model of a person as the problem of finding the most probable, network-level, sequences of nodelevel, sensing features, namely location, time and duration. We propose a methodology for encoding the detected sensing

Figure 1: 7-day window of the data trace recorded from our home network deployment. features (location, time and duration) in a way that allows us to apply an exhaustive, yet very efficient, algorithm for automatically discovering sequential patterns based on how frequently they appear in a given data trace. The effectiveness of the proposed methodology is demonstrated using a data trace collected by an actual sensor network deployment of Passive Infrared sensors (PIR) in the house of an elder person living alone for a period of 30 days. Our results, show that: 1) there is a daily activity pattern and 2) we can automatically generate this daily activity model while taking into account both its spatial and temporal characteristics. The rest of the paper is organized as follows. Section 2 provides an overview of the related work. In Section 3 we present our testbed along with the type of information it records and we describe in detail the proposed network model. In Section 4 we formulate the problem of human activity model generation as a human spatiotemporal pattern discovery problem. Section 5 describes an exhaustive, yet efficient, pattern finding algorithm and Section 6 provides a novel, simple and scalable way for jointly considering space and time information in the pattern discovery process. In Section 7 we present the results of applying the proposed method on a 30-day dataset, recorded from a deployed sensor network in an actual elder’s person. Section 8 concludes the paper.

2.

RELATED WORK

The problem of human activity recognition has been considered in several different domains [7, 12, 11] including wireless sensor networks [6, 14, 17, 13, 1]. These approaches usually focus on the robust detection of a single activity either through specification or typical learning techniques on top of labeled data. Our work differs in the sense that (1) it provides a general method for discovering multiple activities given a large data trace that contains unlabeled data and (2) it takes special care of both spatial and temporal characteristics of different activities. The most closely related work to ours is the work done

by Agrawal at. al [2, 3]. There, the notion of frequent itemset and frequent sequential pattern discovery in a sequence of events is introduced. Based on the apriori principle, Agrawal et. al was the first to propose efficient algorithms for discovering spatiotemporal sequences of events in large event sequences [3]. Since then, several research efforts led to minor [16, 5, 15] or major [10] modifications of these algorithms. The major difference in our work lies on the way we encode and use time information in the pattern discovery phase. In particular, we propose to encode time information in the actual input events and on a per-event basis. By doing so, we can combine the efficiency of the apriori-based sequential pattern mining algorithms with the flexibility of exploiting different temporal charcteristics on different input events. Our work also differs from previous research efforts [5, 15] in the sense that we focus on discovering sequential patterns that are closely related to human activities and not abstract, statistical correlations of events in the input sequence that might not be directly mapped to human activities. Our previous work in hierarchical human activity recognition based on user-defined probabilistic context-free grammars [8, 9] is complimentary to the work presented here. Instead of relying on a user-defined specification of an activity as it was done in previous papers [8], in this paper we automatically extract the activity as a spatiotemporal pattern given a recorded data set that might contain one or more activities. The two approaches are complimentary in the sense that the automatically discovered patterns could be used to refine the user-defined activity specifications so that more robust activity detection is achieved.

3.

DEPLOYMENT AND NETWORK MODEL

The data considered for this work comes from an ongoing sensor network deployment that monitors an elder living alone. The testbed includes a wide variety of sensors including tracking cameras, door sensors and passive infrared sensors. To derive the activity models presented in this pa-

per we only used the PIR measurements. Every room in the house contains PIR sensors placed in a pattern that can capture the elder’s transitions from room to room. Each time a sensor gets triggered it transmits its ID to a home gateway that timestamps and records the sensor ID and uses the data to compute a room-transition function. A 7-day time window of the room transition plot from the actual testbed is shown in Figure 1. From this high level view it is already apparent that the daily activity of the person under observation has regular reccurring patterns. The method proposed in this paper will be applied to the complete dataset to extract these recurring patterns and construct a model of the person’s daily pattern. Based on our deployment, the network generates a sequence of triplets of the form: {P, T, D} where: P is the phoneme detected by the sensor node (a room identifier for this discussion), T is the actual timestamp at which this phoneme was detected and D is the duration of the phoneme. According to the above definition, the output of sensor node i over time will be a time ordered sequence of triplets S i . Assuming that in a given time frame, sensor node i has generated Ni triplets, its output can be denoted as follows: i i i S i =< {P1i , T1i , D1i }, {P2i , T2i , D2i }, . . . , {PN , TN , DN } > i i i i i i , where : T1 < T2 < . . . < TNi Consequently, the output O over time of a sensor network with n nodes becomes a collection of such time ordered sequences of triplets: O = {S 1 , S 2 , . . . , S n }, 1 |S | = N1 , |S 2 | = N2 , . . . , |S n | = Nn where S i is the time ordered output sequence at node i, containing Ni triplets. For instance, in Figure 1, the waveform at the bottom represents the output O of the sensor network while the two waveforms on the top represent the outputs of the nodes monitoring the bedroom and the bathroom respectively. We define an episode E(Tstart , Tstop ) as the time-ordered sequence of all triplets in the output sequences of every node in the network that were recorded between Tstart and Tstop . Formally, the episode E(Tstart , Tstop ) is defined as follows: E(Tstart , Tstop ) = {(Pji , Tji , Dji )|(Pji , Tji , Dji ) ∈ S i , Tstart ≤ i Tj ≤ Tstop ∀i, j} Note that each episode is nothing more than a temporal clustering of triplets that might be recorded to one or more sensor nodes. Given the definition of episodes we can express the output O of the sensor network as a collection of episodes: O = {E(T1 , T2 ), E(T2 , T3 ), E(T3 , T4 ), ...}

(1)

To better illustrate our network model, let us consider the simple night/morning activity of the elder person monitored in our deployment. According to the data trace collected, the elder person will go to sleep around 11pm, wake up to go to the bathroom in the middle of the night and then return to sleep, then wake up again in the morning to visit the bathroom and then visit the kitchen to have breakfast. Given that the basic phonemes generated by our sensor network are rooms, a typical sequence of time-ordered phoneme triplets would be the following(duration is in minutes): < {Bed, ”11 : 00pm”, 300}, {Bath, ”4 : 00am”, 5}, {Bed, ”4 : 05am”, 300}, {Bath, ”9 : 05am”, 10}, {Kitchen, ”9 : 15am”, 30} > The above sequence represents an instance of the night/morning

activity pattern of the elder person. If we define an episode as the time-ordered sequences of all phoneme triplets that take place between 10pm and 10am every day, then different episodes would correspond to different instances of the same activity pattern: E 1 (10pm, 10am) =< {Bed, ”11 : 00pm”, 300}, {Bath, ”4 : 00am”, 5}, {Bed, ”4 : 05am”, 300}, {Bath, ”9 : 05am”, 10}, {Kitchen, ”9 : 15am”, 30} > E 2 (10pm, 10am) =< {Bed, ”10 : 30pm”, 270}, {Bath, ”3 : 00am”, 3}, {Bed, ”3 : 03am”, 330}, {Kitchen, ”8 : 33am”, 20} > E 3 (10pm, 10am) = . . . Note, that in different episodes the abolute time and duration characteristics of the sequences or even the sequences themselves might be different.

4.

EXTRACTING ACTIVITY PATTERNS FROM DATA SEQUENCES

The output O of the sensor network is a collection of triplet sequences over time that are temporally organized in episodes. Each episode encodes the spatiotemporal activity of the monitored person in a specific time window. Different episodes provide different instances of the monitored person’s activity at different points in time. As a result of this, discovering the similarities across a large set of episodes results into discovering the basic activity model of the monitored person. The type of the model depends on the time duration of an episode. For instance, when an episode is defined as a single day, week or month of activity then the process of discovering activity patterns across a large set of episodes corresponds to the daily, weekly or monthly activity model respectively. In particular, the more frequently a sequence of phoneme triplets appears on a sequence of episodes the more important it is for the activity model. In general, given a sequence of episodes, we define the frequency fs of a phoneme triplet sequence as: fs =

NsE NE

where NsE is the number of episodes where the sequence s appears at least once and N E is the total number of episodes. Note that: 0 ≤ fs ≤ 1 and therefore the frequency fs can be seen as the appearance probability of the sequence s in the input sequence of episodes. Problem Statement: Given a sequence of episodes generated by a sensor network, find all the sequences si of triplets with frequency fsi ≥ fth . The goal of the above problem statement is to find the most frequent sequences of triplets in a given sensor network output. fth is a user-specified threshold that defines what most frequent actually means. All sequences with a frequency higher than fth are considered to be frequent. Frequency can be considered as a measure of how often a sequence of triplets appears in the output of the network. The more often a sequence appears the more probable is that this sequence encodes one of the core(most-performed) activities of the monitored person. Being able to find all these core activities will allow us to build the spatiotemporal model of the person’s daily, weekly or monthly activity habits. The ability to construct a model with spatiotemporal characteristics lies on the fact that triplets encode both spatial (phoneme P ) and temporal (timestamp T and duration D) information.

// Find all frequent sequences of size 1 L=1 FL = {s|fs ≥ fth } while(FL ! = ∅) { L=L+1 // Generate candidate frequent sequences of size L CL = candidate generation(FL−1 ) for every episode E { // Find all candidate frequent sequences // that appear in episode E CE = f ind sequences(CL , E) for every sequence s in CE { fs = fs + 1 } } FL = {s|s ∈ CL andfs ≥ fth } } S F requent Sequences = FL Figure 2: The a-priori algorithm for discovering the frequent sequences in a set of episodes.

5.

HUMAN ACTIVITY DISCOVERY

To simplify our discussion, in this section, we will ignore the temporal information included in each triplet generated from a sensor node. In the next section we demonstrate how the proposed approach can be transparently used on top of the spatiotemporal triplets. A brute-force approach to the problem would be to generate all possible sequences of all possible lengths for all the different episodes, then compute the frequency of each sequence and choose those sequences that have a frequency higher than fth . Even though this consists of an exhaustive search over the input that will find all frequent paterns, it requires to first generate a huge list of candidate frequent sequences and then for each one of these candidates we have to parse all the episodes to compute each candidate’s frequency. Even worse, as the number of different phonemes increases and/or the number of observed phonemes in an episode increases, the number of candidate frequent sequences explodes. Our goal is to reduce the total number of candidate frequent sequences before we even try to compute their frequencies while making sure that all frequent patterns will be discovered. To deal with this problem, we take advantage of the apriori principle [2, 3]: if a sequence is frequent then all of its subsequences must be frequent. This argument is very similar to the shortest path argument in a network of nodes. The path between any pair nodes that are located on the shortest path between two nodes is also a shortest path. This observation is very important for two reasons. First, it indicates that all the candidate frequent sequences of size L should be generated by the frequent sequences of size L − 1. This is due to the fact that the supersequence of any nonfrequent sequence will also be a non-frequent sequence. Second, after generating the candidate frequent sequences of size L, every sequence that contains a non-frequent subsequence should be automatically eliminated because it cannot be frequent. Based on these two fundamental observations, Agrawal et. al have designed an efficient algorithm, called the a-priori algorithm, for exhaustively searching the input set of episodes to find all the frequent sequences [2, 3]. Figure 2 shows the basic steps of this algorithm. First,

the frequency of every sequence of length one is computed. In practice, the frequency of every phoneme is computed and the set of frequent phonemes F1 is formed by choosing all the phonemes with frequency higher than fth . At the next step, the set of frequent phonemes is used to generate the candidate frequent sequences of length two. In general, the algorithm will iteratively generate all candidate frequent sequences of size L using as input the frequent sequences of size L − 1. This is done in two discrete steps that have been exhaustively studied in the dta mining domain [2, 3, 16, 18, 4]:the candidate generation and pruning steps. First (candidate generation step), every frequent sequence of size L−1 is expanded by one frequent phoneme. If there are more than one frequent phonemes then every frequent sequence of size L − 1 generates a candidate frequent sequence of size L for every frequent phoneme. Second (candidate pruning step), the candidate frequent sequences of size L that contain a non-frequent subsequence of size L − 1 are immediately eliminated and CL is formed. As soon as CL is defined, we pass each episode to find which of the sequences in CL are contained in that episode. Every time that a sequence in CL is contained in an episode its frequency is increased by one. After we examine all episodes, the set of frequent sequences of size L (FL ) is formed by keeping only those candidate frequent sequences in CL with frequency higher than fth . This process continues iteratively until the set of candidate frequent sequences is the empty set. The final output of the algorithm consists of all the frequent sequences of different sizes. Note, that at each iteration of the algorithm the new candidate frequent sequences are generated by the frequent sequences identified in the previous step. In that way, the overall number of sequences for which we have to compute their frequency is drastically reduced at every step. This reduces the number of passes we have to perform over the set of episodes, which in general might be quite large. Hence, the cost of finding the most frequent patterns while making sure that all existing frequent patterns will be discovered is dramatically reduced.

6.

HANDLING SPACE AND TIME

The methodology presented in the previous section allows for an exhaustive search over the output of the sensor network for discovering sequences of triplets with a frequency that is higher than a predefined threshold. Even though each triplet contains both location and time information (start time and duration) , so far we have only used the time in a primitive way for sorting out the locations produced by the different nodes in the network. As a result of this, the sequences we discover are nothing more than sequences of spatial features over time. It is apparent that essential temporal information included in the triplets is ignored. For instance, consider the simple example where we want to monitor the bathroom usage from an elder person living inside the house. Knowing that the monitored person visited the bathroom is a useful piece of information. However, knowing when this visit took place is even more important; it is normal for an elder person to visit the bathroom regularly during a day and right after he wakes up, but when a bathroom visit is taking place in the middle of the night its meaning might be totally different. In the same sense, the duration of this activity is also very important. For example, a couple of bathroom visits in the evening or

< {Bed_Night_Sleep}, {Bath_Morning_Shower}, {Bed_Morning}, {Bath_Morning}, {Kitchen_Short} >

Temporal Abstraction Layer

during the day might be considered normal activity, however, lengthier bathroom visits over the night can be used to identify abnormal or possibly emergency situations. It is clear that even though location, time and duration of an event/activity can independently provide useful information, when these features are combined together we can interpret the same event in a totally different way. Given this, the following question arises: How can we combine the different spatial and temporal information that a triplet provides with the methodology presented in the previous section in a scalable way? Answering this question is challenging due to the following reasons: (1) Different activities and even different phonemes require different time and duration resolutions. For instance, monitoring bedroom activity requires a duration resolution that can vary from tens of minutes to several hours. In the same sense, bathroom monitoring requires a duration resolution that varies from a few minutes up to tens of minutes. For the proposed sequence pattern discovery method to be efficient enough in terms of discovering informative patterns, we have to concurrently support all these different resolutions. (2) The support of different time and duration resolutions must have the minimum possible impact on the size of the data set over which the proposed method is executed. Increasing the size of the data set can lead to extremely large execution times of the proposed method and therefore limit its applicability. Current state-of-the-art methods take advantage of userdefined time windows to guide the pattern discovery process in the sequence of episodes [16, 10]. These time windows are used to constrain the scope of the pattern search algorithm in terms of the actual time and duration of a pattern. Even though this approach does not increase the size of the input to the search algorithm, it has a significant drawback; it is not flexible enough to adjust to the different temporal characteristics of different patterns. Using a fixed time window in the search algorithm prevent us from efficiently searching for temporal patterns. On the other hand, using variable window sizes can significantly increase the complexity of the pattern search algorithm. To deal with this issue, we have designed a flexible, userconfigured, hierarchical temporal abstraction layer that encodes both the spatial and temporal information of a triplet into a single spatiotemporal symbol/phoneme. Executing the method described in the previous section on top of these symbols allows us to discover spatiotemporal sequences. Figure 3 shows the main components of the proposed temporal abstraction layer. Note that besides the sensor network input of triplets sequence, the user of the system provides two sets of condition parameters. These are used to hierarchically condition the input triplets based on their timestamps and/or their duration. Both parameter sets include userspecified conditions that could be applied to all, a subset or only a specific triplet of the input sequence. The filtering process for identifying which triplets are subject to a condition parameter is done using their phoneme fields. Initially, every triplet in the input sequence that satisfies the filtering criteria will be conditioned on the actual timestamp it was recorded using the user-defined parameters. This process will convert all or a subset of the input triplets: {P, T, D} into a tuple: {P T , D}, where P T is the new phoneme name (provided by the user) that embeds both spatial and absolute time information. Note that if a condition parameter

Time Duration Based Conditioning Duration Condition Parameters < {Bed_Night,510}, {Bath_Morning,25}, {Bed_Morning,15}, {Bath_Morning,5}, {Kitchen,30} > User

Absolute Time Based Conditioning Absolute Time Condition Parameters < {Bed,”11:00pm”,510}, {Bath,”7:30am”,25}, {Bed, “7:55am”,15}, {Bath,”8:10am”,5}, {Kitchen,”8:15am”,30} >

Sensor Network Output

Figure 3: Outline of the temporal abstraction layer. for a specific type of phoneme does not exist, then the input triplet is simply converted to a tuple by ignoring the timestamp field of the triplet. At the immediate next level, the tuples that satisfy the filtering criteria of the user-specified duration condition parameters will be conditioned on their duration values. Again, if a condition parameter for a specific type of phoneme does not exist, then the input tuple is simply converted to a spatiotemporal phoneme by ignoring the duration field of the tuple. This process will convert all or a subset of the input tuples: {P T , D} into a single symbol/phoneme: {P T,D }. Note that P T,D now embeds spatial, absolute time and duration information into a single spatiotemporal phoneme. Running the method described in Section 5 on the output sequence of spatiotemporal phonemes allows us to identify spatiotemporal patterns without the need to explicitly process absolute time or duration information. In this way, we manage to identify spatiotemporal patterns by simply conditioning the input triplets based on absolute time, duration or on both absolute time and duration. This approach can be seen as a phoneme renaming process that allows us to keep the size of the input data set intact while encoding all the necessary information we need. The only incurred overhead has to do with increasing the different number of phonemes used, however, this has no effect on the size of the input to the algorithm and therefore on its complexity. In addition, it provides the necessary flexibility to the user to apply different condition parameters at different phonemes or even different condition parameters at the same phoneme according to the requirements of different activities. Hence, the proposed scheme scales well with both the size of the input data set as well as with the number of activities and phonemes we want to exploit. To demonstrate how this temporal abstraction operates, consider the following input sequence of room, time and duration triplets (duration is expressed in minutes): < {Bed, ”11 : 00pm”, 510}, {Bath, ”7 : 30am”, 25}, {Bed, ”7 : 55am”, 15}, {Bath, ”8 : 10am”, 5}, {Kitchen, ”8 : 15am”, 30} > This sequence shows a typical morning activity. The person went to sleep at 11pm, slept for 8, 5 hours, then woke up, took a shower, then returned to the bedroom to get dressed, quickly visited the bathroom and finally went to the kitchen

to get breakfast. A possible set of absolute time condition parameters in this case could be the following: Phonemes Time Spatiotemporal Range Phoneme {Bed} (”8:00pm”,”12:30am”) {Bed Night} {Bed} (”5:00am”,”11:00am”) {Bed Morning} {Bath} (”5:00am”,”11:00am”) {Bath Morning} After applying this conditioning to the input sequence of triplets we get the following sequence of tuples: < {Bed N ight, 510}, {Bath M orning, 25}, {Bed M orning, 15}, {Bath M orning, 5}, {Kitchen, 30} > Note that all absolute time references have been eliminated and the Kitchen phoneme remained the same since there was no conditioning parameters defined for it. At the immediate next step the set of duration condition parameters is applied. Such a simple set of parameters can be seen next: Phonemes Duration Spatiotemporal Range Phoneme {Bed} (120,700) {Bed Sleep} {Bed Night} (120,700) {Bed Night Sleep} {Bath} (15,30) {Bath Shower} {Bath Morning} (15,30) {Bath Morning Shower} {Kitchen} (5,40) {Kitchen Short} By applying this set of duration condition parameters on the input sequence of tuples we get the following sequence of spatiotemporal symbols: < {Bed N ight Sleep}, {Bath M orning Shower}, {Bed M orning}, {Bath M orning}, {Kitchen Short} > Note that each symbol in the new output sequence encodes absolute time, location and duration information. For instance, the Bath Morning Shower indicates a bathroom visit that took place in the morning with a duration large enough to indicate shower activity. In the same way, the Bed Night Sleep phoneme provides information of a sleeping activity that took place during the night. The time and duration condition parameters for the different phonemes can be either speciffied by the user or extracted for exploitoing the statistics of the raw sensing data as it is demonstrated in Section 7.2. Due to the semantic information embedded in these symbols, we can use the proposed method for finding way more fine grained, and thus more informative, sequences over space and time. In that way, instead of modifying the algorithm described in Section 5 to explicitly process time information, something that could significantly increase its complexity and would poorly scale with the size of the input data set, we encode the spatiotemporal information into the input phonemes allowing us to implicitly discover spatiotemporal patterns.

7.

EXPERIMENTAL RESULTS

We evaluated the proposed spatiotemporal, activity-based frequent pattern mining method on a 30-day data trace collected using the home sensor network deployment described in Section 3. A network of 15 PIR sensors was used to monitor the occupancy of all the different rooms in the house. The set of phonemes in this case became the actual rooms that were visited by the monitored elder person over time and every such phoneme was associated with an actual timestamp and a duration interval. To better highlight the value of the proposed approach and to simplify our discussion, we opted to operate on a slightly processed sequence of the visited rooms. In particular, we map the

1

2

3

4

Sequence

Probability 68.4% 26.3% 21% 47.3% 36.8% 68.4% 47.3% 36.8% 42.1% 73.6% 42.1% 26.3% 26.3% 47.3% 52.6% 84.2% 57.8% 26.3% 79%

Figure 4: A subset of the most frequent spatial sequential patterns discovered in the collected trace. generated sequence of rooms into a sequence of primitive activities by applying a simple set of rules. This process has been successfully demonstrated in our previous work [8, 9]. Thus, the actual input phonemes become the different activities: Sleep, Bath, Breakfast, GetReady, Hangout, and Out, that provide information about when the person sleeps, visits the bathroom, has breakfast, gets ready for the day, spends time in the living-room watching TV and when he is out of the house respectively. This mapping is equivalent to the transformation of the raw sensing data (red waveform at the bottom) to the simple activity waveforms (green and blue waveforms on the top) shown in Figure 1.

7.1

Spatial Pattern Discovery

We applied the proposed method on the 30-day sequence of primitive activities, to extract the daily living model of the monitored person. Since, we were interested in the daily living model, we defined the duration of an episode to be the duration of a single 24-hour day. In that way, the input sequence of activities was expressed as a sequence of 30 episodes, where each episode contained an ordered sequence of the five different primitive activities. First, we applied our method while ignoring the temporal characteristics of the input sequence (e.g. no time or duration conditioning was applied). A subset of the extracted frequent sequential patterns can be seen in Figure 4. To facilitate the intepretation of the patterns, we have organized all the similar patterns into chronologically ordered groups. Even though no temporal information was considered, the daily living model of the elder person begins to emerge. From the Sleep and Bath sequences (pattern groups 1 and 2 in Figure 4), one can see that the sleeping pattern of the monitored person consists of more than one Sleep and Bath activities. After the person wakes up (alternating sequence of Sleep and Bath activities), he will have breakfast and then spend most of his time in the living-room watching TV (pattern group 3 in Figure 4). Besides some bathroom visits, the person continues to hangout in the living area until he eventually gets out of the house. After the person returns, he will continue to hangout in the living-room and occasionally visit the bathroom. The day ends by visiting the bathroom for one last time before going to bed (pattern group 4 in Figure 4).

7.2

Spatiotemporal Pattern Discovery

While the patterns in Figure 4 provide basic information

1

2

3

4

5

Sequence

Probability 21% 31.6% 21% 52.6% 21% 31.6% 21% 21% 26.3% 21% 52.6% 31.6% 21% 26.3% 42.1% 21% 42.1% 21%

Figure 6: A subset of the most frequent sequential patterns discovered in the collected trace when both time and duration conditioning has been applied. The extensions ‘ M”, ‘ A”, ‘ E”, and ‘ N” stand for morning, afternoon, evening and night.

Figure 5: Time and duration characteristics of the Sleep activity. Time is divided into 4 time windows of 6 hours duration each. Duration is divided into 30-minutes windows. The z-axis represents the number of times the Sleep activity appeared in a specific time and duration window. about the monitored person’s daily living habits, they lack significant temporal information. Without absolute time or duration information the importance of the discovered patterns degrades. For instance, knowing that the person wakes up regularly to visit the bathroom is useful information but it would be even more informative if we knew when this happens (in the middle of the night or in the morning) and if it is periodic or not. In addition, it is important to know that the person leaves the house but it is even more important to know when and for how long. To highlight the importance of time and duration information in the pattern discovery process consider Figure 5 where the time and duration characteristics of the Sleep activity are shown. Figure 5 shows the number of times the Sleep activity appears in a specific time window during a day (4 time-windows of 6 hours duration each) and with a specific duration. Note, that the total number of the Sleep instances are more than the total number of days in the recorded data trace. This clearly shows that the sleeping pattern of the elder person consists of multiple Sleep instances due to regular bathroom visits. Also, from the time windows where the Sleep instances appear we can infer when the person goes to sleep (last time window) and then interrupts his sleep to visit the bathroom (first two time windows correspond to bathroom visits in the night and in the morning respectively). The fact that sleep is interrupted by frequent bathroom visits can also be seen by the duration of the different Sleep instances. In general a Sleep activity can last from approximately one hour up to approximately 10 hours, however, in most of the cases the duration is limited between 2 and 4 hours. Figure 5 also shows the correlation between the absolute time when the Sleep activity happens and its duration. Sleep activities at

different time-windows have different duration characteristics. As Figure 5 shows, the duration of the Sleep activity instances is usually large when the elder person goes for first time to sleep a little bit before midnight and it gradually degrades after each bathroom visit during the night or in the morning. This combination of spatial and temporal information provides a more detailed insight on the person’s daily activity and therefore it should also appear in the discovered patterns. Using the time abstraction layer described in Section 6 we were able to extract more information-rich spatiotemporal patterns by conditioning the sequence of input activities on absolute time and duration. In particular, we partitioned the day in 4 time windows (morning, afternoon, evening and night) and used a rough classification of the input activities into short and long according to their duration. Since different activities have different duration characteristics we used activity-specific duration parameters that we were able to extract using statistical information, like the one shown in Figure 5, for all the different input activities. Figure 6 shows a subset of the discovered spatiotemporal frequent patterns after running the proposed method on the conditioned input sequence of activities. Again, in order to facilitate the intepretation of the patterns, we have organized all the similar patterns into chronologically ordered groups. The information that can be extracted now is more valuable. By looking at the pattern groups 1 and 2 in Figure 6, we can clearly see that the bathroom visits happen once during the night and once during the morning. The same pattern groups show that after the bathroom visit in the morning the person will go to sleep for a small period of one to two hours conversely to the previous night sleeping activity instances. After waking up and having a long, most of the times, breakfast the person will spend most of his time in the living-room watching TV until the afternoon (pattern groups 3 and 4 in Figure 6). It is during the afternoon, usually around 3pm, where the person will leave the house for approximately 3 hours to continue watching TV in the living room as soon as he comes back (pattern groups 4 and 5 in Figure 6). During the evening, the person will eventually visit the bathroom for one last time before going to sleep (pattern group 5 in Figure 6).

0.1

Bath_M

Sleep_M_Short 1.0 0.17

pressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).

0.1

1.0

1.0

Sleep_N_Long 0.28

9.[1] Alarmnet: REFERENCES Assisted-living and residential monitoring

0.34 GetReady_M

Breakfast_M_Short

1.0 Sleep_M_Long

Bath_N

[2]

1.0

0.76

0.24

1.0

Breakfast_M_Long Bath_E

[3]

1.0

0.74 0.26

0.07

0.93

Sleep_E_Long

[4]

Hangout_M_Long 0.75

Hangout_E_Long

[5]

0.25

Bath_A

1.0 Out_A_Long 1.0

[6] 1.0 1.0

GetReady_A

1.0 Hangout_A_Long

Hangout_A_Short

[7] Figure 7: The generated daily activity model of the monitored elder person using all frequent patterns with a probability higher than 15% These spatiotemporal frequent patterns represent elder’s person core daily activities. By combining all these patterns together we can build the daily activity model of the monitored person. Figure 7 shows the activity model that was generated by combining all the frequent spatiotemporal patterns discovered by our method with a probability higher than 15%. Such a model could be used to predict at runtime the monitored person’s activity or it could be used to detect unusual or abnormal activity.

8.

CONCLUSION

We have introduced a method for extracting spatiotemporal human activity patterns by properly encoding location, time and duration information into a single phoneme. Our method can be trasnparently used on different people to automatically extract their daily activity model. The 30-day data trace collected from our home sensor network deployment was invaluable in terms of understanding the process, its bottlenecks and requirements, and evaluating the effectiveness of the proposed approach. Our exposure to the real data, revealed that our previous, complimentary work, on grammars [8, 9] is crucial in terms of transforming raw sensing data into a higher level form more appropriate for discovering meaningful patterns. The reason is that due to the noise that low level data always include, you can potentially have infinite permutations making the pattern discovery process extremely difficult. As part of the future work, we will focus more on the automatic discovery of the temporal properties of the model.

Acknowledgments This work was partially funded by the National Science Foundation under projects CNS 062682 and IIS 075180. Any opinions, findings and conclusions or recommendation ex-

[8]

[9]

[10]

[11]

[12] [13]

[14]

[15] [16] [17] [18]

network - a wireless sensor network for smart healthcare, http://www.cs.virginia.edu/wsn/medical/. R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB ’94: Proceedings of the 20th International Conference on Very Large Data Bases, pages 487–499, San Francisco, CA, USA, 1994. Morgan Kaufmann Publishers Inc. R. Agrawal and R. Srikant. Mining sequential patterns. In Proceedings of ICDE, pages 3–14, Washington, DC, USA, 1995. IEEE Computer Society. M. Hasan, V. Chaoji, S. Salem, N. Parimi, and M. Zaki. Dmtl: A generic data mining template library. In Workshop on Library-Centric Software Design (LCSD’05), 2005. E. Heierman, M. Youngblood, and D. J. Cook. Mining temporal sequences to discover interesting patterns. In KDD Workshop on Mining Temporal and Sequential Data, 2004. S. S. Intille, K. Larson, and E. M. Tapia. Designing and evaluating technology for independent aging in home. In International Conference oon Aging, Disability and Independence, 2003. Y. A. Ivanov and A. F. Bobick. Recognition of visual activities and interactions by stochastic parsing. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):852–872, 2000. D. Lymberopoulos, A. Ogale, A. Savvides, and Y. Aloimonos. A sensory grammar for inferring behaviors in sensor networks. In In the Proceedings of Information Processing in Sensor Networks, IPSN 2006, April 2005. D. Lymberopoulos, T. Teixeira, and A. Savvides. Detecting patterns for assisted living using sensor networks. In Proceedings of SensorComm, October 2007. H. Mannila, H. Toivonen, and A. I. Verkamo. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3):259–289, 1997. D. Moore and I. Essa. In Recognizing multitasked activities from video using stochastic context-free grammar, pages 770–776, Menlo Park, CA, USA, 2002. AAI. A. S. Ogale, A. Karapurkar, and Y. Aloimonos. View-invariant modeling and recognition of human actions using grammars. ICCV’05, October 2005. D. J. Patterson, D. Fox, H. Kautz, and M. Philipose. Fine-grained activity recognition by aggregating abstract object usage. In IEEE International Symposium on Wearable Computers, October 2005. M. Philipose, K. P. Fishkin, M. Perkowitz, D. J. Patterson, D. Fox, H. Kautz, and D. Hahnel. Inferring activities from interactions with objects. IEEE Pervasive Computing, 03(4):50–57, 2004. K. Romer. Distributed mining of spatio-temporal event patterns in sensor networks. In EAWMS / DCOSS, 2006. P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison-Wesley, 2006. E. M. Tapia, S. S. Intille, and K. Larson. Activity recognition in the home setting using simple and ubiquitous sensors. In PERVASIVE 2004, 2004. M. J. Zaki, N. Parimi, N. De, F. Gao, B. Phoophakdee, J. Urban, V. Chaoji, M. A. Hasan, and S. Salem. Towards generic pattern mining. In ICFCA, 2004.

Extracting Spatiotemporal Human Activity Patterns in ... - Microsoft [PDF]

Recommend Stories

Idea Transcript

Helpful Links

Smile Life

Get in touch