Trending ▼ ICSE CBSE 10th ISC CBSE 12th CTET GATE UGC NET Vestibulares ResFinder

networking projects

55 pages, 0 questions, 0 questions with responses, 0 total responses,

	vpnjain	+Fave Message
	Profile Timeline Uploads

Home > vpnjain >

Formatting page ...

IT 09 056 Examensarbete 30 hp November 2009 Wireless Sensor Network Deployment in Mobile Phones Environment Zheng Ruan Institutionen f r informationsteknologi Department of Information Technology Abstract Wireless Sensor Network Deployment in Mobile Phones Environment Zheng Ruan Teknisk- naturvetenskaplig fakultet UTH-enheten Bes ksadress: ngstr mlaboratoriet L gerhyddsv gen 1 Hus 4, Plan 0 Postadress: Box 536 751 21 Uppsala Telefon: 018 471 30 03 Participatory sensing is a rising and promising field, which utilizes mobile phone users as mobile wireless sensors to gather data. However, because of the randomness of its participants, it is necessary to deploy wireless sensors in the sensing area at the same time, in order to gather enough quantity data with satisfactory quality. The deployment becomes a challenge because participatory sensing processes are dynamic and wireless sensor networks are relatively static. In this thesis, we propose a framework for the wireless sensors deployment in the participatory sensing campaigns. The aim is to minimize the number of sensors deployed, while providing enough satisfactory quality of data. Our framework consists of several sub-models and has a great flexibility. The experiments prove a good performance for our deployment framework. Telefax: 018 471 30 00 Hemsida: http://www.teknat.uu.se/student Handledare: Edith Ngai mnesgranskare: Lars- ke Larzon Examinator: Anders Jansson IT 09 056 Tryckt av: Reprocentralen ITC Contents 1 Introduction 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 5 2 Related Work 2.1 Participatory sensing as a dynamic process . . . . . . 2.2 Wireless sensor network as a static deployment . . . . 2.3 Necessity of combination . . . . . . . . . . . . . . . . . 2.4 Researches on participatory sensing . . . . . . . . . . . 2.5 Researches on deployment of wireless sensor networks 6 6 8 8 9 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Problem Formulation 4 Deployment Algorithm 4.1 Model for sensors and terrain of sensing eld . . . . . . . . . . 4.2 Evaluation of performance of participants . . . . . . . . . . . . 4.2.1 Beta distribution . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Participants performance model by beta distribution . . 4.2.3 Aging factor . . . . . . . . . . . . . . . . . . . . . . . . 4.2.4 More general model . . . . . . . . . . . . . . . . . . . . 4.3 Predication of participants actions . . . . . . . . . . . . . . . . 4.3.1 Model for participants actions . . . . . . . . . . . . . . 4.3.2 Markov chains . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Initialization of Markov transition matrix . . . . . . . . 4.3.4 Predication of participators actions . . . . . . . . . . . 4.3.5 Update of Markov transition matrix . . . . . . . . . . . 4.4 Determination of grid points which need extra wireless sensors 4.5 Deployment of wireless sensor network . . . . . . . . . . . . . . 4.5.1 Solve case 1 by constraint programming . . . . . . . . . 4.5.2 Solve case 2 by greedy algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 15 17 17 18 19 20 20 21 22 23 23 25 26 27 27 36 5 Experimental results 43 5.1 Lower bound for performance . . . . . . . . . . . . . . . . . . . . 43 5.2 Experiment with motion patterns . . . . . . . . . . . . . . . . . . 45 1 6 Conclusion 48 2 List of Figures 2.1 Procedure of participatory sensing . . . . . . . . . . . . . . . . . 7 3.1 Example of grid points in sensing eld . . . . . . . . . . . . . . . 11 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 Example of sensing quality probability distribution Models in framework . . . . . . . . . . . . . . . . . Example of grid points, obstacles and a sensor . . Example of Beta distributions . . . . . . . . . . . . Examples of two participants . . . . . . . . . . . . Examples of motions sequence of a participant . . Examples of starting and ending state . . . . . . . View as bipartite graph . . . . . . . . . . . . . . . Results for Experiment 1 . . . . . . . . . . . . . . Results for Experiment 2 . . . . . . . . . . . . . . Results for Experiment 3 . . . . . . . . . . . . . . Results for Experiment 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 15 16 18 19 21 24 28 39 40 41 42 5.1 5.2 5.3 5.4 Number of Sensors Deployed . . . . . . . Percentage of Coverage . . . . . . . . . . . Running results without Wireless Sensors Running results with Wireless Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 45 46 47 3 . . . . . . . . . . . . . . . . . . . . List of Tables 4.1 Performance comparison . . . . . . . . . . . . . . . . . . . . . . . 4 35 Chapter 1 Introduction 1.1 Background A wireless sensor network (WSN) is a network consisting of spatially distributed autonomous devices, and use them to cooperatively monitor physical or environmental conditions, such as temperature, sound, vibration, pressure, motion or pollutants at di erent locations. The applications for WSNs are many and varied, typically involved some kind of monitoring, tracking and controlling. Speci c applications for WSNs include habitat monitoring, object tracking, nuclear reactor control, fore detection, and tra c monitoring. Mobile phones could also be used as kind of wireless sensors. They have the advantage of numerous users around the world. People could collect data when they travel around by mobile phones, which is a rising and promising eld called participatory sensing . 1.2 Motivation In a typical application, a WSN is scattered in a region to collect data through its sensor nodes. Although individual sensor node is not very expensive, large deployment of sensor nodes in the network makes the total cost considerable expensive. In the meantime, there are many mobile phone users around the world, who could act as data collector sometimes. As a result, quite a lot of money could be saved by decreasing the number of of sensor nodes in locations where mobile phone users exist. However, the quality of sensing results by human di er much from each other, which may cannot satisfy the requirement of applications. So wireless sensors still need to be deployed as a complementary approach to participatory sensing. And the locations of sensors deployed should be calculated carefully. The aim is to minimize the number of sensors, while providing enough satisfactory quality of data. 5 Chapter 2 Related Work In this chapter, we introduce some related work. It is organized as follows: Section 1 introduces the participatory sensing, including its aim, procedure and also several examples. Section 2 compares the participants with wireless sensors. And the reasons why they need to be combined together are explained in section 3. Section 4 and 5 introduce researches on participatory sensing and wireless sensor networks deployment respectively. 2.1 Participatory sensing as a dynamic process The rapid adoption of mobile phones over the last decade and an increasing ability to capture, classify, and transmit a wide variety of data (image, audio, and location) have enabled a new sensing paradigm - participatory urban sensing - where humans carrying mobile phones act as, and contribute to, sensing systems. In participatory sensing, mobile phone-based data gathering is coordinated across a potentially large number of participants over wide spans of space and time[1]. CENS (Center for Embedded Networked Sensing) has put participatory sensing into action in some projects, for example: 1. Dietsense: Participants can upload photos of their daily meals onto the server. In addition to photos, they can also annotate photos with more information, such as voice or text messages. Then dietary specialists will provide further analysis and give some comments. 2. Garbage watch: Students in UCLA campus collect and upload photos of the contents of garbage bins, in order to help UCLA Facilities determine where new recycle bins should be placed, the e ectiveness of existing recycling infrastructure, and to learn more about when, where, and what materials get thrown away on campus. 3. Cyclesense: Cyclers use an application which runs on mobile phones to log their routes using GPS and provide geo-tagged annotations, along with 6 other data. The information is used to infer the roughness and tra c density of the road. And it can give suggestions and feedbacks on their routes. Although di erent participatory sensing campaigns exist, typically they have the following di erent stages: (a) De nition: Organizers de ne a sensing campaign, including its aim, coverage and lifetime, etc. (b) Recruitment and re nement: Organizers recruit some participants according to some criterions, such as number of campaigns volunteered for, number of campaigns accepted for, number of campaigns participated in, and number of campaigns abandoned[1]. (c) Execution and publishing: Organizers gather uploaded data from participants, then analyze them to get some useful results. The results will be published to the participants or to the public. It is important to note that the above stages are not totally sequential. These stages have feedbacks to each other. For example, the performance of participants during the execution will give some feedbacks to the recruitment. The organizers may decide to recruit more participants or replace some of them. The above stages during the whole campaign are shown in the following gure[7]: Sensing Campaign Participant Recuritment , Coordination , & Monitoring Campaign Execution Participant Profile definition Recruitment & refinement Execution & publishing Figure 2.1: Procedure of participatory sensing 7 Because Participatory sensing campaign involves a lot of human activities, it is very dynamic: (a) The organizers and participants are all human beings, whose activities are random to some extent. The participants may nd that the campaign does not suit their daily schedules, or some accidental events may happen to them. During a campaign, organizers can recruit more participants or replace some of them, according to their performance. (b) The campaign itself is also dynamic. As the campaign processes, the organizers may nd that the content of campaign must be changed. For example, the campaign may require more tasks, in order to gather enough data for analysis. The aim of campaign may also be changed because some new situations occur. For example, new aims are proposed because new problems are discovered in the data analysis. 2.2 Wireless sensor network as a static deployment Compared with the dynamic aspect of participatory sensing campaigns, wireless sensor network is relatively static. In most applications, after the WSNs are deployed, the topologies almost remain the same and their behaviors are more predictable. Although there are some random or unpredictable factors, such as damages of sensors, running out of energy, and data inaccuracy during transition, its performance can be analyzed. For example, the amount of data being gathered can be estimated. 2.3 Necessity of combination Participatory sensing campaigns which only rely on humans have some disadvantages: 1. Sometimes the campaign cannot attract enough participants, maybe because of their consideration of privacy. For example, in the Cyclesense, a lot of data is required such as GPS information along the routes, which is considered as personal secrets for many people. 2. If some critical data cannot be gathered, because of participants factors, the campaign has to be prolonged to get enough data for analysis. That means the cost of campaign will increase, which may exceed the budget. Even worse, the campaign may fail at last. On the other hand, the drawback of wireless sensor network is also obvious: 1. For some interesting research, the data need to be gathered is far more complex than temperatures or vibrations, such as images and audio. As 8 a result, the price of sensor for such tasks is quite expensive. It is almost impossible to deploy them in large amount. 2. After deployment of sensor network, its topology and sensing area remains the same (if we don t take the factors of damages and short of energy into account). If the sensing task changes, the sensor network need to be redeployed. It will cost quite a lot, if the number of sensors is huge. As a result, we should combine the advantages of them. That is, combine humans activities with wireless sensor networks. Wireless sensor network can be used to increase the coverage and quality of data gathering. Meanwhile, participants can decrease the number of sensors required and the cost when the network need to be redeployed. 2.4 Researches on participatory sensing Researches on participatory sensing have di erent focuses: privacy mechanisms, context-annotated mobility pro les for recruitment, performance evaluation for feedback, incentives and recruitment, etc. In our thesis, we mainly focus on two aspects: evaluating of participants performance and availability. Reddy proposed a model for evaluating participation and performance in participatory sensing, based on Beta distribution[1]. Reddy proposed a recruitment engine that uses campaign speci cations provided by an organizer to select a limited set of potential volunteers based on participants previous-gathered mobility pro les[8]. Their framework focuses on the geographic and temporal availability of participants. 2.5 Researches on deployment of wireless sensor networks Wireless sensor network deployment is one of main problems in wireless sensor network researches. Because it a ects the cost and performance of network. A good deployment strategy can reduce cost of network, save the energy for communication and increase robustness of the network. Because of its various applications, wireless sensor network deployment is researched in di erent situations. In some applications, the sensors are deployed at random, for example, by plane. Thus the coverage and connectivity are the most important factors. And in most of these applications, the sensors may be damaged by anthropogenic or natural factors, such as battle, earthquakes or ood. As a result, the network can lose its connectivity at any time. In some other applications, the deployment can be calculated before hand and the sensors can survive long time in the environment. Thus performances of deployments of this kind of applications are more predictable. The researches also focus on di erent aspects. Tian proposed a node-scheduling scheme to reduce system overall energy consumption and increase system lifetime[11]. 9 Their scheme turns o some redundant nodes and guarantees that the original sensing coverage is maintained. Dhillon proposed two greedy algorithms for deployment of wireless sensor network[12]. They built a probability model for wireless sensors based on a grid sensing eld. Chakrabarty proposed a deployment strategy to reduce cost for wireless sensor network who has di erent kind of wireless sensors[13]. They formalized it as a integer linear programming problem. Poduri proposed an algorithm based on arti cial potential elds for the self-deployment of a mobile sensor network[14]. Their deployment strategy is researched in a network with the constraint that each of the nodes has at least K neighbors. 10 Chapter 3 Problem Formulation In our thesis, the sensing eld is represented as a grid of two or three dimensional points gi . The distance between adjacent grid points is d. It is assumed that sensors can only be deployed in grid points, and the participants sensing actions are also performed in grid points. The number of grid points in sensing eld is denoted by N . The following gure shows an example: d d Sensor Grid points Figure 3.1: Example of grid points in sensing eld Di erent points in sensing area can have di erent importance due to di erent reasons. For example, some points are critical to the sensing project whose data need to be sensed in high priority. And such importance can be changed during the progress of participatory sensing, from period to period. Thus every grid point gi is associated with a pair < qi , pi >. qi indicates the lowest quality of data required by the campaign, which is a real number in the range of [0, 1]. The quality of sensing result is judged by organizes or experts. And pi indicates the lowest coverage probability required for that point. By coverage probability, we mean the probability that a grid point gi is sensed by participants or wireless 11 sensors. At the beginning of each period, quality and probability vectors Q =< q1 , q2 , ..., qN > and P =< p1 , p2 , ..., pN > are given as input parameters. Wireless sensor network should act as a complementary method in participatory sensing, to make sure enough data could be sensed. And it is not wise to deploy the network once and remain it the same during the whole campaign. It should be combined with human actions. The participatory sensing campaign can be divided into several periods. Before each period, the wireless sensor network is deployed (again), according to the information of participatory sensing campaign and its participants. There are many factors to be considered when a wireless sensor network is deployed, such as energy saving, connectivity, total cost. In most of participatory sensing campaigns, the sensors required are expensive because of their advanced functions. As a result, the most important factor is the number of sensors deployed. Our aim is to deploy minimum number of sensors, and provide enough coverage for every grid point in sensing grid at same time. 12 Chapter 4 Deployment Algorithm We propose a framework for deployment of wireless sensor network in participatory sensing environment. Our framework has a high level of exibility. It consists of several sub-models, and they communicate with each other by parameters. By exibility, we mean every model can be replaced by another providing the interfaces between models remain the same. This gives our deployment algorithm a good generality, which is important due to diversity of participatory campaigns. Our framework bases on the assumption that the sensing eld consists of two of three dimensional grid points, as it is described in the problem formulation. Our framework concentrates on the two-dimensional cases. However, it can be generalized into three-dimensional cases straightforwardly. The distance d between adjacent grid points is determined by di erent campaigns. The smaller d is, the more precise the frame becomes. However, too small d will result in large number of points in sensing eld, which increases the cost for computation. As a result, d should be chosen according to practical situations. Besides this, our framework consists of following sub-models: The model for sensors gives the sensing ability, by providing a detection probability matrix D. Each entry di,j gives the probability that grid point gi can be sensed by a sensor deployed in grid point gj . The terrain model gives information of the sensing eld, such as obstacles and climates. It a ects the detection probability matrix D. For example, obstacles block the visions of some kinds of sensors and this will set some entries di,j to be 0. The climates, such as fog, can decrease the detection probability. As a result, it works with sensors model together to provide the detection probability matrix D. The performance of every participant is described by quality evaluation model. The sensing quality is represented by marks as a real number q in the range of [0, 1]. This sub-model gives the probability distribution p(q )for sensing quality of every participant, based on historical data. The 13 sensing quality probability of next sensing action in the range [q1 , q2 ] can q be calculated by q12 p(q )dq . For example, if a sensing quality probability distribution of one participant is: 3.5 3 2.5 p 2 1.5 1 0.5 0 0 0.1 0.2 0.3 0.4 0.5 q 0.6 0.7 0.8 0.9 1 Figure 4.1: Example of sensing quality probability distribution Then when this participant performs a sensing action next time, the probability that the quality of sensing result lies in the range of [0.5, 1] is given 1 by 0.5 p(q )dq . The probability that a participant performs a sensing action in grid point gi in next period is predicted by the participant s actions predication model. This model provides a vector V , in which Vi gives the probability that grid point gi will be visited next period. The deployment algorithm takes the input from above sub-models, then calculates the grid points which need to be monitored by extra sensor(s) and gives the minimum wireless sensors required to make sure every grid point is covered by its minimum sensing probability, as well as the locations where the sensors are deployed. The whole framework can be shown by following gure: 14 S ensor Model Terrain Model Quality Evaluation Model Participant ' s Actions Predication Model Detection probability matrix Visiting probability vector Sensing quality probability distribution Deployment Algorithm Deployment Figure 4.2: Models in framework This chapter is organized as follows: In section 1, we build models for sensors and terrain of sensing eld. Then we build the quality evaluation model in section 2. Section 3 explains the participant s actions predication model. Then we use these models to predicate participants motions and nd out the area which need to be monitored by extra sensors in section 4. Finally, we give di erent deployments algorithms for two di erent cases in section 5, together with their performance analysis. 4.1 Model for sensors and terrain of sensing eld In a participatory sensing campaign, there are two kinds of sensors: mobile phones and wireless sensors. We represent sensors as a disk centered in one point with a positive detection radius r, which represent the range of sensing. Because participants can have di erent kinds of mobile phones, their sensing ranges di er from each other. Generally speaking, there are two kinds of sensing - precise and imprecise, which corresponds two di erent cases in our algorithm: Case 1: The sensors can produce perfect detections. That means, the result produced by the sensors is either yes or no , as a binary result. For example, in harbor communities study, participants are asked to gather data of gaseous and particulate pollutants. In the sensing range of a sensor, the data won t di er much, thus it can be regarded as percise . Case 2: The detections of sensors are imprecise. The precision of data is a ected by some factors, such as target distance to the sensor. For example, in the cyclesense campaign, participants are asked to gather acoustic data which are used to analyze the noises near the cycle route. 15 The sound becomes weaker as the target distance increases. As a result, the detection probability decreases. Dhillon proposed a model for probability of detection of a target by a sensor, together with a model for terrain of sensing area[12]. We adapt their models for detection probability and terrain in our thesis. They assume that the detection probability varies exponentially with the distance between the target and the sensor. A target at distance d from sensor is detected by that sensor with probability e d if 0 d r p(d) = 0 otherwise However, the choice of a sensor detection model is a parameter to our algorithms. It can be changed without a ecting the algorithms. It can be noted that case 1 is a specialization of case 2, by setting p(d) = if 0 d r otherwise 1 0 Terrain is an important factor in wireless sensor networks, which heavily a ects the range of sensors. For example, obstacles such as buildings can block the vision of sensors. We represent the sensing eld of interested as a grid (twodimensional) of points. However, our algorithm can be generalized into threedimensional cases straightforwardly. The following gure shows an example of sensing eld with obstacles: Obstacles r d d Sensor Grid points Figure 4.3: Example of grid points, obstacles and a sensor Detection probability matrix D describes the probability of detection between sensors and targets: d1,1 d1,2 d1,n d2,1 d2,2 d2,n D= . . . .. . . . . . . . dn,1 dn,2 dn,n 16 in which di,j indicates the sensing probability of a target in grid point j by a sensor in grid point i. The probability matrix can be calculated according to knowledge of sensor and terrain models. We let dis(i, j ) denote the distance from grid point i to grid point j . Then in our thesis, entries of D are calculated as follows: di,j = 4.2 p(dis(i, j )) 0 if vision of grid point j from i cannot be blocked otherwise Evaluation of performance of participants Evaluating performance of participants is similar to evaluating reputation of transaction parties in e-commerce. The participants act as merchants who sell goods, and the organizers or experts act as the customers. In reputation systems for e-commerce, the reputation of merchants are calculated according to the feedbacks and remarks from customers. Thus in participatory sensing, the organizers can also use similar systems to evaluating performance of participants. Existing reputation systems used in applications include: cumulative, where a user s reputation ratings are summed; average, where the reputation score is computed by averaging all scores; blurred, where a weighted sum is used to down weight old ratings; and adaptive, where the current reputation score a ects to what degree new observations make a di erence[2]. Because analysis of pilot campaign participation has shown that it is important to monitor participant contributions while a campaign is running, Sasank Reddy proposed a mathematical model for evaluating participation and performance in participatory sensing[1]. It adopted the Beta distribution to model the performance of a participant, which has solid foundation on the theory of statistics. We adapt their model in our thesis for evaluating participants performance. To be more precisely, our model evaluates the quality of sensing result of participants. 4.2.1 Beta distribution The beta distribution f (p| , ) can be expressed by using the gamma function as: ( + ) 1 f (p| , ) = p (1 p) 1 ( ) ( ) where 0 p 1, > 0, > 0. Beta distribution can be used to represent probability distributions of binary events. It is indexed by two parameters, and . If a process has two possible outcomes {x, x}. We let r denote the number of outcome x and let s denote the number of outcome x. Then the probability density function of outcome x in the future is a beta distribution by setting = r + 1 and = s + 1. The shape of Beta distribution is determined by and , the following gure shows some examples with di erent and : 17 12 2 2.5 3.5 1.8 10 3 1.6 2 1.4 2.5 8 1.2 1.5 2 6 1 1.5 0.8 1 4 0.6 1 0.4 0.5 2 0.5 0.2 0 0 0.2 =1 0.4 , 0.6 = 1 0.8 1 0 0 0 0.2 0.4 = 10 0.6 , 0.8 0 1 = 1 0 0.2 = 5 0.4 , 0.6 0.8 0 0.2 0.4 0.6 0.8 1 1 = 5 = 5 , = 10 Figure 4.4: Example of Beta distributions The exception of p is: E (p) = + The con dence factor is the posterior probability given the actual exception value lies within an acceptable level of error[4]. At the beginning, and are initialized to be 1, which results in a uniform distribution as it is shown by above gure. 4.2.2 Participants performance model by beta distribution The results of participants performance can be represented as a stochastic process which has two possible outcomes (x, x). x means a successful action and x means an unsuccessful action. For ith outcome, we de ne random variables ri and si as following: ri = 1 0 if ith action is successful otherwise si = 0 1 if ith action is successful otherwise It can be observed that ri + si = 1. Beta distribution can be used to model participants performance by setting r be number of successful actions and s be number of unsuccessful actions: r= ri , s = si Then a participant s performance can be found by calculating the expectation of the Beta distribution: E ( , ) = r+1 = + r+s+2 1 At the beginning, r and s are initialized to be 0. Thus E (1, 1) = 2 . 18 4.2.3 Aging factor In the above model, every action of participant has the same weight, without taking account of the time factor. However, as campaign progresses, the ability of participant is being improved, because of practice or taking advices and feedback from organizers. Intuitively, the participants will perform better and better. As a result, the recent performances represent more than the old ones. And the old performances should have less weights than recent ones. We introduce aging factor to show this e ect, by assigning di erent weights ki to di erent actions results: r= ki ri , s = ki si People has proposed several di erent methods to make aging factors. Josang used exponential function in [3]: ri (n i) , s = r= si (n i) , 0 1 It has one advantage that it doesn t need to remember all the actions results in the sequence, because it can be calculated recursively as follows: r = r + ri , s = s + si Reddy used a constant aging factor 0.8 in the experiment of PEIR[1]: r= ri 0.8(n i) , s = si 0.8(n i) Constant aging factor has the advantage that it can be calculated fast. However, the above methods only emphasize the order of actions results in the sequence, but omit that the performance of participant won t be changed too much during the same day. Meanwhile, the performance of a participant may decrease after taking no actions for several days. Taking the following two participants as an example: Participants 1 2 Days Successful No Sensing Actions Unsuccessful Figure 4.5: Examples of two participants The two participants both contributed 8 successful and 2 unsuccessful results during the campaign, and both in same order. They have same performance 19 evaluations, by using above aging factor system. If we use aging factor 0.8, both of them has the performance evaluation 0.7723. However, participant 1 should have higher probability to contribute successful results in the immediate future. So we use the following aging factor in our thesis: ki = (t ti ) in which t is the current day, ti is the day when the action is performed. After using this aging factor system, participant 1 has performance evaluation 0.8050, and participant 2 has performance evaluation 0.6257. Meanwhile, it still has the advantage of can be calculated recursively: r = r (ti ti 1 ) + ri , s = s (ti ti 1 ) + si 4.2.4 More general model In practical campaign, the feedback from organizers for participants are not just binary. Because the result of an action cannot be only judged as successful or unsuccessful. The organizers give the feedback in form of a pair of real numbers (ri , si ), in which ri means the satisfaction degree and si means the dissatisfaction degree. Moreover, it is more convenient to give the feedback by just one real number vi . Then ri and si can be calculated as follows: ri = 1 vi 1 + vi , si = 2 2 Because di erent tasks have di erent di culties, it is straightforward that it can be indicated by a positive weight wi to show the di culty. More important the task is, the larger its weight is. Then ri and si can be calculated as follows: ri = wi (1 + vi ) wi (1 vi ) , si = 2 2 Together with the aging factor, the parameters of and can be calculated as follows: wi (1 + vi ) (t ti ) =1+ ri = 1 + 2 wi (1 vi ) (t ti ) =1+ si = 1 + 2 Then we can get the probability that next sensing action of a participant 1 whose result is better than Q by calculating Q f (q |( , ))dq . 4.3 Predication of participants actions Another dynamical aspect of participants is their motions, because nobody knows what exactly he/she will do tomorrow. However, their motions are not totally random, because everybody has his/her schedule everyday, or places they 20 used to go to. As a result, their motions patterns could be learned and predicted by some models. Reddy proposed a coverage-based recruitment system that consists of three distinct stages: the quali er, interview, and progress view, modeled after realworld recruitment processes[8]. Although they made quite a lot of e orts in providing a privacy mechanism, it is limited in practical campaigns because of some reasons: 1. In the pre-processing of building mobility pro les, a lot of GPS logs are required. The engine generates a likely route using Google Router generator, which is not accurate. 2. The data of participants resides in a private data store and the queries can made by organizers are limited, but it is not easy to reduce participants concerns on privacy. 3. In some campaigns, the data required is not that much. What we need is just the geo-tagged data and its rough time. For example, in the Campus garbage campaign, the required data contains the locations where the photos are taken and their rough time. In our thesis, we propose a model based on Markov chains to learn and predict participants behaviors, which only needs data from sensing results uploaded by participants. 4.3.1 Model for participants actions Because most every person cares about his/her privacy, the almost only reliable way to collect information of users motions is using their uploaded geo-tagged data. The locations can be obtained from the uploaded data. So their motions during one day can be described by a sequence L. Every element li describes the location where the task is performed. The following gure shows an example: B A E D C Motion of Participant Figure 4.6: Examples of motions sequence of a participant This motions sequence in the gure is represent as [A, D, C, E, B ]. 21 4.3.2 Markov chains Markov chains is a stochastic process named after Andrey Markov. A Markov chain can be described as follows: We have a set of states, S = {s1 , s2 , ..., sn }. The process starts in one of these states and moves successively from one state to another. Each move is called a step. If the chain is currently in state si , then it moves to sj at next step with a probability denoted by pi,j , and this probability does not depend on which states the chain was in before the current state. This property is called Markov property. A transition matrix is used to describe the transitions of a Markov chain with n states. If the probability of moving from si to sj in one step is pi,j , the transition matrix P is given by using pi,j as the ith row and j th column element, e.g., p1,1 p1,2 p1,n p2,1 p2,2 p2,n P = . . . .. . . . . . . . pn,1 pn,2 pn,n It is obvious that the entry pi,j has the following properties: 0 pi,j 1 pi,j = 1 1 j n (n) Given P as the transition matrix of a Markov chain, pi,j of matrix P n gives the probability that the Markov chain, starting in state si , will be in state sj after n steps. One special type of Markov chains is the absorbing Markov chain. A state si of a Markov chain is called absorbing if it is impossible to leave it (i.e., pi,i = 1). A Markov chain if absorbing is it has at least one absorbing state, and if from every state it is possible to go to an absorbing state (not necessarily in one step). In an absorbing Markov chain, a state which is not absorbing is called transient[5]. In real life, where will people go next always has some relationship with where he/she is now. For example, if a student is in the teaching building, then he/she will most likely go to the canteen or the library. If a o ce lady is at work, then she will most likely go to some supermarkets for shopping. Meanwhile, the future motion of participants have few relationships with his/her historical motions, which is the Markov property. As a result, the sequence of participants motions can be modeled as Markov chains, as follows: The states set S consists of {s1 , s2 , ..., sn }, in which each state si corresponds to grid point gi in the sensing eld. 22 4.3.3 Initialization of Markov transition matrix In mathematics, the entries of Markov transition matrix is always set to be equal at the beginning, if no probability information about the system is known, e.g., 1/n 1/n 1/n 1/n 1/n 1/n P = . . . .. . . . . . . . 1/n 1/n 1/n In practical campaign, the transition matrix can be initialized by doing surveys among participants. Participants are asked about their daily schedule, then a transition matrix can be built for every participant. And this can also be used as a criterion for recruitment, together with participants previous campaign performance. Because people in the same class, such as colleague students, they have similar daily schedules, the transition matrix can be calculated from empirical data. For example, if we have previous campaign whose participants are students in same area, we can use the same transition matrix. In order to generate campaign-speci c participation and performance measures, campaign organizers could choose several mechanisms. Campaigns could incorporate a calibration phase paired with reoccurring check-ups where experts or campaign organizers obtain ground truth information for a particular area of interest. Participants would then be coordinated to monitor the same area. The observations made by participants could then be compared to the ground truth to obtain a reliability measure[1]. During this calibration phase, the motions of every participant can be collected and the transition matrix can be calculated. 4.3.4 Predication of participators actions (n) According to the property of Markov transition matrix, pi,j gives the probability that the Markov chain, starting in stat si , will be in state sj after n steps. However, because the number of actions taken by a participant may di er from days to days. We use discrete random variable T to denote the number of taken by participant during one day. Thus we need the discrete probability distribution of T . To accomplish this, we append a virtual ending state to the end of every motions sequence. Meanwhile, we pre x a virtual starting state to head of every sequence. The following gure shows an example with two motions sequences: 23 Ending B A Starting State E D State C Motion of Participant Vitural State Figure 4.7: Examples of starting and ending state Then the Markov transition matrix P is augmented from P , by adding starting state as 0th element and ending state as the (n + 1)th element, as follows: 0 p0,1 p0,2 p0,n , p0,n+1 0 p1,1 p1,2 p1,n , p1,n+1 P = 0 p2,1 p2,2 p2,n , p2,n+1 . . . .. . . . . . . . 0 0 0 0, 1 Every entry p0,i in the rst row denotes the probability that the process starts at state si . Meanwhile, every entry pi,n+1 in the last column denotes the probability that the process ends at state si . When P is initialized at beginning of campaign, if no information about participant s motions is known, it can be initialized as follows: 1 1 1 1 0 n+1 n+1 n+1 , n+1 1 1 1 1 0 n+1 , n+1 n+1 n+1 1 1 1 1 P = 0 n+1 n+1 n+1 , n+1 . . . .. . . . . . . . 0 0 0 0, 1 That means at beginning of a campaign, a participant has equal probability to start his/her sensing actions at any location. And he/she has equal probability to perform next sensing action in each locations. It can be noted that in the last row pn+1,i = 0, where 0 i n pn+1,n+1 = 1 24 That means, once a process enter the state , it will stay there forever, which indicates the motions sequences has been ended. Now this Markov chain becomes an absorbing Markov chain. The only absorbing state is , which is the (n + 1)th state. Absorbing Markov has the following theorem: In an absorbing Markov chain, the probability that the process will be absorbed is 1. Translated into our cases, it says (t) lim p t 0,n+1 =1 Now we can get the probability that a participant takes t actions during one day with help of P : p0,n+1 P r(T = t) = (t) if t = 0 (t 1) p0,n+1 p0,n+1 if t > 0 Meanwhile, it can be noted that: (t) (t 1) lim P r(T = t) = lim p0,n+1 lim p0,n+1 = 0 t t t That means a sensing sequence becomes more and more impossible when its length increases. By combining the discrete probability distribution of T and Markov transition matrix P , we can calculate the probability of one grid point gi being visited by a participant during one day: 1 (t) (1 P0,i ) t=0 In reality, the calculation is stopped after 1 threshold. 4.3.5 P r(T = t) is smaller than some Update of Markov transition matrix Similar with the performance of participants, after a participatory sensing period, the Markov transition matrix need also to be updated. During a period, a participant has motions sequence seqi = [g1 , g2 , ..., gm ] during a day. We let seq n denote the sequence consisting of the rst n elements of s, and let seq n denote the sequence consisting of s with the rst n elements removed. Then we can de ne a sequence seq2 is part of seq1 if seq2 = seq1 i j, where i 0 j 0 25 First we calculate the Markov transition matrix P period, as following: if pi,j = 0 p = i,j [gi ,gj ]is part of seq 1/ [gi ]is part of seq 1 if pi,j = 0 if if pi,j = 1 from information of this [gi ] is not part of seq si = si = sj = si = sj = That is, pi,j is the ratio of moves from state si to state sj to all the moves from si . Similar with the performance of participants, recently data should have more weight than the old ones. Thus aging factor is also used in the transition matrix. And it still has the advantage that the entries of matrix can be calculated recursively. Then it need not to remember all the participants motions. Assume that before this period, the transition matrix is P , then after received new information about the participant s motions, it can be updated as follows: pi,j = pi,j + pi,j +1 It can be rewritten in matrix notations form: P= 4.4 P + P +1 Determination of grid points which need extra wireless sensors For every grid point gi which has the sensing requirement < qi , pi >, the probability that its data can be sensed by any participant with required quality can be calculated in following steps: The probability that participant x perform a sensing action with required quality is 1 fx (q | , )dq qi The probability that grid point gi can be sensed when action is performed in grid point j is dj,i As a result, the probability that grid point i data can be sensed by participant x with required quality is j =N j =1 1 (1 (dj,i 1 fx (q | , )dq (1 qi (t) (1 P0,j )))) t=0 26 Now we can get the probability that data grid point i can be sensed by any participant with required quality is j =N (1 (dj,i pri = 1 1 fx (q | , )dq (1 qi x paritipants j =1 (t) (1 P0,j )))) t=0 If the result is smaller than pi , it means grid point i need be monitored by extra wireless sensors. And it s not hard to get that the new lowest required probability that grid point i is monitored by wireless sensors is pi = pi pri 1 pri if pri pi otherwise 0 After this step, we get a new vector P which indicates the probability that a grid point gi need to be covered by extra wireless sensors. 4.5 Deployment of wireless sensor network We give di erent algorithms for the two di erent cases. Unfortunately, both cases are NP-hard. Firstly, we propose a constraint programming algorithm to solve case 1, which aim at nding the optimal solution. Secondly, a greedy heuristic algorithm is proposed for case 2, which can nd an approximate solution. The performances of these algorithms are studied and compared in several di erent study cases. 4.5.1 Solve case 1 by constraint programming Minimum set covering The set covering problem (SC) can be de ned as follows: let U = {e1 , e2 , ..., em } be a set of m elements. Let X be a collection of subsets of U , i.e. X = {S1 , ..., Sn } where Si U, (1 i n) and let cj be a weight associated to each subset Sj , 1 j n. SC calls for a subset T of indexes of X covering U : T {1, ..., n} i T Si = U and such that j T cj is minimum. In this thesis, we only consider the cases when cj = 1. Case 1 can be modeled as a minimum set covering problem as follows: We let U to be the set of grid points which need to be monitored by extra wireless sensors. For each grid point gi in U , a subset Si of U is built by adding all the grid points gj such that di,j = 1. There are several techniques could be used to solve the minimum set covering problem. For example, constraint programming, integer programming and local search. We choose constraint programming to solve case 1 in our thesis. 27 Constraint programming solution Sebastien Mouthy, Yves Deville and Gregorie Dooms proposed a global constraint for the set covering problem in [9]. They also proposed a propagator for this constraint. It uses a shaving technique and a lower bound based on an IP relaxation. As far as we know, it is the only paper on how to solve the minimum set covering problem by constraint programming directly. Here we propose our method, based on some observations of the problem. In contrast to their shaving technique, we construct the set covering from empty set. The subsets are chosen to try enlarging the covering set. And their members are removed. Our algorithm is studied by using the same test data as [9], and the results turn out to be very good. Although the algorithm proposed here only deals with the cases when cj = 1, it is straightforward to be extended for general cases. Preliminaries The minimum set covering problem can be viewed more straightforwardly as a bipartite graph < A, B, E >, by constructing vertex set A in which ai represents Si , and vertex set B in which bj represents ej , and connecting vertex ai with bj if ej Si . Then the minimum set covering is converted to nding smallest subset A A to cover all vertexes in B . By cover we mean every vertex in B has at least one neighbor in A . For example, U = {1, 2, 3, 4, 5}, S1 = {1, 3, 5}, S2 = {1, 4}, S3 = {5, 2}, S4 = {1, 3, 2}. It could be viewed as following bipartite graph: S1 S2 S3 S4 A B e1 e2 e3 e4 e5 Figure 4.8: View as bipartite graph One of the smallest A is {a1 , a2 , a4 }. We let msc(S, U ) denote the cardinality of smallest set cover of S with the universe U . Meanwhile, we let mscG(A, B, E ) denote the cardinality of smallest subset of A which covers all vertexes in B , with the edges set E . And we may also use mscG(G) if the detailed information about graph G is not cared. Let S be a subset of U . By del(X, S ) we denote the collection which is obtained from X by removing the elements of S from every Si in X : del(X, S ) = {S |S = Si \ S Si X }. Some properties With the help of viewing as bipartite graph, we can get the following properties: 28 Property 1. If there is such vertex bj in B that no vertexes ai in A could cover it, then there is no solution. Property 2. We denote CC the set of connected components of the bipartite graph, then mscG(A, B, E ) = cc CC mscG(cc). Property 3. If there is a set Si in X which is a subset of Sj , then msc(X, U ) = msc(X \ {Si }, U ). Property 4. If there is an element e of U which belongs to a unique S X , then msc(X, U ) = 1 + msc({del(X, S )}, U \ S ). Property 5. For general msc(X, U ), the following holds: msc(X, U ) = min{msc(X \ {S }, U ), 1 + msc(del(X \ {S }, U \ S ))}. Property 6. If there is some vertex ai which covers no vertexes bj in B , then it could be removed. That is mscG(A, B, E ) = mscG(A \ {ai }, B, E ). Property 7. If a subset A A covers B , and all the vertexes in B are covered by more than one ai A , then there must be some A A which also covers B. Proof. We let coverage(bj ) = (ai ,bj ) E 1. Then if coverage(bj ) 2 for every bj , remove any ai A will decrease every coverage(bj ) at most by one. Thus A = A \ {ai } will still can cover cover B . Basic CP algorithm First we propose a basic algorithm for searching for a minimum set covering. MinimumSetCovering will nd the minimum set covering in bipartite graph (A, B, E ). Current is the vertex set which has been chosen as part of set covering. K is the minimum set covering found so far. FilterAndPropagate will do the propagation, it returns false if this search branch fails. At the beginning, it is called by CP (A, B, E ). CP(A,B ,E ) if A cannot cover B then no solution else MinimumSetCovering( ,A,B ,A) end if MinimumSetCovering works as follows: rst it will try to select vertex ai from A by some strategy, until a covering set is found or the search fails. After selecting ai , it will remove all bj in B which is covered by ai . Then it will call FilterAndPropagate to perform propagations. If FilterAndPropagate returns false, it increase the f ails counter. Otherwise, it checks whether B is empty, if so, Current will be a set covering and K will be replaced if |K | > |Current|. Otherwise it calls MinimumSetCovering to recursive. MinimumSetCovering(Current,A,B ,K ) 29 while A is not empty do save B select a from A by some strategy and remove it as well as the covered elements in B save A add a to Current if FilterAndPropagate(Current,A,B ,K ) then if B is empty then if |K | > |Current| then K =Current end if else MinimumSetCovering(Current,A,B ,K ) end if else f ails=f ails+1 end if remove a from Current restore A restore B end while FilterAndPropagate does the propagations: FilterAndPropagate(Current,A,B ,K ) repeat continue=f alse for each b in B do if there is no a in A which can cover b then return f alse end if end for for each a in A do if a cannot cover any b then remove a from A continue=true end if end for for each a in A do if the elements covered by a is subset of another aa then remove a from A continue=true end if end for for each b in B do if there is only one a in A which can cover b then remove a from A and all the b in B which is covered by a 30 add a to Current continue=true end if end for until continue is f alse There are several possible strategies could be used to select the next Ai to enlarge Current: 1. One possible strategy is to select ai which covers the most bj . 2. Another way is rst looking for bj which is covered by fewest ai , then choose one ai which covers it. 3. If after selecting some ai , the graph has more than one connected components, then we get several subproblems with smaller scales. Thus we could solve them individually, then combine the solutions according to property 2. In our algorithm, we use the rst strategy. More properties Although the above algorithm works, it is too basic. We will explore more properties about this problem. At any time, all the vertexes in A can be divided into four sets: 1. Candidate Set: The vertexes set Candidate A in which every ai can be selected to enlarge the covering set. 2. Current Set: The vertexes set Current A in which every ai is currently in the partial covering set. 3. Checked Set: We de ne Checked as the vertexes set in which every ai has been checked in current branch. That means, every set covering contains ai has been learned by the algorithm, directly or be pruned. 4. Removed Set: The Removed is the vertexes set in which every ai has been removed during the propagation. We could observe that: Candidate Current Checked Removed = A Candidate Current = Candidate Checked = Candidate Removed = Current Checked = 31 Current Removed = Checked Removed = For simplicity, we de ne more notations: 1. Covers(A ) = {bj | ai (ai A (ai , bj ) E )}, that is the subset of B which is covered by set A . 2. RemainingB = B \ Covers(Current), that is the subset of B which is not covered by Current. 3. U nique(A , A ) = Covers(A ) \ Covers(A \ A ), that is the set of bj which is only covered by A . Then we could have following properties immediately: Property 8. If there is some vertex ai Candidate, and Covers({ai }) RemainingB is subset of some Covers({cj }) and cj Checked. Then ai could be removed safely from Candidate. Proof. Assume that we nd a solution which contains {ai } Current, ai could be replaced by cj , and the result set is still a solution. Meanwhile, by the de nition of Checked, all the covering sets which contain cj have been learned by the algorithm. As a result, ai could be removed from Candidate safely. Property 9. If there is a subset C Current and subset Ch Checked, and U nique(C , Current) Covers(Ch ), and |Ch | |C |. Then this search tree branch could be pruned safely. Proof. Because in this branch of search tree, all solutions will contain Current as a subset, and we could get a equivalent or even better solution by replacing C with Ch . Meanwhile, all the covering sets which contain Ch have been learned by the algorithm. As a result, this branch could be pruned safely. Unfortunately, this constraint is not easy to be checked. Thus we just check the special case when |Ch | = |C | = 1. Lower bounds [9] proposed several methods to calculate lower bounds for minimum set covering: 1. Linear relaxation of SC (SCL*) 2. Ordered interval algorithm (OI) 3. Greedy algorithm (MD - for Minimum Degree) 4. Turan s approximation (TA) 32 We use Greedy algorithm (MD) as a lower bound for our algorithm. It works as follows: an undirected graph G =< V, E > can be built by creating vi for every bi . Then an edge is connected between vi with vj if they can be covered by the same a A. It could be observed that the independent number (G ) is a lower bound of minimum set covering. We use a greedy algorithm to nd (G ): Let (v ) be the set of neighbors of node v and (S ) = v S (v ). Initially, set S = . Choose the node v V \ (S (S )) of minimum degree. Add v to S and loop until S (S ) = V . At the end S will be an independent set of G and |S | will be a lower bound of (G ). Improved CP algorithm Now we propose our algorithm based on above properties: MinimumSetCovering(Current,Candidate,Checked,Removed,RemainingB ,K ) while Candidate is not empty do save Current save Candidate save Checked save Removed save RemainingB select a in Candidate which covers most vertexes in RemainingB and then remove it as well as the covered elements add a to Current remove a from Candidate if FilterAndPropagate(Current,Candidate,Checked,Removed,RemainingB ,K ) then if RemainingB is empty then K =Current else MinimumSetCovering(Current,Available,Checked,RemainingB ,K ) end if else f ails=f ails+1 end if restore RemainingB restore Removed restore Checked restore Candidate restore Current remove a from Candidate add a to Checked end while FilterAndPropagate(Current,Candidate,Checked,Removed,RemainingB ,K ) repeat continue=f alse for each b in B do 33 if there is no a in A which can cover b then return f alse end if end for for each a in A do if a cannot cover any b then remove a from A continue=true end if end for for each a in A do if the elements covered by a is subset of another aa then remove a from A continue=true end if end for for each b in RemainingB do if there is only one a in A which can cover b then remove a from A and all the b in B which is covered by a add a to Current continue=true end if end for if |Current| + lowerbound |K | then return f alse end if for each a in Candidate do for each c in Checked do if Covers({a}) RemainingB Covers({c}) then remove a from Candidate add a to Removed continue=true end if end for end for for each c in Current do for each ch in Checked do if U nique({c}, Current) Covers({ch}) then return f alse end if end for end for until continue is f alse 34 Running results We use the same test data as in[9]. M is the number of elements to cover (i.e. The cardinality of U ). N is the size of the collection of subsets of U . S + (resp. S is the maximum cardinality of subset S i (resp. the minimum cardinality of Si ). Sol is the smallest solution found for N . T ime is the total search time (search+propagation). A 30 seconds limit was used. f ailures is the number of times the algorithm failed during search. nodes is the number of nodes in the search tree. [9] gives three running result, based on lower bounds of 2SC , SCL , and M D. Here we compare our running result with theirs which uses M D as lower bound. Table 4.1: Performance comparison M N S- S+ 10 10 10 10 20 20 20 20 20 20 50 50 50 50 50 50 10 100 100 100 100 100 100 50 50 50 50 50 50 400 400 400 400 400 400 400 400 400 400 50 50 50 50 50 200 200 100 100 400 50 50 50 500 500 200 200 200 200 200 200 200 200 500 500 500 500 500 500 50 500 500 500 500 500 500 200 200 200 200 200 200 1000 1000 1000 1000 1000 1000 500 500 500 500 50 50 50 50 50 200 200 50 50 200 20 20 20 2 2 2 2 2 2 2 4 8 8 2 2 2 4 8 8 2 2 2 2 4 8 8 2 2 2 4 8 8 2 2 2 4 8 8 2 4 8 8 2 2 4 8 8 4 8 4 8 8 4 8 8 6 10 6 10 4 6 10 14 10 14 4 6 10 14 10 14 10 4 6 10 14 10 14 4 6 10 14 10 14 4 6 10 14 10 14 10 14 10 14 6 10 14 10 14 14 14 14 14 14 14 10 14 Sol MD in [9] Time Failures Sol Our algorithm with MD Time Failures Nodes 2 1 2 1 9 5 3 2 3 2 39 28 19 12 15 11 1 199 104 94 45 62 42 36 24 15 10 12 8 980 964 958 930 940 934 462 462 463 446 21 12 8 10 7 138 84 23 29 171 7 9 7 12,4 8,46 1,66 1,18 > 30 > 30 > 30 8,97 > 30 13,13 > 30 > 30 > 30 > 30 > 30 > 30 0,09 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 30 0,14 1,42 2,25 2 1 2 1 5 4 3 2 3 2 15 11 8 5 7 5 1 32 23 16 12 14 12 15 11 8 6 7 6 132 18 71 55 62 51 80 61 68 56 14 9 6 8 6 31 29 17 16 67 7 9 7 0,504 0,076 0,122 0,025 8,369 9,061 4,959 0,547 14,882 0,921 > 30 > 30 > 30 > 30 > 30 > 30 0,007 > 30 > 30 > 30 > 30 > 30 > 44, 33 > 30 > 30 > 30 > 30 > 30 > 30 > 30 > 42.769 > 72.862 > 146.072 > 187.058 > 228.67 > 30 > 38.489 > 49.25 > 59.783 0,223 1,016 0,44 12,602 10,782 > 30 > 30 0,306 2,553 > 30 0,027 0,07 0,097 263731 127222 45025 20875 771069 619822 398993 49310 266804 102288 551144 446423 318013 226824 212295 174868 1446 511838 382419 269540 167095 176621 153117 749486 500720 393054 203992 243080 159690 117592 128167 88089 61146 64338 49467 121025 89955 73104 55121 473335 272193 183474 201884 143366 91178 89203 130224 95007 65742 775 9487 9991 96 0 54 0 9598 4848 1233 110 2919 167 0 0 0 166 0 0 0 0 113 10982 0 270 0 1450 81 0 0 324 0 1 18 0 0 0 0 9 38 0 0 562 1673 540 17263 12944 1947 0 371 2836 0 28 121 213 98 1 56 1 9894 5000 1283 112 3044 169 15 11 8 185 7 5 1 27 148 11508 12 313 12 1548 99 8 6 356 6 74 92 57 52 58 47 62 91 52 45 584 1736 563 17850 13463 2026 22 395 2920 33 31 133 230 From above running results, we could observe several advantages of our 35 algorithm: 1. When our algorithm reaches a successful leaf, the solution is almost optimal. 2. When the scale of problem increases, our algorithm spends more time on propagations. Although sometimes it spends a little more than 30 seconds to nd the rst solution, the solution is much better than the previous work. 3. If the problem has a solution, At any node of the search tree, Current Candidate is a covering set. Thus our algorithm can be stopped at any time, it will always give a covering set. 4.5.2 Solve case 2 by greedy algorithm Case 2 is also NP-hard, simply because it is a generalization of case 1. Firstly we propose an algorithm based on integer linear programming, then a polynomial greedy heuristic algorithm is proposed to solve it. Integer linear programming formalization Case 2 can be formalized by integer linear programming, as follows: Let ai be a binary variable de ned as follows: ai = 1 0 if a wireless sensor is deployed in grid point gi otherwise Then for grid point gi , the probability that it cannot be monitored by any wireless sensors is: j =N (1 aj dj,i ) j =1 Thus case 2 can be formalized as: objective: Minimize ai subject to: for every grid point i, (1 aj dj,i ) 1 pi 0 ai 1 However, this formalization is not in an integer linear programming form, because some of its constrains are not linear. With help of some tricks, we can convert them into linear form: First, with help of logarithmic function, the constrains can be converted into: lg (1 aj dj,i ) lg (1 pi ) 36 The only problem is that logarithmic function f (x) = lg (x) is unde ned for x = 0. But this can be solved by adding a negligibly small value to each item. For example: lg (1 + 10 100 aj dj,i ) lg (1 + 10 100 pi ) Then the integer variables are moved outside the logarithmic functions: aj lg (1 + 10 100 dj,i ) lg (1 + 10 100 pi ) Now the constrains become linear. As a result, case 2 can be formalized as: objective: Minimize ai subject to: for every grid point i, aj lg (1 + 10 100 dj,i ) lg (1 + 10 100 pi ) 0 ai 1 Although case 2 can be formalized into an integer linear programming form, it cannot be solved e ciently when the scale of problem increases. As a result, we propose a greedy heuristic algorithm here. Greedy algorithms Dhillon proposed two greedy algorithms to solve case 2[12]. One of them is called M AX AV G COV , and the other one is M AX M IN COV . Firstly we explain their algorithms. Then we propose our own method M AX V ALID COV . The rst procedure M AX AV G COV attempts to maximize the average coverage of the grid points. The second procedure M AX M IN COV attempts to maximize the coverage of the grid point that is covered least e ectively[12]. The sensor detection probability matrix D = [di,j ] is generated for all pairs of grid points in the sensing eld. From the sensor detection probability matrix D, the miss probability matrix is M = mi,j , where mi,j = 1 pi,j . Both of their algorithms use a heuristic method to determine the best placement of next sensor at a time. The algorithms are iterative and terminated when every grid point is covered with required probability. The vector M = (M1 , M2 , ..., MN ) is de ned as the miss probability of every grid point. An entry Mi in this vector denotes the probability that grid point gi is not collectively covered by any sensors. The pseudocode steps of M AX AV G COV are outlined as follows: M AX AV G COV (M, M ) num sensors := 1 repeat for 1:=1 to N do i = mi1 + mi2 + ... + miN end for 37 Place sensor on grid point k such that k is minimum for 1:=1 to N do Mi = Mi mki end for Delete k th row and column from the M matrix num sensors := num sensors + 1 until Mi < 1 pi for all i, 1 i N An alternative method for sensor placement is to place a sensor at each iteration at the grid point with minimum coverage. This process determined when the missing probability of every grid point is lower than its threshold. It is outlined as follows: M AX M IN COV (M, M ) Place rst sensor randomly num sensors := 1 repeat for 1:=1 to N do Mi = Mi mki end for Place sensor on grid point k such that Mk is maximum Delete k th row and column from the M matrix num sensors := num sensors + 1 until Mi < 1 pi for all i, 1 i N Our greedy algorithm M AX V ALID COV is based on M AX AV G COV . Because it is useless to cover gird points which has been covered by required probability. We remove such grid points from sensing eld after each iteration. Our algorithm works as follows: M AX V ALID COV (M, M ) num sensors := 1 repeat for 1:=1 to N do i = mi1 + mi2 + ... + miN end for Place sensor on grid point k such that k is minimum for 1:=1 to N do Mi = Mi mki end for Delete k th row and column from the M matrix for 1:=1 to N do if Mi 1 pi then Delete ith row and column from the M matrix end if end for num sensors := num sensors + 1 38 until M is empty The performance of their algorithm is studied in several cases, together with our proposed algorithm. Experiment on regular square The rst experiment is done on a regular square sensing eld with size n, from 5 to 50. is set to be 0.6 in the detection probability function. The distance between adjacent grid points is 1 unit. The detection probability function of wireless sensors is e 0.6d if 0 d 3 p(d) = 0 otherwise The sensing probability thresholds for every grid points are generated at random. The following gure shows the running results. The x-axis denotes the number of grid points in the sensing eld. And the y-axis denotes the number of sensors deployed. It compares our algorithm with Dhillon s two greedy algorithms. 250 Number of Sensors Deployed 200 150 100 50 0 0 500 MAX_AVG_COV 1000 1500 Number of Grid Points 2000 MAX_MIN_COV 2500 MAX_VALID_COV Figure 4.9: Results for Experiment 1 Experiment on regular square with obstacles This experiment is the same with previous one, except that n obstacles are placed randomly. The following gure show the comparisons between these 39 algorithms. 250 Number of Sensors Deployed 200 150 100 50 0 0 500 1000 1500 Number of Grid Points MAX_AVG_COV MAX_MIN_COV 2000 2500 MAX_VAILD_COV Figure 4.10: Results for Experiment 2 Experiment on irregular sensing eld The experiment is done on an irregular sensing eld, whose grid points are generated randomly by removing 80% percent grid points from a square sensing eld randomly. The following gure shows the running results of these three algorithms: 40 150 Number of Sensors Deployed 100 50 0 0 50 100 MAX_AVG_COV 150 200 250 300 Number of Grid Points MAX_MIN_COV 350 400 450 500 MAX_VALID_COV Figure 4.11: Results for Experiment 3 From these 3 case studies, we can nd that our algorithm M AX V ALID COV performs better than M AX AV G COV and M AX M IN COV in all the cases. Performance of MAX VALID COV When the scale of problem increases, our constraint programming algorithm takes more and more time to nd the optimal solution. However, our greedy algorithm has a time complexity of O(N 2 ), which is suitable for large scale of deployment. As a result, we can use greedy algorithm to get an approximate solution for case 1. In this experiment, we compare the performance of our greedy algorithm with constraint programming algorithm. And we use the same settings as in previous experiment. The results show that our greedy algorithm is a good approximation of the optimal solutions. 41 80 70 Number of Sensors Deployed 60 50 40 30 20 10 0 0 50 100 150 200 Number of Grid Points MAX_VALID_COV 250 300 Constraint Programming Figure 4.12: Results for Experiment 4 42 350 Chapter 5 Experimental results In this chapter, we do some experiments to show the performance our framework. 5.1 Lower bound for performance In this section, we show a performance lower bound for our deployment framework. By lower bound, we mean that in this experiment, all the participants behaviors are totally random, including their motions patterns and their sensing quality. The sensing eld is a irregular area containing 100 grid points, and each grid point has a required coverage probability of 0.8. The distance between adjacent grid points is 1 unit. There are 10 participants in the campaign. The campaign consists of 10 period and each period consists 20 days. And it is assumed that the mobile phones and wireless sensors have the same detection probability functions: p(d) = e 0.6d 0 if 0 d 3 otherwise Meanwhile, we only simulate case 2, because it is a generalized situation of case 1. The rst gure shows the number of wireless sensors deployed in the sensing eld. It can be shown by the gure that the maximum deployed sensors is 10, which is 10% of the grid points. At it happens at the beginning of the campaign. 43 10 Number of Sensors Deployed 9 8 7 6 5 4 0 1 2 3 4 5 6 7 8 9 Period Figure 5.1: Number of Sensors Deployed The second gure shows the percentage of grid points which are covered with required probability, together with the percentage of grid points which are not covered with required probability. It can be shown that in average, 94.97% of grid points are covered with required probability. The maximum coverage is 100%, and the minimum one is 89%. 44 1 0.9 0.8 0.7 Percentage 0.6 0.5 0.4 0.3 0.2 0.1 0 0 20 40 60 80 100 Day Percentage of Covered Grid Points 120 140 160 180 200 Percentage of Uncovered Grid Points Figure 5.2: Percentage of Coverage The experiments show that our deployment framework can cover about 94.97% grid points with required probability. And it uses relatively small number of wireless sensors, which is at most 10% of the grid points. And the results will become better in real campaigns because the behaviors of participants are not totally random. 5.2 Experiment with motion patterns In this experiment, we limit the motions of every participant into part of grid points. And the number of actions taken by one participant during one day is at most 10. The following gure compares the sensing results with and without wireless sensors: 45 0.9 0.8 0.7 Percentage 0.6 0.5 0.4 0.3 0.2 0.1 0 20 40 60 80 100 Day Percentage of Covered Grid Points 120 140 160 180 200 Percentage of Incovered Grid Points Figure 5.3: Running results without Wireless Sensors When the sensing results reply only on participants behaviors, the percentage of grid points which can gather enough quantity of data is very low. In average, 25.21% of grid points are covered with required probability. 46 1 0.9 0.8 0.7 Percentage 0.6 0.5 0.4 0.3 0.2 0.1 0 0 20 40 60 80 100 Day Percentage of Covered Grid Points 120 140 160 180 200 Percentage of Uncovered Grid Points Figure 5.4: Running results with Wireless Sensors As it can be shown in the gures, the sensing results are improved huge after the wireless sensors are deployed in the sensing eld. In average, 83.30% of grid points are covered with required probability. And it can also be shown that in the rst period, the coverage is still low. However, after the rst period, the coverage is improved after our framework learns the motions patterns of participants. And the rst period can be avoid by adding a calibration phase, in which the motions patterns can be learned by our framework before the rst deployment. The results will become even better in real campaigns, because the quality of sensing results of participants are not totally random, and the motions patterns of participants contains more information than just random walk. 47 Chapter 6 Conclusion In our thesis, we propose a deployment framework for wireless sensor network in participatory sensing. Our framework combines the participants together with the wireless sensors network. It has a high level of exibility and can be adapt to di erent kinds of campaigns after replacing some sub-models. The experiments show that our framework has a good performance. It covers most of the grid points with required sensing probability. And the performance will become even better in real campaigns. Because the participants behaviors have some hidden patterns and contain a lot of information which can be learned by our framework. 48 Acknowledgement I would like to express my deep and sincere gratitude to my supervisor, Edith Ngai, who introduced me to this interesting topic and gave me a lot of important suggestions. I also would like to thank Sebastien Mouthuy who shared their test data on solving minimum set covering by constraint programming with me. The special thanks go to my dear girlfriend Yi Zhao as well as our parents who encourage me all the time. 49 Reference [1] Sasank Reddy, Katie Shilton, Je Burke, Deborah Estrin, Mark Hansen, Mani Srivastava Evaluating Participation and Performance in Participatory Sensing. 2008 [2] A. Schlosser, M. Voss, and L. Bruckner Comparing and Evaluating Metrics for Reputation Systems by Simulation Paolucci, 2004 [3] Audun Josang, Roslan Ismail The beta Reputation System 15th Bled Electronic Commerce Conference, 2002 [4] W. Teacy, J. Patel, N. Jennings, and M. Luck Coping with inaccurate reputation sources: experimental analysis of a probabilistic trust model Conference on Automous Agents and Multiagent Systems, pp. 997-1004, 2005 [5] Charles M. Grinstead, J. Laurie Snell Introduction to Probability Chapter 11, 1997 [6] Je rey Goldman, Katie Shiton, Je Burke, Deborah Estrin, Mark Hansen, Nithya Ramanathan, Sasank Reddy, Vids Samanta, Mani Srivastava, and Ruth West Participatory Sensing 2009 [7] Mani Srivastava MobiSense - Mobile Network Services for Coordinated Participatory Sensing 2009 [8] Saank Reddy, Katie Shilton, Je Burke, Deborah Estrin, Mark Hansen, and Mani Srivastava Using Context Annotated Mobility Pro les to Recurit Data Collectors in Participatory Sensing 2009 [9] Sebastien Mouthy, Yves Deville and Gregorie Dooms Global constraint for the set covering problem. Actes JFPC, 2007 [10] Fabrizio Grandoni A Note on the Complexity of Minimum Dominating Set. Journal of Discrete Algorithms, 2005 [11] Di Tian, Nicolas D. Georganas A Coverage-Preserving Node Scheduling Scheme for Large Wireless Sensor Networks. Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications, 2002 50 [12] Santpal Singh Dhillon, Krishnendu Charkrabarty Sensor Placement for E ective Coverageand Surveillance in Distributed Sensor Networks. Proc. of IEEE Wireless Communications and Networking Conference, 2003 [13] Krishnendu Chakrabarty, S. Sitharama Iyengar, Hairong Qi, Eungchun Cho Grid coverage for surveillance and target location in distributed sensor netorks. IEEE Transactions on Computers, 2002 [14] Sameera Poduri, Gaurav S. Sukhatme Constrained Coverage for Mobile Sensor Networks. IEEE Transactions on Computers, 2004 [15] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cli ord Stein Introduction to Algorithms, 2nd edition. The MIT Press, 2001 [16] Jean-Charles Regin Solving the Maximum Clique Problem with Constraint Programming. Proceedings CPAIOR, 2003 51

Formatting page ...

vpnjain chat

Horizontal lines at:
Guest Horizontal lines at:
AutoRM Data:
Box geometries: