On Packing R-trees. Department of CS. University of Maryland. College Park, MD Abstract
|
|
- Ἠλύσια Πυλαρινός
- 6 χρόνια πριν
- Προβολές:
Transcript
1 On Packing R-trees Ibrahim Kamel and Christos Faloutsos Department of CS University of Maryland College Park, MD Abstract We propose new R-tree packing techniques for static databases. Given a collection of rectangles, we sort them and we build the R-tree bottom-up. There are several ways to sort the rectangles; the innovation of this work is the use of fractals, and specically the hilbert curve, to achieve better ordering of the rectangles and eventually better packing. We proposed and implemented several variations and performed experiments on synthetic, as well as real data (TIGER les from the U.S. Bureau of Census). The winning variation (`2D-c') was the one that sorts the rectangles according to the hilbert value of the center. This variation consistently outperforms the packing method of Roussopoulos and Leifker [24], as well as other R-tree variants. The performance gain of the our method seems to increase with the skeweness of the data distribution; specically, on the (highly skewed) TIGER dataset, it achieves up to 58% improvement in response time over the older packing algorithm and 36% over the best known R-tree variant. We also, introduce an analytical formula to compute the average response time of a range query as a function of the geometric characteristics of the R-tree. 1 Introduction One of the requirements for the database management systems (DBMSs) of the near future is the ability to handle spatial data. Spatial data arise in many applications, including: Cartography [27]; Computer-Aided Design (CAD) [22] [12]; computer vision and robotics [2]; traditional databases, where a record with k attributes corresponds to a point in a k-d space; temporal databases, where time can be considered as one more dimension [18]; scientic databases, with spatial-temporal data, etc. Currently on sabbatical at IBM Almaden Research Center. This research was partially funded by the Systems Research Center (SRC) at the University of Maryland, and by the National Science Foundation under Grant IRI (PYI), with matching funds from EMPRESS Software Inc. and Thinking Machines Inc. 1
2 In the above applications, one of the most typical queries is the range query: Given a rectangle, retrieve all the elements that intersect it. A special case of the range query is the point query or stabbing query, where the query rectangle degenerates to a point. We focus on R-trees, which is one of the most ecient methods that support range queries. The original R-tree [13] and almost all of its variants are designed for a dynamic environment, being able to handle insertions and deletions. Operating in this mode, the R-tree guarantees that the space utilization is at least 50%; experimental results [3] showed that the average utilization is 70%. However, for a static set of data rectangles, we should be able to do better. Improving the space utilization through careful packing will have a two benets: Not only does it reduce the space overhead of the R-tree index, but it can also oer better response time, because the R-tree will have a higher fanout and it will possibly be shorter. Static data appear in several applications. For example, in cartographic databases, insertions and deletions are rare; databases with census results are static; the same is true for databases that are published on CD-ROMs; databases with spatio/temporal meteorological and environmental data are seldom modied, too. The volume of data in these databases is expected to be enormous, in which case it is crucial to minimize the space overhead of the index. In this paper we study the problem of how to create a compressed R-tree for static data. Our goal is to develop a packing method such that the resulting R-tree (a) will have 100% space utilization and (b) will respond to range queries at least as fast as the R-tree of any other known method, (e.g., the R tree or the quadratic-split R-tree). We design and study several heuristics to build the R-tree bottom-up. Most of these heuristics are based on space lling curves, and specically, the Hilbert curve. We report experiments from 2-dimensional data, although our method can handle higher dimensionalities. The experimental results showed that, the most eective of our heuristics is the one that sorts the data rectangles according to the hilbert value of their centers (`2D-c' heuristic). This heuristic consistently gives signicant saving in response time over all the known R-tree variants, namely, the quadratic-split R-tree, the R tree, as well as the method proposed by Roussopoulos and Leifker [24], which is the only R-tree packing known up to now. We also propose an analytical formula to compute the average response time of a range query by using the area and perimeter of the R-tree nodes. This formula works for any R-tree variation, either static or dynamic. The paper is organized as follows. Section 2 gives rst a brief description of the R-tree and its variants. Section 3 describes our proposed heuristics to build the compressed R-tree. In section 4, we introduce the analytical formula to compute the average response time for a given R-tree instance, given some information about the minimum bounding rectangles of its nodes. Section 5 presents 2
3 our experimental results that compare the proposed schemes with others. conclusions and directions for future research. Section 6 gives the 2 Survey Several spatial access methods have been proposed. A recent survey can be found in [25]. These methods fall in the following broad classes: methods that transform rectangles into points in a higher dimensionality space [15]; methods that use linear quadtrees [8] [1] or, equivalently, the z- ordering [21] or other space lling curves [7] [17]; and nally, methods based on trees (R-tree [13], k-d-trees [4], k-d-b-trees [23], hb-trees [19], cell-trees [11] e.t.c.) One of the most promising approaches in the last class is the R-tree [13]. It is the extension of the B-tree for multidimensional objects. A geometric object is represented by its minimum bounding rectangle (MBR). Non-leaf nodes contain entries of the form (ptr,r) where ptr is a pointer to a child node in the R-tree; R is the MBR that covers all rectangles in the child node. Leaf nodes contain entries of the form (obj-id, R) where obj-id is a pointer to the object description, and R is the MBR of the object. The main innovation in the R-tree is that father nodes are allowed to overlap. This way, the R-tree can guarantee at least 50% space utilization and remain balanced Figure 1: Data (dark rectangles) organized in an R-tree with fanout=3 Figure 1 illustrates data rectangles (in black), organized in an R-tree with fanout 3. Figure 2 shows the le structure for the same R-tree, where nodes correspond to disk pages. In the rest of this paper, the term `node' and the term `page' will be used interchangeably. Guttman proposed three splitting algorithms, the linear split, the quadratic split and the exponential split. Their names come from their complexity; among the three, the quadratic split algorithm is the one that achieves 3
4 Root Figure 2: The le structure for the R-tree of the previous gure (fanout=3) the best trade-o between splitting time and search performance. Subsequent work on R-trees includes the work by Greene [9], the R + -tree [26], R-trees using Minimum Bounding Polygons [16], and nally, the R -tree [3], which seems to have the best performance among the R-tree variants. The main idea in the R -tree is the concept of forced re-insert, which is analog to the deferred-splitting in B-trees. When a node overows, some of its children are carefully chosen and they are deleted and re-inserted, usually resulting in a R-tree with better structure. All the above R-tree variants support insertions and deletions, and are thus suitable for dynamic environments. For a static environment, the only R-tree packing scheme we are aware of is the method of Roussopoulos and Leifker [24]. They proposed a method to build a packed R-tree that achieves (almost) 100% space utilization. The idea is to sort the data on the x or y coordinate of one of the corners of the rectangles. The sorted list of rectangles is scanned; successive rectangles are assigned to the same R-tree leaf node, until this node is full; then, a new leaf node is created and the scanning of the sorted list continues. Thus, the nodes of the resulting R-tree will be fully packed, with the possible exception of the last node at each level. Thus, the utilization is 100%. Their experimental results on point data showed that their packed R-tree performs much better than the linear split R-tree for point queries. In our experiments (section 5), their packed R-tree outperformed the quadratic split R-tree and the R tree as well, for point queries on point data. However, it does not perform that well for region queries and/or rectangular data. For the rest of this paper, we shall refer to this method by lowx packed R-tree. In our implementation of their method, we sort the rectangles according to their x value of the lower left corner (`lowx'). Sorting on any of the other three values gives similar results; thus our implementation does not impose an unfair disadvantage to the lowx packed R-tree. 4
5 3 Proposed design We assume that the data are static, or that the frequency of modication is low. Our goal is to design a simple heuristic to construct an R-tree with 100% space utilization, which, on the same time, will have as good response time as possible. The lowx packed R-tree [24] is a step towards this goal. However, it suers from a subtle pitfall. Although it performs very well for point queries on point data, its performance degrades for larger queries. Figures 3 and 4 highlight the problem. Figure 4 shows the leaf nodes of the R-tree that the lowx packing method will create for the points of gure 3. The fact that the resulting father nodes cover little area explains why the lowx packed R-tree achieves excellent performance for point queries; the fact that the fathers have large perimeters, in conjunction with the upcoming Eq. 3 (section 4), explains the degradation of performance for region queries. Intuitively, the packing algorithm should ideally assign nearby points to the same leaf node; ignoring the y-coordinate, the lowx packed R-tree to tends violate this empirical rule Figure 3: 200 points uniformly distributed In order to cluster the data in a better way than the lowx packed R-trees, we propose to use space lling curves (or fractals), and specically, the Hilbert curve. A space lling curve visits all the points in a k-dimensional grid exactly once and never crosses itself. The Z-order (or Morton key order, or bit-interleaving, or Peano curve), the Hilbert curve, and the Gray-code curve [6] are examples of space lling curves. In [7], it was shown experimentally that the Hilbert curve achieves the best clustering among the three above methods. Next we provide a brief introduction to the Hilbert curve: The basic Hilbert curve on a 2x2 grid, denoted by H 1, is shown in Figure 5. To derive a curve of order i, each vertex of the basic curve is replaced by the curve of order i 1, which may be appropriately rotated and/or reected. Figure 5 also shows the Hilbert curves of order 2 and 3. When the order of the curve tends to 5
6 Figure 4: MBR of nodes generated by the `lowx packed R-tree' algorithm H H H Figure 5: Hilbert Curves of order 1, 2 and 3 innity, the resulting curve is a fractal, with a fractal dimension of 2 [20]. The Hilbert curve can be generalized for higher dimensionalities. Algorithms to draw the two-dimensional curve of a given order, can be found in [10], [17]. An algorithm for higher dimensionalities is in [5]. The path of a space lling curve imposes a linear ordering on the grid points, which may be calculated by starting at one end of the curve and following the path to the other end. Figure 5 shows one such ordering for a 4 4 grid (see curve H 2 ). For example the point (0,0) on the H 2 curve has a hilbert value of 0, while the point (1,1) has a hilbert value of 2. After this preliminary material, we are in a position now to describe the proposed methods. Exploiting the superior clustering that the Hilbert curve can achieve, we impose a linear ordering on the data rectangles and then we traverse the sorted list, assigning each set of C rectangles to a node in the R-tree. The nal result is that the set of data rectangles on the same node will be close to each other in the linear ordering, and most likely in the native space; thus the resulting R-tree nodes will have smaller areas. Figure 5 illustrates the intuitive reasons that our hilbert-based methods will result in good performance. The data are points (the same points with Figure 3 and Figure 4). We see that, by grouping the points according to their hilbert values, the MBRs of the resulting R-tree leaf nodes tend to be small square-like rectangles. This indicates that the nodes 6
7 will likely have small area and small perimeter. Small area results in good performance for point queries; small area and small perimeter leads to good performance for larger queries. Eq. 3 conrms the above claims. Algorithm Hilbert-Pack: packs rectangles into an R-tree Step 1. Calculate the hilbert value for each data rectangle Step 2. Sort data rectangles on ascending hilbert values Step 3. /* Create leaf nodes (level l=0) */ While (there are more rectangles) generate a new R-tree node assign the next C rectangles to this node Step 4. /* Create nodes at higher level (l + 1) */ While ( there are > 1 nodes at level l) sort nodes at level l 0 on ascending creation time repeat Step 3 Figure 6: Pseudo-code of the packing algorithm We studied several methods to sort the data rectangles. All of them use the same algorithm (see Figure 6) to build the R-tree. The only point that the proposed hilbert-based methods distinguish themselves from each other is on the way they compute the hilbert value of a rectangle. We examine the following alternatives: 4d Hilbert through corners ("4D-xy"): Each data rectangle is mapped to a point in four dimensional space formed by the lower left corner and the upper right corner namely (lowx, lowy, highx, highy). The hilbert value of this 4-dimensional point is the hilbert value of the rectangle. 4-d Hilbert through center and diameter ("4D-cd"): Each data rectangle is mapped to the 4-dimensional point (c x ; c y ; d x ; d y ). where c x ; c y are the coordinates of the center of the rectangle and d x ; d y the `diameters' or sides of the rectangle. As in 4D-xy, the hilbert value of this 4-dimensional point is the hilbert value of the rectangle. 2-d Hilbert through Centers Only ("2D-c"): Each data rectangle is represented by its center only; the hilbert value of the center is the hilbert value of the rectangle. 7
8 Y X Figure 7: Peano (or z-order) curve of order 3 Method name Description 2D-c sorts on the 2d-hilbert value of the centers (c x ; c y ) 4D-xy sorts on the 4-d hilbert value of the two corners, i.e., (low x ; low y ; high x ; high y ) 4D-cd sorts on 4-d hilbert value of the center and diameters, i.e., (c x ; c y ; d x ; d y ) 2Dz-c sorts on the z-value of the center (c x ; c y ) lowx packed R-tree [24] linear-split R-tree [13] quadratic-split R-tree [13] R -tree [3] sorts on the x coordinate of the lower left corner Guttman's R-tree, with linear split Guttman's R-tree with quadratic split R-tree variant, better packing, forced reinsert Table 1: List of methods - the proposed ones are in italics For the sake of comparison, we also examined a method that uses the Peano curve, or `zordering', despite the fact that the z-ordering achieves inferior clustering compared to the hilbert curve. The z-value of a point is computed by bit-interleaving the binary representation of its x and y coordinates. For example, in Figure 7, the point (0,0) has a z-value of 0, while the point (1,3) has a z-value of 7. Z-order through Centers only ("2Dz-c"): center. The value of the rectangle is the z-value of its Table 1 gives a list of the methods we compared, along with a brief description of each. The new methods are in italics; R-tree methods for static environments are above the double horizontal line; the rest can be applied for dynamic environments, as well. 8
9 Symbols Denitions p page size, in bytes C page capacity (max. number of rectangles per page) P (q x ; q y ) avg. pages retrieved by a q x q y query N d number of data rectangles N number of tree nodes d density of data n i node i in the R-tree n i;x length of node i in x direction n i;y length of node i in y direction L x sum of x-sides of all nodes in the tree L y sum of y-sides of all nodes in the tree T otalarea sum of areas of all nodes in the tree q x length of the query in x direction length of the query in y direction q y Table 2: Summary of Symbols and Denitions 4 Analytical formula for the response time In this section we introduce an analytical formula to evaluate the average response time for a query of size q x q y as a function of the geometric characteristics of the R-tree. This means that once we built the R-tree we can estimate the average response time of the query q x q y without generating random queries and computing the average and variance of their response times. In this discussion we assume that queries are rectangles uniformly distributed over the unit square address space. Without loss of generality we consider a 2-dimensional space. The same idea can be generalized to higher dimensionalities. The response time of a range query is mainly aected by the time required to retrieve the nodes touched by the query plus the time required by the CPU to process the nodes. Since the CPU is much faster than the disk, we assume that the CPU time is negligible (=0) compared to the time required by a disk to retrieve a page. Thus, the measure for the response is approximated the number of nodes (pages) that will be retrieved by the range query. The next lemma forms the basis for the analysis: Lemma 1 If the node n i of the R-tree has an MBR of n i;x n i;y, then the probability DA(n i;x ; n i;y ) that this node will contribute one disk access to a point query is DA(n i;x ; n i;y ) = Prob(a point query retrieves node n i ) = n i;x n i;y (1) Proof: Since we assume that the (point) queries are uniformly distributed in the address space 9
10 and the address space is the unit square. The probability that a random point fall in the rectangle (n i;x ; n i;y ) is the area of the rectangle n i;x n i;y. DA() is the expected number of disk accessed that the specic node will contribute in an arbitrary point query. Notice that the level of the node in the R-tree is immaterial. The next two lemmas calculate the expected number of disk accesses for point and rectangular queries respectively. Lemma 2 (Point query) For a point query, the expected number of disk accesses P (0; 0) is given by P (0; 0) = NX i=1 n i;x n i;y (2) Proof: Every node n i in the R-tree is represented in the native space by its minimum bounding rectangle (MBR) of size say n i;x, n i;y in the x, y direction respectively. Given Lemma 1, each node of the R-tree contributes DA() disk access; to calculate the average number of disk accesses that all the nodes of the R-tree will result into, we have to sum Eq. 1 over all the nodes. q y q x Q q y Q q x (a) (b) Figure 8: a) original nodes along with rectangular query q x q y query Q b) Extended nodes with point Lemma 3 (Rectangular query) For a rectangular query q x q y, the expected number of disk accesses P (q x ; q y ) is given by P (q x ; q y ) = NX n i;x n i;y + q y NX n i;x + q x NX i=1 i=1 i=1 n i;y + N q x q y (3) Proof: A rectangular query of size q x q y is equivalent to a point query, if we `inate' the nodes of the R-tree by q x and q y in the x- and y-directions respectively. Thus, the node n i with size n i;x n i;y behaves like a node of size (n i;x + q x ) (n i;y + q y ). Figure 8 illustrates the idea: Figure 8(a) shows a 10
11 range query q x q y with upper-left corner at Q; this query is equivalent to a point query anchored at Q, as long as the data rectangles are `inated' as shown by the dotted lines in Figure 8(b). Applying (Eq. 2) on Figure 8(b) we obtain: P (q x ; q y ) = which gives (Eq. 3), after trivial mathematical manipulations. Notice that Lemma 3 gives NX i=1 (n i;x + q x ) (n i;y + q y ) (4) P (q x ; q y ) = T otalarea + q x L y + q y L x + N q x q y (5) where T otalarea = P (0; 0) is the sum of all the areas of the nodes of the tree, and L x, L y respectively the sums of x and y extents of all nodes in the R-tree. There are several comments and observations with respect to the above formulas: are The formula is independent of the details of the R-tree creation/insertion/split algorithms; it holds for packed R-trees, for R -trees e.t.c. Notice that Eq. 3 for range queries reduces to Eq. 2 for point queries, if q x = q y = 0, as expected Figure 9: MBR of nodes generated by 2D-c (2-d Hilbert through centers), for 200 random points The last equation illustrates the importance of minimizing the perimeter of the R-tree nodes, in addition to the area. The larger the queries, the more important the perimeter becomes. This explains why the lowx packed R-tree performs well for point queries (q x = q y = 0), but not so well for larger queries. The nodes produced by the lowx packed R-tree (gure 4) have small area but large perimeters. Figure 9 shows the leaf nodes produced by the `2D-c' hilbert packing method, for the set of points of gure 3. Notice that the resulting nodes have smaller perimeters. 11
12 Eq. 3 has theoretical as well as practical value: From a practical point of view, it can assist with the cost estimation and query optimization for spatial queries [1]: Maintaining only a few numbers about the R-tree (total area, total perimeter), a query optimizer can make a good estimate for the cost of a range query. Moreover, researchers working on R-trees can use Eq 3 to avoid issuing queries in their simulation studies. This eliminates one randomness factor (the query), leaving the generation and insertion order of the data as random variables. From a theoretical point of view, it makes one step towards the theoretical analysis of the R-trees; the only missing step is a good estimate of the average size and perimeter of nodes, for a given data set and a given packing/insertion/split algorithm. 5 Experimental results To assess the merit of our proposed hilbert-based packing methods, we ran simulation experiments on a two-dimensional native space. The code for the packed R-tree is written in C under UNIX and the simulation experiments ran on a SUN SPARC station. Since the CPU time required to process the node is negligible, we based our comparison on the number of nodes (=pages) retrieved by range queries. Without loss of generality, the address space was normalized to the unit square. There are several factors that aect the search time; we studied the following ones: Data items: points and/or lines and/or rectangles File size: ranged from 10, ,000 records Query area Q area = q x q y : ranged from of the space size Page size p: varied in the range 1Kb - 4Kb Another important factor, which is derived from N and the average area a of the data rectangles, is the `data density' d (or `cover quotient') of the data rectangles. This is the sum of the areas of the data rectangles in the unit square, or equivalently, the average number of rectangles that cover a randomly selected point. Mathematically: d = N a. For the selected values of N and a, the data density ranges from The rectangles for the synthetic datasets were generated as follows: Their centers were uniformly distributed in the unit square; their x and y sizes were uniformly distributed in the range [0,max], where max was chosen to achieve the desired data density. Data or query rectangles that were not completely inside the unit square were clipped. In addition to synthetic data, we also used real data from the TIGER system of the U.S. Bureau of Census. One data set consisted of line segments, representing the roads of Mont- 12
13 gomery county in Maryland. Using the minimum bounding rectangles of the segments, we obtained rectangles, with data density d = We refer to this dataset as the `MGCounty' dataset. Another dataset consists of line segments, representing the roads of Long Beach, California. The data density of the MBR that covers these line segments is d = 0:15. We refer to this dataset as the `LBeach' dataset. An important observation is that the data in the TIGER dataset follow a highly skewed distribution. In the following subsections we present experiments (a) comparing the response time of the best of our methods (2D-c) with the response time of older R-tree variants (dynamic or static) (b) comparing our hilbert-based packing schemes against each other, to pinpoint the best. 5.1 Comparison of 2D-c hilbert packed R-tree vs. older R-tree variants Pages Touched Montgomery County, MD: line segements Hilbert 2D-c R*-tree Quad R-tree Lowx Qarea x 10-3 Figure 10: Montgomery county, Maryland - real data Here we illustrate that the 2D-c packing method gives better response times than older R-tree variants. The fact that lowx packed R-tree performs worse than Dynamic designs (e.g. R tree) mandated on us to compare our new packing methods with both Static and Dynamic designs, namely lowx packed R-tree, Guttman R-tree and R tree. In our plots we omit the results 13
14 Pages Touched TIGER: Long Beach, CA : line segements Hilbert 2D-c R*-tree Quad R-tree Lowx Qarea x 10-3 Figure 11: Long Beach, California - real data of the linear-split R-tree, because the quadratic-split R-tree consistently outperformed it. The exponential-split R-tree was very slow in building the tree, and it was not used. To avoid cluttering the plots, we only plot the best of our proposed algorithms, namely the one using the `2D-c' heuristic. The detailed results for the other, hilbert-based packing algorithms are presented in the next subsection. We performed experiments with several data sets using dierent parameters. For brevity we present experiments with four sets of data. The rst two data sets are from the 'TIGER' system, while the last two are synthetic data. The reason for using synthetic data is that we can control the parameters (data density, number of rectangles, ratio of points to rectangles etc.). Figure 10, 11 show the results for the TIGER data sets, they represent the roads of Montgomery county of Maryland and the roads of Long Beach of California respectively. Figure 12 shows the results for a synthetic data set, which contains a mix of points and rectangles; specically 50,000 points and 10,000 rectangles; the data density is d = and the page size p = 1Kb. The forth set (Figure 13) contains no points; only 100,000 rectangles; the data density d = 1 and the page size p = 1Kb. All Figures plot the number of pages retrieved by a range query (from Eq. 3) as a 14
15 Pages Touched Mixed 50k Points + 10k Rect s Hilbert 2D-c R*-tree Quad R-tree Lowx Qarea x 10-3 Figure 12: Page accesses per query vs. Query area - Rectangles + Points function of the area of the query. A common observation is that, for point queries, all methods performs almost the same, with small dierences. However, for slightly larger queries, the proposed 2D-c hilbert packed R-tree is the clear winner. The performance gap increases with the area of the queries. The second important observation is that the performance gap seems to increase with the skeweness of the data distribution: for the TIGER data sets, the proposed method achieves up to 36% improvement over the next best method (R -tree), and up to 58% improvement over the lowx packed R-tree. Since the performance of the Static lowx packed R-tree (100% space utilization) is worse than the performance of the Dynamic designs (e.g. quadratic split R-tree and R tree) we ascribe the good performance of our proposed methods not only to the higher space utilization but also to the superior clustering property of the Hilbert curve. 5.2 Comparison of hilbert-based packing schemes Here we compare all the packing heuristics that we have introduced in this paper, namely 15
16 Pages Touched k.2000.ana Data Rectangles Hilbert 2D-c R*-tree Quad R-tree Lowx Qarea x 10-3 Figure 13: Page accesses per query vs. Query area - Rectangles Only 2D-c, 4D-xy, 4D-cd and the only heuristic that uses the z-ordering, 2Dz-c. Table 1 contains a list of these methods, along with a brief description. Table 3 gives the response time versus the query area for all of these heuristics that use Hilbert order. The (synthetic) data le consists of 50K points and 10K rectangles. The page size p = 1Kb. The dierences between the alternative methods are small. However, from Table 3 we see that (2D-c) does better, especially for large queries. The next best method is the (4D-cd), which uses a 4-d hilbert curve on the parameter space (center-x, center-y, diameter-x, diameter-y). The query area Hilbert Hilbert Hilbert Q area 2D-c 4-cd 4D-xy Table 3: Comparison Between Dierent Schemes that uses Hilbert order 16
17 query area Hilbert Z-order Q area 2D-c 2Dz-c Table 4: Schema that uses Hilbert order vs. the one uses z-order last contender is the 4D-xy. For the same setting, table 4 compares 2D-c that sorts the data on the 2d hilbert-value of the centers of the data rectangles and 2Dz-c that sorts on the 2d z-value of the center. Table 4 shows that 2D-c that uses hilbert order always performs better than 2Dz-c that uses z-order. The relative ranking of the methods was the same for every dataset we tried; we omit the results for brevity. 6 Conclusions We have examined algorithms to generate packed R-trees for static databases. Our algorithms try to exploit the superior clustering properties of the Hilbert curve. We propose several schemes to sort the data rectangles, before grouping them into R-tree nodes. We performed experiments with all these methods and the most promising competitors; the major conclusion is that the proposed algorithms result in better R-trees. Specically, the most successful variation (2D-c = 2-d hilbert curve through centers) consistently outperforms the best dynamic methods, namely, the R -trees and the quadratic-split R-trees, as well as the only previously known static method (lowx packed R-tree). More importantly, the performance gap seems to be wider for real, skewed data distributions. An additional, smaller contribution, is the derivation of (Eq. 3), From a practical point of view, it can help a query optimizer [14] give a good estimate for the cost of an R-tree index. Moreover, it makes the simulation analysis of R-trees easier and more reliable, eliminating the need to ask queries. Future research could examine methods to achieve provably optimal packing, as well as the corresponding lower bound on the response time. 17
18 References [1] Walid G. Aref and Hanan Samet. Optimization strategies for spatial query processing. Proc. of VLDB (Very Large Data Bases), pages 81{90, September [2] D. Ballard and C. Brown. Computer Vision. Prentice Hall, [3] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The r*-tree: an ecient and robust access method for points and rectangles. ACM SIGMOD, pages 322{331, May [4] J.L. Bentley. Multidimensional binary search trees used for associative searching. CACM, 18(9):509{517, September [5] T. Bially. Space-lling curves: Their generation and their application to bandwidth reduction. IEEE Trans. on Information Theory, IT-15(6):658{664, November [6] C. Faloutsos. Gray codes for partial match and range queries. IEEE Trans. on Software Engineering, 14(10):1381{1393, October early version available as UMIACS-TR-87-4, also CS-TR [7] C. Faloutsos and S. Roseman. Fractals for secondary key retrieval. Eighth ACM SIGACT- SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 247{252, March also available as UMIACS-TR and CS-TR [8] I. Gargantini. An eective way to represent quadtrees. Comm. of ACM (CACM), 25(12):905{ 910, December [9] D. Greene. An implementation and performance analysis of spatial data access methods. Proc. of Data Engineering, pages 606{615, [10] J.G. Griths. An algorithm for displaying a class of space-lling curves. Software-Practice and Experience, 16(5):403{411, May [11] O. Gunther. The cell tree: an index for geometric data. Memorandum No. UCB/ERL M86/89, Univ. of California, Berkeley, December [12] A. Guttman. New Features for Relational Database Systems to Support CAD Applications. PhD thesis, University of California, Berkeley, June [13] A. Guttman. R-trees: a dynamic index structure for spatial searching. Proc. ACM SIGMOD, pages 47{57, June
19 [14] L. M. Haas, J. C. Freytag, G. M. Lohman, and H. Pirahesh. Extensible query processing in starburst. Proc. ACM-SIGMOD 1989 Int'l Conf. Management of Data, pages 377{388, May [15] K. Hinrichs and J. Nievergelt. The grid le: a data structure to support proximity queries on spatial objects. Proc. of the WG'83 (Intern. Workshop on Graph Theoretic Concepts in Computer Science), pages 100{113, [16] H. V. Jagadish. Spatial search with polyhedra. Proc. Sixth IEEE Int'l Conf. on Data Engineering, February [17] H.V. Jagadish. Linear clustering of objects with multiple attributes. ACM SIGMOD Conf., pages 332{342, May [18] Curtis P. Kolovson and Michael Stonebraker. Segment indexes: Dynamic indexing techniques for multi-dimensional interval data. Proc. ACM SIGMOD, pages 138{147, May [19] David B. Lomet and Betty Salzberg. The hb-tree: a multiattribute indexing method with good guaranteed performance. ACM TODS, 15(4):625{658, December [20] B. Mandelbrot. Fractal Geometry of Nature. W.H. Freeman, New York, [21] J. Orenstein. Spatial query processing in an object-oriented database system. Proc. ACM SIGMOD, pages 326{336, May [22] J. K. Ousterhout, G. T. Hamachi, R. N. Mayo, W. S. Scott, and G. S. Taylor. Magic: a vlsi layout system. In 21st Design Automation Conference, pages 152 { 159, Alburquerque, NM, June [23] J.T. Robinson. The k-d-b-tree: a search structure for large multidimensional dynamic indexes. Proc. ACM SIGMOD, pages 10{18, [24] N. Roussopoulos and D. Leifker. Direct spatial search on pictorial databases using packed r-trees. Proc. ACM SIGMOD, May [25] H. Samet. The Design and Analysis of Spatial Data Structures. Addison-Wesley, [26] T. Sellis, N. Roussopoulos, and C. Faloutsos. The r+ tree: a dynamic index for multidimensional objects. In Proc. 13th International Conference on VLDB, pages 507{518, England,, September also available as SRC-TR-87-32, UMIACS-TR-87-3, CS-TR
20 [27] M. White. N-Trees: Large Ordered Indexes for Multi-Dimensional Space. Application Mathematics Research Sta, Statistical Research Division, U.S. Bureau of the Census, December
Physical DB Design. B-Trees Index files can become quite large for large main files Indices on index files are possible.
B-Trees Index files can become quite large for large main files Indices on index files are possible 3 rd -level index 2 nd -level index 1 st -level index Main file 1 The 1 st -level index consists of pairs
HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:
HOMEWORK 4 Problem a For the fast loading case, we want to derive the relationship between P zz and λ z. We know that the nominal stress is expressed as: P zz = ψ λ z where λ z = λ λ z. Therefore, applying
ER-Tree (Extended R*-Tree)
1-9825/22/13(4)768-6 22 Journal of Software Vol13, No4 1, 1, 2, 1 1, 1 (, 2327) 2 (, 3127) E-mail xhzhou@ustceducn,,,,,,, 1, TP311 A,,,, Elias s Rivest,Cleary Arya Mount [1] O(2 d ) Arya Mount [1] Friedman,Bentley
ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007
Οδηγίες: Να απαντηθούν όλες οι ερωτήσεις. Αν κάπου κάνετε κάποιες υποθέσεις να αναφερθούν στη σχετική ερώτηση. Όλα τα αρχεία που αναφέρονται στα προβλήματα βρίσκονται στον ίδιο φάκελο με το εκτελέσιμο
2 Composition. Invertible Mappings
Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan Composition. Invertible Mappings In this section we discuss two procedures for creating new mappings from old ones, namely,
Approximation of distance between locations on earth given by latitude and longitude
Approximation of distance between locations on earth given by latitude and longitude Jan Behrens 2012-12-31 In this paper we shall provide a method to approximate distances between two points on earth
derivation of the Laplacian from rectangular to spherical coordinates
derivation of the Laplacian from rectangular to spherical coordinates swapnizzle 03-03- :5:43 We begin by recognizing the familiar conversion from rectangular to spherical coordinates (note that φ is used
Problem Set 3: Solutions
CMPSCI 69GG Applied Information Theory Fall 006 Problem Set 3: Solutions. [Cover and Thomas 7.] a Define the following notation, C I p xx; Y max X; Y C I p xx; Ỹ max I X; Ỹ We would like to show that C
Other Test Constructions: Likelihood Ratio & Bayes Tests
Other Test Constructions: Likelihood Ratio & Bayes Tests Side-Note: So far we have seen a few approaches for creating tests such as Neyman-Pearson Lemma ( most powerful tests of H 0 : θ = θ 0 vs H 1 :
CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS
CHAPTER 5 SOLVING EQUATIONS BY ITERATIVE METHODS EXERCISE 104 Page 8 1. Find the positive root of the equation x + 3x 5 = 0, correct to 3 significant figures, using the method of bisection. Let f(x) =
5.4 The Poisson Distribution.
The worst thing you can do about a situation is nothing. Sr. O Shea Jackson 5.4 The Poisson Distribution. Description of the Poisson Distribution Discrete probability distribution. The random variable
4.6 Autoregressive Moving Average Model ARMA(1,1)
84 CHAPTER 4. STATIONARY TS MODELS 4.6 Autoregressive Moving Average Model ARMA(,) This section is an introduction to a wide class of models ARMA(p,q) which we will consider in more detail later in this
Areas and Lengths in Polar Coordinates
Kiryl Tsishchanka Areas and Lengths in Polar Coordinates In this section we develop the formula for the area of a region whose boundary is given by a polar equation. We need to use the formula for the
Homework 3 Solutions
Homework 3 Solutions Igor Yanovsky (Math 151A TA) Problem 1: Compute the absolute error and relative error in approximations of p by p. (Use calculator!) a) p π, p 22/7; b) p π, p 3.141. Solution: For
C.S. 430 Assignment 6, Sample Solutions
C.S. 430 Assignment 6, Sample Solutions Paul Liu November 15, 2007 Note that these are sample solutions only; in many cases there were many acceptable answers. 1 Reynolds Problem 10.1 1.1 Normal-order
EE512: Error Control Coding
EE512: Error Control Coding Solution for Assignment on Finite Fields February 16, 2007 1. (a) Addition and Multiplication tables for GF (5) and GF (7) are shown in Tables 1 and 2. + 0 1 2 3 4 0 0 1 2 3
Lecture 2. Soundness and completeness of propositional logic
Lecture 2 Soundness and completeness of propositional logic February 9, 2004 1 Overview Review of natural deduction. Soundness and completeness. Semantics of propositional formulas. Soundness proof. Completeness
Assalamu `alaikum wr. wb.
LUMP SUM Assalamu `alaikum wr. wb. LUMP SUM Wassalamu alaikum wr. wb. Assalamu `alaikum wr. wb. LUMP SUM Wassalamu alaikum wr. wb. LUMP SUM Lump sum lump sum lump sum. lump sum fixed price lump sum lump
Areas and Lengths in Polar Coordinates
Kiryl Tsishchanka Areas and Lengths in Polar Coordinates In this section we develop the formula for the area of a region whose boundary is given by a polar equation. We need to use the formula for the
ΑΠΟΔΟΤΙΚΗ ΑΠΟΤΙΜΗΣΗ ΕΡΩΤΗΣΕΩΝ OLAP Η ΜΕΤΑΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ ΕΞΕΙΔΙΚΕΥΣΗΣ. Υποβάλλεται στην
ΑΠΟΔΟΤΙΚΗ ΑΠΟΤΙΜΗΣΗ ΕΡΩΤΗΣΕΩΝ OLAP Η ΜΕΤΑΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ ΕΞΕΙΔΙΚΕΥΣΗΣ Υποβάλλεται στην ορισθείσα από την Γενική Συνέλευση Ειδικής Σύνθεσης του Τμήματος Πληροφορικής Εξεταστική Επιτροπή από την Χαρά Παπαγεωργίου
Section 8.3 Trigonometric Equations
99 Section 8. Trigonometric Equations Objective 1: Solve Equations Involving One Trigonometric Function. In this section and the next, we will exple how to solving equations involving trigonometric functions.
Block Ciphers Modes. Ramki Thurimella
Block Ciphers Modes Ramki Thurimella Only Encryption I.e. messages could be modified Should not assume that nonsensical messages do no harm Always must be combined with authentication 2 Padding Must be
Section 9.2 Polar Equations and Graphs
180 Section 9. Polar Equations and Graphs In this section, we will be graphing polar equations on a polar grid. In the first few examples, we will write the polar equation in rectangular form to help identify
Statistical Inference I Locally most powerful tests
Statistical Inference I Locally most powerful tests Shirsendu Mukherjee Department of Statistics, Asutosh College, Kolkata, India. shirsendu st@yahoo.co.in So far we have treated the testing of one-sided
HISTOGRAMS AND PERCENTILES What is the 25 th percentile of a histogram? What is the 50 th percentile for the cigarette histogram?
HISTOGRAMS AND PERCENTILES What is the 25 th percentile of a histogram? The point on the horizontal axis such that of the area under the histogram lies to the left of that point (and to the right) What
department listing department name αχχουντσ ϕανε βαλικτ δδσϕηασδδη σδηφγ ασκϕηλκ τεχηνιχαλ αλαν ϕουν διξ τεχηνιχαλ ϕοην µαριανι
She selects the option. Jenny starts with the al listing. This has employees listed within She drills down through the employee. The inferred ER sttricture relates this to the redcords in the databasee
ST5224: Advanced Statistical Theory II
ST5224: Advanced Statistical Theory II 2014/2015: Semester II Tutorial 7 1. Let X be a sample from a population P and consider testing hypotheses H 0 : P = P 0 versus H 1 : P = P 1, where P j is a known
Math221: HW# 1 solutions
Math: HW# solutions Andy Royston October, 5 7.5.7, 3 rd Ed. We have a n = b n = a = fxdx = xdx =, x cos nxdx = x sin nx n sin nxdx n = cos nx n = n n, x sin nxdx = x cos nx n + cos nxdx n cos n = + sin
SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018
Journal of rogressive Research in Mathematics(JRM) ISSN: 2395-028 SCITECH Volume 3, Issue 2 RESEARCH ORGANISATION ublished online: March 29, 208 Journal of rogressive Research in Mathematics www.scitecresearch.com/journals
Fractional Colorings and Zykov Products of graphs
Fractional Colorings and Zykov Products of graphs Who? Nichole Schimanski When? July 27, 2011 Graphs A graph, G, consists of a vertex set, V (G), and an edge set, E(G). V (G) is any finite set E(G) is
6.3 Forecasting ARMA processes
122 CHAPTER 6. ARMA MODELS 6.3 Forecasting ARMA processes The purpose of forecasting is to predict future values of a TS based on the data collected to the present. In this section we will discuss a linear
3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β
3.4 SUM AND DIFFERENCE FORMULAS Page Theorem cos(αβ cos α cos β -sin α cos(α-β cos α cos β sin α NOTE: cos(αβ cos α cos β cos(α-β cos α -cos β Proof of cos(α-β cos α cos β sin α Let s use a unit circle
Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)
Phys460.nb 81 ψ n (t) is still the (same) eigenstate of H But for tdependent H. The answer is NO. 5.5.5. Solution for the tdependent Schrodinger s equation If we assume that at time t 0, the electron starts
ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006
Οδηγίες: Να απαντηθούν όλες οι ερωτήσεις. Ολοι οι αριθμοί που αναφέρονται σε όλα τα ερωτήματα είναι μικρότεροι το 1000 εκτός αν ορίζεται διαφορετικά στη διατύπωση του προβλήματος. Διάρκεια: 3,5 ώρες Καλή
b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!
MTH U341 urface Integrals, tokes theorem, the divergence theorem To be turned in Wed., Dec. 1. 1. Let be the sphere of radius a, x 2 + y 2 + z 2 a 2. a. Use spherical coordinates (with ρ a) to parametrize.
Congruence Classes of Invertible Matrices of Order 3 over F 2
International Journal of Algebra, Vol. 8, 24, no. 5, 239-246 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.2988/ija.24.422 Congruence Classes of Invertible Matrices of Order 3 over F 2 Ligong An and
Math 6 SL Probability Distributions Practice Test Mark Scheme
Math 6 SL Probability Distributions Practice Test Mark Scheme. (a) Note: Award A for vertical line to right of mean, A for shading to right of their vertical line. AA N (b) evidence of recognizing symmetry
On a four-dimensional hyperbolic manifold with finite volume
BULETINUL ACADEMIEI DE ŞTIINŢE A REPUBLICII MOLDOVA. MATEMATICA Numbers 2(72) 3(73), 2013, Pages 80 89 ISSN 1024 7696 On a four-dimensional hyperbolic manifold with finite volume I.S.Gutsul Abstract. In
Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.
Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο The time integral of a force is referred to as impulse, is determined by and is obtained from: Newton s 2 nd Law of motion states that the action
Strain gauge and rosettes
Strain gauge and rosettes Introduction A strain gauge is a device which is used to measure strain (deformation) on an object subjected to forces. Strain can be measured using various types of devices classified
9.09. # 1. Area inside the oval limaçon r = cos θ. To graph, start with θ = 0 so r = 6. Compute dr
9.9 #. Area inside the oval limaçon r = + cos. To graph, start with = so r =. Compute d = sin. Interesting points are where d vanishes, or at =,,, etc. For these values of we compute r:,,, and the values
Every set of first-order formulas is equivalent to an independent set
Every set of first-order formulas is equivalent to an independent set May 6, 2008 Abstract A set of first-order formulas, whatever the cardinality of the set of symbols, is equivalent to an independent
Test Data Management in Practice
Problems, Concepts, and the Swisscom Test Data Organizer Do you have issues with your legal and compliance department because test environments contain sensitive data outsourcing partners must not see?
Partial Differential Equations in Biology The boundary element method. March 26, 2013
The boundary element method March 26, 203 Introduction and notation The problem: u = f in D R d u = ϕ in Γ D u n = g on Γ N, where D = Γ D Γ N, Γ D Γ N = (possibly, Γ D = [Neumann problem] or Γ N = [Dirichlet
ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ
ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ ΕΛΕΝΑ ΦΛΟΚΑ Επίκουρος Καθηγήτρια Τµήµα Φυσικής, Τοµέας Φυσικής Περιβάλλοντος- Μετεωρολογίας ΓΕΝΙΚΟΙ ΟΡΙΣΜΟΙ Πληθυσµός Σύνολο ατόµων ή αντικειµένων στα οποία αναφέρονται
The Simply Typed Lambda Calculus
Type Inference Instead of writing type annotations, can we use an algorithm to infer what the type annotations should be? That depends on the type system. For simple type systems the answer is yes, and
Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1
Eon : Fall 8 Suggested Solutions to Problem Set 8 Email questions or omments to Dan Fetter Problem. Let X be a salar with density f(x, θ) (θx + θ) [ x ] with θ. (a) Find the most powerful level α test
TMA4115 Matematikk 3
TMA4115 Matematikk 3 Andrew Stacey Norges Teknisk-Naturvitenskapelige Universitet Trondheim Spring 2010 Lecture 12: Mathematics Marvellous Matrices Andrew Stacey Norges Teknisk-Naturvitenskapelige Universitet
ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 7η: Consumer Behavior Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών
ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ Ψηφιακή Οικονομία Διάλεξη 7η: Consumer Behavior Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών Τέλος Ενότητας Χρηματοδότηση Το παρόν εκπαιδευτικό υλικό έχει αναπτυχθεί
EPL 603 TOPICS IN SOFTWARE ENGINEERING. Lab 5: Component Adaptation Environment (COPE)
EPL 603 TOPICS IN SOFTWARE ENGINEERING Lab 5: Component Adaptation Environment (COPE) Performing Static Analysis 1 Class Name: The fully qualified name of the specific class Type: The type of the class
Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1
Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1 A Brief History of Sampling Research 1915 - Edmund Taylor Whittaker (1873-1956) devised a
Section 7.6 Double and Half Angle Formulas
09 Section 7. Double and Half Angle Fmulas To derive the double-angles fmulas, we will use the sum of two angles fmulas that we developed in the last section. We will let α θ and β θ: cos(θ) cos(θ + θ)
Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3
Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3 1 State vector space and the dual space Space of wavefunctions The space of wavefunctions is the set of all
ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ - ΤΜΗΥΠ ΒΑΣΕΙΣ ΔΕΔΟΜΕΝΩΝ ΙI
ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ - ΤΜΗΥΠ ΒΑΣΕΙΣ ΔΕΔΟΜΕΝΩΝ ΙI Β. Μεγαλοοικονόμου Μέθοδοι Προσπέλασης Χωρικών Δεδομένων ΙΙ Spatial Access Methods (SAMs) II (παρουσίαση βασισμένη εν μέρη σε σημειώσεις των Silberchatz,
Reminders: linear functions
Reminders: linear functions Let U and V be vector spaces over the same field F. Definition A function f : U V is linear if for every u 1, u 2 U, f (u 1 + u 2 ) = f (u 1 ) + f (u 2 ), and for every u U
Example Sheet 3 Solutions
Example Sheet 3 Solutions. i Regular Sturm-Liouville. ii Singular Sturm-Liouville mixed boundary conditions. iii Not Sturm-Liouville ODE is not in Sturm-Liouville form. iv Regular Sturm-Liouville note
the total number of electrons passing through the lamp.
1. A 12 V 36 W lamp is lit to normal brightness using a 12 V car battery of negligible internal resistance. The lamp is switched on for one hour (3600 s). For the time of 1 hour, calculate (i) the energy
Πανεπιστήμιο Κρήτης, Τμήμα Επιστήμης Υπολογιστών Άνοιξη 2009. HΥ463 - Συστήματα Ανάκτησης Πληροφοριών Information Retrieval (IR) Systems
Πανεπιστήμιο Κρήτης, Τμήμα Επιστήμης Υπολογιστών Άνοιξη 2009 HΥ463 - Συστήματα Ανάκτησης Πληροφοριών Information Retrieval (IR) Systems Στατιστικά Κειμένου Text Statistics Γιάννης Τζίτζικας άλ ιάλεξη :
k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +
Chapter 3. Fuzzy Arithmetic 3- Fuzzy arithmetic: ~Addition(+) and subtraction (-): Let A = [a and B = [b, b in R If x [a and y [b, b than x+y [a +b +b Symbolically,we write A(+)B = [a (+)[b, b = [a +b
The challenges of non-stable predicates
The challenges of non-stable predicates Consider a non-stable predicate Φ encoding, say, a safety property. We want to determine whether Φ holds for our program. The challenges of non-stable predicates
Potential Dividers. 46 minutes. 46 marks. Page 1 of 11
Potential Dividers 46 minutes 46 marks Page 1 of 11 Q1. In the circuit shown in the figure below, the battery, of negligible internal resistance, has an emf of 30 V. The pd across the lamp is 6.0 V and
Srednicki Chapter 55
Srednicki Chapter 55 QFT Problems & Solutions A. George August 3, 03 Srednicki 55.. Use equations 55.3-55.0 and A i, A j ] = Π i, Π j ] = 0 (at equal times) to verify equations 55.-55.3. This is our third
Solutions to Exercise Sheet 5
Solutions to Eercise Sheet 5 jacques@ucsd.edu. Let X and Y be random variables with joint pdf f(, y) = 3y( + y) where and y. Determine each of the following probabilities. Solutions. a. P (X ). b. P (X
Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013
Notes on Average Scattering imes and Hall Factors Jesse Maassen and Mar Lundstrom Purdue University November 5, 13 I. Introduction 1 II. Solution of the BE 1 III. Exercises: Woring out average scattering
Concrete Mathematics Exercises from 30 September 2016
Concrete Mathematics Exercises from 30 September 2016 Silvio Capobianco Exercise 1.7 Let H(n) = J(n + 1) J(n). Equation (1.8) tells us that H(2n) = 2, and H(2n+1) = J(2n+2) J(2n+1) = (2J(n+1) 1) (2J(n)+1)
Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in
Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in : tail in X, head in A nowhere-zero Γ-flow is a Γ-circulation such that
6.1. Dirac Equation. Hamiltonian. Dirac Eq.
6.1. Dirac Equation Ref: M.Kaku, Quantum Field Theory, Oxford Univ Press (1993) η μν = η μν = diag(1, -1, -1, -1) p 0 = p 0 p = p i = -p i p μ p μ = p 0 p 0 + p i p i = E c 2 - p 2 = (m c) 2 H = c p 2
Lecture 34 Bootstrap confidence intervals
Lecture 34 Bootstrap confidence intervals Confidence Intervals θ: an unknown parameter of interest We want to find limits θ and θ such that Gt = P nˆθ θ t If G 1 1 α is known, then P θ θ = P θ θ = 1 α
The Probabilistic Method - Probabilistic Techniques. Lecture 7: The Janson Inequality
The Probabilistic Method - Probabilistic Techniques Lecture 7: The Janson Inequality Sotiris Nikoletseas Associate Professor Computer Engineering and Informatics Department 2014-2015 Sotiris Nikoletseas,
Démographie spatiale/spatial Demography
ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΕΣΣΑΛΙΑΣ Démographie spatiale/spatial Demography Session 1: Introduction to spatial demography Basic concepts Michail Agorastakis Department of Planning & Regional Development Άδειες Χρήσης
Uniform Convergence of Fourier Series Michael Taylor
Uniform Convergence of Fourier Series Michael Taylor Given f L 1 T 1 ), we consider the partial sums of the Fourier series of f: N 1) S N fθ) = ˆfk)e ikθ. k= N A calculation gives the Dirichlet formula
Right Rear Door. Let's now finish the door hinge saga with the right rear door
Right Rear Door Let's now finish the door hinge saga with the right rear door You may have been already guessed my steps, so there is not much to describe in detail. Old upper one file:///c /Documents
ΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ "ΠΟΛΥΚΡΙΤΗΡΙΑ ΣΥΣΤΗΜΑΤΑ ΛΗΨΗΣ ΑΠΟΦΑΣΕΩΝ. Η ΠΕΡΙΠΤΩΣΗ ΤΗΣ ΕΠΙΛΟΓΗΣ ΑΣΦΑΛΙΣΤΗΡΙΟΥ ΣΥΜΒΟΛΑΙΟΥ ΥΓΕΙΑΣ "
ΤΕΧΝΟΛΟΓΙΚΟ ΕΚΠΑΙΔΕΥΤΙΚΟ ΙΔΡΥΜΑ ΚΑΛΑΜΑΤΑΣ ΣΧΟΛΗ ΔΙΟΙΚΗΣΗΣ ΟΙΚΟΝΟΜΙΑΣ ΤΜΗΜΑ ΜΟΝΑΔΩΝ ΥΓΕΙΑΣ ΚΑΙ ΠΡΟΝΟΙΑΣ ΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ "ΠΟΛΥΚΡΙΤΗΡΙΑ ΣΥΣΤΗΜΑΤΑ ΛΗΨΗΣ ΑΠΟΦΑΣΕΩΝ. Η ΠΕΡΙΠΤΩΣΗ ΤΗΣ ΕΠΙΛΟΓΗΣ ΑΣΦΑΛΙΣΤΗΡΙΟΥ ΣΥΜΒΟΛΑΙΟΥ
Second Order Partial Differential Equations
Chapter 7 Second Order Partial Differential Equations 7.1 Introduction A second order linear PDE in two independent variables (x, y Ω can be written as A(x, y u x + B(x, y u xy + C(x, y u u u + D(x, y
Matrices and Determinants
Matrices and Determinants SUBJECTIVE PROBLEMS: Q 1. For what value of k do the following system of equations possess a non-trivial (i.e., not all zero) solution over the set of rationals Q? x + ky + 3z
ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?
Teko Classes IITJEE/AIEEE Maths by SUHAAG SIR, Bhopal, Ph (0755) 3 00 000 www.tekoclasses.com ANSWERSHEET (TOPIC DIFFERENTIAL CALCULUS) COLLECTION # Question Type A.Single Correct Type Q. (A) Sol least
Instruction Execution Times
1 C Execution Times InThisAppendix... Introduction DL330 Execution Times DL330P Execution Times DL340 Execution Times C-2 Execution Times Introduction Data Registers This appendix contains several tables
Figure A.2: MPC and MPCP Age Profiles (estimating ρ, ρ = 2, φ = 0.03)..
Supplemental Material (not for publication) Persistent vs. Permanent Income Shocks in the Buffer-Stock Model Jeppe Druedahl Thomas H. Jørgensen May, A Additional Figures and Tables Figure A.: Wealth and
ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω
0 1 2 3 4 5 6 ω ω + 1 ω + 2 ω + 3 ω + 4 ω2 ω2 + 1 ω2 + 2 ω2 + 3 ω3 ω3 + 1 ω3 + 2 ω4 ω4 + 1 ω5 ω 2 ω 2 + 1 ω 2 + 2 ω 2 + ω ω 2 + ω + 1 ω 2 + ω2 ω 2 2 ω 2 2 + 1 ω 2 2 + ω ω 2 3 ω 3 ω 3 + 1 ω 3 + ω ω 3 +
SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions
SCHOOL OF MATHEMATICAL SCIENCES GLMA Linear Mathematics 00- Examination Solutions. (a) i. ( + 5i)( i) = (6 + 5) + (5 )i = + i. Real part is, imaginary part is. (b) ii. + 5i i ( + 5i)( + i) = ( i)( + i)
ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 24/3/2007
Οδηγίες: Να απαντηθούν όλες οι ερωτήσεις. Όλοι οι αριθμοί που αναφέρονται σε όλα τα ερωτήματα μικρότεροι του 10000 εκτός αν ορίζεται διαφορετικά στη διατύπωση του προβλήματος. Αν κάπου κάνετε κάποιες υποθέσεις
Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.
Bayesian statistics DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Frequentist vs Bayesian statistics In frequentist
1) Abstract (To be organized as: background, aim, workpackages, expected results) (300 words max) Το όριο λέξεων θα είναι ελαστικό.
UΓενικές Επισημάνσεις 1. Παρακάτω θα βρείτε απαντήσεις του Υπουργείου, σχετικά με τη συμπλήρωση της ηλεκτρονικής φόρμας. Διευκρινίζεται ότι στα περισσότερα θέματα οι απαντήσεις ήταν προφορικές (τηλεφωνικά),
ΠΟΛΥΤΕΧΝΕΙΟ ΚΡΗΤΗΣ ΣΧΟΛΗ ΜΗΧΑΝΙΚΩΝ ΠΕΡΙΒΑΛΛΟΝΤΟΣ
ΠΟΛΥΤΕΧΝΕΙΟ ΚΡΗΤΗΣ ΣΧΟΛΗ ΜΗΧΑΝΙΚΩΝ ΠΕΡΙΒΑΛΛΟΝΤΟΣ Τομέας Περιβαλλοντικής Υδραυλικής και Γεωπεριβαλλοντικής Μηχανικής (III) Εργαστήριο Γεωπεριβαλλοντικής Μηχανικής TECHNICAL UNIVERSITY OF CRETE SCHOOL of
Second Order RLC Filters
ECEN 60 Circuits/Electronics Spring 007-0-07 P. Mathys Second Order RLC Filters RLC Lowpass Filter A passive RLC lowpass filter (LPF) circuit is shown in the following schematic. R L C v O (t) Using phasor
Μηχανική Μάθηση Hypothesis Testing
ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ Μηχανική Μάθηση Hypothesis Testing Γιώργος Μπορμπουδάκης Τμήμα Επιστήμης Υπολογιστών Procedure 1. Form the null (H 0 ) and alternative (H 1 ) hypothesis 2. Consider
Numerical Analysis FMN011
Numerical Analysis FMN011 Carmen Arévalo Lund University carmen@maths.lth.se Lecture 12 Periodic data A function g has period P if g(x + P ) = g(x) Model: Trigonometric polynomial of order M T M (x) =
Homework 8 Model Solution Section
MATH 004 Homework Solution Homework 8 Model Solution Section 14.5 14.6. 14.5. Use the Chain Rule to find dz where z cosx + 4y), x 5t 4, y 1 t. dz dx + dy y sinx + 4y)0t + 4) sinx + 4y) 1t ) 0t + 4t ) sinx
Abstract Storage Devices
Abstract Storage Devices Robert König Ueli Maurer Stefano Tessaro SOFSEM 2009 January 27, 2009 Outline 1. Motivation: Storage Devices 2. Abstract Storage Devices (ASD s) 3. Reducibility 4. Factoring ASD
Tridiagonal matrices. Gérard MEURANT. October, 2008
Tridiagonal matrices Gérard MEURANT October, 2008 1 Similarity 2 Cholesy factorizations 3 Eigenvalues 4 Inverse Similarity Let α 1 ω 1 β 1 α 2 ω 2 T =......... β 2 α 1 ω 1 β 1 α and β i ω i, i = 1,...,
DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.
DESIGN OF MACHINERY SOLUTION MANUAL -7-1! PROBLEM -7 Statement: Design a double-dwell cam to move a follower from to 25 6, dwell for 12, fall 25 and dwell for the remader The total cycle must take 4 sec
ΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ ΒΑΛΕΝΤΙΝΑ ΠΑΠΑΔΟΠΟΥΛΟΥ Α.Μ.: 09/061. Υπεύθυνος Καθηγητής: Σάββας Μακρίδης
Α.Τ.Ε.Ι. ΙΟΝΙΩΝ ΝΗΣΩΝ ΠΑΡΑΡΤΗΜΑ ΑΡΓΟΣΤΟΛΙΟΥ ΤΜΗΜΑ ΔΗΜΟΣΙΩΝ ΣΧΕΣΕΩΝ ΚΑΙ ΕΠΙΚΟΙΝΩΝΙΑΣ ΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ «Η διαμόρφωση επικοινωνιακής στρατηγικής (και των τακτικών ενεργειών) για την ενδυνάμωση της εταιρικής
Chapter 6: Systems of Linear Differential. be continuous functions on the interval
Chapter 6: Systems of Linear Differential Equations Let a (t), a 2 (t),..., a nn (t), b (t), b 2 (t),..., b n (t) be continuous functions on the interval I. The system of n first-order differential equations
[1] P Q. Fig. 3.1
1 (a) Define resistance....... [1] (b) The smallest conductor within a computer processing chip can be represented as a rectangular block that is one atom high, four atoms wide and twenty atoms long. One
Parametrized Surfaces
Parametrized Surfaces Recall from our unit on vector-valued functions at the beginning of the semester that an R 3 -valued function c(t) in one parameter is a mapping of the form c : I R 3 where I is some
ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ Η/Υ & ΠΛΗΡΟΦΟΡΙΚΗΣ. του Γεράσιμου Τουλιάτου ΑΜ: 697
ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ Η/Υ & ΠΛΗΡΟΦΟΡΙΚΗΣ ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ ΣΤΑ ΠΛΑΙΣΙΑ ΤΟΥ ΜΕΤΑΠΤΥΧΙΑΚΟΥ ΔΙΠΛΩΜΑΤΟΣ ΕΙΔΙΚΕΥΣΗΣ ΕΠΙΣΤΗΜΗ ΚΑΙ ΤΕΧΝΟΛΟΓΙΑ ΤΩΝ ΥΠΟΛΟΓΙΣΤΩΝ του Γεράσιμου Τουλιάτου
Συστήματα Διαχείρισης Βάσεων Δεδομένων
ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ Συστήματα Διαχείρισης Βάσεων Δεδομένων Φροντιστήριο 9: Transactions - part 1 Δημήτρης Πλεξουσάκης Τμήμα Επιστήμης Υπολογιστών Tutorial on Undo, Redo and Undo/Redo
GPU. CUDA GPU GeForce GTX 580 GPU 2.67GHz Intel Core 2 Duo CPU E7300 CUDA. Parallelizing the Number Partitioning Problem for GPUs
GPU 1 1 NP number partitioning problem Pedroso CUDA GPU GeForce GTX 580 GPU 2.67GHz Intel Core 2 Duo CPU E7300 CUDA C Pedroso Python 323 Python C 12.2 Parallelizing the Number Partitioning Problem for
A Note on Intuitionistic Fuzzy. Equivalence Relation
International Mathematical Forum, 5, 2010, no. 67, 3301-3307 A Note on Intuitionistic Fuzzy Equivalence Relation D. K. Basnet Dept. of Mathematics, Assam University Silchar-788011, Assam, India dkbasnet@rediffmail.com
ΠΕΡΙΕΧΟΜΕΝΑ. Κεφάλαιο 1: Κεφάλαιο 2: Κεφάλαιο 3:
4 Πρόλογος Η παρούσα διπλωµατική εργασία µε τίτλο «ιερεύνηση χωρικής κατανοµής µετεωρολογικών µεταβλητών. Εφαρµογή στον ελληνικό χώρο», ανατέθηκε από το ιεπιστηµονικό ιατµηµατικό Πρόγραµµα Μεταπτυχιακών