Web search basics. Content. History Web Size Spam Link Analysis. Ανάκτηση Πληροφορίας
|
|
- Παρθενορή Δημητρίου
- 8 χρόνια πριν
- Προβολές:
Transcript
1 Web search basics 1 Content History Web Size Spam Link Analysis 2
2 History 3 Brief (non-technical) history Hypertext In the 1990 s: (1) Server communicates with the client via a protocol (http) that is lightweight and simple, asynchronous, simple markup language (HTML) (2) The client (browser) ignores what it does not understand 4
3 Brief (non-technical) history Making web content discoverable (1) Full text index (Altavista, Excite, Infoseek) (2) Taxonomies populated in categories such as Yahoo! (1) Manual (difficult to scale) (2) Need s to now what sub trees to seek 5 Brief (non-technical) history First challenge: scale Then: quality and relevance of query results 6
4 Brief (non-technical) history 1998+: Link based ranking pioneered by Google Blew away all early engines save Inktomi Great user experience in search of a business model Meanwhile Goto/Overture s annual revenues were nearing $1 billion 7 Brief (non-technical) history Early keyword based engines Altavista, Excite, Infoseek, Inktomi, ca Sponsored search ranking: Goto.com (morphed into Overture.com Yahoo!) Your search ranking depended on how much you paid Auction for keywords: casino was expensive! 8
5 Advertising Graphical banner advertisements on web pages at popular websites (news and entertainment sites, such as MSN, CNN, etc) Purpose: Branding Cost Per Mil (CPB) model: cost of having its banner advertisement displayed 1000 times (also called impression) Cost per Click (CPC) model Purpose: Make a purchase > transaction oriented 9 Advertising Goto (later Overture) Not a search engine For every query term q, it accepted bids for companies who wanted their web page shown on the query q As results, Goto returned the pages of all advertisers who bid for q When the user clicked, the advertiser would pay Sponsored Search or Search advertising 10
6 Advertising Combine: Pure search engines (aka algorithmic search results) Sponsored search engines (displayed separately and distinctively to the reight of the algorithmic results) 11 Ads Algorithmic results. 12
7 Advertising Paid inclusion: Pay to have one s web page included in the search engine's index Effect on ranking or not 13 The Web document collection The Web No design/co ordination Distributed content creation, linking, democratization of publishing Content includes truth, lies, obsolete information, contradictions Unstructured (text, html, ), semi structured (XML, annotated photos), structured (Databases) Scale much larger than previous text collections Growth slowed down from initial volume doubling every few months but still expanding Content can be dynamically generated 14
8 Results 1-10 of about 7,310,000 for miele. (0.12 seconds) At the heart of your home, Appliances by Miele.... USA. to miele.com. Residential Appliances. Vacuum Cleaners. Dishwashers. Cooking Appliances. Steam Oven. Coffee System k - Cached - Similar pages Welcome to Miele, the home of the very best appliances and kitchens in the world k - Cached - Similar pages page ] Das Portal zum Thema Essen & Geniessen online unter Miele weltweit...ein Leben lang.... Wählen Sie die Miele Vertretung Ihres Landes k - Cached - Similar pages Herzlich willkommen bei Miele Österreich Wenn Sie nicht automatisch weitergeleitet werden, klicken Sie bitte hier! HAUSHALTSGERÄTE k - Cached - Similar pages Sponsored Links CG Appliance Express Discount Appliances (650) Same Day Certified Installation San Francisco-Oakland-San Jose, CA Miele Vacuum Cleaners Miele Vacuums- Complete Selection Free Shipping! Miele Vacuum Cleaners Miele-Free Air shipping! All models. Helpful advice. Static web page: its content does not vary from one request to that page to the next Dynamic web page: Typically generated by an application server in response to a query to a database Character? in its URL Example: airport s flight status, etc 15 Web search basics User Web Web spider Miele, Inc -- Anything else is a compromise Miele Miele - Deutscher Hersteller von Einbaugeräten, Hausgeräten... - [ Translate this Herzlich willkommen bei Miele Österreich - [ Translate this page ] Search Indexer The Web Indexes Ad indexes 16
9 User 17 User Needs Need [Brod02, RL04] Low hemoglobin 1. Informational want to learn about something (~40% /65%) Not a single web page, assimilate information from many sites United Airlines 2. Navigational want to go to that page (~25% / 15%) Best, precision at 1 18
10 User Needs 3. Transactional want to do something (webmediated) (~35% / 20%) Access a service Downloads Shop Seattle weather Mars surface images Canon S410 Listing sites with interfaces for such services Gray areas Find a good hub Car rental Brasil Exploratory search see what s there 19 User Needs Type of query influences both the algorithmic search results and the query for sponsored search results Other characteristics: Average number of keywords between 2 and 3 Syntax operators are seldom used 20
11 How far do people look for results? (Source: iprospect.com WhitePaper_2006_SearchEngineUserBehavior.pdf) 21 Users empirical evaluation of results Quality of pages varies widely Relevance is not enough Other desirable qualities (non IR!!) Content: Trustworthy, diverse, non duplicated, well maintained Web readability: display correctly & fast No annoyances: pop ups, etc Precision vs. recall On the web, recall seldom matters What matters Precision at 1? Precision above the fold? Comprehensiveness must be able to deal with obscure queries Recall matters when the number of matches is very small User perceptions may be unscientific, but are significant over a large aggregate 22
12 Users empirical evaluation of engines Relevance and validity of results UI Simple, no clutter, error tolerant Trust Results are objective Coverage of topics for polysemic queries Pre/Post process tools provided Mitigate user errors (auto spell check, search assist, ) Explicit: Search within results, more like this, refine... Anticipative: related searches Deal with idiosyncrasies Web specific vocabulary Impact on stemming, spell check, etc Web addresses typed in the search box 23 Spam 24
13 The trouble with sponsored search It costs money. What s the alternative? Search Engine Optimization: Tuning your web page to rank highly in the algorithmic search results for select keywords Alternative to paying for placement, thus, intrinsically a marketing function Performed by companies, webmasters and consultants ( Search engine optimizers ) for their clients Some perfectly legitimate, some very shady 25 Simplest forms First generation engines relied heavily on tf/idf The top ranked pages for the query maui resort were the ones containing the most maui s and resort s SEOs responded with dense repetitions of chosen terms e.g., maui resort maui resort maui resort Often, the repetitions would be in the same color as the background of the web page Repeated terms got indexed by crawlers But not visible to humans on browsers Pure word density cannot be trusted as an IR signal 26
14 Variants of keyword stuffing Misleading meta tags, excessive repetition Hidden text with colors, style sheet tricks, etc. Meta-Tags = London hotels, hotel, holiday inn, hilton, discount, booking, reservation, sex, mp3, britney spears, viagra, 27 Motives Search engine optimization (Spam) Commercial, political, religious, lobbies Promotion funded by advertising budget Operators Contractors (Search Engine Optimizers) for lobbies, companies Web masters Hosting services Forums E.g., Web master world ( ) Search engine specific tricks Discussions about academic papers 28
15 Cloaking Serve fake content to search engine spider DNS cloaking: Switch IP address. Impersonate Get indexed under misleading keywords Y SPAM Cloaking Is this a Search Engine spider? N Real Doc 29 More spam techniques Doorway pages Pages optimized for a single keyword that re direct to the real target page Link spamming Mutual admiration societies, hidden links, awards more on these later Domain flooding: numerous domains that point or re direct to a target page Robots Fake query stream rank checking programs Curve fit ranking programs of search engines Millions of submissions via Add Url Click spam 30
16 The war against spam Quality signals Prefer authoritative pages based on: Votes from authors (linkage signals) Votes from users (usage signals) Policing of URL submissions Anti robot test Limits on meta keywords Robust link analysis Ignore statistically implausible linkage (or text) Use link analysis to detect spammers (guilt by association) Spam recognition by machine learning Training set based on known spam Family friendly filters Linguistic analysis, general classification techniques, etc. For images: flesh tone detectors, source text analysis, etc. Editorial intervention Blacklists Top queries audited Complaints addressed Suspect pattern detection 31 More on spam Web search engines have policies on SEO practices they tolerate/block Adversarial IR: the unending (technical) battle between SEO s and web search engines Research 32
17 Web as a graph 33 Web Graph As a directed graph Nodes: static HTL pages Edge: hyperlinks between pages Anchor text (href attribute of <a>) 34
18 Web Graph Not connected strongly In links, in degree: 8 15 Out links, out degree Power law distribution of in degree: web pages with in degree i is proportional to 1/i α, α = Web Graph Bowtie structure SCC: Strongly connected component IN, OUT roughly equal in size, SCC somewhat larger Most web pages in one of the three sets Tubes (small sets outside SCC that lead directly form IN to OUT) Tendrils (lead nowhere from IN, or from nowhere to OUT) 36
19 Size of the web What is the size of the web? Issues The web is really infinite Dynamic content, e.g., calendar Soft 404: is a valid page Static web contains syntactic duplication, mostly due to mirroring (~30%) Some servers are seldom connected Who cares? Media, and consequently the user Engine design Engine crawl policy. Impact on recall. 38
20 What can we attempt to measure? The relative sizes of search engines The notion of a page being indexed is still reasonably well defined. Already there are problems Document extension: e.g. engines index pages not yet crawled, by indexing anchortext. Document restriction: All engines restrict what is indexed (first n words, only relevant words, etc.) The coverage of a search engine relative to another particular crawling process. 39 New definition? (IQ is whatever the IQ tests measure.) The statically indexable web is whatever search engines index. Different engines have different preferences max url depth, max count/host, anti spam rules, priority rules, etc. Different engines index different things under the same URL: frames, meta keywords, document restrictions, document extensions,... 40
21 Relative Size from Overlap given two engines A and B Sample URLs randomly from A Check if contained in B and vice versa A B A B = (1/2) * Size A A B = (1/6) * Size B (1/2)*Size A = (1/6)*Size B Size A / Size B = (1/6)/(1/2) = 1/3 Each test involves: (i) Sampling (ii) Checking 41 Sampling URLs Assumption: A and B independent and uniform random subsets of the Web Have or have not access to the search engine How to achieve a sample 42
22 Sampling URLs Ideal strategy: Generate a random URL and check for containment in each index. Problem: Random URLs are hard to find! Enough to generate a random URL contained in a given Engine. Approach 1: Generate a random URL contained in a given engine Suffices for the estimation of relative size Approach 2: Random walks / IP addresses In theory: might give us a true estimate of the size of the web (as opposed to just relative sizes of indexes) 43 Statistical methods Approach 1 Random queries Random searches Approach 2 Random IP addresses Random walks 44
23 Random URLs from random queries Generate random query: how? picking random terms from say Webster s dictionary Not all terms occur equally often (not the same as chosen documents uniformly at random from a search engine) Many terms not in the dictionary Thus, a sample web dictionary: Lexicon: 400,000+ words from a web crawl Conjunctive Queries: w 1 and w 2 e.g., vocalists AND rsi 45 Random URLs from random queries Get 100 result URLs from engine A Choose a random URL p as the candidate to check for presence in engine B This distribution induces a probability weight W(p) for each page. Conjecture: W(SE A ) / W(SE B ) ~ SE A / SE B How to test for the presence of p (document D) in B? 46
24 Query Based Checking Strong Query to check whether an engine B has a document D: Download D. Get list of words. Use 6 8 low frequency words as AND query to B Check if D is present in result set. Problems: Near duplicates Frames Redirects Engine time outs Is 8 word query good enough? 47 Advantages & disadvantages Statistically sound under the induced weight. Biases induced by random query Query Bias: Favors content rich pages in the language(s) of the lexicon Ranking Bias: Solution: Use conjunctive queries & fetch all Checking Bias: Duplicates, impoverished pages omitted Document or query restriction bias: engine might not deal properly with 8 words conjunctive query Malicious Bias: Sabotage by engine Operational Problems: Time outs, failures, engine inconsistencies, index modification. 48
25 Random searches Choose random searches extracted from a local log [Lawrence & Giles 97] or build random searches [Notess] Use only queries with small results sets. Count normalized URLs in result sets. Use ratio statistics 49 Advantages & disadvantages Advantage Might be a better reflection of the human perception of coverage Issues Samples are correlated with source of log Duplicates Technical statistical problems (must have non zero results, ratio average not statistically sound) 50
26 Random searches 575 & 1050 queries from the NEC RI employee logs 6 Engines in 1998, 11 in 1999 Implementation: Restricted to queries with < 600 results in total Counted URLs from each engine after verifying query match Computed size ratio & overlap for individual queries Estimated index size ratio & overlap by averaging over all queries 51 Queries from Lawrence and Giles study adaptive access control neighborhood preservation topographic hamiltonian structures right linear grammar pulse width modulation neural unbalanced prior probabilities ranked assignment method internet explorer favourites importing karvel thornber zili liu softmax activation function bose multidimensional system theory gamma mlp dvi2pdf john oliensis rieke spikes exploring neural video watermarking counterpropagation network fat shattering dimension abelson amorphous computing 52
27 Random IP addresses Generate random IP addresses Find a web server at the given address If there s one Collect all pages from server From this, choose a page at random 53 Random IP addresses HTTP requests to random IP addresses Ignored: empty or authorization required or excluded [Lawr99] Estimated 2.8 million IP addresses running crawlable web servers (16 million total) from observing 2500 servers. OCLC using IP sampling found 8.7 M hosts in 2001 Netcraft [Netc02] accessed 37.2 million hosts in July 2002 [Lawr99] exhaustively crawled 2500 servers and extrapolated Estimated size of the web to be 800 million Estimated use of metadata descriptors: Meta tags (keywords, description) in 34% of home pages, Dublin core metadata in 0.3% 54
28 Advantages & disadvantages Advantages Clean statistics Independent of crawling strategies Disadvantages Doesn t deal with duplication Many hosts might share one IP, or not accept requests No guarantee all pages are linked to root page. Eg: employee pages Power law for # pages/hosts generates bias towards sites with few pages. But bias can be accurately quantified IF underlying distribution understood Potentially influenced by spamming (multiple IP s for same server to avoid IP block) 55 Random walks View the Web as a directed graph Build a random walk on this graph Includes various jump rules back to visited sites Does not get stuck in spider traps! Can follow all links! Converges to a stationary distribution Must assume graph is finite and independent of the walk. Conditions are not satisfied (cookie crumbs, flooding) Time to convergence not really known Sample from stationary distribution of walk Use the strong query method to check coverage by SE 56
29 Advantages & disadvantages Advantages Statistically clean method at least in theory! Could work even for infinite web (assuming convergence) under certain metrics. Disadvantages List of seeds is a problem. Practical approximation might not be valid. Non uniform distribution Subject to link spamming 57 Conclusions No sampling solution is perfect. Lots of new ideas......but the problem is getting harder Quantitative studies are fascinating and a good research problem 58
30 59 Link Analysis 60
31 Content Introduction PageRank HITS 61 The Web as a Directed Graph Page A Anchor hyperlink Page B Assumption 1: The anchor of the hyperlink describes the target page (textual context) 62
32 The Web as a Directed Graph Page A Anchor hyperlink Page B Assumption 2: A hyperlink between pages denotes author perceived relevance (quality signal) An endorsement of page B by the creator of A 63 Anchor Text WWW Worm - McBryan [Mcbr94] For ibm how to distinguish between: IBM s home page (mostly graphical) IBM s copyright page (high term freq. for ibm ) Rival s spam page (arbitrarily high term freq.) ibm ibm.com IBM home page A million pieces of anchor text with ibm send a strong signal 64
33 Indexing anchor text When indexing a document D, include anchor text from links pointing to D. Armonk, NY-based computer giant IBM announced today Joe s computer hardware links Sun HP IBM Big Blue today announced record profits for the quarter 65 Indexing anchor text Can sometimes have unexpected side effects e.g., evil empire. Can score anchor text with weight depending on the authority of the anchor page s website E.g., if we were to assume that content from cnn.com or yahoo.com is authoritative, then trust the anchor text from them 66
34 Anchor Text Gap between the terms in would describe this web page a web page ad how users Index also the window surrounding anchor text Anchor text terms weighted based on frequency (similar to idf Click Here ) 67 Query-independent ordering First generation: using link counts as simple measures of popularity. Two basic suggestions: Undirected popularity: Each page gets a score = the number of in links plus the number of out links (3+2=5). Directed popularity: Score of a page = number of its in links (3). 68
35 Query processing First retrieve all pages meeting the text query (say venture capital). Order these by their link popularity (either variant on the previous page). 69 Spamming simple popularity How do you spam each of the following heuristics so your page gets a high score? Each page gets a static score = the number of in links plus the number of out links. Static score of a page = number of its in links. 70
36 Citation Analysis Citation frequency Co citation coupling frequency Cocitations with a given author measures impact Cocitation analysis Bibliographic coupling frequency Articles that co cite the same articles are related Citation indexing Who is author cited by? (Garfield 1972) Pagerank preview: Pinsker and Narin 60s 71 PageRank 72
37 PageRank scoring Not all links to a page are equal Links from important pages (i.e. pages with many links) count more Assign to every node (page) in the web graph a numerical score between 0 and 1 > PageRank Given a query: compute a composite score for each web page that combines relevance with PageRank 73 Ορισμός PageRank Παράδειγμα Έστω ότι υπάρχει μια γενική ποσότητα PR που μοιράζεται στις σελίδες του συστήματος. Έστω 4 σελίδες: A, B, C και D. Αρχική προσεγγιστική τιμή για καθεμία: PR = 0.25 Έστω B, C, και D έχουν link μόνο στο A, τότε όλα το PageRank PR( ) τους θα μαζευόταν στο Α Έστω τώρα ότι η Β έχει link στη C, και η D έχει links και στο Β και στο C ΗτιμήτουPR μιας σελίδας μοιράζεται ανάμεσα στις εξωτερικές ακμές της Άρα η ψήφος της B έχει αξία για την Α και για την C. Αντίστοιχα, μόνο το 1/3 του PageRank του D μετρά για PageRank του Α (περίπου 0.083). 74
38 Ορισμός PageRank Γενικός ορισμός του PageRank για μια σελίδα Α: Έστω ότι η A έχει τις σελίδες T1,...,Tn που δείχνουν σε αυτήν (δηλαδή, αναφορές) Έστω C(Τ) ο αριθμός των εξωτερικών ακμών μιας σελίδας T PR(A) = PR(T1)/C(T1) PR(Tn)/C(Tn) 75 Απλό μοντέλο «ροής» flow model Υπολογισμός PageRank Το web το 1839 PageRank: a, y, m a/2 y Yahoo y/2 y/2 y = y /2 + a /2 a = y /2 + m m = a /2 Amazon a m a/2 M soft m 76
39 Υπολογισμός PageRank Διατύπωσημετηνμορφήπίνακα Yahoo y Adjacency Matrix a Amazon y = y /2 + a /2 a = y /2 + m m = a /2 m M soft y a m y 1/2 1/2 0 a 1/2 0 1 m 0 1/2 0 Άθροισμα 1 (οι ψήφοι στο y) 77 Διατύπωσημετηνμορφήπίνακα(παράδειγμα) Υπολογισμός PageRank a Amazon Yahoo y = y /2 + a /2 a = y /2 + m m = a /2 y m M soft r (rank vector) r [y, a, m] A = Adjacency Matrix r = A r y 1/2 1/2 0 y a = 1/2 0 1 a m 0 1/2 0 m 78
40 Υπολογισμός PageRank Ιδιοδιανύσματα (eigenvectors) Οι εξισώσεις ροής μπορούν να γραφούν r = Mr Δηλαδή, ο rank vector είναι ένα ιδιοδιάνυσμα (eigenvector) του στοχαστικού πίνακα γειτνίασης του web Συγκεκριμένα είναι το βασικό ιδιοδιάνυσμα (αυτό που αντιστοιχεί στην ιδιοτιμή λ = 1) 79 Υπολογισμός PageRank Power Iteration method Επαναληπτική Μέθοδο Ένα απλό επαναληπτικό σχήμα (relaxation) Έστω N web σελίδες Αρχικοποίηση: r 0 = [1/N,.,1/N] T Επανάληψη: r k+1 = Mr k Τερματισμός όταν r k+1 r k 1 < ε x 1 = 1 i N x i είναι L1 norm Μπορεί να χρησιμοποιηθούν και άλλες μετρικές, πχ Ευκλείδεια 80
41 Υπολογισμός PageRank Παράδειγμα Yahoo y a m y 1/2 1/2 0 a 1/2 0 1 m 0 1/2 0 Amazon M soft y a = m 1/3 1/3 1/3 1/3 1/2 1/6 5/12 1/3 1/4 3/8 11/24 1/6... 2/5 2/5 1/5 Συγκλίνει; Μοναδική Λύση; 81 PageRank scoring Intuitively, the probability a random surfer would visit the node 82
42 PageRank scoring Imagine a browser doing a random walk on web pages: 1/3 1/3 Start at a random page 1/3 At each step, go out of the current page along one of the links on that page, equiprobably In the steady state each page has a long term visit rate use this as the page s score. 83 Not quite enough The web is full of dead ends (no out links). Random walk can get stuck in dead ends. Makes no sense to talk about long term visit rates.?? 84
43 Teleporting At a dead end, jump to a random web page. At any non dead end, with probability α, say α = 10%, jump to a random web page. With remaining probability (90%), go out on a random link. 10% (α) a parameter. If N total number of web pages: teleport with 1/N 85 Result of teleporting Now cannot get stuck locally. There is a long term rate at which any page is visited How do we compute this visit rate? 86
44 Markov chains A Markov chain consists of n states, plus an n n transition probability matrix P. At each step, we are in exactly one of the states. For 1 i,j n, the matrix entry P ij tells us the probability of j being the next state, given we are currently in state i (transition probability, Markov property, depends only on i) i j P ij P ii >0 is OK. 87 Markov chains Clearly, for all i, n j= 1 P ij = 1. Markov chains are abstractions of random walks. example A B C 88
45 Random Surfer and Markov chains State > web page Transition probability > probability moving from one page to another Adjacency matrix A of the web: A ij = 1 if link from i to j, 0 otherwise 89 Random Surfer and Markov chains Adjacency matrix A of the web > Probability matrix P Divide each 1 in A by the number of 1 s in its row Multiple the resulting matrix by (1 α) Add α/n to every entry of the resulting matrix 90
46 Random Surfer and Markov chains Example Three nodes, 1, 2 and 3 1 > 2, 3 >2, 2 >3 and α = Random Surfer and Markov chains The probability of a surfer s position at any time by a vector x At t = 0, if at state t, (1 at the corresponding state, all others 0) At t = 1, x P At t = 2 (xp) P and so on Does it converges? PageRank of each node u = steady state visit frequency 92
47 Probability vectors A probability (row) vector x = (x 1, x n ) tells us where the walk is at any point. E.g., ( ) means we re in state i. 1 i n More generally, the vector x = (x 1, x n ) means the walk is in state i with probability x i. n i= 1 x i = Ergodic Markov chains A Markov chain is ergodic if you have a path from any state to any other For any start state, after a finite transient time T 0, the probability of being in any state at a fixed time T>T 0 is nonzero. 94
48 Ergodic Markov chains For any ergodic Markov chain, there is a unique longterm visit rate for each state. Steady state probability distribution. Over a long time period, we visit each state in proportion to this rate. It doesn t matter where we start. 95 Steady state example The steady state looks like a vector of probabilities a = (a 1, a n ): a i is the probability that we are in state i. 1/4 3/ /4 3/4 For this example, a 1 =1/4 and a 2 =3/4. 96
49 How do we compute this vector? Let a = (a 1, a n ) denote the row vector of steady state probabilities. If we our current position is described by a, then the next step is distributed as ap. But a is the steady state, so a=ap. Solving this matrix equation gives us a. So a is the (left) eigenvector for P. (Corresponds to the principal eigenvector of P with the largest eigenvalue.) Transition probability matrices always have larges eigenvalue One way of computing a Recall, regardless of where we start, we eventually reach the steady state a. Start with any distribution (say x=(10 0)). After one step, we re at xp; after two steps at xp 2, then xp 3 and so on. Eventually means for large k, xp k = a. Algorithm: multiply x by increasing powers of P until the product looks stable. 98
50 Pagerank summary Preprocessing: Given graph of links, build matrix P. From it compute a. The entry a i is a number between 0 and 1: the pagerank of page i. Query processing: Retrieve pages meeting query. Rank them by their pagerank. Order is query independent. 99 The reality Pagerank is used in google, but so are many other clever heuristics. 100
51 Pagerank: Issues and Variants How realistic is the random surfer model? What if we modeled the back button? Surfer behavior sharply skewed towards short paths Search engines, bookmarks & directories make jumps nonrandom. Biased Surfer Models Weight edge traversal probabilities based on match with topic/query (non uniform edge selection) Bias jumps to pages on topic (e.g., based on personal bookmarks & categories of interest) 101 PageRank Topic-Specific PageRank 102
52 Topic Specific Pagerank Idea: Teleport to a random page non uniformly How? 103 Topic Specific Pagerank Conceptually, we use a random surfer who teleports, with say 10% probability, using the following rule: Selects a category (say, one of the 16 top level ODP categories) based on a query & user specific distribution over the categories Teleport to a page uniformly at random within the chosen category Sounds hard to implement: can t compute PageRank at query time! 104
53 Topic Specific Pagerank Offline:Compute pagerank for individual categories Query independent as before Each page has multiple pagerank scores one for each ODP category, with teleportation only to that category Online: Distribution of weights over categories computed by query context classification Generate a dynamic pagerank score for each page weighted sum of category specific pageranks 105 Influencing PageRank ( Personalization ) Input: Web graph W influence vector v v : (page degree of influence) Output: Rank vector r: (page page importance wrt v) r = PR(W, v) 106
54 Non-uniform Teleportation Teleport with 10% probability to a Sports page Sports 107 Interpretation of Composite Score For a set of personalization vectors {v j } j [w j PR(W, v j )] = PR(W, j [w j v j ]) Weighted sum of rank vectors itself forms a valid rank vector, because PR() is linear wrt v j 108
55 Interpretation Sports 10% Sports teleportation 109 Interpretation Health 10% Health teleportation 110
56 Interpretation pr = (0.9 PR sports PR health ) gives you: 9% sports teleportation, 1% health teleportation Health Sports 111 HITS 112
57 The hope Query: Long distance telephone companies Hubs Alice Bob Use hubs to discover authorities AT&T Sprint MCI Authorities 113 Hyperlink-Induced Topic Search (HITS) In response to a query, instead of an ordered list of pages each meeting the query, find two sets of inter related pages: Hub pages are good lists of links on a subject. e.g., Bob s list of cancer related links. Authority pages occur recurrently on good hubs for the subject. Each page has two scores for each query: a hub score and an authority score 114
58 Hubs and Authorities Thus, a good hub page for a topic points to many authoritative pages for that topic. A good authority page for a topic is pointed to by many good hubs for that topic. Circular definition will turn this into an iterative computation. 115 The hope Hubs Alice Bob AT&T Sprint MCI Authorities Query: Long distance telephone companies 116
59 Hyperlink-Induced Topic Search (HITS) Best suited for broad topic queries rather than for page finding queries. Gets at a broader slice of common opinion. 117 High-level scheme Extract from the web a base set of pages that could be good hubs or authorities. From these, identify a small set of top hub and authority pages; iterative algorithm. 118
60 Base set Query specific Given text query (say browser), use a text index to get all pages containing browser. Call this the root set of pages. Add in any page that either points to a page in the root set, or is pointed to by a page in the root set. Call this the base set. 119 Visualization Root set Base set 120
61 Assembling the base set Root set typically nodes. Base set may have up to 5000 nodes. How do you find the base set nodes? Follow out links by parsing root set pages. Get in links (and out links) from a connectivity server. (Actually, suffices to text index strings of the form href= URL to get in links to URL.) 121 Distilling hubs and authorities Compute, for each page x in the base set, a hub score h(x) and an authority score a(x). Initialize: for all x, h(x) 1; a(x) 1; Iteratively update all h(x), a(x); After iterations output pages with highest h() scores as top hubs highest a() scores as top authorities. 122
62 Iterative update Repeat the following updates, for all x: h( x) a( y) xa y x a( x) h( y) yax x 123 Αναπαράσταση με πίνακες Έστω το βασικό σύνολο σελίδων {1, 2,..., n} Πίνακας Γειτνίασης (adjacency matrix) B: n x n B[i, j] = 1 αν η σελίδα i περιέχει σύνδεσμο που δείχνει στη σελίδα j Έστω h = <h 1, h 2,, h n > το διάνυσμα συντελεστών κομβικών ρόλων και α = <α 1, α 2,..., α n > το διάνυσμα συντελεστών αυθεντικότητας (αντίστοιχο του r vector) 124
63 Αναπαράσταση με πίνακες Οι κανόνες ενημέρωσης Αρχικά h = B a 1ηεπανάληψη h = B B Τ h = (B B Τ )h 2ηεπανάληψη h = (B B Τ ) 2 h a = B Τ h a = B T B a = (B T B) a a = (B T B) 2 a Σύγκλιση στα ιδιοδιανύσματα του ΒΒ Τ και Β Τ Β αν κανονικοποιηθούν αρχικά οι συντελεστές 125 Αναπαράσταση με πίνακες Netscape B = n m a n m a B T = n m a B B T = Amazon M soft h = BB T h =
64 Scaling To prevent the h() and a() values from getting too big, can scale down after each iteration. Scaling factor doesn t really matter: we only care about the relative values of the scores. 127 Proof of convergence n n adjacency matrix A: each of the n pages in the base set has a row and column in the matrix. Entry A ij = 1 if page i links to page j, else =
65 Hub/authority vectors View the hub scores h() and the authority scores a() as vectors with n components. Recall the iterative updates h( x) a( y) xa y a( x) h( y) yax 129 Rewrite in matrix form h=aa. a=a t h. Recall A t is the transpose of A. Substituting, h=aa t h and a=a t Aa. Thus, h is an eigenvector of AA t and a is an eigenvector of A t A. Further, our algorithm is a particular, known algorithm for computing eigenvectors: the power iteration method. Guaranteed to converge. 130
66 How many iterations? Claim: relative values of scores will converge after a few iterations: in fact, suitably scaled, h() and a() scores settle into a steady state! We only require the relative orders of the h() and a() scores not their absolute values. In practice, ~5 iterations get you close to stability. 131 Japan Elementary Schools Hubs Authorities schools The American School in Japan LINK Page-13 The Link Page ú { ÌŠwZ ªès ˆä c ŠwZƒz[ƒƒy[ƒW a ŠwZƒz[ƒƒy[ƒW Kids' Space 100 Schools Home Pages (English) ˆÀés ˆÀé¼ ŠwZ K-12 from Japan 10/...rnet and Education ) {é ³ˆç åšw ŠwZ KEIMEI GAKUEN Home Page ( Japanese ) l f j ŠwZ U N P g Œê Shiranuma Home Page ÒŠ ÒŠ Œ ŠwZ fuzoku-es.fukui-u.ac.jp Koulutus ja oppilaitokset welcome to Miasa E&J school TOYODA HOMEPAGE _ ÞìŒ E ls Education ì¼ ŠwZ ̃y Cay's Homepage(Japanese) y ì ŠwZ ̃z[ƒƒy[ƒW fukui haruyama-es HomePage UNIVERSITY Torisu primary school J ³ ŠwZ DRAGON97-TOP goo  ª ŠwZ T N P gƒz[ƒƒy[ƒw Yakumo Elementary,Hokkaido,Japan µ é¼âá á Ë å ¼ á Ë å ¼ FUZOKU Home Page Kamishibun Elementary School
67 Things to note Pulled together good pages regardless of language of page content. Use only link analysis after base set assembled iterative scoring is query independent. Iterative computation after text index retrieval significant overhead. 133 Issues Topic Drift Off topic pages can cause off topic authorities to be returned E.g., the neighborhood graph can be about a super topic Mutually Reinforcing Affiliates Affiliated pages/sites can boost each others scores Linkage between affiliated pages is not a useful signal 134
68 Resources IIR Chap xhtml/index.html mccurley.html 135
Εισαγωγή στην ανάλυση συνδέσμων
Εισαγωγή στην ανάλυση συνδέσμων Αποθήκες και Εξόρυξη Δεδομένων Διδάσκων: Μαρία Χαλκίδη Why link analysis? Why link analysis? The web is not just a collection of documents its hyperlinks are important!
Διαβάστε περισσότεραOther Test Constructions: Likelihood Ratio & Bayes Tests
Other Test Constructions: Likelihood Ratio & Bayes Tests Side-Note: So far we have seen a few approaches for creating tests such as Neyman-Pearson Lemma ( most powerful tests of H 0 : θ = θ 0 vs H 1 :
Διαβάστε περισσότεραPhys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)
Phys460.nb 81 ψ n (t) is still the (same) eigenstate of H But for tdependent H. The answer is NO. 5.5.5. Solution for the tdependent Schrodinger s equation If we assume that at time t 0, the electron starts
Διαβάστε περισσότεραCHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS
CHAPTER 5 SOLVING EQUATIONS BY ITERATIVE METHODS EXERCISE 104 Page 8 1. Find the positive root of the equation x + 3x 5 = 0, correct to 3 significant figures, using the method of bisection. Let f(x) =
Διαβάστε περισσότεραHOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:
HOMEWORK 4 Problem a For the fast loading case, we want to derive the relationship between P zz and λ z. We know that the nominal stress is expressed as: P zz = ψ λ z where λ z = λ λ z. Therefore, applying
Διαβάστε περισσότεραderivation of the Laplacian from rectangular to spherical coordinates
derivation of the Laplacian from rectangular to spherical coordinates swapnizzle 03-03- :5:43 We begin by recognizing the familiar conversion from rectangular to spherical coordinates (note that φ is used
Διαβάστε περισσότεραSection 8.3 Trigonometric Equations
99 Section 8. Trigonometric Equations Objective 1: Solve Equations Involving One Trigonometric Function. In this section and the next, we will exple how to solving equations involving trigonometric functions.
Διαβάστε περισσότεραΜΥΕ003: Ανάκτηση Πληροφορίας. Διδάσκουσα: Ευαγγελία Πιτουρά Κεφάλαιο 21: Ανάλυση Συνδέσμων.
ΜΥΕ3: Ανάκτηση Πληροφορίας Διδάσκουσα: Ευαγγελία Πιτουρά Κεφάλαιο 2: Ανάλυση Συνδέσμων. Κεφ 2 Τι θα δούμε σήμερα Πως μπορούμε να χρησιμοποιήσουμε το δίκτυο στη διάταξη των αποτελεσμάτων Δεν είναι όλες
Διαβάστε περισσότεραJesse Maassen and Mark Lundstrom Purdue University November 25, 2013
Notes on Average Scattering imes and Hall Factors Jesse Maassen and Mar Lundstrom Purdue University November 5, 13 I. Introduction 1 II. Solution of the BE 1 III. Exercises: Woring out average scattering
Διαβάστε περισσότεραThe Simply Typed Lambda Calculus
Type Inference Instead of writing type annotations, can we use an algorithm to infer what the type annotations should be? That depends on the type system. For simple type systems the answer is yes, and
Διαβάστε περισσότεραMain source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1
Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1 A Brief History of Sampling Research 1915 - Edmund Taylor Whittaker (1873-1956) devised a
Διαβάστε περισσότεραStatistical Inference I Locally most powerful tests
Statistical Inference I Locally most powerful tests Shirsendu Mukherjee Department of Statistics, Asutosh College, Kolkata, India. shirsendu st@yahoo.co.in So far we have treated the testing of one-sided
Διαβάστε περισσότεραΜΥΕ003: Ανάκτηση Πληροφορίας. Διδάσκουσα: Ευαγγελία Πιτουρά Κεφάλαιο 21: Ανάλυση Συνδέσμων.
ΜΥΕ3: Ανάκτηση Πληροφορίας Διδάσκουσα: Ευαγγελία Πιτουρά Κεφάλαιο 2: Ανάλυση Συνδέσμων. Κεφ 2 Τι θα δούμε σήμερα Πως μπορούμε να χρησιμοποιήσουμε το δίκτυο στη διάταξη των αποτελεσμάτων Δεν είναι όλες
Διαβάστε περισσότεραEE512: Error Control Coding
EE512: Error Control Coding Solution for Assignment on Finite Fields February 16, 2007 1. (a) Addition and Multiplication tables for GF (5) and GF (7) are shown in Tables 1 and 2. + 0 1 2 3 4 0 0 1 2 3
Διαβάστε περισσότεραC.S. 430 Assignment 6, Sample Solutions
C.S. 430 Assignment 6, Sample Solutions Paul Liu November 15, 2007 Note that these are sample solutions only; in many cases there were many acceptable answers. 1 Reynolds Problem 10.1 1.1 Normal-order
Διαβάστε περισσότεραΨηφιακή ανάπτυξη. Course Unit #1 : Κατανοώντας τις βασικές σύγχρονες ψηφιακές αρχές Thematic Unit #1 : Τεχνολογίες Web και CMS
Ψηφιακή ανάπτυξη Course Unit #1 : Κατανοώντας τις βασικές σύγχρονες ψηφιακές αρχές Thematic Unit #1 : Τεχνολογίες Web και CMS Learning Objective : SEO και Analytics Fabio Calefato Department of Computer
Διαβάστε περισσότεραΜηχανική Μάθηση Hypothesis Testing
ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ Μηχανική Μάθηση Hypothesis Testing Γιώργος Μπορμπουδάκης Τμήμα Επιστήμης Υπολογιστών Procedure 1. Form the null (H 0 ) and alternative (H 1 ) hypothesis 2. Consider
Διαβάστε περισσότεραBlock Ciphers Modes. Ramki Thurimella
Block Ciphers Modes Ramki Thurimella Only Encryption I.e. messages could be modified Should not assume that nonsensical messages do no harm Always must be combined with authentication 2 Padding Must be
Διαβάστε περισσότερα5.4 The Poisson Distribution.
The worst thing you can do about a situation is nothing. Sr. O Shea Jackson 5.4 The Poisson Distribution. Description of the Poisson Distribution Discrete probability distribution. The random variable
Διαβάστε περισσότεραΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ Η/Υ & ΠΛΗΡΟΦΟΡΙΚΗΣ. του Γεράσιμου Τουλιάτου ΑΜ: 697
ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ Η/Υ & ΠΛΗΡΟΦΟΡΙΚΗΣ ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ ΣΤΑ ΠΛΑΙΣΙΑ ΤΟΥ ΜΕΤΑΠΤΥΧΙΑΚΟΥ ΔΙΠΛΩΜΑΤΟΣ ΕΙΔΙΚΕΥΣΗΣ ΕΠΙΣΤΗΜΗ ΚΑΙ ΤΕΧΝΟΛΟΓΙΑ ΤΩΝ ΥΠΟΛΟΓΙΣΤΩΝ του Γεράσιμου Τουλιάτου
Διαβάστε περισσότεραConcrete Mathematics Exercises from 30 September 2016
Concrete Mathematics Exercises from 30 September 2016 Silvio Capobianco Exercise 1.7 Let H(n) = J(n + 1) J(n). Equation (1.8) tells us that H(2n) = 2, and H(2n+1) = J(2n+2) J(2n+1) = (2J(n+1) 1) (2J(n)+1)
Διαβάστε περισσότεραΕΠΛ660. Ανάλυση Υπερσυνδέσµων
Ανάλυση Υπερσυνδέσµων Περιεχόµενα Μαθήµατος Anchor text Link analysis for ranking Markov Chains Pagerank and variants How can I improve the PageRank of my Web pages? HITS The Web as a Directed Graph Page
Διαβάστε περισσότεραST5224: Advanced Statistical Theory II
ST5224: Advanced Statistical Theory II 2014/2015: Semester II Tutorial 7 1. Let X be a sample from a population P and consider testing hypotheses H 0 : P = P 0 versus H 1 : P = P 1, where P j is a known
Διαβάστε περισσότερα2 Composition. Invertible Mappings
Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan Composition. Invertible Mappings In this section we discuss two procedures for creating new mappings from old ones, namely,
Διαβάστε περισσότερα3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β
3.4 SUM AND DIFFERENCE FORMULAS Page Theorem cos(αβ cos α cos β -sin α cos(α-β cos α cos β sin α NOTE: cos(αβ cos α cos β cos(α-β cos α -cos β Proof of cos(α-β cos α cos β sin α Let s use a unit circle
Διαβάστε περισσότεραMath 6 SL Probability Distributions Practice Test Mark Scheme
Math 6 SL Probability Distributions Practice Test Mark Scheme. (a) Note: Award A for vertical line to right of mean, A for shading to right of their vertical line. AA N (b) evidence of recognizing symmetry
Διαβάστε περισσότεραPhysical DB Design. B-Trees Index files can become quite large for large main files Indices on index files are possible.
B-Trees Index files can become quite large for large main files Indices on index files are possible 3 rd -level index 2 nd -level index 1 st -level index Main file 1 The 1 st -level index consists of pairs
Διαβάστε περισσότεραCHAPTER 48 APPLICATIONS OF MATRICES AND DETERMINANTS
CHAPTER 48 APPLICATIONS OF MATRICES AND DETERMINANTS EXERCISE 01 Page 545 1. Use matrices to solve: 3x + 4y x + 5y + 7 3x + 4y x + 5y 7 Hence, 3 4 x 0 5 y 7 The inverse of 3 4 5 is: 1 5 4 1 5 4 15 8 3
Διαβάστε περισσότεραTerabyte Technology Ltd
Terabyte Technology Ltd is a Web and Graphic design company in Limassol with dedicated staff who will endeavour to deliver the highest quality of work in our field. We offer a range of services such as
Διαβάστε περισσότεραSection 7.6 Double and Half Angle Formulas
09 Section 7. Double and Half Angle Fmulas To derive the double-angles fmulas, we will use the sum of two angles fmulas that we developed in the last section. We will let α θ and β θ: cos(θ) cos(θ + θ)
Διαβάστε περισσότεραHomework 3 Solutions
Homework 3 Solutions Igor Yanovsky (Math 151A TA) Problem 1: Compute the absolute error and relative error in approximations of p by p. (Use calculator!) a) p π, p 22/7; b) p π, p 3.141. Solution: For
Διαβάστε περισσότεραΕύρεση & ιαχείριση Πληροφορίας στον Παγκόσµιο Ιστό
Εύρεση & ιαχείριση Πληροφορίας στον Παγκόσµιο Ιστό ιδάσκων ηµήτριος Κατσαρός, Ph.D. @ Τµ. Μηχανικών Η/Υ, Τηλεπικοινωνιών & ικτύων Πανεπιστήµιο Θεσσαλίας ιάλεξη 8η: 18/04/2007 1 Ανάλυση υπερσυνδέσµων Πρακτικές
Διαβάστε περισσότεραInstruction Execution Times
1 C Execution Times InThisAppendix... Introduction DL330 Execution Times DL330P Execution Times DL340 Execution Times C-2 Execution Times Introduction Data Registers This appendix contains several tables
Διαβάστε περισσότεραInformation Retrieval
Introduction to Information Retrieval ΠΛΕ70: Ανάκτηση Πληροφορίας Διδάσκουσα: Ευαγγελία Πιτουρά Διάλεξη 10: Βασικά Θέματα Αναζήτησης στον Παγκόσμιο Ιστό. 1 Κεφ. 19 Τι θα δούμε σήμερα; Τι ψάχνουν οι χρήστες
Διαβάστε περισσότεραMatrices and Determinants
Matrices and Determinants SUBJECTIVE PROBLEMS: Q 1. For what value of k do the following system of equations possess a non-trivial (i.e., not all zero) solution over the set of rationals Q? x + ky + 3z
Διαβάστε περισσότεραTMA4115 Matematikk 3
TMA4115 Matematikk 3 Andrew Stacey Norges Teknisk-Naturvitenskapelige Universitet Trondheim Spring 2010 Lecture 12: Mathematics Marvellous Matrices Andrew Stacey Norges Teknisk-Naturvitenskapelige Universitet
Διαβάστε περισσότεραΨηφιακή ανάπτυξη. Course Unit #1 : Κατανοώντας τις βασικές σύγχρονες ψηφιακές αρχές Thematic Unit #1 : Τεχνολογίες Web και CMS
Ψηφιακή ανάπτυξη Course Unit #1 : Κατανοώντας τις βασικές σύγχρονες ψηφιακές αρχές Thematic Unit #1 : Τεχνολογίες Web και CMS Learning Objective : Βασικά συστατικά του Web Fabio Calefato Department of
Διαβάστε περισσότεραEcon 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1
Eon : Fall 8 Suggested Solutions to Problem Set 8 Email questions or omments to Dan Fetter Problem. Let X be a salar with density f(x, θ) (θx + θ) [ x ] with θ. (a) Find the most powerful level α test
Διαβάστε περισσότεραΤΕΧΝΟΛΟΓΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΤΜΗΜΑ ΝΟΣΗΛΕΥΤΙΚΗΣ
ΤΕΧΝΟΛΟΓΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΤΜΗΜΑ ΝΟΣΗΛΕΥΤΙΚΗΣ ΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ ΨΥΧΟΛΟΓΙΚΕΣ ΕΠΙΠΤΩΣΕΙΣ ΣΕ ΓΥΝΑΙΚΕΣ ΜΕΤΑ ΑΠΟ ΜΑΣΤΕΚΤΟΜΗ ΓΕΩΡΓΙΑ ΤΡΙΣΟΚΚΑ Λευκωσία 2012 ΤΕΧΝΟΛΟΓΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΣΧΟΛΗ ΕΠΙΣΤΗΜΩΝ
Διαβάστε περισσότερα9.09. # 1. Area inside the oval limaçon r = cos θ. To graph, start with θ = 0 so r = 6. Compute dr
9.9 #. Area inside the oval limaçon r = + cos. To graph, start with = so r =. Compute d = sin. Interesting points are where d vanishes, or at =,,, etc. For these values of we compute r:,,, and the values
Διαβάστε περισσότεραCongruence Classes of Invertible Matrices of Order 3 over F 2
International Journal of Algebra, Vol. 8, 24, no. 5, 239-246 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.2988/ija.24.422 Congruence Classes of Invertible Matrices of Order 3 over F 2 Ligong An and
Διαβάστε περισσότεραLecture 2. Soundness and completeness of propositional logic
Lecture 2 Soundness and completeness of propositional logic February 9, 2004 1 Overview Review of natural deduction. Soundness and completeness. Semantics of propositional formulas. Soundness proof. Completeness
Διαβάστε περισσότεραExample of the Baum-Welch Algorithm
Example of the Baum-Welch Algorithm Larry Moss Q520, Spring 2008 1 Our corpus c We start with a very simple corpus. We take the set Y of unanalyzed words to be {ABBA, BAB}, and c to be given by c(abba)
Διαβάστε περισσότεραΠΑΝΕΠΙΣΤΗΜΙΟ ΠΕΙΡΑΙΩΣ ΤΜΗΜΑ ΠΛΗΡΟΦΟΡΙΚΗΣ ΠΜΣ «ΠΡΟΗΓΜΕΝΑ ΣΥΣΤΗΜΑΤΑ ΠΛΗΡΟΦΟΡΙΚΗΣ» ΚΑΤΕΥΘΥΝΣΗ «ΕΥΦΥΕΙΣ ΤΕΧΝΟΛΟΓΙΕΣ ΕΠΙΚΟΙΝΩΝΙΑΣ ΑΝΘΡΩΠΟΥ - ΥΠΟΛΟΓΙΣΤΗ»
ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΕΙΡΑΙΩΣ ΤΜΗΜΑ ΠΛΗΡΟΦΟΡΙΚΗΣ ΠΜΣ «ΠΡΟΗΓΜΕΝΑ ΣΥΣΤΗΜΑΤΑ ΠΛΗΡΟΦΟΡΙΚΗΣ» ΚΑΤΕΥΘΥΝΣΗ «ΕΥΦΥΕΙΣ ΤΕΧΝΟΛΟΓΙΕΣ ΕΠΙΚΟΙΝΩΝΙΑΣ ΑΝΘΡΩΠΟΥ - ΥΠΟΛΟΓΙΣΤΗ» ΜΕΤΑΠΤΥΧΙΑΚΗ ΙΑΤΡΙΒΗ ΤΟΥ ΕΥΘΥΜΙΟΥ ΘΕΜΕΛΗ ΤΙΤΛΟΣ Ανάλυση
Διαβάστε περισσότεραSolutions to Exercise Sheet 5
Solutions to Eercise Sheet 5 jacques@ucsd.edu. Let X and Y be random variables with joint pdf f(, y) = 3y( + y) where and y. Determine each of the following probabilities. Solutions. a. P (X ). b. P (X
Διαβάστε περισσότεραProblem Set 3: Solutions
CMPSCI 69GG Applied Information Theory Fall 006 Problem Set 3: Solutions. [Cover and Thomas 7.] a Define the following notation, C I p xx; Y max X; Y C I p xx; Ỹ max I X; Ỹ We would like to show that C
Διαβάστε περισσότεραNumerical Analysis FMN011
Numerical Analysis FMN011 Carmen Arévalo Lund University carmen@maths.lth.se Lecture 12 Periodic data A function g has period P if g(x + P ) = g(x) Model: Trigonometric polynomial of order M T M (x) =
Διαβάστε περισσότεραNowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in
Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in : tail in X, head in A nowhere-zero Γ-flow is a Γ-circulation such that
Διαβάστε περισσότερα6.3 Forecasting ARMA processes
122 CHAPTER 6. ARMA MODELS 6.3 Forecasting ARMA processes The purpose of forecasting is to predict future values of a TS based on the data collected to the present. In this section we will discuss a linear
Διαβάστε περισσότεραΣυστήματα Διαχείρισης Βάσεων Δεδομένων
ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ Συστήματα Διαχείρισης Βάσεων Δεδομένων Φροντιστήριο 9: Transactions - part 1 Δημήτρης Πλεξουσάκης Τμήμα Επιστήμης Υπολογιστών Tutorial on Undo, Redo and Undo/Redo
Διαβάστε περισσότεραΣτο εστιατόριο «ToDokimasesPrinToBgaleisStonKosmo?» έξω από τους δακτυλίους του Κρόνου, οι παραγγελίες γίνονται ηλεκτρονικά.
Διαστημικό εστιατόριο του (Μ)ΑστροΈκτορα Στο εστιατόριο «ToDokimasesPrinToBgaleisStonKosmo?» έξω από τους δακτυλίους του Κρόνου, οι παραγγελίες γίνονται ηλεκτρονικά. Μόλις μια παρέα πελατών κάτσει σε ένα
Διαβάστε περισσότεραPartial Differential Equations in Biology The boundary element method. March 26, 2013
The boundary element method March 26, 203 Introduction and notation The problem: u = f in D R d u = ϕ in Γ D u n = g on Γ N, where D = Γ D Γ N, Γ D Γ N = (possibly, Γ D = [Neumann problem] or Γ N = [Dirichlet
Διαβάστε περισσότερα11. Βασικές Αρχές Αναζήτησης στον Ιστό
Πανεπιστήμιο Πειραιώς Σχολή Τεχνολογιών Πληροφορικής και Επικοινωνιών Τμήμα Ψηφιακών Συστημάτων 11. Βασικές Αρχές Αναζήτησης στον Ιστό Ανάκτηση Πληροφοριών Χρήστος ουλκερίδης Τμήμα Ψηφιακών Συστημάτων
Διαβάστε περισσότεραΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007
Οδηγίες: Να απαντηθούν όλες οι ερωτήσεις. Αν κάπου κάνετε κάποιες υποθέσεις να αναφερθούν στη σχετική ερώτηση. Όλα τα αρχεία που αναφέρονται στα προβλήματα βρίσκονται στον ίδιο φάκελο με το εκτελέσιμο
Διαβάστε περισσότεραHISTOGRAMS AND PERCENTILES What is the 25 th percentile of a histogram? What is the 50 th percentile for the cigarette histogram?
HISTOGRAMS AND PERCENTILES What is the 25 th percentile of a histogram? The point on the horizontal axis such that of the area under the histogram lies to the left of that point (and to the right) What
Διαβάστε περισσότεραLecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3
Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3 1 State vector space and the dual space Space of wavefunctions The space of wavefunctions is the set of all
Διαβάστε περισσότεραEvery set of first-order formulas is equivalent to an independent set
Every set of first-order formulas is equivalent to an independent set May 6, 2008 Abstract A set of first-order formulas, whatever the cardinality of the set of symbols, is equivalent to an independent
Διαβάστε περισσότεραChapter 6: Systems of Linear Differential. be continuous functions on the interval
Chapter 6: Systems of Linear Differential Equations Let a (t), a 2 (t),..., a nn (t), b (t), b 2 (t),..., b n (t) be continuous functions on the interval I. The system of n first-order differential equations
Διαβάστε περισσότερα[1] P Q. Fig. 3.1
1 (a) Define resistance....... [1] (b) The smallest conductor within a computer processing chip can be represented as a rectangular block that is one atom high, four atoms wide and twenty atoms long. One
Διαβάστε περισσότεραPARTIAL NOTES for 6.1 Trigonometric Identities
PARTIAL NOTES for 6.1 Trigonometric Identities tanθ = sinθ cosθ cotθ = cosθ sinθ BASIC IDENTITIES cscθ = 1 sinθ secθ = 1 cosθ cotθ = 1 tanθ PYTHAGOREAN IDENTITIES sin θ + cos θ =1 tan θ +1= sec θ 1 + cot
Διαβάστε περισσότεραThe challenges of non-stable predicates
The challenges of non-stable predicates Consider a non-stable predicate Φ encoding, say, a safety property. We want to determine whether Φ holds for our program. The challenges of non-stable predicates
Διαβάστε περισσότεραBusiness English. Ενότητα # 9: Financial Planning. Ευαγγελία Κουτσογιάννη Τμήμα Διοίκησης Επιχειρήσεων
ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ Ανώτατο Εκπαιδευτικό Ίδρυμα Πειραιά Τεχνολογικού Τομέα Business English Ενότητα # 9: Financial Planning Ευαγγελία Κουτσογιάννη Τμήμα Διοίκησης Επιχειρήσεων Άδειες Χρήσης Το παρόν εκπαιδευτικό
Διαβάστε περισσότεραΑνάκτηση Πληροφορίας
Ανάκτηση Πληροφορίας Αποτίμηση Αποτελεσματικότητας Μέτρα Απόδοσης Precision = # σχετικών κειμένων που επιστρέφονται # κειμένων που επιστρέφονται Recall = # σχετικών κειμένων που επιστρέφονται # συνολικών
Διαβάστε περισσότεραApproximation of distance between locations on earth given by latitude and longitude
Approximation of distance between locations on earth given by latitude and longitude Jan Behrens 2012-12-31 In this paper we shall provide a method to approximate distances between two points on earth
Διαβάστε περισσότεραAssalamu `alaikum wr. wb.
LUMP SUM Assalamu `alaikum wr. wb. LUMP SUM Wassalamu alaikum wr. wb. Assalamu `alaikum wr. wb. LUMP SUM Wassalamu alaikum wr. wb. LUMP SUM Lump sum lump sum lump sum. lump sum fixed price lump sum lump
Διαβάστε περισσότεραSrednicki Chapter 55
Srednicki Chapter 55 QFT Problems & Solutions A. George August 3, 03 Srednicki 55.. Use equations 55.3-55.0 and A i, A j ] = Π i, Π j ] = 0 (at equal times) to verify equations 55.-55.3. This is our third
Διαβάστε περισσότεραSCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions
SCHOOL OF MATHEMATICAL SCIENCES GLMA Linear Mathematics 00- Examination Solutions. (a) i. ( + 5i)( i) = (6 + 5) + (5 )i = + i. Real part is, imaginary part is. (b) ii. + 5i i ( + 5i)( + i) = ( i)( + i)
Διαβάστε περισσότεραModbus basic setup notes for IO-Link AL1xxx Master Block
n Modbus has four tables/registers where data is stored along with their associated addresses. We will be using the holding registers from address 40001 to 49999 that are R/W 16 bit/word. Two tables that
Διαβάστε περισσότεραExample Sheet 3 Solutions
Example Sheet 3 Solutions. i Regular Sturm-Liouville. ii Singular Sturm-Liouville mixed boundary conditions. iii Not Sturm-Liouville ODE is not in Sturm-Liouville form. iv Regular Sturm-Liouville note
Διαβάστε περισσότεραΠανεπιστήμιο Κρήτης, Τμήμα Επιστήμης Υπολογιστών Άνοιξη 2009. HΥ463 - Συστήματα Ανάκτησης Πληροφοριών Information Retrieval (IR) Systems
Πανεπιστήμιο Κρήτης, Τμήμα Επιστήμης Υπολογιστών Άνοιξη 2009 HΥ463 - Συστήματα Ανάκτησης Πληροφοριών Information Retrieval (IR) Systems Στατιστικά Κειμένου Text Statistics Γιάννης Τζίτζικας άλ ιάλεξη :
Διαβάστε περισσότεραAreas and Lengths in Polar Coordinates
Kiryl Tsishchanka Areas and Lengths in Polar Coordinates In this section we develop the formula for the area of a region whose boundary is given by a polar equation. We need to use the formula for the
Διαβάστε περισσότεραHow to register an account with the Hellenic Community of Sheffield.
How to register an account with the Hellenic Community of Sheffield. (1) EN: Go to address GR: Πηγαίνετε στη διεύθυνση: http://www.helleniccommunityofsheffield.com (2) EN: At the bottom of the page, click
Διαβάστε περισσότεραΑΓΓΛΙΚΗ ΓΛΩΣΣΑ ΣΕ ΕΙΔΙΚΑ ΘΕΜΑΤΑ ΔΙΕΘΝΩΝ ΣΧΕΣΕΩΝ & ΟΙΚΟΝΟΜΙΑΣ
ΑΓΓΛΙΚΗ ΓΛΩΣΣΑ ΣΕ ΕΙΔΙΚΑ ΘΕΜΑΤΑ ΔΙΕΘΝΩΝ ΣΧΕΣΕΩΝ & ΟΙΚΟΝΟΜΙΑΣ Ενότητα 1β: Principles of PS Ιφιγένεια Μαχίλη Τμήμα Οικονομικών Επιστημών Άδειες Χρήσης Το παρόν εκπαιδευτικό υλικό υπόκειται σε άδειες χρήσης
Διαβάστε περισσότεραb. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!
MTH U341 urface Integrals, tokes theorem, the divergence theorem To be turned in Wed., Dec. 1. 1. Let be the sphere of radius a, x 2 + y 2 + z 2 a 2. a. Use spherical coordinates (with ρ a) to parametrize.
Διαβάστε περισσότεραFractional Colorings and Zykov Products of graphs
Fractional Colorings and Zykov Products of graphs Who? Nichole Schimanski When? July 27, 2011 Graphs A graph, G, consists of a vertex set, V (G), and an edge set, E(G). V (G) is any finite set E(G) is
Διαβάστε περισσότερα4.6 Autoregressive Moving Average Model ARMA(1,1)
84 CHAPTER 4. STATIONARY TS MODELS 4.6 Autoregressive Moving Average Model ARMA(,) This section is an introduction to a wide class of models ARMA(p,q) which we will consider in more detail later in this
Διαβάστε περισσότεραFourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics
Fourier Series MATH 211, Calculus II J. Robert Buchanan Department of Mathematics Spring 2018 Introduction Not all functions can be represented by Taylor series. f (k) (c) A Taylor series f (x) = (x c)
Διαβάστε περισσότεραAreas and Lengths in Polar Coordinates
Kiryl Tsishchanka Areas and Lengths in Polar Coordinates In this section we develop the formula for the area of a region whose boundary is given by a polar equation. We need to use the formula for the
Διαβάστε περισσότεραΠώς μπορεί κανείς να έχει έναν διερμηνέα κατά την επίσκεψή του στον Οικογενειακό του Γιατρό στο Ίσλινγκτον Getting an interpreter when you visit your
Πώς μπορεί κανείς να έχει έναν διερμηνέα κατά την επίσκεψή του στον Οικογενειακό του Γιατρό στο Ίσλινγκτον Getting an interpreter when you visit your GP practice in Islington Σε όλα τα Ιατρεία Οικογενειακού
Διαβάστε περισσότερα6.1. Dirac Equation. Hamiltonian. Dirac Eq.
6.1. Dirac Equation Ref: M.Kaku, Quantum Field Theory, Oxford Univ Press (1993) η μν = η μν = diag(1, -1, -1, -1) p 0 = p 0 p = p i = -p i p μ p μ = p 0 p 0 + p i p i = E c 2 - p 2 = (m c) 2 H = c p 2
Διαβάστε περισσότεραA Bonus-Malus System as a Markov Set-Chain. Małgorzata Niemiec Warsaw School of Economics Institute of Econometrics
A Bonus-Malus System as a Markov Set-Chain Małgorzata Niemiec Warsaw School of Economics Institute of Econometrics Contents 1. Markov set-chain 2. Model of bonus-malus system 3. Example 4. Conclusions
Διαβάστε περισσότεραUNIVERSITY OF CALIFORNIA. EECS 150 Fall ) You are implementing an 4:1 Multiplexer that has the following specifications:
UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences EECS 150 Fall 2001 Prof. Subramanian Midterm II 1) You are implementing an 4:1 Multiplexer that has the following specifications:
Διαβάστε περισσότεραWebsite review lalemou.com
Website review lalemou.com Generated on September 16 2017 11:58 AM The score is 52/100 SEO Content Title Κάνε Γνωριμίες στο chat μπαμ! Live & Ανώνυμα lalemou Length : 54 Perfect, your title contains between
Διαβάστε περισσότεραExercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.
Exercises 0 More exercises are available in Elementary Differential Equations. If you have a problem to solve any of them, feel free to come to office hour. Problem Find a fundamental matrix of the given
Διαβάστε περισσότεραFinite Field Problems: Solutions
Finite Field Problems: Solutions 1. Let f = x 2 +1 Z 11 [x] and let F = Z 11 [x]/(f), a field. Let Solution: F =11 2 = 121, so F = 121 1 = 120. The possible orders are the divisors of 120. Solution: The
Διαβάστε περισσότεραRight Rear Door. Let's now finish the door hinge saga with the right rear door
Right Rear Door Let's now finish the door hinge saga with the right rear door You may have been already guessed my steps, so there is not much to describe in detail. Old upper one file:///c /Documents
Διαβάστε περισσότεραΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006
Οδηγίες: Να απαντηθούν όλες οι ερωτήσεις. Ολοι οι αριθμοί που αναφέρονται σε όλα τα ερωτήματα είναι μικρότεροι το 1000 εκτός αν ορίζεται διαφορετικά στη διατύπωση του προβλήματος. Διάρκεια: 3,5 ώρες Καλή
Διαβάστε περισσότεραModels for Probabilistic Programs with an Adversary
Models for Probabilistic Programs with an Adversary Robert Rand, Steve Zdancewic University of Pennsylvania Probabilistic Programming Semantics 2016 Interactive Proofs 2/47 Interactive Proofs 2/47 Interactive
Διαβάστε περισσότεραΑνάλυση υπερσυνδέσµων
Εύρεση & ιαχείριση Πληροφορίας στον Παγκόσµιο Ιστό ιδάσκων ηµήτριος Κατσαρός, Ph.D. @ Τµ. Μηχανικών Η/Υ, Τηλεπικοινωνιών & ικτύων Πανεπιστήµιο Θεσσαλίας ιάλεξη 8η: 18/04/2007 1 Ανάλυση υπερσυνδέσµων Πρακτικές
Διαβάστε περισσότεραEPL 603 TOPICS IN SOFTWARE ENGINEERING. Lab 5: Component Adaptation Environment (COPE)
EPL 603 TOPICS IN SOFTWARE ENGINEERING Lab 5: Component Adaptation Environment (COPE) Performing Static Analysis 1 Class Name: The fully qualified name of the specific class Type: The type of the class
Διαβάστε περισσότεραSection 9.2 Polar Equations and Graphs
180 Section 9. Polar Equations and Graphs In this section, we will be graphing polar equations on a polar grid. In the first few examples, we will write the polar equation in rectangular form to help identify
Διαβάστε περισσότεραBayesian statistics. DS GA 1002 Probability and Statistics for Data Science.
Bayesian statistics DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Frequentist vs Bayesian statistics In frequentist
Διαβάστε περισσότεραΑπόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.
Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο The time integral of a force is referred to as impulse, is determined by and is obtained from: Newton s 2 nd Law of motion states that the action
Διαβάστε περισσότεραCalculating the propagation delay of coaxial cable
Your source for quality GNSS Networking Solutions and Design Services! Page 1 of 5 Calculating the propagation delay of coaxial cable The delay of a cable or velocity factor is determined by the dielectric
Διαβάστε περισσότεραdepartment listing department name αχχουντσ ϕανε βαλικτ δδσϕηασδδη σδηφγ ασκϕηλκ τεχηνιχαλ αλαν ϕουν διξ τεχηνιχαλ ϕοην µαριανι
She selects the option. Jenny starts with the al listing. This has employees listed within She drills down through the employee. The inferred ER sttricture relates this to the redcords in the databasee
Διαβάστε περισσότεραLecture 34 Bootstrap confidence intervals
Lecture 34 Bootstrap confidence intervals Confidence Intervals θ: an unknown parameter of interest We want to find limits θ and θ such that Gt = P nˆθ θ t If G 1 1 α is known, then P θ θ = P θ θ = 1 α
Διαβάστε περισσότεραΑναερόβια Φυσική Κατάσταση
Αναερόβια Φυσική Κατάσταση Γιάννης Κουτεντάκης, BSc, MA. PhD Αναπληρωτής Καθηγητής ΤΕΦΑΑ, Πανεπιστήµιο Θεσσαλίας Περιεχόµενο Μαθήµατος Ορισµός της αναερόβιας φυσικής κατάστασης Σχέσης µε µηχανισµούς παραγωγής
Διαβάστε περισσότεραΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ
ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ ΕΛΕΝΑ ΦΛΟΚΑ Επίκουρος Καθηγήτρια Τµήµα Φυσικής, Τοµέας Φυσικής Περιβάλλοντος- Μετεωρολογίας ΓΕΝΙΚΟΙ ΟΡΙΣΜΟΙ Πληθυσµός Σύνολο ατόµων ή αντικειµένων στα οποία αναφέρονται
Διαβάστε περισσότεραElements of Information Theory
Elements of Information Theory Model of Digital Communications System A Logarithmic Measure for Information Mutual Information Units of Information Self-Information News... Example Information Measure
Διαβάστε περισσότερα