The International Symposium on Web Algorithms (iSWAG) is dedicated to all academic and industrial researchers working on algorithmic problems related to the web. The aim of iSWAG 2015 is to cover as completely as possible the field of research on algorithms for solving web related problems.
Count-Min Sketch  is a widely adopted algorithm for approximate event counting in large scale processing. However, the original version of the Count-Min-Sketch (CMS) suffers of some deficiences, especially if one is interested in the low-frequency items, such as in textmining related tasks. Several variants of CMS  have been proposed to compensate for the high relative error for low-frequency events, but the proposed solutions tend to correct the errors instead of preventing them. In this paper, we propose the Count-Min-Log sketch, which uses logarithm-based, approximate counters [7, 4] instead of linear counters to improve the average relative error of CMS at constant memory footprint.paper
Users struggle with keyword based search engines like Google or Bing because queries can have multiple interpretations and search engines fail to understand the context in which the user is looking for information. This failure leads to search results that are either inappropriate or contextually irrelevant. In this paper we describe algorithms which, utilizing information about the user’s context, scrape the web and process/filter candidate sites that could be used to create a customized context specific search engine for the user. We used the algorithm to create Medtree, a customized medical search engine for doctors at Stanford Hosptial. In evaluations we demonstrate Medtree’s superiority to Google for medical queries.paper
Web site internal mesh optimization is at the very heart of search engine optimization. One prominent way to get the best web search engine visibility for your site is to build the adequate internal linking to promote your naturally popular pages. The definition of popular might be the transformation rate for e-commerce, the traffic from logs for a common site or even rather a specific semantic where you want to be visible. We here propose an algorithm to automatically compute the optimal internal mesh for your web site. The idea behind being simple : the higher you value your URL, the more in rank you want to give it. We tackle the challenges met both at a theory and software implementation level. We’ll more specifically deal with big data issues for an e-commerce web site.paper
We have made experiments with reordering algorithms on sparse very large graphs (VLGs), we have considered only undirected, unweighted, sparse and huge graphs, i.e. G = (V, E) with n = |V | from million to billion of nodes and with |E| = O(|V |). The problem of reordering a matrix to enhance the computation time (and sometimes the memory) is traditional in numerical algorithms but we focus on this short paper on results obtained for the approximate computation of the diameter of a sparse VLG (with some graphs on various different computers). The problem of reordering a graph has already been pointed, explicitly or implicitly by a lot of people, from the numerical community but also from the graph community, like the authors of the Louvain algorithm when they write that choosing an order is thus worth studying since it could accelerate the Louvain algorithm. Our experimental results show clearly that it can be worth (and simple) to preprocess a sparse VLG with a reordering algorithm.paper
Massive Open Online Courses (MOOC) appeared fairly recently in distance education. Unlike traditional elearning classes, they are intended for an unlimited number of participants and successful ones can gather hundreds of thousands of participants. Basically they consist on a video class per week that the learners can watch when it is most fitting for them. It is a very useful tool for learners who wish to learn at their own pace. One of the biggest problems facing MOOCs is the high drop ratio. Indeed even for the very successful ones it is common to see only 10% of the participants following the class until the end. As shown in previous studies on online learning, collaboration between the participants is an effective way to solve this problem. This paper introduces a group partitioning scheme based on subjective logic operators that intends to satisfy the learners by reforming successful alliances and splitting unfulfilling ones.paper