Applying language modeling to session identification from database trace logs
AbstractA database session is a sequence of requests presented to the database system by a user or an application to achieve a certain task. Session identification is an important step in discovering...
View ArticleHierarchical Density-Based Clustering of Categorical Data and a Simplification
AbstractA challenge involved in applying density-based clustering to categorical datasets is that the ‘cube’ of attribute values has no ordering defined. We propose the HIERDENC algorithm for...
View ArticleTranslation and Rotation Invariant Mining of Frequent Trajectories:...
AbstractWe present a framework for mining frequent trajectories, which are translated and/or rotated with respect to one another. We then discuss a multiresolution methodology, based on the wavelet...
View ArticleIntegration of Genomic, Proteomic and Biomedical Information on the Semantic Web
AbstractResearchers are faced with the challenge of integrating, on the basis of a common semantic web framework, the information on biological processes resulting from genomic and proteomic...
View ArticlePromoting Diversity in Top Hits for Biomedical Passage Retrieval
AbstractWith the volume of biomedical literature exploding, such as BMC or PubMed, it is of paramount importance to have scalable passage retrieval systems that allow researchers to quickly find...
View ArticleBlog Data Mining: The Predictive Power of Sentiments
In this chapter, we study the problem of mining sentiment information from online resources and investigate ways to use such information to predict product sales performance. In particular, we conduct...
View ArticleRule Quality Measures Improve the Accuracy of Rule Induction: An Experimental...
AbstractRule quality measures can help to determine when to stop ge- neralization or specification of rules in a rule induction system. Rule quality measures can also help to resolve conflicts among...
View ArticleThe Effect of Sequence Complexity on the Construction of Protein-Protein...
AbstractIn this paper, the role of sequence complexity in the construction of important nodes in protein-protein interaction (PPI) networks is investigated. We use two complexity measures, linguistic...
View ArticleDetecting Web Crawlers from Web Server Access Logs with Data Mining Classifiers
AbstractIn this study, we introduce two novel features: the consecutive sequential request ratio and standard deviation of page request depth, for improving the accuracy of malicious and non-malicious...
View ArticleTowards Automatic Acquisition of a Fully Sense Tagged Corpus for Persian
AbstractSense tagged corpora play a crucial role in Natural Language Processing, particularly in Word Sense Disambiguation and Natural Language Understanding. Since semantic annotations are usually...
View ArticleCross-Lingual Word Sense Disambiguation for Languages with Scarce Resources
AbstractWord Sense Disambiguation has long been a central problem in computational linguistics. Word Sense Disambiguation is the ability to identify the meaning of words in context in a computational...
View ArticleFinding best evidence for evidence-based best practice recommendations in...
AbstractA major problem for Canadian health organizations is finding best evidence for evidence-based best practice recommendations. Medications are not always effectively used and misuse may harm...
View ArticleEfficient Bi-objective Team Formation in Social Networks
AbstractWe tackle the problem of finding a team of experts from a social network to complete a project that requires a set of skills. The social network is modeled as a graph. A node in the graph...
View ArticleRiding the tide of sentiment change: sentiment analysis with evolving online...
AbstractThe last decade has seen a rapid growth in the volume of online reviews. A great deal of research has been done in the area of opinion mining, aiming at analyzing the sentiments expressed in...
View ArticleTopic Modeling Using Collapsed Typed Dependency Relations
AbstractTopic modeling is a powerful tool to uncover hidden thematic structures of documents. Many conventional topic models represent documents as a bag-of-words, where the important linguistic...
View ArticleFinding top- $$k\, r$$ k r -cliques for keyword search from graphs in...
AbstractKeyword search over structured data offers an alternative method to explore and query databases for users that are not familiar with the structure of the data and/or a query language....
View Article