| Disney's Paradise
April 28, 2012
(Download Poster, 695KB)
Disney's Paradise Pier Hotel
to be held in conjunction with
History of the Workshop
This is the tenth in the series of Text Mining workshops held in conjunction with SDM. Previous ones have taken place in 2001, 2002, 2003, 2006, 2007, 2008, and 2009, and 2010, and at the most recent workshop (2011) in Mesa, AZ, 25 authors representing industry, academia and national research laboratories from 3 different countries submitted a total of 9 papers. After careful review, 7 papers were selected for publication and presentation. In addition, SAS and the Center for Intelligent Systems and Machine Learning (CISML) at UT-Knoxville sponsored the workshop and provided funds for student travel expenses.
proliferation of digital computing devices and their use in
communication has resulted in an increased demand for systems and
algorithms capable of mining textual data. Thus, the development of
techniques for mining unstructured, semi-structured, and fully
structured textual data has become quite important in both academia and
industry. As a result, this Workshop tracks new developments in the field of
Text Mining - the application of techniques of machine learning in
conjunction with natural language processing, information extraction
and algebraic/mathematical approaches to computational information
retrieval. Issues addressed range from
the development of new learning approaches to the parallelization of
existing algorithms. The goal of this workshop is to provide a venue
for researchers to share initial approaches and preliminary results of
recent research in Text Mining. Through the careful selection and
review of submitted workshop papers, we hope to provide a suitable
selection of topics that will both generate interest and provide
insight into the state of the field of Text Mining.
Special Topics - Text Mining with Email (Enron), Blogs, Tweets, and VAST 2008/2009/2010/2011 Contest Data
of the continued interest generated from the availability of the Enron
data set of 1.3 million email messages (see Enron
and its versatility in terms of potential research topics (link
analysis, pattern matching),
researchers are encouraged to submit papers to this workshop. In addition,
the text-based datasets of news events and scenario definition used in the
IEEE Symposium on Visual Analytics Science and Technology (VAST)
2008 and 2009 Contests is an interesting corpus for research in topic
detection/tracking, role playing, and scenario analysis (see
VAST 2008 ,
VAST 2009 ,
contests for more details on those datasets). Text classification and
clustering models for social media repositories such as Twitter and
Facebook are also encouraged.
Other Specific Topics of Interest Include:
Algorithms and Models
are required to register for SDM 2012 so that no separate registration is
needed for this workshop.
A one-day registration for the conference is available. Workshop attendees do not have to register at the complete conference rate. Click here for more details.
submit a paper, upload your paper in PDF format (Papers should be printable
on 8.5 × 11 paper only and be roughly 10 pages in length using a
11pt font in two-column font with 1 inch margins). Please
to download on SIAM LaTeX macros (soda2e.all) you can use to format
your two-column paper.
Click here to access the MyReview system for uploading abstracts and manuscripts.
Note: You must create a MyReview account for uploading your files. In the Authors section you will find the instructions:
1. Use the abstract submission interface to provide the main information on your paper. You will be given an id/password which must later be used to access the system during the following steps, so save the login information message that you will receive from the system.
2. Once an abstract has been submitted, you can upload your paper.
To guarantee consideration, manuscripts must be received by January 13, 2012. Submission of work in progress is also encouraged.
January 13, 2012 Deadline passed.
February 3, 2012. Deadline passed..
Camera ready (final papers) due to workshop:
February 10, 2012 Deadline passed.
Title of Presentation:
Tapping Social Media for Sentiments with Live Customer Intelligence (LCI)
The explosion of Web opinion data that Web 2.0 and its increasingly popular social sites like Twitter, Facebook, blogs and review sites have brought about, has made essential the need for automatic tools to analyze and understand sentiments toward different topics. This has fueled the emerging field known as sentiment analysis whose goal is to translate the vagaries of human emotion into hard data. Live Customer Intelligence (LCI) is a system that taps into what is being said to understand the sentiment with the particular ability of doing so in near real-time. LCI integrates novel algorithms for sentiment analysis and a configurable dashboard with different kinds of charts including dynamic ones that change as new data is ingested. LCI has been researched and prototyped at HP Labs in close interaction with business divisions and a few selected customers. In this talk I give an overview of LCI, focusing in particular on challenging issues and illustrating its capabilities with selected use cases.
Dr. Malu Castellanos is a senior researcher in the Information Analytics Lab at Hewlett-Packard Laboratories in Palo Alto, CA, USA. Since 1998 she has been applying data management and data analytics technologies to develop novel solutions to different kinds of business related problems. She received a B.S. in Computer Engineering at the National University of Mexico and a Ph.D. in Computer Science from the Polytechnic University of Catalunya. Prior to joining Hewlett-Packard she was on the faculty at the Information Systems Department of the Polytechnic University of Catalunya. She has more than 60 publications in international conferences, journals and book chapters and has served in numerous PC committees and journal review boards. She has participated in the organization of prestigious international conferences and workshops in different areas of data management occupying different chairing roles including being General Chair for ICDE 2008. Her current interests are new technologies or methods to gain insigh ts from big data, real-time business intelligence, text analytics, automatic database tuning, business process intelligence and data interoperability related technologies. She is a member of the Executive Committee for the IEEE technical committee of data engineering (TCDE).
Michael W. Berry, University of Tennessee
and Jacob Kogan, University of Maryland, Baltimore County
Loulwah AlSumait, Kuwait University
Brett Bader, Digital Globe
Malu Castellanos, Hewlett-Packard Laboratories
Efstratios Gallopoulos, University of Patras, Greece
Wilfried Gansterer, University of Vienna
Efim Gendler, iboogie.tv
April Kontostathis, Ursinus University
Choudur Lakshminarayan, Hewlett-Packard Laboratories
Alan Ratner, Northop Gruman
Andrea Tagarelli, University of Calabria, Italy
Dvora Toledano-Kitai, Ort Braude, Israel
Judith Vogel, Stockton College
Zeev Volkovich, Ort Braude College, Israel
Michael W. Berry
Department of Electrical Engineering & Computer Science
Min H. Kao Building, Suite 401
1520 Middle Drive
University of Tennessee
Knoxville, TN 37996
Phone: (865) 974-3838
Fax: (865) 974-4404
berry AT eecs DOT utk DOT edu
Department of Mathematics and Statistics
University of Maryland, Baltimore County
Baltimore, MD 21250
Phone: (410) 455-3297
Fax: (410) 455-1066
kogan AT math DOT umbc DOT edu