| Hilton Phoenix East/Mesa
April 30, 2011
(Download Poster, 444KB)
to be held in conjunction with
History of the Workshop
This is the ninth in the series of Text Mining workshops held in conjunction with SDM. Previous ones have taken place in 2001, 2002, 2003, 2006, 2007, 2008, and 2009, and at the most recent workshop (2010) in Columbus, OH, 39 authors representing industry, academia and national research laboratories from 4 different countries submitted a total of 14 papers. After careful review, 10 papers were selected for publication and presentation. In addition, SAS, Catalyst Repository Systems, Inc., and Small Bear Consulting, LLC sponsored the workshop and provided funds for student travel expenses.
proliferation of digital computing devices and their use in
communication has resulted in an increased demand for systems and
algorithms capable of mining textual data. Thus, the development of
techniques for mining unstructured, semi-structured, and fully
structured textual data has become quite important in both academia and
industry. As a result, this Workshop tracks new developments in the field of
Text Mining - the application of techniques of machine learning in
conjunction with natural language processing, information extraction
and algebraic/mathematical approaches to computational information
retrieval. Issues addressed range from
the development of new learning approaches to the parallelization of
existing algorithms. The goal of this workshop is to provide a venue
for researchers to share initial approaches and preliminary results of
recent research in Text Mining. Through the careful selection and
review of submitted workshop papers, we hope to provide a suitable
selection of topics that will both generate interest and provide
insight into the state of the field of Text Mining.
Special Topics - Text Mining with the Enron Data Set and VAST 2008/2009/2010 Contest Data
Because of the continued interest generated from the availability of the Enron data set of 1.3 million email messages (see Enron Email Dataset) and its versatility in terms of potential research topics (link analysis, pattern matching), researchers are encouraged to submit papers to this workshop. In addition, the text-based datasets of news events and scenario definition used in the IEEE Symposium on Visual Analytics Science and Technology (VAST) 2008 and 2009 Contests is an interesting corpus for research in topic detection/tracking, role playing, and scenario analysis (see VAST 2008 , VAST 2009 , and VAST 2010, contests for more details on those datasets).
Other Specific Topics of Interest Include:
required to register for SDM 2011 so that no separate registration is
needed for this workshop.
A one-day registration for the conference is available. Workshop attendees do not have to register at the complete conference rate.
Click here for more details.
submit a paper, upload your paper in PDF format (Papers should be printable
on 8.5 × 11 paper only and be roughly 10 pages in length
using a 11pt font in two-column font with 1 inch margins). Click here to access the MyReview system for uploading abstracts and manuscripts.
Note: You must create a MyReview account for uploading your files. In the Authors section you will find the instructions:
1. Use the abstract submission interface to provide the main information
on your paper. You will be given an id/password which must later be used
to access the system during the following steps, so save the login information message that you will receive from the system.
2. Once an abstract has been submitted, you can upload your paper.
To guarantee consideration, manuscripts must be received by January 14, 2011. Submission of work in progress is also encouraged.
January 14, 2011
Deadline has passed.
February 4, 2011.
Deadline has passed.
Camera ready (final papers) due to workshop:
February 11, 2011
. Deadline has passed.
Title of Presentation:
Visualization for Text Analysis
Human-generated, typically unstructured, content has exploded due to the popularity and availability of Internet-based technologies: social networking sites, blogs and microblogs, shared documents, and ever increasing reliance on email, instant messaging, and SMS. These represent treasure troves of potentially useful data to reveal patterns and trends, to understand how people foster relationships and how they use language to communicate. Text mining promises to expose interesting characteristics, hidden patterns and key relationships within these textual corpora. In order to make the results of textual mining and analytics understandable to humans, these algorithms would benefit from integration with information visualization techniques. This talk will address some of the research and practice in information visualization that may be useful in interactions with textual data sets, as well as in exploring the results of text mining algorithms that expose important concepts and relationships.
Biography of John R. Goodall:John R. Goodall is a Chief Scientist with the Information Dominance Research team in the Computational Sciences and Engineering Division at Oak Ridge National Laboratory. His research experience and interests include: visual analytics, information visualization, human-computer interaction, computer network defense, and computer-supported cooperative work; he is particularly interested in the intersection between these areas. His work has included research into the work practice and collaborative work flows among Computer Network Defense analysts and the design of systems to facilitate the exploration and knowledge building activities inherent in that domain. He has a Ph.D. and M.S. in Information Systems from University of Maryland, Baltimore County.
Michael W. Berry, University of Tennessee
and Jacob Kogan, University of Maryland, Baltimore County
Loulwah AlSumait, Kuwait University
Brett Bader, Sandia National Laboratories
Murray Browne, Turner Broadcasting Systems, Inc.
Malu Castellanos, Hewlett-Packard Laboratories
Carlotta Domeniconi, George Mason University
Efstratios Gallopoulos, University of Patras, Greece
Wilfried Gansterer, University of Vienna
Efim Gendler, iboogie.tv
Peg Howland, Utah State University
Eric Jiang, University of San Diego
Mei Kobayashi, IBM Research - Tokyo
April Kontostathis, Ursinus University
Choudur Lakshminarayan, Hewlett-Packard Laboratories
Doug Oard, University of Maryland
Padma Raghavan, Penn State University
Alan Ratner, Northop Gruman
Andrea Tagarelli, University of Calabria, Italy
Dvora Toledano-Kitai, Ort Braude, Israel
Judith Vogel, Stockton College
Zeev Volkovich, Ort Braude College, Israel
Michael W. Berry
Department of Electrical Engineering & Computer Science
203 Claxton Complex
University of Tennessee
Knoxville, TN 37996-3450
Phone: (865) 974-3838
Fax: (865) 974-4404
berry AT eecs DOT utk DOT edu
Department of Mathematics and Statistics
University of Maryland, Baltimore County
Baltimore, MD 21250
Phone: (410) 455-3297
Fax: (410) 455-1066
kogan AT math DOT umbc DOT edu