The Columbus,
A Renaissance Hotel
Columbus, OH

May 1, 2010

to be held in conjunction with

Tenth SIAM International Conference on Data Mining (SDM 2010)


Topics of interest | Registration | Submission Requirements | Important Dates
Program | Program Committee | Organizational Committee | Sponsors | Proceedings | Slides


History of the Workshop

This is the eighth in the series of Text Mining workshops held in conjunction with SDM. Previous ones have taken place in 2001, 2002, 2003, 2006, 2007, 2008, 2009, and at the most recent workshop (2009) in Sparks, NV, 41 authors representing industry, academia and national research laboratories from 6 different countries submitted a total of 15 papers. After careful review, 8 papers were selected for publication and presentation. In addition, SAS and Small Bear Consulting, LLC sponsored the workshop and provided funds for student travel expenses.

General Topics

The proliferation of digital computing devices and their use in communication has resulted in an increased demand for systems and algorithms capable of mining textual data. Thus, the development of techniques for mining unstructured, semi-structured, and fully structured textual data has become quite important in both academia and industry. As a result, this Workshop will survey the emerging field of Text Mining - the application of techniques of machine learning in conjunction with natural language processing, information extraction and algebraic/mathematical approaches to computational information retrieval. Many issues are being addressed in this field ranging from the development of new learning approaches to the parallelization of existing algorithms. The goal of this workshop is to provide a venue for researchers to share initial approaches and preliminary results of recent research in Text Mining. Through the careful selection and review of submitted workshop papers, we hope to provide a suitable selection of topics that will both generate interest and provide insight into the state of the field of Text Mining.

Special Topics - Text Mining with the Enron Data Set and VAST 2008/2009 Contest Data

Because of the continued interest generated from the availability of the Enron data set of 1.3 million email messages (see Enron Email Dataset) and its versatility in terms of potential research topics (link analysis, pattern matching), researchers are encouraged to submit papers to this workshop. In addition, the text-based datasets of news events and scenario definition used in the IEEE Symposium on Visual Analytics Science and Technology (VAST) 2008 and 2009 Contests is an interesting corpus for research in topic detection/tracking, role playing, and scenario analysis (see VAST 2008 or VAST 2009 contests for more details on those datasets).

Other Specific Topics of Interest Include:

    Algorithms and Models

  • Bayesian Models
  • Concept Decomposition
  • Orthogonal Decomposition
  • Probabilistic Models
  • Vector Space Models
  • Latent Semantic Indexing
  • Graph-based Models
  • Text Streaming Models
    Applications
  • Clustering
  • Factor Analysis
  • Visualization Techniques
  • Metadata Generation
  • Information Extraction
  • Text Classification
  • Text Purification
  • Text Segmentation
  • Text Summarization
  • Query Structures
  • Trend Detection
  • Distributed Storage and Retrieval

Registration

Attendees are required to register for SDM 2010 so that no separate registration is needed for this workshop.
A one-day registration for the conference is available. Workshop attendees do not have to register at the complete conference rate.
Click here for more details.


Submission Requirements

To submit a paper, upload your paper in PDF format (Papers should be printable on 8.5 × 11 paper only and be roughly 10 pages in length using a 11pt font in two-column font with 1 inch margins) by accessing the review system via http://www.cs.utk.edu/TextMiningPapers.

In the Authors section you will find the instructions:

1. Use the abstract submission interface to provide the main information
on your paper. You will be given an id/password which must later be used
to access the system during the following steps, so save the login information message that you will receive from the system.

2. Once an abstract has been submitted, you can upload your paper.

To guarantee consideration, manuscripts must be received by January 15, 2010. Submission of work in progress is also encouraged.


Important Dates

Papers Due: January 15, 2010

Notifications sent: February 5, 2010

Camera ready: Final Papers due to workshop: February 12, 2010


Keynote speaker:TBD

Sponsors: TBD
Program Committee (as of 10/02/09)

Co-Chairs: Michael W. Berry, University of Tennessee and Jacob Kogan, University of Maryland, Baltimore County

Loulwah AlSumait, Kuwait University
Murray Browne, Turner Broadcasting Systems, Inc.
Malu Castellanos, Hewlett-Packard Laboratories
Carlotta Domeniconi, George Mason University
Kyle Gallivan, Florida State University
Efstratios Gallopoulos, University of Patras, Greece
Wilfried Gansterer, University of Vienna
Efim Gendler, iboogie.tv
Peg Howland, Utah State University
April Kontostathis, Ursinus University

Choudur Lakshminarayan, Hewlett-Packard Laboratories
Bill Pottenger, DIMACS, Rutgers
Padma Raghavan, Penn State University
Andrea Tagarelli, University of Calabria, Italy
Judith Vogel, Stockton College
Zeev Volkovich, Ort Braude College, Israel
Yu Xia, INRIA Grenoble, France


Organizational Committee

Co-Chairs:
Michael W. Berry
Department of Electrical Engineering & Computer Science
203 Claxton Complex
University of Tennessee
Knoxville, TN 37996-3450
Phone: (865) 974-3838
Fax: (865) 974-4404
berry AT eecs DOT utk DOT edu

Jacob Kogan
Department of Mathematics and Statistics
University of Maryland, Baltimore County
Baltimore, MD 21250
Phone: (410) 455-3297
Fax: (410) 455-1066
kogan AT math DOT umbc DOT edu


Last modified on October 1, 2009