KNIME is Gartners'

Knime Logo in KNIME is Gartners Cool Vendor 2010

KNIME (Konstanz Information Miner)

Every year Gartner selects the “Cool Vendor” – that is a cool small vendor that offers new innovative products or services. This years’ starlet is Data Mining Software  KNIME (short for Konstanz Information Miner), a comprehensive Open-Source data integration, processing, analysis, and exploration platform.

KNIME has been selected in the key technology areas Analytics, Business Intelligence, and Performance Management.

Gartner considers “Cool Vendors” to have the potential to have significant influence to a market.

Text 2.0: the text that knows it is read

DFKI in Text 2.0: the text that knows it is read

DFKI GmbH

The “Text 2.0” project offers a framework that makes it possible to track the movement of a reader’s eye and optimizes the presentation of the text that is read. It will be even possible to integrate this features in websites – an eye-tracking hardware is all that will be needed (may be normal in the future like the webcam already is).

Continue reading Text 2.0: the text that knows it is read

Easy Data Mining by drag’n’drop: FastStats Modelling

Faststats in Easy Data Mining by drag’n’drop: FastStats Modelling

FastStats (c) Apteco

The company Apteco offers FastStats, a software tool for address selection for running campaigns. The .Net tool is equipped with multiple plugins to solve certain tasks.
It works fine with text files that get sorted and linked. A huge deal for the developers seems to be to provide users with an easy drag’n’drop environment. And this was quite well done. The selection and handling of data is simple and efficient. The most important part for us is the module “FastStats Modelling” – since this is where the Data Mining takes place.

Continue reading Easy Data Mining by drag’n’drop: FastStats Modelling

Oracle Cloud Mining at Amazon AWS

Aws in Oracle Cloud Mining at Amazon AWS

Amazon AWS

Cloud Mining with Oracles ODM is available on the Amazon Cloud since end of February 2010, as shown at Oracles Website. There is a pre-installed Oracle 11gR2 Database and sample datasets ready to use. Trying the Oracle 11gR2 Data Mining Amazon Machine Image (AMI), users can now launch an Oracle Cloud Mining enabled instance directly through Amazon Web Services (AWS). There will be normal costs according standard Amazon EC2 charges.

That definitely lets Oracle win the race to the Cloud against the Data Mining Experts from SAS.

Data Applied's Cloud Mining with new functions

Data Applied Logo in Data Applieds Cloud Mining with new functions

Data Applied.com

Data Applied, one of the worlds first Cloud Mining providers, adds new capabilities to its data mining and data visualization suite.  The new data transformation feature complements the company’s existing data visualization, data mining and reporting features.

Using a step-by-step wizard, users can define transformation steps allowing them to process rows of data and create new data sets. Metadata transformation steps include creating, renaming, converting, and deleting fields. Row transformation steps include filtering, sampling, ranking, and scrambling rows. Fields can also be set to calculated values by referencing other fields or by invoking built-in mathematical, statistical, or text functions.

In addition, the company announced other features including geo-mapping and view sharing. The new geo-mapping feature allows widgets such as pie charts or bar charts to be mapped to geographical locations, while the new view sharing feature allows users to securely share and embed visualizations in any web page. For more information, visit www.data-applied.com.

RapidMiner from Rapid-I at CeBIT 2010

RapidMiner is a well known Open-Source Data Mining Tool from company Rapid-I, and is in use many thousand times all over the world. At CeBIT I had the opportunity to talk to  Co-Founder Ralf Klinkenberg about his software and get some interesting information, for example if RapidMiner is ready for Cloud Mining.

Rapidminer 190 in RapidMiner from Rapid-I at CeBIT 2010

RapidMiner auf der CeBIT 2010

RapidMiner, formerly known as YALE, has been developed at the German university of Dortmund, beginning in 2001. Since then it has definitely proved its impressive functionality, I for myself used it the first time for a Data Mining contest in 2006 (being quite successful). Meanwhile it is hosted at the open source developing platform sourceforge and is also developed further on this site. Right now the 5th version is available.

Continue reading RapidMiner from Rapid-I at CeBIT 2010

CeBIT 2010: Cloud Mining soon at SAS?

Sas Logo in CeBIT 2010: Cloud Mining soon at SAS?

SAS Institute Inc.

At the CeBIT 2010 I visited the booth of the SAS Institute. It again was a rich exchange of information, and finally I got my very own SAS mug! My question, if SAS is doing anything in the direction of Cloud Mining, was forwarded by my conversational partner to the press department. But then the partner made a quick comment, that made me listen attentivel: “… but yes, SAS has a Private Cloud on the roadmap”. For this reason they are building a huge data processing center, showed my inquiry. I guess they will enhance the SAS Enterprise Miner, to make it capable for Cloud Mining. When or how my conversational partner was unable or unwilling to say, I will wait for the answer of the press department. But I think it is quite interesting, that the world leader in Data Mining is not oversleeping the buzz around the Cloud.

Sas Cebit 2010 in CeBIT 2010: Cloud Mining soon at SAS?

SAS Institute booth at CeBIT 2010: Cloud Mining on the roadmap?

Cloud Mining - CRM Data Mining in the Cloud

Notebook Cloud Mining-300x254 in Cloud Mining - CRM Data Mining in the Cloud

Cloud Mining für CRM

Cloud Mining is a new approach to apply Data Mining to customer data. This article introduces Cloud Mining in a quick overview.

Data Mining is a determined technique to analyse Data in CRM, Marketing and Distribution. For example it helps optimizing customer interaction, shows buying potentials of customers and the churn probability by the use of statistical-mathematical methods on big amounts of data. Thereby companies can make marketing efforts more precise – they spendings less and achieve better effects.

Continue reading Cloud Mining – CRM Data Mining in the Cloud

Genetic Algorithms with T-SQL

Bot Square-300x183 in Genetic Algorithms with T-SQL

Diligent Data Mining Bot!

Applying Data Mining with only SQL is considered very rare, but it is actually possible to solve some problems. As you can read in the article A Genetic Algorithm Sample in T-SQL by William Talada, it can solve a quite interessting problem.

A field with 10×10 squares is covered to the half with empty cans, it is bordered by a impassable wall. Now a poor tiny robot is send to clear the whole field. He can walk in any direction or pick up a can. His view is reduced to 5 squares, the one he stands on am the four fields that adjoin.
Continue reading Genetic Algorithms with T-SQL

Data Mining in the KDD Environment

In 1996 Osama Fayyad proposed a very popular process how to make a companies data useful for business needs. Data Mining is described to be a part of the KDD Process, actually quite small in this definition, but very important. After reading this article, you will understand why Data Mining needs a pre- and post-processing and just can’t stand alone against all the Data.

The approach to gain knowledge out of a set of data was separated by Fayyad into individual steps. The individuality results out of different tools you use, and different outcomes that are needed.

Fayyad1996-300x141 in Data Mining in the KDD Environment

KDD process by Fayyad 1996

The KDD Process stands for the Knowledge Discovery in Databases. According to Fayyad there are five steps: Selection, Pre-processing, Transformation, Data Mining and Interpretation. These five steps are passed through iteratively. Every step can be seen as a work-through phase. Such a phase requires the supervision of a user and can lead to multiple results. The best of these results is used for the next iteration, the others should be documented. In the following, the steps will be briefly described.

Continue reading Data Mining in the KDD Environment