The real Data Mining battle: Watson vs Google
Who is more powerful? IBMs’ Watson-in-a-box or Googles’ Cloud Grid? As we saw yesterday, Watson was able to beat both of the best human jeopardy players.
But how would Google have performed? Given the presumption, that Google did not intentionally change their results to achieve good at the following test, we can check out what ranking Google would have given to the Jeopardy questions. The J! Archive is a good source to find the questions. All Ranks regarding the “jeopardy” Question are ignored in this little test with use of the “-jeopardy” tag.
Continue reading The real data mining battle: Watson vs Google
SAS Institute Inc.
Data Mining experts are a rare species in a companies business habitat, in contrary to the rather common analyst. That is why more and more data mining tools arise that try to avoid experts and give smaller or even more complicate reports into the hands of the business analyst. Big vendor SAS now offers a new product that fits into that scheme; the SAS Rapid Predictive Modeler.
SAS stressed that this tool indeed focusses on the business analyst, and says it will enable even subject-matter experts with limited statistical expertise to build reliable and robust data mining models. This eventually leading to useful and descriptive reports and graphs.
Continue reading Mining data without being an expert
KNIME (Konstanz Information Miner)
Every year Gartner selects the “Cool Vendor” – that is a cool small vendor that offers new innovative products or services. This years’ starlet is Data Mining Software KNIME (short for Konstanz Information Miner), a comprehensive Open-Source data integration, processing, analysis, and exploration platform.
KNIME has been selected in the key technology areas Analytics, Business Intelligence, and Performance Management.
Gartner considers “Cool Vendors” to have the potential to have significant influence to a market.
The “Text 2.0” project offers a framework that makes it possible to track the movement of a reader’s eye and optimizes the presentation of the text that is read. It will be even possible to integrate this features in websites – an eye-tracking hardware is all that will be needed (may be normal in the future like the webcam already is).
Continue reading Text 2.0: the text that knows it is read
FastStats (c) Apteco
The company Apteco offers FastStats, a software tool for address selection for running campaigns. The .Net tool is equipped with multiple plugins to solve certain tasks.
It works fine with text files that get sorted and linked. A huge deal for the developers seems to be to provide users with an easy drag’n’drop environment. And this was quite well done. The selection and handling of data is simple and efficient. The most important part for us is the module “FastStats Modelling” – since this is where the Data Mining takes place.
Continue reading Easy Data Mining by drag’n’drop: FastStats Modelling
Cloud Mining with Oracles ODM is available on the Amazon Cloud since end of February 2010, as shown at Oracles Website. There is a pre-installed Oracle 11gR2 Database and sample datasets ready to use. Trying the Oracle 11gR2 Data Mining Amazon Machine Image (AMI), users can now launch an Oracle Cloud Mining enabled instance directly through Amazon Web Services (AWS). There will be normal costs according standard Amazon EC2 charges.
That definitely lets Oracle win the race to the Cloud against the Data Mining Experts from SAS.
Data Applied, one of the worlds first Cloud Mining providers, adds new capabilities to its data mining and data visualization suite. The new data transformation feature complements the company’s existing data visualization, data mining and reporting features.
Using a step-by-step wizard, users can define transformation steps allowing them to process rows of data and create new data sets. Metadata transformation steps include creating, renaming, converting, and deleting fields. Row transformation steps include filtering, sampling, ranking, and scrambling rows. Fields can also be set to calculated values by referencing other fields or by invoking built-in mathematical, statistical, or text functions.
In addition, the company announced other features including geo-mapping and view sharing. The new geo-mapping feature allows widgets such as pie charts or bar charts to be mapped to geographical locations, while the new view sharing feature allows users to securely share and embed visualizations in any web page. For more information, visit www.data-applied.com.
RapidMiner is a well known Open-Source Data Mining Tool from company Rapid-I, and is in use many thousand times all over the world. At CeBIT I had the opportunity to talk to Co-Founder Ralf Klinkenberg about his software and get some interesting information, for example if RapidMiner is ready for Cloud Mining.
RapidMiner auf der CeBIT 2010
RapidMiner, formerly known as YALE, has been developed at the German university of Dortmund, beginning in 2001. Since then it has definitely proved its impressive functionality, I for myself used it the first time for a Data Mining contest in 2006 (being quite successful). Meanwhile it is hosted at the open source developing platform sourceforge and is also developed further on this site. Right now the 5th version is available.
Continue reading RapidMiner from Rapid-I at CeBIT 2010
SAS Institute Inc.
At the CeBIT 2010 I visited the booth of the SAS Institute. It again was a rich exchange of information, and finally I got my very own SAS mug! My question, if SAS is doing anything in the direction of Cloud Mining, was forwarded by my conversational partner to the press department. But then the partner made a quick comment, that made me listen attentivel: “… but yes, SAS has a Private Cloud on the roadmap”. For this reason they are building a huge data processing center, showed my inquiry. I guess they will enhance the SAS Enterprise Miner, to make it capable for Cloud Mining. When or how my conversational partner was unable or unwilling to say, I will wait for the answer of the press department. But I think it is quite interesting, that the world leader in Data Mining is not oversleeping the buzz around the Cloud.
SAS Institute booth at CeBIT 2010: Cloud Mining on the roadmap?