The DMG has released a new version of the PMML open format for representing predictive models. The new version includes support for ensembles, new model types and more built in functions to name just a few of the enhancements. For a detailed summary, see the Zementis blog.
PMML 4.0 Released
June 18th, 2009 · Industry, News, Software
→ No CommentsTags:DMG·PMML·Zementis
Ten Data Mining Mistakes to Avoid
May 15th, 2009 · Tips & Tutorials
Some really good advice here from John Elder in a series of video tutorials on data mining mistakes to avoid. Tip #5, regarding contaminating the project with future data is a good one, although sometimes it can be quite tricky (if not impossible) to ‘rewind’ the data! I believe the video series is a part of the launch of The Handbook of Statistical Analysis and Data Mining Applications. You can watch part one below or head over to YouTube for the entire series.
→ No CommentsTags:john elder·video
RapidMiner to get dual GUIs
May 14th, 2009 · Software
A forum post by Ingo Mierswa of Rapid-I indicates the upcoming RapidMiner v5 will feature two GUIs: the existing tree-based designer and a new graph-based designer! I’m quite excited about this because I’ve personally found the existing UI a bit clunky. Details and screenshots over at the
user forum.
→ 1 CommentTags:open source·rapid miner
SAS hints at future R integration
February 17th, 2009 · News, Software
In more R news, it appears SAS isn’t as worried about airplane safety as originally thought, and has indicated they will include R support in an upcoming update to the SAS/IML product. For details see NYTimes & Adventures in Consulting.
→ No CommentsTags:R·SAS
R in the New York Times
January 8th, 2009 · Software
The New York Times has an interesting story on the increasing use of R for data analysis within academia and industry. Several large corporates are cited as having selected R over commercial conterparts such as S and SAS.
Update: For more R news, see also Ajay Ohri’s interview with Dr Graham Williams, the author of Rattle.
→ 3 CommentsTags:R·S-Plus·SAS
RapidMiner 4.3 Released
November 28th, 2008 · Software, Tips & Tutorials
Rapid-I has released an new and improved version of the open source data mining suite RapidMiner (formely called YALE). I’ve been evaluating RapidMiner lately as a possible addition to my data mining toolbox. I’ve found the biggest hurdle in learning how to use it is probably the GUI. It is a tree-based GUI which I find much harder to understand than the graph-style approach used by many others. However RapidMiner is quite a powerful tool, and the Community Edition is free, so there is probably a lot of benefit in getting used to the strange GUI.
The built in tutorial is a really good way to get a grasp of the system and I highly recommend spending some time on this if you are interested in learning RapidMiner. I would also recommend a series of RapidMiner video turtorials over at Neural Market Trends that are worth checking out too.
→ 3 CommentsTags:lift chart·open source·rapid miner·video
SAS Forum (Australia) presentations available online
September 30th, 2008 · Australia, Industry, Software
The SAS Forum (Australia) was held in Sydney back in August. I was unable to attend but luckily the presentations have been put online. Here are some that I found interesting:
- Make Sure Your Insight is Insightful: Analytical Marketing at NAB by Antony Ugoni (National Australia Bank)
- Model Deployment and Management - The ATO Story by Warwick Graco (Australian Taxation Office)
- Putting Cheques in Place to Identify Fraud by Dr Paul Bracewell (Offlode NZ) and Flavio Palaci (Marsh Australia)
- Customer Value Creation Using Analytics by Arun VS (Satyam)
- Analysing Performance and Tuning your SAS Application by Bill Gibson (SAS)
→ No CommentsTags:banking·case study·customer analytics·fraud·government·SAS
Welcome to Data Mining, Down Under
September 26th, 2008 · General
I’ve decided to start afresh with a new blog, Data Mining Down Under. While I have blogged previously on my personal site, I thought it would be good to start fresh with a new focus and more regular posts. Therefore this site will not be about personal posts but rather a journal of my thinking around various data mining topics. Generally speaking, this blog will cover a wide range of data mining topics from the latest research and development efforts to all the trends and best practices for industry. Hopefully it provides for some interesting reading!
→ No CommentsTags:welcome
Data Mining the Financial Markets
April 25th, 2008 · Industry, Tips & Tutorials
Thomas A. Rathburn has written a series of three articles on data mining the financial markets. Rathburn takes a detailed look into the success and failures of his efforts in the markets and with 10 year US bonds in particular. You can check it out here part 1, part 2, and part 3. The articles are also available as a podcast here: 1, 2, 3.
[via KDnuggets]
→ No CommentsTags:bonds·finance
Experian Bolsters Data With Hitwise Acqusition
May 4th, 2007 · Industry
Tim O’Reilly points to the news that Experian has made a significant move to improve the quality of their online and demographic data with the acqusition of Hitwise for US$240 Million. Hitwise collects user traffic from ISPs in several countries including Australia and uses that information to provide companies with insight into their online marketshare. Although not mentioned in the press release, the Hitwise data will likely be a huge boon for Experian’s marketing services, and will probably allow them to develop more accurate geo-demographic profiles.

Subscribe to RSS feed