Wednesday, September 16, 2009

My R Research

This is the first note of my research looking for implementing R to analyze data of many many rows and looking for the best way to implement it like a data mining tool.

First of all, it's clear R is not very suitable for many rows when you don't have enough RAM memory (http://datamining.togaware.com/). So the immediate solution is the incrementation to 32GB (best 64 bits). But I'm looking for another solutions....

Talking about a Data Mining suit, I'm trying to investigate the best possible solutions about this:
- Weka
- Rapid Miner
- KNIME
- Rattle

Maybe I need to consider standards like PMML (sourceforge) (I found an integration within Pentaho)

I'm looking for something like WebFOCUS RStat but open source.

1 comment:

  1. ROOT system can work with very large compressed data files on the flight

    http://root.cern.ch/drupal/content/documentation

    ReplyDelete