Rabu, 26 Juni 2013

IBM SPSS Data Modeler

My Review on IBM SPSS Data Modeler

Usability and interface:

Overall the software is very easy to use with, you can data mine without knowledge of SQL syntax or scripts used on tables, its also easy to work with Excel and SPSS files, its also compatible with SAS files.


Tools 

Auto Classifier and Auto Cluster modes are very helpful for the lazy data miner who would like to compare three or more models accuracy based on your data marts or data sets

a lot of models to choose from CHAID, C5 decision trees to Logistic Regression, Cox Regression and Generalized Linear Models and Generalized Mixed models, there are also models accustomed to the financial sector such as the Recency, Frequency and Monetary node, otherwise known as RFM node.

it also features the Anomaly node for Fraud detection, very useful for credit card loans and risk default detection.

I still have to see what SAS and SAP has to offer in terms of its extensive models available with their predictive analytics software, some do say they are the number one in business intelligence but each software has its strengths and weaknesses depending on the user.

Senin, 24 Juni 2013

Data modeling Tips for the newbie

What Tips can we advise a newbie on data modeling?

Here are some simple advices

Create and plan your Data Warehouse, Data Structure and Architecture

  1. Scope and Plan your data. Data architecture is relevant for you to plan way ahead of data troubles such as data integrity and compatibility issues. Plan what files you will be dealing with like flat files to cubes.
  2. Have a background in Statistics. Knowing a little bit of your normal curves, testing for normal distributions will do help but you have to keep on reading for complex models such as Neural Networks and Bayesian Networks.
  3. Know a little bit of scripting, learning SQL can help in extracting, transforming and loading your data, ETL would be a good way to go, a little bit of select statements such as "select from table where customerid = 200"
  4. know your business objectives, do you want models that you can use to have leverage over your competitors, or do you want customer relationship management
  5. Lastly, execute your model, your boss would be happy if you are able to deploy your complex model and show it to the board of trustees. 
Overall, data mining is not as simple as it seem but at least you can get pointers from these :)

Happy DAta Mining!