SEMMA is an acronym introduced by SAS which stands for:
Sample, Explore, Modify, Model and Assess.
I had recently posted about the Data Mining & Knowledge Discovery Process which had following sequential steps:
Raw Data => cleaning => sampling => Modeling => Testing
SEMMA follows the similar sequential steps as we had seen in the data mining process. So while Data Mining process is applicable to any data mining tool out their, SEMMA helps when you use SAS enterprise miner. In fact, it has helped me quickly find the data mining functions available in SAS tool:
Once in a while I go back to basics to revisit some of the fundamental technology concepts that I’ve learned over past few years. Today, I want to revisit Data Mining and Knowledge Discovery Process:
Here are the steps:
1) Raw Data
2) Data Pre processing (cleaning, sampling, transformation, integration etc)
3) Modeling (Building a Data Mining Model)
4) Testing the Model a.k.a assessing the Model
5) Knowledge Discovery
Here is the visualization:
In the world of Data Mining and Knowledge discovery, we’re looking for a specific type of intelligence from the data which is Patterns. This is important because patterns tend to repeat and so if we find patterns from our data, we can predict/forecast that such things can happen in future.
In this blog post, we saw the Knowledge Discovery and Data Mining process.