Kamis, 23 Mei 2013

Data Mining the 2013 COMELEC Results

by Albert Anthony D. Gavino

All this trending hype on the 60-30-10, is just simple math, there is no conspiracy theory to it from a viewpoint of math and statistics, of course there will be variability from precinct to precinct. 

things that you should consider before making your conclusions

1. Provide Statistical Power Analysis Estimates like Sample Power

A simple explanation from the Indiana University website says that "statistical power analysis estimates the power of the test to detect a meaningful effect, given sample size, test size (significance level), and standardized effect size." 


Sample Power used for Two sample proportions, two tailed with
alpha at 0.10
Sample power software are tools that we can show the general readers that statistics can prove certain accuracy over COMELEC results, by providing COMELEC data from each precinct. (No I wont do that since that would be tedious to do, but a team of individuals may likely do so)
2. Show the Data sheet


NAME PARTY LIST TOTAL %
HONTIVEROS, RISA (AKBAYAN) INDEPENDENT 8900861 3.73%
HAGEDORN, ED INDEPENDENT 6876841 2.88%
VILLANUEVA, BRO.EDDIE (BP) INDEPENDENT 5603663 2.35%
CASIÑO, TEDDY (MKB) INDEPENDENT 3491581 1.46%
DELOS REYES, JC (KPTRAN) INDEPENDENT 988795 0.41%
ALCANTARA, SAMSON (SJS) INDEPENDENT 957212 0.40%
BELGICA, GRECO (DPP) INDEPENDENT 898719 0.38%
PENSON, RICARDO INDEPENDENT 825149 0.35%
DAVID, LITO (KPTRAN) INDEPENDENT 821033 0.34%
MONTAÑO, MON INDEPENDENT 777484 0.33%
LLASOS, MARWIL (KPTRAN) INDEPENDENT 564291 0.24%
SEÑERES, CHRISTIAN (DPP) INDEPENDENT 561041 0.23%
FALCONE, BAL (DPP) INDEPENDENT 516863 0.22%
POE, GRACE PNOY 16340333 6.84%
LEGARDA, LOREN (NPC) PNOY 14942824 6.26%
ESCUDERO, CHIZ PNOY 14137127 5.92%
CAYETANO, ALAN PETER (NP) PNOY 14129783 5.92%
ANGARA, EDGARDO (LDP) PNOY 12853305 5.38%
AQUINO, BENIGNO BAM (LP) PNOY 12376372 5.18%
PIMENTEL, KOKO (PDP) PNOY 11846088 4.96%
TRILLANES, ANTONIO IV (NP) PNOY 11389173 4.77%
VILLAR, CYNTHIA HANEPBUHAY (NP) PNOY 11070265 4.64%
ENRILE, JUAN PONCE JR.(NPC) PNOY 9167583 3.84%
MAGSAYSAY, RAMON JR. (LP) PNOY 9153842 3.83%
MADRIGAL, JAMBY (LP) PNOY 5409440 2.26%
BINAY, NANCY (UNA) UNA 13310851 5.57%
EJERCITO ESTRADA, JV (UNA) UNA 11010630 4.61%
HONASAN, GRINGO (UNA) UNA 10620981 4.45%
GORDON, DICK (UNA) UNA 10160019 4.25%
ZUBIRI, MIGZ (UNA) UNA 9490215 3.97%
MAGSAYSAY, MITOS (UNA) UNA 4484515 1.88%
MACEDA, MANONG ERNIE (UNA) UNA 2746359 1.15%
COJUANGCO, TINGTING (UNA) UNA 2405682 1.01%
238828920 100.01%
INDEPENDENT 31783533 13%
PNOY 142816135 60%
UNA 64229252 27%
238828920 1

By a few margin, computing by means of excel you would probably be getting the same results, yes its close to 60-30-10, If you would get all the total votes. Its just pure math, not based on conspiracy theories.


3. Compare your results by comparing it to the top 12 Senatoriables

NAME PARTY LIST TOTAL VOTES
POE, GRACE PNOY       16,340,333
LEGARDA, LOREN (NPC) PNOY       14,942,824
ESCUDERO, CHIZ PNOY       14,137,127
CAYETANO, ALAN PETER (NP) PNOY       14,129,783
BINAY, NANCY (UNA) UNA       13,310,851
ANGARA, EDGARDO (LDP) PNOY       12,853,305
AQUINO, BENIGNO BAM (LP) PNOY       12,376,372
PIMENTEL, KOKO (PDP) PNOY       11,846,088
TRILLANES, ANTONIO IV (NP) PNOY       11,389,173
VILLAR, CYNTHIA HANEPBUHAY (NP) PNOY       11,070,265
EJERCITO ESTRADA, JV (UNA) UNA       11,010,630
HONASAN, GRINGO (UNA) UNA       10,620,981
PNOY    121,325,856 79%
UNA     32,701,876 21%
   154,027,732 100%

Results show that 80 percent in the top 12 came from the PNOY party list while only 20 percent came from the opposition (UNA). 

4. Plan for future research

Purugganan says that there is much that we can learn from Comelec's election data."In all this furor, I think political analysts are missing something important: that, while votes for individual senators may be different in different regions, the country may be pretty much coalesced into clear pro- and anti-administration voting blocs. This may actually mean that we do have such a thing as national parties that voters across the country vote for," he said


By going through these statistics, party lists have become an important variable in predicting who is most probable to go into the Senatorial top 12. Other factors also have to be considered like the impact of Positive Brand Family Names like BINAY and POE.




Rabu, 15 Mei 2013

My Non-Linear Career Life

by Albert Anthony D. Gavino, MBA

Boring as it seems, I am fascinated by how we jump from one career to another.

My Non-Linear Career
If you have to track down your career into a linear type of graph, its not really good because it wont give you that many options, the best is to have a non-linear career, entering into all sorts of fields like Management, Statistics, Web Analytics, Behavioral Psychology, Law. These fields will widen your horizons into new types of career. Five years ago, I did not imagine myself going back into statistics but if you put down all your domains, these are somewhat related to each other like web analytics to statistics to web development to research.

What I am saying is graduates should not limit themselves to the traditional type of careers that we have, sometimes we have to cross-sell ourselves with other industries. Cross yourself with a manufacturing industry to a banking industry to an academe type of environment. This will broaden your horizons, your contacts and even broaden opportunities. Career Life is ever changing with the demands of industry, Reading on new type of books gets you ahead and more so will a master's degree or a doctoral degree for that matter. Yes, do not be afraid to get into new things like cloud computing, data mining, predictive modeling. Do not fear of the unknown but get interested to read more on new things, as far as we are learning we get to improve ourselves more and be more open minded to new ideas, new culture, new learnings.


My Business Intelligence Framework (4 simple steps on how to get there)

by Albert Anthony D. Gavino

These days, everyone seems to be an expert at business intelligence, but what is business intelligence all about? its just not about data analytics and some graphs and some more pie charts, Business Intelligence involves a meticulous process that starts from the craft of a business strategy. 

Business Strategy
This business strategy comes from the company's vision mission statement aligned with its objectives, there are several ways to attack business strategy, you can use the Michael Porter's model or analyze your competition through market niche or market segmentation.



B.I. framework by Albert Anthony D. Gavino

Performance Management
Performance management is the field that discusses the nitty gritty stuff that involves the dreaded Balanced Scorecard where every organization unit is defined by a perfomance metric such as sales performance or sales quota or number of calls made per day, these metrics are defined by upper management that will in turn drive customer value and share holder value, somewhat taken from the Japanese qualitative methodology that we don't want to read like Six Sigma Belters that exemplify utmost quality by reducing the number of defects almost close to zero.

Data Warehousing
Of course you can't analyze your data if you dont have a good data warehouse, a good data warehouse involves flat files and cubes that are interconnected by a snowflake schema or a starflake schema. These are queried through your SQL server, your MySQL if your on a budget and in some other cases you would be needing your stored procedures that have scripts and subscripts nested within each other, talk about five pages of code with your database administrator (goodluck with that)

and Lastly...

Advance Statistics
your Company Statistician is best consulted with what kind of data you will be handling, no he doesn't care where you put your string fields in, he only has three kinds of variables and that are nominal, ordinal and scale. for in each there is a specific t-test, z-test, chi-square test, for elementary statistics and then there is the advance statistics from linear regression, logistic regression, time series analysis, cluster analysis, CHAID, C5  trees and Neural Networks which involve complex statistical models vastly computed through business intelligence software such as IBM-SPSS data modeler or your other competitor models that you would want to use.




Rabu, 01 Mei 2013

Basic Stat Tools to use for Research


Basic Statistical Chart by Albert Anthony D. Gavino
Basic Decision Tree Chart for Statistical Analysis

How to use:

Step 1: Consider your Independent Variable
Step 2: Indicate the number of experimental conditions
Step 3: Indicate if the groups are related or dependent
Step 4: Identify the variable if it is Nominal, Ordinal or Interval
Step 5: Use the appropriate Statistical Tool for your Research
Step 6: Get ready to use your data with IBM-SPSS software