What is statistical analysis

Statistical Analysis - What is it?

There is more to good statistical analysis than the correct choice and application of methods. It is a complex process that always harbors stumbling blocks. These start with Research design Your questionnaire, go on to sampling and often require one as wellgraphical representation in SPSS. The following post describes the basics that you should know before you even get started Average calculate.

Statistical analysis: data as a basis

Around statistical tests to carry out and generate reliable data from it, you need good data. The most beautiful results become useless if they are not meaningful due to poor data quality. This also includes a coherent, complete and up-to-date data set. It is therefore important to make sure that your data is suitable to answer your question. This means that you have to query all relevant variables with the correct precision. An example: You are interested in the height of adolescents between 13 and 17 years of age. It is of no use here not to raise them at all, for adults or rounded to the nearest meter. But there are also pitfalls in the measurement itself. In order to better track down these pitfalls, the is recommended Statistics service.

Statistical analysis quality criteria: objectivity, reliability and validity

The so-called quality criteria are intended to ensure that your results are reproducible, correct in terms of content and applicable to the population.

The weakest quality criterion here is objectivity. This is the case when measurement results can be reproduced independently of whoever is performing the measurement (cf. Diekmann 2007). For our example, this means that it doesn't matter who puts the tape measure on the adolescents from your study. This may seem trivial here, but think of personal interviews, for example. For the statistical procedures means objectivity that the interviewer must not have any influence on the answers given by the test subjects.

Reliability, which focuses on the measuring instrument itself, goes one step further. Here, too, the goal is reproducibility. In our example, two measuring tapes have to measure the same height. This is relatively easy for physical quantities, but it is more difficult for abstract concepts such as “luck”. So-called reliability analyzes are often carried out here, which check the internal consistency of a measuring instrument.

Validity: It's about the essentials

At the end there is still the validity. This is the most logical, but also the most complex quality criterion. Imagine you want to measure the height of the adolescents and place them on a scale. Well calibrated scales show the same weight and different interviewers read it off. This fulfills objectivity and reliability. The problem: You actually wanted to know the size and not the weight. So validity is about measuring what you want to measure - this also applies to Online surveys. This can be difficult, especially with abstract features.

The external validity even goes one step further: A measurement is only externally valid if it can be extended to the population. Medical studies with an experimental setup and strict choice of subjects often show a high internal validity. Unfortunately, their results cannot always be transferred to all other patients. The external validity is therefore not given.

Samples for statistical analysis

That brings us to the next topic. For very few analyzes, full surveys are available as the data basis. These are usually very complex. Imagine if you had to interview all young people between the ages of 13 and 17!

Therefore you almost always work with samples. Ideally, you can use it to draw conclusions about the population (see external validity). For this to work, your sample must be of a certain quality. Of course, there is also the option of oneData analysis service to claim.

Usually, “sample” means the random sample, which describes the drawing of individual test subjects from the population. The same selection probability should apply to each test person (cf. Fahrmeir et al, 2012).

There are also different approaches here. In addition to the simple random sample, in which a certain number of subjects is drawn from the population, there are stratified samples. The test persons are first stratified according to characteristics, then each stratum is drawn. At SPSS cluster analyzes Often, pre-existing groups, for example school classes or localities as a whole, are surveyed.

Which type of sample is selected and how large it is mostly depends on the question, but also on time and financial restrictions.

Statistical Analysis and Causality

Statistical datamust of course also be analyzed after the survey. Perhaps you will occasionally explore your data in an exploratory manner, i.e. look for patterns that you did not have on your screen before. More often, however, you become a specific one Question pursue. Then the goal for statistical analysis is to examine your hypotheses more closely.

Of course you are happy when your own assumptions are (apparently) confirmed. That's why you have to be particularly careful not to fall into the next trap and causality with, for example, oneSPSS correlation to swap. Then you would discover connections between variables that are actually not (directly) related. If, for example, one looks at the shoe size and gross earnings of employees, it becomes apparent that these correlate positively. So do you earn more when you have big feet? Do your feet grow together with your savings account? Of course not. In this case, there is a common component that affects both variables: gender. Women usually have smaller feet and earn less. Sometimes, as in Figure 1, connections are also simply coincidental. No matter how beautiful they look.

Figure 1: Random correlation for a statistical analysis, source: http://www.tylervigen.com/spurious-correlations

Graphical representations for statistical analysis

Graphics are one of the most common sources of misinterpretation. They can be used deliberately to manipulate, but sometimes also arise from inattention. For example, look at Figure 2.

Figure 2: Barplot of gender ratio for a statistical analysis, source: own illustration

Both graphs show the gender distribution between male and female participants. In fact, both graphs show the same distribution. The apparent difference arises from a manipulation of the y-axis. How to meet this challenge can be found in a Statistics tuition fathom.

The cutting off, omission or lack of labeling of the axes is one of the most popular sources of error in graphics.
But also make sure to select graphics that are easily visible to the human eye. For example, we quickly run into problems when the angles on a pie chart are very similar. 3D graphics also give us difficulties, especially when volume is an issue. On the website of the Johannes Kepler University there are a lot of other examples of unsuccessful graphics.

Statistical analysis is more than applying the correct methods. You have to proceed carefully when collecting the data so that your results are meaningful. The presentation of these results should also be well thought out so as not to inadvertently make statements that were never intended. Formulate statistical resultswhich originate from an in-depth analysis is therefore the goal to be achieved. If you follow the instructions in this article, you can avoid the usual pitfalls.


Fahrmeir, Ludwig / Heumann, Christian / Artists, Rita / Pigeot, Iris / Tutz, Gerhard (2012): Statistics: The way to data analysis, 7th edition Berlin.

Diekmann, Andreas (2007): Empirical social research: Fundamentals, methods, applications, 4th edition Reinbeck / Berlin.