Multivariate Statistics
The research question is the following:
Which combination of variables [crime, low income, gender, age of driver, location, time of year] best predict the rate of car accident in Tucson, Arizona.
Motivation
My motivation for conducting this multivariate study is that I think that two or more variables affect my dependent variable and I am interested in investigating which precisely they are.
Conversely, my motivation, too, could consist in being skeptical of studies that insist that one or more of the above mentioned independent variables dictate causation of dependent variable, and I am interested in investigation whether my assumption is correct, i.e. that not one of these variables have a predictive influence.
Alternately, I may believe that multivariate regression using these variables will demonstrate that these variables have an opposite affect on the dependent variable than is commonly assumed due to researchers having used a simple univariate (or correlation) study.
Objective
My objective will be to test the degree of relationship of these variables - crime, low income, gender, age of driver, location, time of year - on the rate of car accidents in Tucson. In other words, I will wish to assess whether a relationship exists between one or more of these variables on the rate of accident, I would also like to test the direction of the affect, the strength of the association, to see whether any differentials exists between associating effects, and whether any variables can be excluded on the grounds that they have minimal or no effect on the independent variable.
Alternately, my objective here might be to test previous research and to see whether, indeed, one or more particular variables are associated with rate of accidents as claimed. I might believe that other variables (such as gender and low income) have a more significant impact on the frequency of car accidents as compared to for instance the general belief that it is crime, and location that induce them.
Potential pitfall
Firstly, I have to ascertain that my operational terms of data are totally accurate and thorough, since the outcome depends on the data that is fed-in to the system.
Most significantly, however, the major conceptual limitation is that I can only ascertain a relationship but never be sure of the underlying causal factor. It may be a variable that is seemingly unrelated to the independent variables mentioned here and, therefore, has been overlooked. For instance, it may be a possibility that menopause may be responsible for inducing accidents; so even though the factor of age may have emerged as significant, the real problem -- menopause -- has remained concealed.
Another pitfall may be outliers in one or more of my variables that may distort statistical results, as well as the fact that the population sample may be insufficiently controlled or randomized so that 'noise' may distort the data.
You’re 84% through this paper. Sign up to read the full paper.
Sign Up Now — Instant Access Already a member? Log inAlways verify citation format against your institution’s current style guide requirements.