Statistics 101: Measures of central tendency

 
More Than Mortal
| d-d-d-DANK ✡ 🔥🔥🔥 🌈ðŸ‘
 
more |
XBL:
PSN:
Steam: MetaCognition
ID: Meta Cognition
IP: Logged

15,062 posts
This is the way the world ends. Not with a bang but a whimper.
Maths is one of those subjects where you either love it, or hate it. I tend to hate it, but I can get along with it well enough if it has some kind of practical application. Yet, despite many people hating it, a lot of people also wish they were better at it; it's obviously a skill people desire.

Since I do statistics (econometrics, actually, but it's pretty broad) I figured I'd combine the desire to be better at maths by explaining it's practical application in terms of a specific area of study. So, if you want to know more about statistics, then hopefully this will be a pretty decent guide. It will follow the same trajectory as my lectures have taken, and I will be using old notes and my textbook as a way of guiding myself. Also, me being able to explain certain things will probably help me.

Measures of Central Tendency

So, we're starting with the really basic stuff. Let's get some simple notation down:

- Observations of a variable are denoted by a letter such as X (or Y, or Z).

- The index "i" denotes a generic observation of that variable. i takes on the value of 1, 2, 3, 4 and so on so forth. So, X1 would be the first observation. X4 the fourth.

When it comes to measures of central tendency, we are concerned with what the typical value of X is in a given data set. The most common answer is to compute the mean of the variable's observations, which is denoted by X with a bar above it. Or, when typed, as Xbar.

As most of you probably know, the mean is defined as the sum of all the given values (or observations) divided by the overall quantity of those values.

Written mathematically, Xbar = ΣXi / n.

The E-looking letter is a capital sigma, which is a summation notation. All it means is that we sum every given instance of X in the data set. Then, of course, we divide it by n which is the quantity of observations.

The summation notation is a very useful tool; it can help organise calculations into a much more manageable layout. Say we have some constant, "a" (usually, constants are denoted by Greek letters such as alpha, but I can't be fucked to copy-paste it every time). If, for instance, we have Σ a Xi, this is essentially the equivalent to aX1 + aX2 + aX3 + . . . + aXn.

This, however, can be re-arranged into the much more manageable aΣXi. This saves you having to compute every instance of aXi individually. This is called simplifying the expression. Knowing how to rearrange equations in order to simplify them can be very useful and time saving, as I will later demonstrate. But, for now, have a go at rearranging some yourself and I'll put the answers in spoilers.

A) Simplify the expression Σ(8 + 3Xi + 7Yi - 5Zi).

Spoiler
First of all, the summation notation can be placed in front of each part of the expression. Thus, it becomes:

- Σ8 + Σ3Xi +Σ7Yi - Σ5Zi.

It can thus be simplified further to:

- 8n + 3ΣXi + 7ΣYi - 5ΣZi.

Remember, Σ8 simply means we need to sum 8 for every instance of Xi, which is denoted by n. Accordingly, Σ8 can simply be reduced to "an eight for each individual observation"; or, 8n. It can be difficult to remember than the summation notation includes the entire range of observations involved (unless denoted otherwise). In order to make this clearer, it is acceptable to write Σ with a subscript "i". This makes it clear you are summing for all instances.

For the final three parts of the simplification, it is worth moving the constant (either 3, 7 or 5 in this case) to before the summation notation. Allow me to prove they are equivalent, if you cannot see the logic:

Say we have three observations on variable X, and their values are 1, 2 and 3. And, we have a constant: 3.

- Written as "Σ3Xi", we are essentially performing this calculation: (1 x 3) + (2 x 3) + (3 x 3) = 3 + 6 + 9 = 18.

Or, we can simply move the constant to before the summation to make it (1 + 2 + 3) x 3 which again equals 18. This saves you multiplying every instance of X by 3, and allows you to simply multiply the entire summation.

So, let's return to our definition of the mean:

Xbar = ΣXi / n.

This, however, is not the only measure of central tendency. The other common answer is the median, which is simply the middle value of a ranked set of observations. The mean is used more commonly than the median, but it's important to remember that sometimes the latter may be preferable; the mean is more easily distorted by extreme values.

For instance, say you have some data on income in a given town and you want to find the typical value. Yet, unfortunately, Donald Trump lives in this town. The mean would be skewed upwards due to the large value of Trump's income, whereas the median would remain the same as the middle observation remains the middle observation regardless of how high Trump's income may be in a given set of values.

If n (the number of observations) is odd, then the median is as follows: M = X(n + 1) / 2. Say n = 11, then M = X(11 + 1) / 2 = X6. The median, therefore, is the sixth observation of the variable.

If n is even, then M = Xn / 2 + X(n / 2) + 1 / 2. If n = 126, then n / 2 = 63 and (n / 2) + 1 = 64. Therefore, M = X63 + X64 / 2. Or, the 63rd and 64th observations of the variable divided by 2.

Now that you've read through all of that, try some questions:

B) The percentage marks of a class of 12 students is as follows: 80, 16, 11, 71, 85, 95, 12, 71, 8, 15  31, 25. Calculate both the mean and the median.

C) The amount of benefits, X, received by fifteen individuals in a given street, in a given week, in a given currency is: 67.73, 121.36, 54.32, 36.24, 176.56, 201.34, 97.26, 168.93, 35.61, 145.57, 76.58, 213.06, 232.55, 69.47 and 215.95. Calculate the mean and the median.

I'll wait for somebody to hit on the correct answers before posting them in a spoiler, so don't be lazy cunts. Next post, whenever it is, will deal with measures of variance and dispersion.
Last Edit: November 15, 2015, 07:15:50 AM by Meta as Fuck


Anonymous (User Deleted) | Legendary Invincible!
 
more |
XBL:
PSN:
Steam:
ID: Kupo
IP: Logged

6,364 posts
 
>mfw



Spoiler
also bump so a qualified individual notices this thread


rC | Mythic Inconceivable!
 
more |
XBL:
PSN:
Steam:
ID: RC5908
IP: Logged

10,792 posts
ayy lmao
i love me some stats nigga, let's take the natural log of our data and analyze that shit with the normal model cuz


Oh | Elite Four Invincible!
 
more |
XBL:
PSN:
Steam:
ID: Simseo
IP: Logged

3,641 posts
 
Spoiler
43.333..., 28,127.50, 121.36
Assuming I didn't shit things up when I put it into my calculator.

Although I'm going maths at uni, so it's sort of cheating given I already know all that.