Home
Subjects
Explanations
Create
Study sets, textbooks, questions
Log in
Sign up
Upgrade to remove ads
Only $2.99/month
Stats Exam 1
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (29)
Population
a population consists of an entire set of objects, observations or scores that have something in common. POPULATION MEASUREMENTS ARE PARAMETERS
Examples of Parameters
the mean and standard deviation
Sample
a subset of a population. Best approach to gathering data by samples then one person at a time. SAMPLE MEASUREMENTS ARE STATISTICS
Inferential Statistics
random sampling, not as popular as representative selected portions.
Nominal
qualitative, attribute, characteristics and categorical. SUCH AS gender, maritial status, religion, political party, YES OR NO REPSPONSES. BAR GRAPHS OR PIE GRAPH<--
Ordinal
BAR GRAPH OR PIE GRAPHS<--ranking, rating, likert scales (strongly disagree to strongly agree, least important to most important)variables such as Income level, level of happiness --->(SCALE 1-5) ETC<----
Scale
quantitative, numeric such as GPA, Age , weight, etc. USE HISTOGRAM, STEM&LEAF, BOX PLOT.
Central tendency
A number that describes something about the "average" score of a distribution. (mean median mode)
Mean
sample is symbol x-bar, population symbol is (mu); arithmetic average found primarily for scale data; affected by outlier and skewed distributions
Arthmetic Mean
The arithmetic mean is what is commonly called the average: When the word "mean" is used without a modifier, it can be assumed that it refers to the arithmetic mean. The mean is the sum of all the scores divided by the number of scores. The formula in summation notation is:
μ = ΣX/N where μ is the population mean and N is the number of scores.
Median
sample symbol is Q2 the 50th percentile used for scale and ordinal data. The median is the middle of a distribution. Half the scores are above the median and half are below. Median is less sensitive to extreme scores than the mean and this makes it a better measure than the mean for highly skewed distributions. the median income is usually more informative then the mean income.
REMEMBER THIS
The mean, median, and mode are equal in symmetric normal distributions. The mean is typically higher than the median in positively skewed distributions and lower than the median in negatively skewed distributions, although this may not be the case in bimodal distributions
MODE
has no symbol. Is the most frequently occurring score in a distribution and is used as a measure of central tendency. The mode is found for all types of data. The mode is greatly subject to sample fluctuations and is therefore not recommended to be used as the only measure of central tendency. A further disadvantage of the mode is that many distributions have more then one mode. called must modal.
Trimmed mean
A trimmed mean is calculated by discarding a certain percentage of the lowest and the highest scores and the computing the mean of the remaining scores For example, a 5% TM is computed by discarding the lower and higher 2.5% of the scores and taking the mean of the remaining scores. Trimmed means are often used in Olympic scoring to minimize the effects of extreme ratings possibly caused by biased judges.
Range
no symbol. the range is the simplest measure of spread or dispersion. It is equal to the difference between the largest and smallest values. The range can be a useful measure of spread because it is so easily understood. However, it is very sensitive to extreme scores sense it is based on only two values.
Standard Deviation
symbol is s ; population parameter is σ . The SD is very useful in that it can be added to/subtracted from the mean for interpretation of variability and for establishing the empirical rule and z scores
Variance
symbol is s^2 ; population parameter is σ^2
Standard error of the mean
symbol is s subscript x-bar stars error of mean = s/square root of n. Measures sampling error and establishes confidence intervals.
Quartiles
Q1= 25th percentile, Q2= 50th percentile (median), Q3= 75th percentile displayed in Boxplots
Interquartile Range
IQR = Q3 - Q1
Empirical Rule
68-95-99.7% Rule. If the histogram of the data is approximately normal shaped then
• 1 will contain about 68% of the data
• 2 will contain about 95% of the data
• 3 will contain about 99.7% of the data.
z score
standard score or standardized score. Tells how many standard deviations are added to or subtracted from the mean to arrive at a given value. For example, if the mean = 100 and standard deviation = 15 then a value of 130 has a z- score value of +2.0 (2 standard deviations above the mean) ; a value of 85 has a z-score value of -1.0 (1 standard deviation below the mean)
Skewness
is the degree of departure from symmetry of a distribution. A positively skewed distribution has a "tail" which is pulled in the positive direction; if a distribution of exam scores, it means there are many more lower scores than with a bell-shaped normal distribution. A negatively skewed distribution has a "tail" which is pulled in the negative direction; for exam scores it means that there are more higher scores than normal.
Kurtosis
is the degree of peakedness of a distribution. A normal disturbution is a mesokurtic distribution. A pure leptokurtic distribution has a higher peak than the normal distribution and has heavier tails; lepto is a greek prefix mean thin. A pure platykurtic distributions has a lower peak than normal and lighter tails; platy is a greek prefix for mean flat.
Correlation
describes the strength of an association between two variables, and is completely symmetrical the correlation between A and B is the same as the correlation between B and A. However, if the two variables are related it means that when one changes by a certain amount the other changes on a average by a certain amount. For example, in the children described earlier greater height is associated, on average, with greater anatomical dead space. If y represents the dependent variable and x the independent variable this relationship is described as the regression of y on x.
regression equation
the relationship can be represented by this simple equation. Means that the average value of y is a "function" of x, that is, it changes with x. Represents how much y changes with any given change of x can be used to construction a regression line a scatter diagram, and in the simplest case this is assumed to be a straight line. The direction in which the line slopes depends on whether the correlation is + or -. When the two sets increase or decrease together the line is positive; when decrease as the other increase its is a negative line. BEST FIT LINE
CAUTION: correlation/regression analysis
A significant result tells us little about the strength of a relationship. One of the flaws is that even with a very weak relationship (say r = 0.1) we would get a significant result (p < 0.05) with a large enough sample (say n over 1000). ----> Correlation and linear regression analysis do not prove a causal relationship between x and y as the relationship could be only causal (spurious).
Coefficient of Determination *
r^2. A part of the variation in one of the variables (as measured by its variance) can be thought of as being due to its relationship with the other variable and another part as due to undetermined (often "random") causes. The part due to the dependence of one variable on the other is measured by
r2 measures the % variation in the predicted variable (y) that is explained (or measured) by the predictor variable (x) and is the correlation coefficient squared.
It is converted to a % such that if r2 = 0.6 then it can be said that the predictor variable x explains or measures 60% of the variation in the predicted variable y.
Significance p-value
We declare statistical significance most commonly when p< 0.05 or less than 5% Significance is the probability or percent chance that a relationship found in the data is just due to an unlucky sample, such that if we took another sample we might find nothing. That is, significance is the chance of a Type I error: the chance of concluding we have a relationship when we do not. Social scientists often use the .05 level as a cutoff, ie. there is 5% or less chance that a relationship is just due to chance.
Recommended textbook explanations
A First Course in Probability
8th Edition
Sheldon Ross
927 explanations
The Practice of Statistics for the AP Exam
5th Edition
Daniel S. Yates, Daren S. Starnes, David Moore
2,433 explanations
A Survey of Mathematics with Applications
10th Edition
Allen R. Angel, Christine D. Abbott, Dennis C. Runde
6,365 explanations
Probability
Jim Pitman
464 explanations
Sets with similar terms
Psychology Stat Unit 1-3 (Test 1)
61 terms
Psychology Stat Unit 1-3 (Test 1)
61 terms
stats in psych TEST 2
31 terms
stats 3-5
70 terms
Other sets by this creator
FINAL FOR 20TH CENTURY EUROPE
44 terms
HIST 20TH CENTURY (turn into flashcards test 3)
9 terms
Multiple Sclerosis
21 terms
Spinal Cord Injury
64 terms
Other Quizlet sets
PAR/01/2021 - PRIVÉ QCM
106 terms
Exam #3 Study Guide
24 terms
Materials Processing
85 terms