Explore tens of thousands of sets crafted by our community.
Descriptive Statistics Basics
20
Flashcards
0/20
Percentile
The value below which a given percentage of observations in a group of observations falls, e.g., the 50th percentile is the median.
Bimodal Distribution
A frequency distribution having two different values that are heavily populated with cases and therefore has two modes.
Median
The middle value in a list of numbers ordered from smallest to largest; if the list has an even number of values, it is the average of the two middle numbers.
Standard Deviation
A measure that quantifies the amount of variation or dispersion in a set of values, calculated as the square root of the variance.
Z-score
The number of standard deviations a data point is from the mean; calculated as , where is the data point, the mean, and the standard deviation.
Frequency Distribution
A summary of how often each value occurs in a data set, usually depicted in a table or graph such as a histogram.
Nominal Scale
A scale used for labeling variables that have no quantitative value, also known as categorical data.
Ordinal Scale
A scale that depicts the order of values but doesn't define the difference between each one, such as a ranking or rating scale.
Mean
The average of a set of numbers, calculated by adding them all together and dividing by the number of values.
Correlation Coefficient
A measure that determines the degree to which two variables' movements are associated, ranging from -1 to 1, where 1 indicates a perfect positive relationship and -1 indicates a perfect negative relationship.
Variance
A measure of how spread out a data set is, calculated as the average of the squared differences from the mean.
Interquartile Range (IQR)
A measure of variability, calculated as the difference between the 75th and 25th percentiles of the data set.
Box and Whisker Plot
A graphical depiction of data that includes the median, quartiles, and potentially outliers, with 'whiskers' extending to show the variability outside the upper and lower quartiles.
Scatter Plot
A type of data graph that uses Cartesian coordinates to display values of typically two variables for a set of data.
Skewness
A measure of the asymmetry of the probability distribution of a real-valued random variable; skewed to the right (positive skew) or left (negative skew).
Range
The difference between the highest and lowest values in a data set, calculated as .
Mode
The value that occurs most frequently in a data set. A set of data may have one mode, more than one mode, or no mode at all.
Kurtosis
A measure of the 'tailedness' of the probability distribution of a real-valued random variable, indicating how much data is in the tails or the peak compared to a normal distribution.
Histogram
A graphical representation of the distribution of data, where the data is divided into bins and the frequency of the data in each bin is represented by the height of the bar.
Outlier
An observation in a data set that is distant from other observations. Often classified as a data point more than 1.5 IQRs below the first quartile or above the third quartile.
© Hypatia.Tech. 2024 All rights reserved.