Explore tens of thousands of sets crafted by our community.
Descriptive Statistics Basics
20
Flashcards
0/20
Mean
The average of a set of numbers, calculated by adding them all together and dividing by the number of values.
Median
The middle value in a list of numbers ordered from smallest to largest; if the list has an even number of values, it is the average of the two middle numbers.
Mode
The value that occurs most frequently in a data set. A set of data may have one mode, more than one mode, or no mode at all.
Range
The difference between the highest and lowest values in a data set, calculated as .
Variance
A measure of how spread out a data set is, calculated as the average of the squared differences from the mean.
Standard Deviation
A measure that quantifies the amount of variation or dispersion in a set of values, calculated as the square root of the variance.
Interquartile Range (IQR)
A measure of variability, calculated as the difference between the 75th and 25th percentiles of the data set.
Skewness
A measure of the asymmetry of the probability distribution of a real-valued random variable; skewed to the right (positive skew) or left (negative skew).
Kurtosis
A measure of the 'tailedness' of the probability distribution of a real-valued random variable, indicating how much data is in the tails or the peak compared to a normal distribution.
Z-score
The number of standard deviations a data point is from the mean; calculated as , where is the data point, the mean, and the standard deviation.
Percentile
The value below which a given percentage of observations in a group of observations falls, e.g., the 50th percentile is the median.
Histogram
A graphical representation of the distribution of data, where the data is divided into bins and the frequency of the data in each bin is represented by the height of the bar.
Box and Whisker Plot
A graphical depiction of data that includes the median, quartiles, and potentially outliers, with 'whiskers' extending to show the variability outside the upper and lower quartiles.
Outlier
An observation in a data set that is distant from other observations. Often classified as a data point more than 1.5 IQRs below the first quartile or above the third quartile.
Frequency Distribution
A summary of how often each value occurs in a data set, usually depicted in a table or graph such as a histogram.
Scatter Plot
A type of data graph that uses Cartesian coordinates to display values of typically two variables for a set of data.
Correlation Coefficient
A measure that determines the degree to which two variables' movements are associated, ranging from -1 to 1, where 1 indicates a perfect positive relationship and -1 indicates a perfect negative relationship.
Bimodal Distribution
A frequency distribution having two different values that are heavily populated with cases and therefore has two modes.
Nominal Scale
A scale used for labeling variables that have no quantitative value, also known as categorical data.
Ordinal Scale
A scale that depicts the order of values but doesn't define the difference between each one, such as a ranking or rating scale.
© Hypatia.Tech. 2024 All rights reserved.