Logo
Pattern

Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Statistical Measures

20

Flashcards

0/20

Still learning
StarStarStarStar

Range

StarStarStarStar

Formula: Range=Max(xi)Min(xi)\text{Range} = \text{Max}(x_i) - \text{Min}(x_i) Explanation: The range is the difference between the highest and lowest values in the dataset.

StarStarStarStar

Spearman's Rank Correlation Coefficient

StarStarStarStar

Formula: ρ=16di2n(n21)\rho = 1- \frac{6\sum d_i^2}{n(n^2 - 1)} Explanation: Spearman's rank correlation coefficient is a nonparametric measure of rank correlation that assesses how well the relationship between two variables can be described using a monotonic function.

StarStarStarStar

T-Statistic

StarStarStarStar

Formula: t=xˉμ0s/nt = \frac{\bar{x} - \mu_0}{s/\sqrt{n}} Explanation: The t-statistic is used to determine how many standard deviations an observed sample mean is from the population mean when the standard deviation of the population is unknown and the sample size is small.

StarStarStarStar

Mode

StarStarStarStar

Formula: (No formal formula, mode is the most frequent value in the dataset) Explanation: The mode is the value that appears most often in a set of data values.

StarStarStarStar

Correlation Coefficient

StarStarStarStar

Formula: r=cov(X,Y)sXsYr = \frac{cov(X, Y)}{s_Xs_Y} Explanation: The correlation coefficient is a statistical measure that calculates the strength of the relationship between the relative movements of two variables. Values range from -1 to 1.

StarStarStarStar

Standard Deviation

StarStarStarStar

Formula: s=1n1i=1n(xixˉ)2s = \sqrt{\frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2} Explanation: Standard deviation is a measure of the amount of variation or dispersion of a set of values; it is the square root of the variance.

StarStarStarStar

Coefficient of Variation (CV)

StarStarStarStar

Formula: CV=sxˉ×100CV = \frac{s}{\bar{x}} \times 100 Explanation: The coefficient of variation is a statistical measure of the dispersion of data points in a data series around the mean, expressed as a percentage.

StarStarStarStar

Kurtosis

StarStarStarStar

Formula: Kurtosis=i=1n(xixˉ)4/n(i=1n(xixˉ)2/n)23Kurtosis = \frac{\sum_{i=1}^{n}(x_i - \bar{x})^4 / n}{(\sum_{i=1}^{n}(x_i - \bar{x})^2 / n)^2} - 3 Explanation: Kurtosis is a measure of the 'tailedness' of the probability distribution of a real-valued random variable. It indicates the sharpness of the peak of a frequency-distribution curve.

StarStarStarStar

Z-score

StarStarStarStar

Formula: z=(xxˉ)sz = \frac{(x - \bar{x})}{s} Explanation: The z-score represents the number of standard deviations a data point is from the mean. A z-score tells you how many standard deviations from the mean your value is.

StarStarStarStar

Percentile

StarStarStarStar

Formula: Pk=k100(N+1)P_k = \frac{k}{100}(N + 1) Explanation: The percentile is a measure indicating the value below which a given percentage of observations in a group of observations fall.

StarStarStarStar

Probability Density Function (PDF)

StarStarStarStar

Formula: (Varies depending on the distribution) Explanation: A probability density function is a function that describes the likelihood of a random variable to take on a given value. The area under the curve of a PDF (for a certain interval) represents the probability of the variable falling within that interval.

StarStarStarStar

Mean

StarStarStarStar

Formula: xˉ=1ni=1nxi\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i Explanation: The mean is the sum of all the data points divided by the number of data points, representing the average value.

StarStarStarStar

Covariance

StarStarStarStar

Formula: cov(X,Y)=1n1i=1n(xixˉ)(yiyˉ)cov(X, Y) = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y}) Explanation: Covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, the covariance is positive.

StarStarStarStar

Pearson Correlation Coefficient

StarStarStarStar

Formula: r=i=1n(xixˉ)(yiyˉ)i=1n(xixˉ)2i=1n(yiyˉ)2r = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2 \sum_{i=1}^{n}(y_i - \bar{y})^2}} Explanation: The Pearson correlation coefficient is a measure of linear correlation between two sets of data. It is the covariance of the two variables divided by the product of their standard deviations.

StarStarStarStar

Chi-squared Test Statistic

StarStarStarStar

Formula: χ2=(OiEi)2Ei\chi^2 = \sum\frac{(O_i - E_i)^2}{E_i} Explanation: The chi-squared test statistic is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories.

StarStarStarStar

P-value

StarStarStarStar

Formula: (No simple formula, depends on the test statistic distribution) Explanation: The p-value is the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct.

StarStarStarStar

Variance

StarStarStarStar

Formula: s2=1n1i=1n(xixˉ)2s^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2 Explanation: Variance measures how spread out the numbers are in a dataset. It's the average of the squared differences from the Mean.

StarStarStarStar

Skewness

StarStarStarStar

Formula: Skewness=i=1n(xixˉ)3/n(i=1n(xixˉ)2/n)3/2Skewness = \frac{\sum_{i=1}^{n}(x_i - \bar{x})^3 / n}{(\sum_{i=1}^{n}(x_i - \bar{x})^2 / n)^{3/2}} Explanation: Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean.

StarStarStarStar

Median

StarStarStarStar

Formula: (No simple formula for median, it is the middle value after sorting the data) Explanation: The median is the middle value in a list of numbers sorted in ascending or descending order. If there's an even number of observations, the median is the average of the two middle numbers.

StarStarStarStar

Interquartile Range (IQR)

StarStarStarStar

Formula: IQR=Q3Q1IQR = Q_3 - Q_1 Explanation: The interquartile range is the range between the first quartile (25th percentile) and the third quartile (75th percentile). It is used as a measure of the spread of the middle half of a dataset.

Know
0
Still learning
Click to flip
Know
0
Logo

© Hypatia.Tech. 2024 All rights reserved.