From Wikipedia, the không lấy phí encyclopedia
In statistics, a quartile is a type of quantile which divides the number of data points into four parts, or quarters, of moreorless equal size. The data must be ordered from smallest vĩ đại largest vĩ đại compute quartiles; as such, quartiles are a size of order statistic. The three main quartiles are as follows:
Bạn đang xem: tứ phân vị là gì
 The first quartile (Q_{1}) is defined as the middle number between the smallest number (minimum) and the median of the data mix. It is also known as the lower quartile, as 25% of the data is below this point.
 The second quartile (Q_{2}) is the median of a data set; thus 50% of the data lies below this point.
 The third quartile (Q_{3}) is the middle value between the median and the highest value (maximum) of the data mix. It is known as the upper quartile, as 75% of the data lies below this point.^{[1]}
Along with the minimum and maximum of the data (which are also quartiles), the three quartiles described above provide a fivenumber summary of the data. This summary is important in statistics because it provides information about both the center and the spread of the data. Knowing the lower and upper quartile provides information on how big the spread is and if the dataset is skewed toward one side. Since quartiles divide the number of data points evenly, the range is not the same between quartiles (i.e., Q_{3}Q_{2} ≠ Q_{2}Q_{1}) and is instead known as the interquartile range (IQR). While the maximum and minimum also show the spread of the data, the upper and lower quartiles can provide more detailed information on the location of specific data points, the presence of outliers in the data, and the difference in spread between the middle 50% of the data and the outer data points.^{[2]}
Definitions[edit]
Symbol  Names  Definition 

Q_{1} 

splits off the lowest 25% of data from the highest 75% 
Q_{2} 

cuts data mix in half 
Q_{3} 

splits off the highest 25% of data from the lowest 75% 
Computing methods[edit]
Discrete distributions[edit]
For discrete distributions, there is no universal agreement on selecting the quartile values.^{[3]}
Method 1[edit]
 Use the median vĩ đại divide the ordered data mix into twohalves.
 If there is an odd number of data points in the original ordered data mix, do not include the median (the central value in the ordered list) in either half.
 If there is an even number of data points in the original ordered data mix, split this data mix exactly in half.
 The lower quartile value is the median of the lower half of the data. The upper quartile value is the median of the upper half of the data.
This rule is employed by the TI83 calculator boxplot and "1Var Stats" functions.
Method 2[edit]
 Use the median vĩ đại divide the ordered data mix into twohalves.
 If there are an odd number of data points in the original ordered data mix, include the median (the central value in the ordered list) in both halves.
 If there are an even number of data points in the original ordered data mix, split this data mix exactly in half.
 The lower quartile value is the median of the lower half of the data. The upper quartile value is the median of the upper half of the data.
The values found by this method are also known as "Tukey's hinges";^{[4]} see also midhinge.
Method 3[edit]
 If there are even numbers of data points, then Method 3 starts off the same as Method 1 or Method 2 above and you can choose vĩ đại include or not include the median as a datapoint. If you choose vĩ đại include the median as a new datapoint, proceed vĩ đại step 2 or 3 of Method 3 because you now have an odd number of datapoints.
 If there are (4n+1) data points, then the lower quartile is 25% of the nth data value plus 75% of the (n+1)th data value; the upper quartile is 75% of the (3n+1)th data point plus 25% of the (3n+2)th data point.
 If there are (4n+3) data points, then the lower quartile is 75% of the (n+1)th data value plus 25% of the (n+2)th data value; the upper quartile is 25% of the (3n+2)th data point plus 75% of the (3n+3)th data point.
Method 4[edit]
If we have an ordered dataset , we can interpolate between data points vĩ đại find the th empirical quantile if is in the quantile. If we denote the integer part of a number by , then the empirical quantile function is given by,
,
where and .^{[1]}
To find the first, second, and third quartiles of the dataset we would evaluate , , and respectively.
Example 1[edit]
Ordered Data Set: 6, 7, 15, 36, 39, 40, 41, 42, 43, 47, 49
Method 1  Method 2  Method 3  Method 4  

Q_{1}  15  25.5  20.25  15 
Q_{2}  40  40  40  40 
Q_{3}  43  42.5  42.75  43 
Example 2[edit]
Ordered Data Set: 7, 15, 36, 39, 40, 41
Xem thêm: tro choi my little pony equestria
As there are an even number of data points, the first three methods all give the same results.
Method 1  Method 2  Method 3  Method 4  

Q_{1}  15  15  15  13 
Q_{2}  37.5  37.5  37.5  37.5 
Q_{3}  40  40  40  40.25 
Continuous probability distributions[edit]
If we define a continuous probability distributions as where is a real valued random variable, its cumulative distribution function (CDF) is given by
.^{[1]}
The CDF gives the probability that the random variable is less than vãn the value . Therefore, the first quartile is the value of when , the second quartile is when , and the third quartile is when .^{[5]} The values of can be found with the quantile function where for the first quartile, for the second quartile, and for the third quartile. The quantile function is the inverse of the cumulative distribution function if the cumulative distribution function is monotonically increasing.
Outliers[edit]
There are methods by which vĩ đại kiểm tra for outliers in the discipline of statistics and statistical analysis. Outliers could be a result from a shift in the location (mean) or in the scale (variability) of the process of interest.^{[6]} Outliers could also be evidence of a sample population that has a nonnormal distribution or of a contaminated population data mix. Consequently, as is the basic idea of descriptive statistics, when encountering an outlier, we have vĩ đại explain this value by further analysis of the cause or origin of the outlier. In cases of extreme observations, which are not an infrequent occurrence, the typical values must be analyzed. In the case of quartiles, the Interquartile Range (IQR) may be used vĩ đại characterize the data when there may be extremities that skew the data; the interquartile range is a relatively robust statistic (also sometimes called "resistance") compared vĩ đại the range and standard deviation. There is also a mathematical method vĩ đại kiểm tra for outliers and determining "fences", upper and lower limits from which vĩ đại kiểm tra for outliers.
After determining the first and third quartiles and the interquartile range as outlined above, then fences are calculated using the following formula:
where Q_{1} and Q_{3} are the first and third quartiles, respectively. The lower fence is the "lower limit" and the upper fence is the "upper limit" of data, and any data lying outside these defined bounds can be considered an outlier. Anything below the Lower fence or above the Upper fence can be considered such a case. The fences provide a guideline by which vĩ đại define an outlier, which may be defined in other ways. The fences define a "range" outside which an outlier exists; a way vĩ đại picture this is a boundary of a fence, outside which are "outsiders" as opposed vĩ đại outliers. It is common for the lower and upper fences along with the outliers vĩ đại be represented by a boxplot. For a boxplot, only the vertical heights correspond vĩ đại the visualized data mix while horizontal width of the box is irrelevant. Outliers located outside the fences in a boxplot can be marked as any choice of symbol, such as an "x" or "o". The fences are sometimes also referred vĩ đại as "whiskers" while the entire plot visual is called a "boxandwhisker" plot.
When spotting an outlier in the data mix by calculating the interquartile ranges and boxplot features, it might be simple vĩ đại mistakenly view it as evidence that the population is nonnormal or that the sample is contaminated. However, this method should not take place of a hypothesis test for determining normality of the population. The significance of the outliers vary depending on the sample size. If the sample is small, then it is more probable vĩ đại get interquartile ranges that are unrepresentatively small, leading vĩ đại narrower fences. Therefore, it would be more likely vĩ đại find data that are marked as outliers.^{[7]}
Computer software for quartiles[edit]
Environment  Function  Quartile Method 

Microsoft Excel  QUARTILE.EXC  Method 4 
Microsoft Excel  QUARTILE.INC  Method 3 
TI8X series calculators  1Var Stats  Method 1 
R  fivenum  Method 2 
Python  numpy.percentile  Method 3 
Python  pandas.DataFrame.describe  Method 3 
Excel:
Xem thêm: phim cuộc sống tươi đẹp
The Excel function QUARTILE(array, quart) provides the desired quartile value for a given array of data, using Method 3 from above. In the Quartile function, array is the dataset of numbers that is being analyzed and quart is any of the following 5 values depending on which quartile is being calculated. ^{[8]}
Quart  Output QUARTILE Value 

0  Minimum value 
1  Lower Quartile (25th percentile) 
2  Median 
3  Upper Quartile (75th percentile) 
4  Maximum value 
MATLAB:
In order vĩ đại calculate quartiles in Matlab, the function quantile(A,p) can be used. Where A is the vector of data being analyzed and p is the percentage that relates vĩ đại the quartiles as stated below. ^{[9]}
p  Output QUARTILE Value 

0  Minimum value 
0.25  Lower Quartile (25th percentile) 
0.5  Median 
0.75  Upper Quartile (75th percentile) 
1  Maximum value 
See also[edit]
 Fivenumber summary
 Range
 Box plot
 Interquartile range
 Summary statistics
 Quantile
References[edit]
 ^ ^{a} ^{b} ^{c} A modern introduction vĩ đại probability and statistics: understanding why and how. Dekking, Michel, 1946–. London: Springer. 2005. pp. 236238. ISBN 9781852338961. OCLC 262680588.
{{cite book}}
: CS1 maint: others (link)  ^ Knoch, Jessica (February 23, 2018). "How are Quartiles Used in Statistics?". Magoosh. Archived from the original on December 10, 2019. Retrieved February 24, 2023.
 ^ Hyndman, Rob J; Fan, Yanan (November 1996). "Sample quantiles in statistical packages". American Statistician. 50 (4): 361–365. doi:10.2307/2684934. JSTOR 2684934.
 ^ Tukey, John Wilder (1977). Exploratory Data Analysis. ISBN 9780201076165.
 ^ "6. Distribution and Quantile Functions" (PDF). math.bme.hu.
 ^ Walfish, Steven (November 2006). "A Review of Statistical Outlier Method". Pharmaceutical Technology.
 ^ Dawson, Robert (July 1, 2011). "How Significant is a Boxplot Outlier?". Journal of Statistics Education. 19 (2). doi:10.1080/10691898.2011.11889610.
 ^ "How vĩ đại use the Excel QUARTILE function  Exceljet". exceljet.net. Retrieved December 11, 2019.
 ^ "Quantiles of a data mix – MATLAB quantile". www.mathworks.com. Retrieved December 11, 2019.
External links[edit]
 Quartile – from MathWorld Includes references and compares various methods vĩ đại compute quartiles
 Quartiles – From MathForum.org
 Quartiles calculator – simple quartiles calculator
 Quartiles – An example how vĩ đại calculate it
Bình luận