Excel and Statistics, everybody should know. : Contents
IF function
The most useful function in Excel is probably the IF function, and if I had to recommend just one, it would be the IF function.
Let’s open ‘example1.csv’ and divide it into 2 groups according to the numbers in column D. Enter in the E2 field as shown in the figure above. This means that if the value of C2 is less than 0, the value of D2 is low, otherwise it is entered as high.
Characters have quotation marks.
=IF(condition, value if true, value if false)
In this way, the IF function has three arguments, and until you get used to the function, you just enter it like this and hit enter.
Now it looks low, but the actual value (formula) is shown in the formula bar above. Then double-click the small square in the lower-right corner of the cell.
Let’s call this ‘autofill’ and the small square ‘Autofill Handle’. You can also drag the ‘Autofill Handle’ to move it to the desired position, and when you double-click it, it will automatically fill in until a blank appears.
Now, instead of going to low and high, how do we separate them into three levels: low, medium, and high?
=IF(condition, value if true, IF(condition, value if true, value if false))
You can solve it by putting the IF inside of IF.
Note that there are two parentheses on the right, and if you put the wrong parentheses, Excel will fill them in automatically.
Put a formula like this, ‘autofill’ down and you can make it whatever you want. When you’re done typing the function, be sure to press enter.
I thought it was complicated to put IF in IF like this, but from Excel 2016 or later, there is an IFS function, so there are conditions, results, conditions, results, conditions, results...... In this way, it can be further simplified. The last condition must contain TRUE.
The IFS function can avoid the more IF is in the IF, the more complicated it becomes.
By the way, I don’t use the IFS function very often. This is partly because many people do not have Excel 2016, and others have to use the Excel file I created, and at the same time, it is not too difficult if you get used to the IF function.
Copy (Ctrl C) the resulting value and select <Paste Value> from <Paste> to <Paste Value> to remove the formula and insert the actual apparent value as it is. Not only that, but <paste value> can be used in a wide variety of ways. If column D is removed without <paste value> the values in column E will show an error.
The IF function can be applied not only to turn a continuous variable into an interval as it is now, but also to a wide variety of applications. I used the simplest example, and once you get used to it, you will be able to apply it in a variety of ways.
Plus Alpha
We learned to divide a continuous variable by an interval variable as an example of using the IF function, but let’s learn a little more.
Visit https://tinyurl.com/conti2ord. There is an example data, and we want to divide values in the 2nd column into an interval variable into three groups. The criterion is rank. This means that the values will be ordered from small to large, and the number of the three groups that can be appropriated according to their rank will be similar.
In the second tab (reformed data), a new column is added, assigning 1,2,3 according to rank.
The third tab (count) displays the numbers assigned to 1,2,3. Preferably, you try to assign the value as a number, but it is not exactly correct because they have the same value.
By specifying Method as value, we decided to divide it into 3 groups, so we set 2 cut-lines. Rather than the number of groups to divide, one smaller cut-line will be set.
2nd cut is 3.5, so if the value is 3.5, it will be assigned number 3. 3.5 and below are assigned into 2, and 2.5 or less are assigned into 1.
If you divide by Value, it is easy to see a difference in the number assigned to each group. The split results can be downloaded as csv files by clicking ‘Download’.
Dividing a continuous variable into intervals is not very desirable because it would lose the original information. However, in practical cases, such as taking on a train if you are over a certain height, or over a certain age, you should pay more.
Name a variable |
Required to collect data with Excel |
Naming nominal variables |
|
Data validation |
|
Filter |
Required in the preprocessing step before analyzing the created data |
Find and Replace |
|
IF function |
|
Text to a table |
If the results of the analysis are in text (usually from R or web), tabulate them. |
Part 1 is essential for any statistical program.
And the next ‘install’ is a preparatory work for parts 2.
=================================================
- R statisics portal https://tinyurl.com/stat-portal
- R data visualization book 1 https://tinyurl.com/R-plot-I (chart)
- R data visualization book 2
https://tinyurl.com/R-plot-II-3-4 many variables / map
https://tinyurl.com/R-plot-II-5-6 time related / statistics related
https://tinyurl.com/R-plot-II-7-8 others / reactive chart
- R data visualization book 3 https://tinyurl.com/R-data-Vis3
- R data visualization book 4 R 데이터 시각화 4권
- Meata Analysis book 1 https://tinyurl.com/MetaA-portal
- Meata Analysis book 2 https://tinyurl.com/MetaA-portal(2)
- Preciction Model and Machine Learning https://tinyurl.com/Machine-Learning-EZ
- Sample Size Calculations https://tinyurl.com/MY-sample-size
- Sample data https://tinyurl.com/data4edu
댓글 없음:
댓글 쓰기