Adelyn's Blog

My Riverside Rapid Digital Portfolio

Statistics Project

1.Collection of data. This is done all over the world, everyday, for multiple reasons. Sometimes the way data is collected is influenced by certain situations and this could change the chart, graph or the way a survey is conducted.

One of the problems is called a Bias. What it means is that the question influences responses in favour of, or against the topic of the data collection. For example let’s say that I think there are too many homeless cats in Greece. I am against the current amount of cats in Greece, that is why when I conduct a survey, my question I will ask would be, “Don’t you think there are too many homeless cats in Greece?”. Use of language really influences people to give a certain answer. A better question would be, “Do you think the amount of cats in Greece is too high, too low, or fair?”

The collection of data also involves ethics. When you conduct a survey and/or question you need to make sure you use your results for the purpose that you told the participants. Imagine you go and ask you classmates what their favourite gum is, and tell them that the purpose of this was to decide which one you should get. It would be unethical for you to then sell your classmates their favourite gum during lunch, because that is not what you told them you collected the data for in the fist place. Another way you need to be polite while collecting data is knowing that people are part of different cultures, and not all of them do the same things. For example my question might be, “How do you eat ham?” Some cultures do not allow to consume ham, so a better and more culture respected question would be, “If you eat ham, how do you prefer it?” This lets people know that you have put thought into the fact that not everyone eats ham.

As I said earlier data is collected around the world, for many different purposes and many companies do this too. For big companies that need to collect quite a bit of data, the cost is something they think about and plan before they actually collect the data. Many times they do it for advertising, so they have to pay people to go out and ask questions, they have to pay to print out the results, and then pay to make an advertisement. For smaller purposes it doesn’t usually cost a lot, but it is definitely something large companies think about, before collecting data.

When you collect the data can really influence the results. If it is July, and I go out and ask people, “Do you need a snow jacket?”, I would probably get a “No” from most people. Consider asking the Same exact question but in January. “Do you need a snow jacket?”, and people would probably say “Yes”.

If you are creating a survey, and you want a lot of feedback and a lot of participants for better, more exact results, think about how long it takes to complete your survey. Many people would not want to participate if the survey took an hour to complete. Gather all of your most important questions, sort them out and think if you could somehow pair or combine a couple of them. Maybe divide your questions into two groups and make two surveys. This will increase the amount of people willing to give their answers, and this will make your results more precise.

Everybody likes privacy, and having the thought that they can keep something for themselves. If you are conducting a survey that involves private questions, there could be a couple of hurdles. Number one, is that people would just not want to participate, and this will lower the efficiency of your results. The second hurdle might be that the participants will lie, and this would not give you the correct results. If you asked, “How many days a week do you work out?”, some people would say 3 times a week, when actually they don’t work out at all. A way you could change that is by having anonymous surveys! People would feel that their answers are more secure, and confidential. This way more people will feel safer giving their honest answers.

2. Lets say that we asked dentists what toothpaste they recommend. We collected data, and our results were that 100% of dentists recommend Colgate toothpaste. We asked 50 dentists, and all of them recommended Colgate. Technically based on our sample, the results are correct, but based on population, not all dentists recommend Colgate. The population in this case are all of the dentists in the world. We asked 50 of them, so that represents a sample of the population. The conclusion we made compared to our sample is right, because all 50 dentists recommended Colgate, but not all the dentists in the world recommend the same toothpaste.

3. Convenience sampling is a type of non-probability sampling where members of the population are conveniently available to the researcher. It represents the population by selecting people that are easy to gather as volunteers, whether they be close by or readily available. The advantage of this type of sampling is the quickness in which the data needed can be obtained. However, some disadvantages may be that the sample may not accurately represent the population as an entirety and that it could be biased. Ex. A researcher that conducts a study wants to determine the average age of shoppers at a shopping mall. It is conducted from 9-5 on school days, Monday to Friday. The sample could very well be overrepresented by senior citizens, who don’t have a day to day job anymore and have retired. It would mean that the sample would be underrepresented by people that would be working a day job at this hour; students too.

Random sampling, also known as ‘simple random sampling’ is a probability sampling method that is supposed to provide an unbiased high degree of representativeness of a population. Every individual is chosen by chance solely and has an equal chance of being selected to participate in the study. Some advantages are:

The ease of assembling a well rounded sample, because of the randomness in which the individuals are selected.
It’s a fair, unbiased way (when done right) of selecting a sample from a population since each person is given equal opportunity to be selected.
It’s representativeness of the population proves to be accurate. (A theory though, states that the only thing able to compromise its representativeness is by luck. If a sample doesn’t accurately reflect the population, then it is called a sampling error.)
Even though there may be many advantages, it doesn’t come without disadvantages. Some include:

That it is time consuming and a tedious task ensuring that the sample representation is accurate.
It needs a complete list of all members in a population. It comes with added requirements, such as the list of the population used must be complete and up-to-date. A list such as the one described is most likely not available for large populations and therefore, it is preferred to used other sampling techniques.
Example of simple random sampling: At a carnival, every person that enters the gates receives one of two tickets with the same number. The number for every person is different, and there is one ticket every individual holds on to, while the other is deposited into a large jar. Everyone is entered for the chance to win the grand prize. A number is selected by random at the end of the night and the owner of the winning number will receive their prize.

Stratified random sampling is a type of probability sampling where the population contains several different groups that can be partitioned. With stratified sampling, everyone has an equal chance to be surveyed (like simple random sampling). Advantages of this method of sampling include that it can ensure a well rounded representativeness of all groups in the population.

Example of stratified sampling: There are 4 time zones in the U.S. population, and 200 people are randomly selected from each one to state their favourite food.

Systematic sampling is a type of probability sampling where a sample is created from a list of a population, where members to be part of the study are selected at intervals with a organized rate. With this method, there is no need to use random numbers. Systematic sampling is less random than simple random sampling, which may call for bias, but an advantage is that groups called strata (strata are based off every group member’s characteristics that are in common with others) may be able to increase the precision of the results in systematic sampling.

Example (Not based on realistic facts and mostly just for the purpose of explaining): 80% of Canadian high school graduates in British Columbia move on to higher education. 35% of 80% of those high school graduates are Caucasian, while 15% are African-American, 20% are Asian, 5% are of Aboriginal descent, 10% are Hispanic, 5% are Pacific Islander, and 10% are multiracial. Dividing the population into strata gives the researcher more specific results.

A voluntary response sample is a type of non probability sampling made up of volunteers. These samples are biased, unlike random sampling. An advantage is that the study might be cheap to conduct and the study might be easy to gather, but there are many disadvantages as well. The researcher doesn’t have control over who is in the sample, and the people that are are likely very opinionated people.

For example: people who attend and show support at a Donald Trump rally would most probably have strong opinions and skew the data, making way for inaccurate results. The supporters will want to have Donald Trump as a president, and since they make up the majority of the sample, the results are indeed biased.

Using an inappropriate sampling method may bias the data because it can give an inaccurate representation of the population.

Example: If the coach of the Oklahoma City Thunder basketball team was to do a survey using a voluntary response sample, and he was collecting data on what basketball team his team liked the best, it is safe to assume that the results wouldn’t be too accurate. It is probable that most of the team would pick Oklahoma City Thunder as their favourite team. It would be a better option to choose random sampling to ensure a significantly less chance of bias. .

4. Imagine I told you that I flipped a coin 14 times. What is the expected probability that I would land on heads? You would probably say 50% of the time, or 7 percent, because there are only two outcomes on a coin: heads or tails. That is theoretical probability, when you logically predict the outcome. Experimental probability is when you actually conduct a test to get an outcome. So I actually flip a coin 14 times, and record my results. That is experimental probability. So the difference between theoretical and experimental, is that one is what you expect to happen, and the other one you actually test out.

5. pp_original

This graph is misleading because the y-axis is inaccurate. Both lines are going the correct direction. The “Cancer Screening and Prevention Services” line is going down with the numbers, and the “Abortions” line is going up with the numbers. But the lines shouldn’t intersect, because at the end of the lines, the Abortion rate is smaller and should be at a lower point on the graph than the “Cancer Screening and Prevention Services” line. This graph is trying to make their audience think that the abortion rate is rising very fast, and even surpassed the cancer screening and prevention services rate.

enhanced-buzz-3923-1412284647-14-2

This pie chart is misleading, since the “Yes” total was 50% but on the chart is takes up less space than the “No” with 49%, and 50 and 49 doesn’t add up to 100%. This chart is trying to convince the audience that more people have said that levels haven’t become harder to pass, and that less people have said that they are easier now. In reality, more people said yes than no.

enhanced-buzz-19881-1412285145-8

This is a similar pie chart as the one in the previous example, but in this one, all of the parts contain and equal percentage of 26%. Though in this pie chart, the quadrants are all different sizes with the same percentage. The creator of this chart is trying to create an illusion that there are more juniors and less sophomore students in the school. They used 26% instead of 25% to make it appear that all of the parts have more value.

HubbardMath9Stats

adelynh2015 • June 13, 2016


Previous Post

Next Post

Leave a Reply

Your email address will not be published / Required fields are marked *

Skip to toolbar