Does there seem to be differences in age, length of loan, or amount of loan for those who repaid their loans and those who defaulted?

Statistics Problem
Paper , Order, or Assignment Requirements

Problem 3 (45 marks)

Dataset: credit.csv. The description of the variables is in an excel file named

descriptionCreditScoringData.xls.

This data set consists of genuine credit records from a South German bank. The aim would generally

be to predict which customers will repay the loan in full and which of them will not. There are 1000

records and all amounts are in Deutschmarks. Answer the following using suitable approaches

whether descriptive/graphical or inferential and using a suitable package e.g. StatTools. Justify your

answers in the main text and include all workings as appendix.

a) Wherever possible and meaningful, provide a brief analysis of each variable, including their
distribution, outliers, etc.

b) Does there seem to be differences in age, length of loan, or amount of loan for those who repaid
their loans and those who defaulted?

c) Explore and describe the association of each variable with the credit status.
d) Does the Length of the loan vary with the use of the loan?
e) Determine relationships, if any, between Age, Length of loan and Amount of loan.
f) Construct a 3-way contingency table from the factors credit, record and use, and analyse it. You
must state your final conclusions in detail.

 

"Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!"

How many samples should the DMF test in order to conclude the sampling distribution will be approximately normal if the population is distinctly non-Normal. Why?

Statistics Problem
Paper , Order, or Assignment Requirements

Red tide” is a bloom of poison-producing algae–a few different species of a class of plankton called dinoflagellates. When the weather and water conditions cause these blooms, shellfish such as clams living in the area develop dangerous levels of a paralysis-inducing toxin. In Massachusetts, the Division of Marine Fisheries (DMF) monitors levels of the toxin in shellfish by regular sampling of shellfish along the coastline. If the mean level of toxin in clams exceeds 800 μg (micrograms) of toxin per kg of clam meat in any area, clam harvesting is banned there until the bloom is over and levels of toxin in clams subside. During a bloom, the distribution of toxin levels in clams on a single mudflat is distinctly non-Normal.

How many samples should the DMF test in order to conclude the sampling distribution will be approximately normal if the population is distinctly non-Normal. Why?

Define the parameter of interest and state the appropriate hypotheses for the DMF to test.

Describe a Type I and a Type II error in this situation and the consequences of each.

 

"Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!"

Define the parameter and population of interest for the hypothesis test.

Statistics Problem
Paper , Order, or Assignment Requirements

The amount of lead in a certain type of soil, when released by a standard extraction method, averages 86 parts per million (ppm). Developers of a new extraction method wondered if their method would extract a significantly different amount of lead. 40 specimens were obtained, with a mean of 83 ppm lead and a sd of 10 ppm.

What type of statistics test is this?

What are the conditions for the test and are they satisfied?

Define the parameter and population of interest for the hypothesis test.

Write the null and alternate hypotheses.

What is the test statistic and the p-value of the test?

Write the conclusion in the context of the problem, first stating whether you will reject or fail to reject the null hypothesis and what that means.

Construct a 95% Confidence Interval about the mean amount of lead being extracted using the new method.

 

"Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!"

Using the information provided in Table 1, and the critical path identified in the previous question, calculate the standard deviation for the project.

Project Management Statistics
Paper , Order, or Assignment Requirements

PERT Network Diagram 1 illustrates the ten interrelated activities that comprise a certain project.

PERT Network Diagram 1

The information presented in Table 1 represents the optimistic activity time (a), most probable activity time (m), and pessimistic activity time (b) in weeks for each activity associated with the project.

Table 1

Activity Optimistic (a) Most Probable (m) Pessimistic (b)
A 2 3 4
B 1 3 5
C 2 5 8
D 4 10 16
E 2 4 6
F 3 6.25 8
G 2 4 6
H 1 3.25 4
I 1 2 3
J 3 5 7

PERT Network Diagram 2

Using the information provided in Table 1, calculate the expected activity time (t) in weeks for each activity and then enter this information in the appropriate data field provided for each activity in PERT Network Diagram 2 in order to complete this diagram. Use the Excel sheet provided by the Instructor.

Using the information provided in Table 1, calculate the variance for the estimated activity time for each activity.

3 – 14. Using the information in the completed PERT Network Diagram 2, calculate the overall estimated activity time in weeks for each possible of the 12 possible paths through the network. The following table may help you organize your data: (1 point each 12 points total)

Question 3 P1 =
Question 4 P2 =
Question 5 P3 =
Question 6 P4 =
Question 7 P5 =
Question 8 P6 =
Question 9 P7 =
Question 10 P8 =
Question 11 P9 =
Question 12 P10 =
Question 13 P11 =
Question 14 P12 =

Identify the Critical Path. (3 points)

Using the information provided in Table 1, and the critical path identified in the previous question, calculate the standard deviation for the project.

Using the information provided in Table 1 and completed Project Network Diagram 2 calculate the probability of the project critical path activities being completed in less than or equal to 30 weeks. (Assume a standard deviation of 2.50 weeks)

Using the information provided in Table 1 and completed Project Network Diagram 2 calculate the probability of the project critical path activities being completed in more than 35 weeks. (Assume a standard deviation of 2.50 weeks)

Using the information provided in Table 1 and completed Project Network Diagram 2 calculate the probability of the project critical path activities being completed between 31 and 36 weeks. (Assume a standard deviation of 2.50 weeks)

The information presented in Table 2 represents the calculated Earliest Start, Earliest Finish, Latest Start and Latest Finish times for each activity.

Table 2

Activity Earliest Start Earliest Finish Latest Start Latest Finish
A 0 3 0 3
B 3 6 3 6
C 6 11 11 16
D 6 16 6 16
E 16 20 18 22
F 16 22 16 22
G 22 26 22 26
H 22 25 23 26
I 26 28 26 28
J 28 33 28 33

Using the information provided in Table 2, calculate the slack time in weeks for each activity by filling out the PERT chart provided by the Instructor.

The information presented in Table 3 represents the total budgeted cost for each activity associated with the project. Table 3

Activity Total Budgeted Cost
A $45,500
B $38,750
C $34,500
D $13,750
E $11,250
F $10,900
G $15,250
H $14,750
I $23,100
J $15,650

Using the information provided in Table 2 and Table 3, calculate the budgeted costs at the end of week seven of the project based upon using the Earliest Start Date for each activity and calculate the budgeted costs at the end of week seven of the project based upon using the Latest Start Date for each activity in order to then calculate the difference in the cash flow at the end of week seven between these two scenarios. Use the Excel sheet provided by the Instructor. (10 points)

Using the information provided in Table 2 and Table 3, calculate the budgeted costs at the end of week eleven of the project based upon using the Earliest Start Date for each activity and calculate the budgeted costs at the end of week eleven of the project based upon using the Latest Start Date for each activity in order to then calculate the difference in the cash flow at the end of week eleven between these two scenarios. Use the Excel sheet provided by the Instructor. (10 points)

Table 4 represents the percentage of completion and actual cost of work performed data for each activity associated with the project at a certain point in time during the project.

Table 4

Activity

Total Budgeted Cost Percent of Completion Actual Cost of Work Completed
A $45,500 100% $46,750
B $38,750 100% $44,500
C $34,500 100% $33,250
D $13,750 80% $8,750
E $11,250 70% $8,800
F $10,900 65% $7,450
G $15,250 30% $4,850
H $14,750 25% $3,000
I $23,100 0% 0%
J $15,650 0% 0%

Using the information provided in Table 4, calculate the dollar value of work completed and the difference between Budgeted and Actual Cost for each activity. Which activity evidences the single largest activity difference (regardless of whether it is a cost overrun or cost under run) at this particular point in the project?

Using the information provided in Table 4 and the answer to the previous question, calculate the overall cost overrun or cost under run for the project at this particular point in the project.

Table 5 represents the estimated crash time in weeks and associated total crash cost for each activity.

Table 5

Activity Crash Time
(Weeks)

Crash Cost
A 2 $52,500.00
B 2 $40,000.00
C 4 $36,000.00
D 9 $14,900.00
E 3 $12,250.00
F 2 $23,000.00
G 3 $18,500.00
H 2 $18,000.00
I 0 $30,000.00
J 4 $20,000.00

Using the information in the completed PERT Network Diagram 2 and Table 5, crash the network in order to reduce the estimated project duration to 31 weeks while minimizing the overall crash cost. Which critical path activity would be the most logical choice for the first activity to crash?

Which of the following critical path activities would be the most logical choice for the second activity to crash?

What is the total estimated cost associated with crashing the network to reduce the estimated project duration to 31 weeks at the lowest possible overall crash cost?

After successfully crashing the network to reduce the estimated project duration to 31 weeks, identify the crashed network critical path?

 

"Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!"

What do the areas in the intervals µ – σ to µ + σ, µ – 2σ to µ + 2σ and µ – 3σ to µ + 3σ represent as far as areas under the normal curve?

Statistics Discussions
Paper , Order, or Assignment Requirements

Discussion 1 300 Words

When one thinks of the normal distribution, the first thing that comes to mind is the bell curve and grades. While this is one example of a normal curve that is widely recognized, it is not the only one. Try to come up with a unique normal distribution that your classmates have not posted already. Explain your curve with items such as the mean and standard deviation, if available. What do the areas in the intervals µ – σ to µ + σ, µ – 2σ to µ + 2σ and µ – 3σ to µ + 3σ represent as far as areas under the normal curve? If you have the mean and standard deviation, calculate what the actual intervals are for your curve. Please include any citations regarding where you obtained your data for the curve.

*******

Discussion 2 300 Words

One can calculate the 95% confidence interval for the mean with the population standard deviation known. This will give us an upper and a lower confidence limit. What happens if we decide to calculate the 99% confidence interval? Describe how the increase in the confidence level has changed the width of the confidence interval. Do the same for the confidence interval set at 80%. Please include an example with actual numerical values for the intervals in your post.

******

Discussion 3 300 Words

Share with your peers the null and alternative hypotheses for a decision that is relevant to your personal or professional life. Additionally, identify the Type I and Type II errors that could occur with your decision‐making process.

******

Discussion 4 300 Words

Regression analysis can be used to analyze how a change in one variable impacts the other variable, such as an increase in marketing budget increasing sales. Find a unique area of your life where one variable impacts the other variable (and that are both measurable) and do a regression analysis on it. Be sure to include the coefficient of determination as well as the test of significance. Share your results and make any comments as to whether or not there is a possibility of potential problems (causation or extrapolation) with your results.

 

"Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!"

Discuss the sampling method. If it wasn’t a simple random sample of likely voters, why not?

Statistics
Paper , Order, or Assignment Requirements

Option A. Write a paper examining and discussing the methods and results of political polling for this presidential election. This should be research-based and not simply offer your opinion of the validity of the polls as reported. Find and summarize relevant details for one particular poll or polling organization. Some issues to consider and investigate: Discuss the sampling method. If it wasn’t a simple random sample of likely voters, why not? If the poll over-sampled democratic voters and under-sampled republican voters using stratified sampling (some polls reported 55% and 45%, respectively), why? What effect did this have? Telephone survey of land-line numbers? Cell phones? Internet? Voluntary response sample? Possible bias? Discuss the sample size and the related margin of error that was reported. What questions were asked? Did the wording or sequencing of the questions affect the responses or results? Could the responses have been biased due to what has been termed “social desirability bias”? Perhaps the media created a national climate that suggested you didn’t have half a brain if you didn’t support their preferred candidate. So maybe respondents answered what would be acceptable which may have differed from their actual views. Other important or relevant factors? I would guess that a concise and well-written paper between two and four pages could adequately address these issues. Option B. This option extends the philosophy that the best way to learn a subject is to teach it. That is one reason I encourage you to discuss the class ideas with your classmate contacts. Perhaps we can extend this to the assessment of statistical understanding. Write eight original multiple choice questions covering material from Chapters 4 through 7 in the text. There should be two questions from each of the four chapters. The questions should have well-chosen choices in addition to the correct response. Each question should be substantial and assess one or more important statistical concepts. Put some quality thought into this. Writing your eight questions and the responses should be your individual effort, not group work. For each question, discuss why you felt that topic was important and how your question assesses it well. Also explain the rationale for the incorrect choices you provided.

 

"Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!"

If you are hypothesis testing, what are your null and alternative hypotheses?

Biostatistics Process and Calculations
Paper , Order, or Assignment Requirements

Process and Calculations
In order to successfully explore the health question, you need to plan what tests you need to run. Create a table in which you propose the calculations and graphs you will need to perform to answer the health question you are investigating. Then explain why you chose these calculations to explore your health question.

Overview: Your task is to help the organization answer the health question by critically analyzing the data. You will run statistical tests, interpret the results, and
present the results and recommendations to non-technical decision makers in the form of a statistical report. Keep in mind that it is your job to do this from a
statistical standpoint. Be sure to justify your conclusions and recommendations with appropriate statistical support.
Prompt: In order to successfully explore the health question, you need to plan what tests you need to run. Create a table in which
you propose the calculations and graphs you will need to perform to answer the health question you are investigating. In crafting your table, consider the
following:
· Watch Choosing a Statistical Test (12:31).
· What tests will you need to run?
· If you are hypothesis testing, what are your null and alternative hypotheses?
· Are there any tests that you want to use that you have not learned about yet? What are they? What is your plan for researching these tests?
Then explain why you chose these calculations to explore your health question.
Specifically address this critical element:
● Process: Propose how you will go about answering the health question you were asked to address based on the data set provided.

Data Analysis
A. Graphs: In this section, you will use graphical displays to examine the data and formulate an initial hypothesis. In particular, you should:
1. Create key graphical displays that give a sense of potential relationships between variables. Include the graphs and discuss why you
selected these graphical displays as opposed to others.
B. Conduct appropriate hypothesis tests, simple regressions, and other tests to analyze the data set.
C. Explain why these tests are the best choice in this context and how they compare with established best practices.
D. Analysis of Biostatistics: Use this section to describe your findings from a statistical standpoint. Be sure to:
1. Present key biostatistics from the graphs, tests, and regressions performed, and explain what they mean. Be sure to include a
spreadsheet showing your work as an appendix.
2. What statistical inferences or conclusions can you draw based on the hypothesis tests and simple regression analyses performed?
Justify your response.

 

"Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!"

What are the two Branches of Statistics?

What are the two Branches of Statistics?
Paper , Order, or Assignment Requirements

What are the two Branches of Statistics?

Q2: Theorize what the distributions would approximately look like for both the right-hand and left-hand digits of the populations of towns and cities from your selected state. Show a sketch in the rectangles below.

You have to choose one of three options to complete this activity. Your total score depends on the option chosen.

Option 1: Regular score (7.0 points)

Which Option did you choose?

If Option 1, which State?

Analysis: Now with both side-by-side bar graphs completed, answer the questions below.

What type of distribution shape would you say fits (models) the following counts?

Which of the two types of actual digits fits its theoretical count distribution the best?

Referring to the one you did not circle above, describe in a brief sentence what characteristic(s) of the shape made it not be the best fit.

Discussion: Before addressing the items below, modify the formula for the Left-hand

Theoretical Count by using the logarithmic formula from Benford’s Law. A brief Internet search on “Benford’s Law” and/or “first digit phenomenon” would be very helpful.

Why do you suppose the actual count distributions are not as smooth as the theoretical?

By changing the right-hand theoretical count to reflect the logarithmic formula from Benford’s Law, discuss how the shape improved the fit of the actual count

Why would the right-hand digit’s distribution be approximately uniform (flat)?

Why would the left-hand digit’s distribution be roughly right-skewed? (see #5 below)

To better understand why there is a built-in bias for the lower digits in the left-hand distribution, scan the sorted populations of your state from low to high. Discuss why a city, as it grows in population, would remain with a left-hand digit of a 1 longer than a 2, or why longer with a 2 than a 3, etc. You may come to a better appreciation for the first digit phenomenon that occurs in certain kinds of data by noting how there is a 100% increase from 1 to 2 but then a dramatically tapering percentage thereafter. Fill in the rest of the table and discuss how this might apply to population changes in a town or city.

Based on your Internet research, discuss a practical application of Benford’s Law that

interested you and why. Also, include what the Benford ratios are for digits 1 through 9.

 

"Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!"

Calculate the mean, median, and standard deviation for ounces in the bottles.

Statistics
Paper , Order, or Assignment Requirements

Write a two to three (2-3) page report in which you:

1. Calculate the mean, median, and standard deviation for ounces in the bottles.
2. Construct a 95% Confidence Interval for the ounces in the bottles.
3. Conduct a hypothesis test to verify if the claim that a bottle contains less than sixteen (16) ounces is supported. Clearly state the logic of your test, the calculations, and the conclusion of your test.
4. Provide the following discussion based on the conclusion of your test:
a. If you conclude that there are less than sixteen (16) ounces in a bottle of soda, speculate on three (3) possible causes. Next, suggest the strategies to avoid the deficit in the future.

Or

b. If you conclude that the claim of less soda per bottle is not supported or justified, provide a detailed explanation to your boss about the situation. Include your speculation on the reason(s) behind the claim, and recommend one (1) strategy geared toward mitigating this issue in the future.

 

"Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!"