General Instructions

Aims of the Lab

In this lab you will learn and practice how to get sense (explore) of data from various visualization and summary statistics techniques before modeling. In particular we will cover the following objectives:

We will use the same dataset and example used in the reference book (https://github.com/WinVector/zmPDSwR/tree/master/Custdata). Copy of the dataset is in the ‘data’ sub folder of your working directory. Open the data file in text

Preliminaries

The aim of the example using ‘custdata’ is to build a model that predicts the customers who dont have a health insurance. You have identified and collected data (this is done for you) that may lead to create model to achieve the goal.

A dataset may have a inconsistancies and may be not ready to use for a given task as it is due to missing values, outliers and irrelavant values. Hence we use data exploration to get an idea about a dataset.

1. Working with Summary Statistics

To get a quick understanding a dataset we can use summary statistics.

rr summary(custdata)

     custid        sex     is.employed         income                   marital.stat health.ins                            housing.type recent.move      num.vehicles  
 Min.   :   2068   F:440   Mode :logical   Min.   : -8700   Divorced/Separated:155   Mode :logical   Homeowner free and clear    :157   Mode :logical   Min.   :0.000  
 1st Qu.: 345667   M:560   FALSE:73        1st Qu.: 14600   Married           :516   FALSE:159       Homeowner with mortgage/loan:412   FALSE:820       1st Qu.:1.000  
 Median : 693403           TRUE :599       Median : 35000   Never Married     :233   TRUE :841       Occupied with no rent       : 11   TRUE :124       Median :2.000  
 Mean   : 698500           NA's :328       Mean   : 53505   Widowed           : 96   NA's :0         Rented                      :364   NA's :56        Mean   :1.916  
 3rd Qu.:1044606                           3rd Qu.: 67000                                            NA's                        : 56                   3rd Qu.:2.000  
 Max.   :1414286                           Max.   :615000                                                                                               Max.   :6.000  
                                                                                                                                                        NA's   :56     
      age              state.of.res is.employed.fix   
 Min.   :  0.0   California  :100   Length:1000       
 1st Qu.: 38.0   New York    : 71   Class :character  
 Median : 50.0   Pennsylvania: 70   Mode  :character  
 Mean   : 51.7   Texas       : 56                     
 3rd Qu.: 64.0   Michigan    : 52                     
 Max.   :146.7   Ohio        : 51                     
                 (Other)     :600                     

1.1 Problems revealed by summary statistics

  • Missing Values
  • Q6 Which fields in custdata have a common number of missing values? Are the missing values significant as a percentage of the data?

  • Q7 Compare the percentage figures obtained from Q5.2 and Q6. What is your strategy to deal with missing values in each case?

  • Outliers and Invalid Values Outliers are data points that fall well out of the range of what you expect your data to be.
  • Q8 Comment what you observe about the summary of the income field?

2. Working with Visualizations

Spotting problems using graphics and visualisations

  • Q9 Name three graph visualisation packages in R.

  • Q10 Briefly write William Cleveland’s principles for scientific visualisations.

2.1 Visually checking distributions for a single variable.

The visualisations we discuss in this section can answer questions like

  • What is the peak value of the distribution?
  • How many peaks are there in the distribution (unimodality versus bimodality)?
  • How normal (or lognormal) is the data?
  • How much does the data vary? Is it concentrated in a certain interval or in a certain category?

Histograms

A histogram discretize the the range of a variable into bins present the frequency of the bins as a visualisation.

  • Q11 Use the following code to generate the histogram of age variable of customer data.
  • 11.1 What does the bandwidth parameter mean?
  • 11.2 Change the bandwidth value to 2 and describe what happens to the histogram shape?
  • 11.3 Change the bandwidth value to 10 and describe what happen to the histogram shape?
  • 11.4 Are there disadvantages of using histograms?

  • Q12 Children under 5 years do not use use healthcare and people rarely live over 100 years. Based on these statements and using the histogram in Q11, can you identify outliers and invalid values?

Densitiy Plots

A density plot can be used to quickly get an idea about the distribution. Whether the data is concentrated in one area or spreaded.

  • Q13 Use the following code to generate the density plot for the income variable.
  • Give a rough estimate of an income range where most of the population is concentrated. If you want to further expand this part of the population, you can use lagarithmic scale (e.g. scale_x_log10)
  • How many sub population can be found in the income?

rr library(scales) ggplot(custdata) + geom_density(aes(x=income)) + scale_x_continuous(labels=dollar)

Bar Charts

Bar chart is a histogram for discrete data.

  • Q14 Use the following code snippet along with the codes used in the earlier questions to generate the bar plot for marital status. If you are going to use marital status as one of the modeling variables in health insurance, it is better to understand such a categorical variable has good representation across the population.

  • Q15 draw the bar chart for state.of.res variable in custdata.
  • Flip the chart view to horizontal view using coord_flip()

  • Q16 In what circumstances the bar charts are advantageous? Consider the type and nature of data in your answer.

2.2 Visually checking relationships between two variables

The visualisations we discuss in this section can answer questions like

  • Is there a relationship between the two inputs age and income in custdata?
  • What kind of relationship, and how strong?
  • Is there a relationship between the input marital status and the output health insurance? How strong?

Line Plots

Line plots works best when the relationship between two variables are relatively clean

  • Q17 Use the following code to generate a line plot for abstract data.

rr x <- runif(100) y <- x^2 + 0.2*x ggplot(data.frame(x=x,y=y), aes(x=x,y=y)) + geom_line()

Scatter Plots and Smoothing Curves

Sometimes the relationship between two variables are not clean (not strongly correlated) as the synthetic data we generated in Q17. We can use correlation summary statistic to find the relationship between two variales. Further information can be done using scatter plots.

  • Q18 Use the following code to filter sensible subset of data from custdata. Then find the correlation between age and income using cor() function.

rr custdata2 <- subset(custdata,(custdata\(age > 0 & custdata\)age < 100 & custdata$income > 0))

  • Q19 Create a scatter plot to find relationship between income and age using the following code. In addition to the scatter plot, the code draws a smoothing line which shows the relationship(linear) betwen the two variables.
  • Does the smoothing curve helpful to see the relationship between two variables in this example?

  • Q20 Draw the scatter plot and the smoothing curve without specifying the smoothing function as follows.
  • What is the difference betwen the smoothing lines in Q19 and Q20?
  • What is the (shaded) ribbon around the smoothing curve mean?

rr ggplot(custdata2, aes(x=age, y=income)) + geom_point() + geom_smooth() + ylim(0, 200000)

  • Q21 Change the scatter plot code in Q20 to plot health.ins and age. Comment on the shape and direction of the smoothing line.

Bar Charts for two categorical variables

We can use barcharts for two categorical variables to represent probabilities.

  • Q22 Draw two barchart types to present health insurance and marital status using the following code.
  • How do you interpret the height of bars in each plot?
  • What are advantages of drawing bar charts for two categorical variables (use these charts in your explanation)

rr ggplot(custdata) + geom_bar(aes(x=marital.stat, fill=health.ins))

LS0tDQp0aXRsZTogIkNJVFMgNDAwOSBMYWIgMyAtIEV4cGxvcmluZyBEYXRhIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCg0KIyMjIEdlbmVyYWwgSW5zdHJ1Y3Rpb25zDQoqIFlvdXIgbGFic2hlZXRzIHdpbGwgYmUgc3RydWN0dXJlZCB3aXRoIGNvbXBsZW1lbnRvcnkgaW5mb3JtYXRpb24uIFRoZSBsYWJzIHdpbGwgY2xvc2VseSBmb2xsb3cgdGhlIHN0cnVjdHVyZSBvZiAiUHJhY3RpY2FsIERhdGEgU2NpZW5jZSB3aXRoIFIiIGJvb2sgYnkgTmluYSBadW1lbCBhbmQgSm9obiBNb3VudCANCiogRnJvbSBlYWNoIGxhYiB5b3UgYXJlIGV4cGVjdGVkIHRvIGFuc3dlciBhbGwgdGhlIHF1ZXN0aW9ucyBwcmVzZW50ZWQgd2l0aCBhIHF1ZXN0aW9uIG51bWJlci4gDQoNCiMjIyBBaW1zIG9mIHRoZSBMYWINCkluIHRoaXMgbGFiIHlvdSB3aWxsIGxlYXJuIGFuZCBwcmFjdGljZSBob3cgdG8gZ2V0IHNlbnNlIChleHBsb3JlKSBvZiBkYXRhIGZyb20gdmFyaW91cyB2aXN1YWxpemF0aW9uIGFuZCBzdW1tYXJ5IHN0YXRpc3RpY3MgdGVjaG5pcXVlcyBiZWZvcmUgbW9kZWxpbmcuIEluIHBhcnRpY3VsYXINCndlIHdpbGwgY292ZXIgdGhlIGZvbGxvd2luZyBvYmplY3RpdmVzOg0KDQoqIFVzaW5nIHN1bW1hcnkgc3RhdGlzdGljcyB0byBleHBsb3JlIGRhdGENCiogRXhwbG9yaW5nIGRhdGEgdXNpbmcgdmlzdWFsaXphdGlvbg0KKiBGaW5kaW5nIHByb2JsZW1zIGFuZCBpc3N1ZXMgZHVyaW5nIGRhdGEgZXhwbG9yYXRpb24NCg0KIyMjIyBXZSB3aWxsIHVzZSB0aGUgc2FtZSBkYXRhc2V0IGFuZCBleGFtcGxlIHVzZWQgaW4gdGhlIHJlZmVyZW5jZSBib29rIChodHRwczovL2dpdGh1Yi5jb20vV2luVmVjdG9yL3ptUERTd1IvdHJlZS9tYXN0ZXIvQ3VzdGRhdGEpLiBDb3B5IG9mIHRoZSBkYXRhc2V0IGlzIGluIHRoZSAnZGF0YScgc3ViIGZvbGRlciBvZiB5b3VyIHdvcmtpbmcgZGlyZWN0b3J5LiBPcGVuIHRoZSBkYXRhIGZpbGUgaW4gdGV4dA0KDQojIFByZWxpbWluYXJpZXMNClRoZSBhaW0gb2YgdGhlIGV4YW1wbGUgdXNpbmcgJ2N1c3RkYXRhJyBpcyB0byBidWlsZCBhIG1vZGVsIHRoYXQgcHJlZGljdHMgdGhlIGN1c3RvbWVycyB3aG8gZG9udCBoYXZlIGEgaGVhbHRoIGluc3VyYW5jZS4gWW91IGhhdmUgaWRlbnRpZmllZCBhbmQgY29sbGVjdGVkIGRhdGEgKHRoaXMgaXMgZG9uZSBmb3IgeW91KSB0aGF0IG1heSBsZWFkIHRvIGNyZWF0ZSBtb2RlbCB0byBhY2hpZXZlIHRoZSBnb2FsLg0KDQoqICoqUTEqKiBSZWFkIHRoZSBkYXRhIGluIHRoZSBjdXN0ZGF0YS50c3Ygc3RvcmVkIGluIHRoZSAnZGF0YScgc3ViZm9sZGVyIGludG8gYSB2YXJpYWJsZSBjYWxsZWQgJ2N1c3RkYXRhJy4gSGludDogdXNlIHRoZSByZWFkLnRhYmxlKCkgZnVuY3Rpb24uDQoNCkEgZGF0YXNldCBtYXkgaGF2ZSBhIGluY29uc2lzdGFuY2llcyBhbmQgbWF5IGJlIG5vdCByZWFkeSB0byB1c2UgZm9yIGEgZ2l2ZW4gdGFzayBhcyBpdCBpcyBkdWUgdG8gbWlzc2luZyB2YWx1ZXMsIG91dGxpZXJzIGFuZCBpcnJlbGF2YW50IHZhbHVlcy4gSGVuY2Ugd2UgdXNlIGRhdGEgZXhwbG9yYXRpb24gdG8gZ2V0IGFuIGlkZWEgYWJvdXQgYSBkYXRhc2V0Lg0KDQoqICoqUTIqKiBGaW5kIHRoZSB0eXBlIG9mIHRoZSAnY3VzdGRhdGEnIHZhcmlhYmxlIHVzaW5nIGNsYXNzIGNvbW1hbmQuIA0KDQoqICoqUTMqKiBVc3VhbGx5IGluIHByb2R1Y3Rpb24gZW52aXJvbm1lbnRzIHRoZSBkYXRhIGFyZSBzdG9yZWQgaW4gU1FMIGRhdGFiYXNlcy4gV2h5IGEgdHlwaWNhbCBkYXRhIGV4cGxvcmF0aW9uIHRhc2sgaW4gUiB1c2VzIGl0cyBidWlsdCBpbiBkYXRhIHN0cnVjdHVyZSByYXRoZXIgdGhhbiB1dGlsaXNpbmcgU1FMIHF1ZXJpZXM/DQoNCiMgMS4gV29ya2luZyB3aXRoIFN1bW1hcnkgU3RhdGlzdGljcw0KVG8gZ2V0IGEgcXVpY2sgdW5kZXJzdGFuZGluZyBhIGRhdGFzZXQgd2UgY2FuIHVzZSBzdW1tYXJ5IHN0YXRpc3RpY3MuIA0KDQoqICoqUTQqKiBVc2UgdGhlIGZvbGxvd2luZyBjb2RlIHNuaXBwZXQgdG8gZ2V0IHRoZSBzdW1tYXJ5IG9mIGN1c3RvbWVyIGRhdGEuDQoNCmBgYHtyfQ0Kc3VtbWFyeShjdXN0ZGF0YSkNCmBgYA0KDQoqICoqUTUqKiBGcm9tIHRoZSBvdXRwdXQgb2YgUTQsIGFuc3dlciB0aGUgZm9sbG93aW5nDQogICAgKyA1LjEgQXJlIHRoZXJlIGludmFsaWQgdmFsdWVzIGluICdpbmNvbWUnIHN1bW1hcmllcz8gSWYgc28gd2hhdCBpcyB0aGUgZmllbGQgYW5kIGdpdmUgcmVhc29ucyB0byB5b3VyIGNvbmNsdXNpb24uDQogICAgKyA1LjIgSG93IG1hbnkgbWlzc2luZyB2YWx1ZXMgYXJlIHRoZXJlIGluICdpcy5lbXBsb3llZCc/IENhbiB5b3UgYXNzZXJ0IGhvdyBzaWduaWZpY2FudCB0aGlzIHZhbHVlIGFzIGEgcGVyY2VudGFnZSBvZiB0aGUgZGF0YT8NCiAgICArIDUuMyBDb21tZW50IG9uIGhvdyB0byBpbnRlcnByZXRlIG1pbmltdW0sIGF2ZXJhZ2UgYW5kIG1heGltdW0gYWdlIG9mIGEgcGVyc29uIHVzaW5nIHN1bW1hcnkgc3RhdGlzdGljcz8gQXJlIHRoZXkgcGxhdXNpYmxlIHZhbHVlcz8NCiAgICANCiMjIyAxLjEgUHJvYmxlbXMgcmV2ZWFsZWQgYnkgc3VtbWFyeSBzdGF0aXN0aWNzDQoNCiogTWlzc2luZyBWYWx1ZXMNCiogKipRNioqIFdoaWNoIGZpZWxkcyBpbiBjdXN0ZGF0YSBoYXZlIGEgY29tbW9uIG51bWJlciBvZiBtaXNzaW5nIHZhbHVlcz8gQXJlIHRoZSBtaXNzaW5nIHZhbHVlcyBzaWduaWZpY2FudCBhcyBhIHBlcmNlbnRhZ2Ugb2YgdGhlIGRhdGE/DQoNCiogKipRNyoqIENvbXBhcmUgdGhlIHBlcmNlbnRhZ2UgZmlndXJlcyBvYnRhaW5lZCBmcm9tIFE1LjIgYW5kIFE2LiBXaGF0IGlzIHlvdXIgc3RyYXRlZ3kgdG8gZGVhbCB3aXRoIG1pc3NpbmcgdmFsdWVzIGluIGVhY2ggY2FzZT8NCg0KKiBPdXRsaWVycyBhbmQgSW52YWxpZCBWYWx1ZXMNCk91dGxpZXJzIGFyZSBkYXRhIHBvaW50cyB0aGF0IGZhbGwgd2VsbCBvdXQgb2YgdGhlIHJhbmdlIG9mIHdoYXQgeW91IGV4cGVjdCB5b3VyIGRhdGEgdG8gYmUuIA0KKiAqKlE4KiogQ29tbWVudCB3aGF0IHlvdSBvYnNlcnZlIGFib3V0IHRoZSBzdW1tYXJ5IG9mIHRoZSBpbmNvbWUgZmllbGQ/DQoNCg0KDQojIDIuIFdvcmtpbmcgd2l0aCBWaXN1YWxpemF0aW9ucw0KIyMgU3BvdHRpbmcgcHJvYmxlbXMgdXNpbmcgZ3JhcGhpY3MgYW5kIHZpc3VhbGlzYXRpb25zDQoNCiogKipROSoqIE5hbWUgdGhyZWUgZ3JhcGggdmlzdWFsaXNhdGlvbiBwYWNrYWdlcyBpbiBSLg0KDQoqICoqUTEwKiogQnJpZWZseSB3cml0ZSBXaWxsaWFtIENsZXZlbGFuZCdzIHByaW5jaXBsZXMgZm9yIHNjaWVudGlmaWMgdmlzdWFsaXNhdGlvbnMuDQoNCiMjIyAyLjEgVmlzdWFsbHkgY2hlY2tpbmcgZGlzdHJpYnV0aW9ucyBmb3IgYSBzaW5nbGUgdmFyaWFibGUuIA0KVGhlIHZpc3VhbGlzYXRpb25zIHdlIGRpc2N1c3MgaW4gdGhpcyBzZWN0aW9uIGNhbiBhbnN3ZXIgcXVlc3Rpb25zIGxpa2UgDQoNCisgV2hhdCBpcyB0aGUgcGVhayB2YWx1ZSBvZiB0aGUgZGlzdHJpYnV0aW9uPw0KKyBIb3cgbWFueSBwZWFrcyBhcmUgdGhlcmUgaW4gdGhlIGRpc3RyaWJ1dGlvbiAodW5pbW9kYWxpdHkgdmVyc3VzIGJpbW9kYWxpdHkpPw0KKyBIb3cgbm9ybWFsIChvciBsb2dub3JtYWwpIGlzIHRoZSBkYXRhPyANCisgSG93IG11Y2ggZG9lcyB0aGUgZGF0YSB2YXJ5PyBJcyBpdCBjb25jZW50cmF0ZWQgaW4gYSBjZXJ0YWluIGludGVydmFsIG9yIGluIGEgY2VydGFpbg0KY2F0ZWdvcnk/DQoNCiMjIyMgKipIaXN0b2dyYW1zKioNCkEgaGlzdG9ncmFtIGRpc2NyZXRpemUgdGhlIHRoZSByYW5nZSBvZiBhIHZhcmlhYmxlIGludG8gYmlucyBwcmVzZW50IHRoZSBmcmVxdWVuY3kgb2YgdGhlIGJpbnMgYXMgYSB2aXN1YWxpc2F0aW9uLg0KDQoqICoqUTExKiogVXNlIHRoZSBmb2xsb3dpbmcgY29kZSB0byBnZW5lcmF0ZSB0aGUgaGlzdG9ncmFtIG9mIGFnZSB2YXJpYWJsZSBvZiBjdXN0b21lciBkYXRhLiANCisgMTEuMSBXaGF0IGRvZXMgdGhlIGJhbmR3aWR0aCBwYXJhbWV0ZXIgbWVhbj8NCisgMTEuMiBDaGFuZ2UgdGhlIGJhbmR3aWR0aCB2YWx1ZSB0byAyIGFuZCBkZXNjcmliZSB3aGF0IGhhcHBlbnMgdG8gdGhlIGhpc3RvZ3JhbSBzaGFwZT8gDQorIDExLjMgQ2hhbmdlIHRoZSBiYW5kd2lkdGggdmFsdWUgdG8gMTAgYW5kIGRlc2NyaWJlIHdoYXQgaGFwcGVuIHRvIHRoZSBoaXN0b2dyYW0gc2hhcGU/DQorIDExLjQgQXJlIHRoZXJlIGRpc2FkdmFudGFnZXMgb2YgdXNpbmcgaGlzdG9ncmFtcz8NCg0KDQoNCmBgYHtyfQ0KbGlicmFyeShnZ3Bsb3QyKQ0KZ2dwbG90KGN1c3RkYXRhKSArDQpnZW9tX2hpc3RvZ3JhbShhZXMoeD1hZ2UpLA0KYmlud2lkdGg9NSwgZmlsbD0iZ3JheSIpDQpgYGANCg0KDQoqICoqUTEyKiogQ2hpbGRyZW4gdW5kZXIgNSB5ZWFycyBkbyBub3QgdXNlIHVzZSBoZWFsdGhjYXJlIGFuZCBwZW9wbGUgcmFyZWx5IGxpdmUgb3ZlciAxMDAgeWVhcnMuIEJhc2VkIG9uIHRoZXNlIHN0YXRlbWVudHMgYW5kIHVzaW5nIHRoZSBoaXN0b2dyYW0gaW4gUTExLCBjYW4geW91IGlkZW50aWZ5IG91dGxpZXJzIGFuZCBpbnZhbGlkIHZhbHVlcz8NCg0KIyMjIyAqKkRlbnNpdGl5IFBsb3RzKioNCkEgZGVuc2l0eSBwbG90IGNhbiBiZSB1c2VkIHRvIHF1aWNrbHkgZ2V0IGFuIGlkZWEgYWJvdXQgdGhlIGRpc3RyaWJ1dGlvbi4gV2hldGhlciB0aGUgZGF0YSBpcyBjb25jZW50cmF0ZWQgaW4gb25lIGFyZWEgb3Igc3ByZWFkZWQuDQoNCiogKipRMTMqKiBVc2UgdGhlIGZvbGxvd2luZyBjb2RlIHRvIGdlbmVyYXRlIHRoZSBkZW5zaXR5IHBsb3QgZm9yIHRoZSBpbmNvbWUgdmFyaWFibGUuDQorIEdpdmUgYSByb3VnaCBlc3RpbWF0ZSBvZiBhbiBpbmNvbWUgcmFuZ2Ugd2hlcmUgbW9zdCBvZiB0aGUgcG9wdWxhdGlvbiBpcyBjb25jZW50cmF0ZWQuIElmIHlvdSB3YW50IHRvIGZ1cnRoZXIgZXhwYW5kIHRoaXMgcGFydCBvZiB0aGUgcG9wdWxhdGlvbiwgeW91IGNhbiB1c2UgbGFnYXJpdGhtaWMgc2NhbGUgKGUuZy4gc2NhbGVfeF9sb2cxMCkNCisgSG93IG1hbnkgc3ViIHBvcHVsYXRpb24gY2FuIGJlIGZvdW5kIGluIHRoZSBpbmNvbWU/IA0KDQoNCmBgYHtyfQ0KbGlicmFyeShzY2FsZXMpDQpnZ3Bsb3QoY3VzdGRhdGEpICsgZ2VvbV9kZW5zaXR5KGFlcyh4PWluY29tZSkpICsNCnNjYWxlX3hfY29udGludW91cyhsYWJlbHM9ZG9sbGFyKQ0KYGBgDQoNCiMjIyMgKipCYXIgQ2hhcnRzKioNCkJhciBjaGFydCBpcyBhIGhpc3RvZ3JhbSBmb3IgZGlzY3JldGUgZGF0YS4NCg0KKiAqKlExNCoqIFVzZSB0aGUgZm9sbG93aW5nIGNvZGUgc25pcHBldCBhbG9uZyB3aXRoIHRoZSBjb2RlcyB1c2VkIGluIHRoZSBlYXJsaWVyIHF1ZXN0aW9ucyB0byBnZW5lcmF0ZSB0aGUgYmFyIHBsb3QgZm9yIG1hcml0YWwgc3RhdHVzLiBJZiB5b3UgYXJlIGdvaW5nIHRvIHVzZSBtYXJpdGFsIHN0YXR1cyBhcyBvbmUgb2YgdGhlIG1vZGVsaW5nIHZhcmlhYmxlcyBpbiBoZWFsdGggaW5zdXJhbmNlLCBpdCBpcyBiZXR0ZXIgdG8gdW5kZXJzdGFuZCBzdWNoIGEgY2F0ZWdvcmljYWwgdmFyaWFibGUgaGFzIGdvb2QgcmVwcmVzZW50YXRpb24gYWNyb3NzIHRoZSBwb3B1bGF0aW9uLiANCg0KYGBge3J9DQpnZ3Bsb3QoY3VzdGRhdGEpICsgZ2VvbV9iYXIoYWVzKHg9bWFyaXRhbC5zdGF0KSwgZmlsbD0iZ3JheSIpDQpgYGANCg0KKiAqKlExNSoqIGRyYXcgdGhlIGJhciBjaGFydCBmb3Igc3RhdGUub2YucmVzIHZhcmlhYmxlIGluIGN1c3RkYXRhLiANCisgRmxpcCB0aGUgY2hhcnQgdmlldyB0byBob3Jpem9udGFsIHZpZXcgdXNpbmcgY29vcmRfZmxpcCgpDQoNCiogKipRMTYqKiBJbiB3aGF0IGNpcmN1bXN0YW5jZXMgdGhlIGJhciBjaGFydHMgYXJlIGFkdmFudGFnZW91cz8gQ29uc2lkZXIgdGhlIHR5cGUgYW5kIG5hdHVyZSBvZiBkYXRhIGluIHlvdXIgYW5zd2VyLg0KDQojIyMgMi4yIFZpc3VhbGx5IGNoZWNraW5nIHJlbGF0aW9uc2hpcHMgYmV0d2VlbiB0d28gdmFyaWFibGVzDQpUaGUgdmlzdWFsaXNhdGlvbnMgd2UgZGlzY3VzcyBpbiB0aGlzIHNlY3Rpb24gY2FuIGFuc3dlciBxdWVzdGlvbnMgbGlrZSANCg0KKyBJcyB0aGVyZSBhIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIHRoZSB0d28gaW5wdXRzIGFnZSBhbmQgaW5jb21lIGluIGN1c3RkYXRhPw0KKyBXaGF0IGtpbmQgb2YgcmVsYXRpb25zaGlwLCBhbmQgaG93IHN0cm9uZz8NCisgSXMgdGhlcmUgYSByZWxhdGlvbnNoaXAgYmV0d2VlbiB0aGUgaW5wdXQgbWFyaXRhbCBzdGF0dXMgYW5kIHRoZSBvdXRwdXQgaGVhbHRoDQppbnN1cmFuY2U/IEhvdyBzdHJvbmc/DQoNCg0KIyMjIyBMaW5lIFBsb3RzDQpMaW5lIHBsb3RzIHdvcmtzIGJlc3Qgd2hlbiB0aGUgcmVsYXRpb25zaGlwIGJldHdlZW4gdHdvIHZhcmlhYmxlcyBhcmUgcmVsYXRpdmVseSAqY2xlYW4qDQoNCiogKipRMTcqKiBVc2UgdGhlIGZvbGxvd2luZyBjb2RlIHRvIGdlbmVyYXRlIGEgbGluZSBwbG90IGZvciBhYnN0cmFjdCBkYXRhLiANCg0KYGBge3J9DQp4IDwtIHJ1bmlmKDEwMCkNCnkgPC0geF4yICsgMC4yKngNCmdncGxvdChkYXRhLmZyYW1lKHg9eCx5PXkpLCBhZXMoeD14LHk9eSkpICsgZ2VvbV9saW5lKCkNCmBgYA0KDQojIyMjIFNjYXR0ZXIgUGxvdHMgYW5kIFNtb290aGluZyBDdXJ2ZXMNClNvbWV0aW1lcyB0aGUgcmVsYXRpb25zaGlwIGJldHdlZW4gdHdvIHZhcmlhYmxlcyBhcmUgKm5vdCBjbGVhbiogKG5vdCBzdHJvbmdseSBjb3JyZWxhdGVkKSBhcyB0aGUgc3ludGhldGljIGRhdGEgd2UgZ2VuZXJhdGVkIGluIFExNy4gV2UgY2FuIHVzZSBjb3JyZWxhdGlvbiBzdW1tYXJ5IHN0YXRpc3RpYyB0byBmaW5kIHRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiB0d28gdmFyaWFsZXMuIEZ1cnRoZXIgaW5mb3JtYXRpb24gY2FuIGJlIGRvbmUgdXNpbmcgc2NhdHRlciBwbG90cy4gIA0KDQoqICoqUTE4KiogVXNlIHRoZSBmb2xsb3dpbmcgY29kZSB0byBmaWx0ZXIgc2Vuc2libGUgc3Vic2V0IG9mIGRhdGEgZnJvbSBjdXN0ZGF0YS4gVGhlbiBmaW5kIHRoZSBjb3JyZWxhdGlvbiBiZXR3ZWVuIGFnZSBhbmQgaW5jb21lIHVzaW5nICpjb3IoKSogZnVuY3Rpb24uDQpgYGB7cn0NCmN1c3RkYXRhMiA8LSBzdWJzZXQoY3VzdGRhdGEsKGN1c3RkYXRhJGFnZSA+IDAgJiBjdXN0ZGF0YSRhZ2UgPCAxMDAgJiBjdXN0ZGF0YSRpbmNvbWUgPiAwKSkNCmBgYA0KDQoNCiogKipRMTkqKiBDcmVhdGUgYSBzY2F0dGVyIHBsb3QgdG8gZmluZCByZWxhdGlvbnNoaXAgYmV0d2VlbiBpbmNvbWUgYW5kIGFnZSB1c2luZyB0aGUgZm9sbG93aW5nIGNvZGUuIEluIGFkZGl0aW9uIHRvIHRoZSBzY2F0dGVyIHBsb3QsIHRoZSBjb2RlIGRyYXdzIGEgc21vb3RoaW5nIGxpbmUgd2hpY2ggc2hvd3MgdGhlIHJlbGF0aW9uc2hpcChsaW5lYXIpIGJldHdlbiB0aGUgdHdvIHZhcmlhYmxlcy4NCisgRG9lcyB0aGUgc21vb3RoaW5nIGN1cnZlIGhlbHBmdWwgdG8gc2VlIHRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiB0d28gdmFyaWFibGVzIGluIHRoaXMgZXhhbXBsZT8NCg0KYGBge3J9DQpnZ3Bsb3QoY3VzdGRhdGEyLCBhZXMoeD1hZ2UsIHk9aW5jb21lKSkgKyBnZW9tX3BvaW50KCkgKw0Kc3RhdF9zbW9vdGgobWV0aG9kPSJsbSIpICsNCnlsaW0oMCwgMjAwMDAwKQ0KDQpgYGANCg0KKiAqKlEyMCoqIERyYXcgdGhlIHNjYXR0ZXIgcGxvdCBhbmQgdGhlIHNtb290aGluZyBjdXJ2ZSB3aXRob3V0IHNwZWNpZnlpbmcgdGhlIHNtb290aGluZyBmdW5jdGlvbiBhcyBmb2xsb3dzLg0KKyBXaGF0IGlzIHRoZSBkaWZmZXJlbmNlIGJldHdlbiB0aGUgc21vb3RoaW5nIGxpbmVzIGluIFExOSBhbmQgUTIwPyANCisgV2hhdCBpcyB0aGUgKHNoYWRlZCkgcmliYm9uIGFyb3VuZCB0aGUgc21vb3RoaW5nIGN1cnZlIG1lYW4/DQoNCmBgYHtyfQ0KZ2dwbG90KGN1c3RkYXRhMiwgYWVzKHg9YWdlLCB5PWluY29tZSkpICsNCmdlb21fcG9pbnQoKSArIGdlb21fc21vb3RoKCkgKw0KeWxpbSgwLCAyMDAwMDApDQpgYGANCg0KKiAqKlEyMSoqIENoYW5nZSB0aGUgc2NhdHRlciBwbG90IGNvZGUgaW4gUTIwIHRvIHBsb3QgaGVhbHRoLmlucyBhbmQgYWdlLiBDb21tZW50IG9uIHRoZSBzaGFwZSBhbmQgZGlyZWN0aW9uIG9mIHRoZSBzbW9vdGhpbmcgbGluZS4NCg0KDQojIyMjIEJhciBDaGFydHMgZm9yIHR3byBjYXRlZ29yaWNhbCB2YXJpYWJsZXMNCldlIGNhbiB1c2UgYmFyY2hhcnRzIGZvciB0d28gY2F0ZWdvcmljYWwgdmFyaWFibGVzIHRvIHJlcHJlc2VudCBwcm9iYWJpbGl0aWVzLiANCg0KKiAqKlEyMioqIERyYXcgdHdvIGJhcmNoYXJ0IHR5cGVzIHRvIHByZXNlbnQgaGVhbHRoIGluc3VyYW5jZSBhbmQgbWFyaXRhbCBzdGF0dXMgdXNpbmcgdGhlIGZvbGxvd2luZyBjb2RlLg0KKyBIb3cgZG8geW91IGludGVycHJldCB0aGUgaGVpZ2h0IG9mIGJhcnMgaW4gZWFjaCBwbG90Pw0KKyBXaGF0IGFyZSBhZHZhbnRhZ2VzIG9mIGRyYXdpbmcgYmFyIGNoYXJ0cyBmb3IgdHdvIGNhdGVnb3JpY2FsIHZhcmlhYmxlcyAodXNlIHRoZXNlIGNoYXJ0cyBpbiB5b3VyIGV4cGxhbmF0aW9uKQ0KDQpgYGB7cn0NCmdncGxvdChjdXN0ZGF0YSkgKyBnZW9tX2JhcihhZXMoeD1tYXJpdGFsLnN0YXQsDQpmaWxsPWhlYWx0aC5pbnMpKQ0KZ2dwbG90KGN1c3RkYXRhKSArIGdlb21fYmFyKGFlcyh4PW1hcml0YWwuc3RhdCwNCmZpbGw9aGVhbHRoLmlucyksDQpwb3NpdGlvbj0iZG9kZ2UiKQ0KYGBgDQoNCg0K