Zekun Blog

Across the Great Wall, we can reach every corner in the world.

Think Strategically and Get Jobs - H1B Visa Analysis

Helping international data science students strategize their job search

H1B The H1B visa is a non-immigrant visa that allows companies in the US to hire graduate-level workers in specialty occupations that require theoretical or technical expertise in specialized fie...

As data scientists, what should we do?

Data science application analysis, using product launch processes as an example

Slides Summary Three major parts of data science work: Use EXPLORATION to turn the unknown into the known. Use INFERENCE to help find something new from the old. Use PREDICTION to make be...

Power and Sample Size Calculations for Correlational Studies

Probability and Statistical Inference - 10

A common research objective is to demonstrate that two measurements are highly correlated. One measurement, call it A, may reflect the severity of disease but is difficult or costly to collect. Ano...

Central Limit Theorem - Approximation

Probability and Statistical Inference - 09

Import the packages first. library(magrittr) library(sn) Situation Description The central limit theorem is an important computational shortcut for generating and making inferences from the samp...

What makes a trending video?

An analysis of trending YouTube videos

Slides Introduction YouTube is one of the biggest media platforms of the 21st century. 5 Billion videos are watched daily. 500 Hours of videos are uploaded every minute. 51% of users say that...

Simulation Method Comparison

Probability and Statistical Inference - 08

Situation Description This time, I will perform a 2 × 4 × 2 factorial simulation study to compare the coverage probability of various methods of calculating 90% confidence intervals. The three fac...

Coverage Probability of MLE

Probability and Statistical Inference - 07

Introduction In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the ass...

Which quantiles of a continuous distribution can one estimate with more precision?

Probability and Statistical Inference - 05

Introduction The median is an important quantity in data analysis. It represents the middle value of a data distribution. Estimates of the median, however, have a degree of uncertainty because (a)...

Visualization Cheat Sheet

A quick reference for efficient data visualization

Created by Zekun Wang Based on Storytelling With Data Download: PDF version

If home-field advantage exists, how much of an impact does it have on winning the world series?

Probability and Statistical Inference - 04

Introduction In team sports, the term home advantage – also called home ground, home field, home-field advantage, home court, home-court advantage, defender’s advantage or home-ice advantage ...