* You have some data set: for example:

webuse mheart0, clear

* And you want to test how well an estimator will work on sampled data from that data set.

* There are obviously many ways to do this.

* One way would be to resample from that data 10,000 draws and then generate a dependent variable and test how well your estimator works.

sum

* First we want to mark the draws, but we can see that bmi is missing some information.

* For our purposes we could either drop the observations for which bmi is missing or linearly impute bmi.

* Let's just impute bmi:

reg bmi age smokes attack female hsgrad marstatus alcohol hightar

predict bmi_fill

replace bmi = bmi_fill if bmi==.

sum bmi

drop bmi_fill

* This might not be the best technique for imputing values for econometric estimation but since all we want is to generate data with reasonable ranges for the explanatory variables, this should work fine.

* First let's mark our observations

gen draw_num = _n

di "There are " _N " observations available to draw from"

* Let us save the name number of draws

gl max_draw = _N

tempfile draw

save `draw'

* Now we are ready to start drawing our data.

* Imagine we would like to draw 10,000 observations:

clear

set obs 10000

* Now is the only trick. We need to assign each observation a draw_num that ranges form 1 to ${max_draw}

gen draw_num = int(runiform()*${max_draw})+1

* This is a little tricky but it should do the job. First it generates a variable which ranges in value from 0 to max_draw-1 then it reduces it to an integer and adds 1.

sort draw_num

* Neccessary for the merge

* Now we just need to merge in our temporary data set:

merge m:1 draw_num using `draw'

* Sum

* Now we can generate some values and estimate how well they work

* for example

gen u = 50*rnormal()

gen y = -5 + attack -3* smokes + .15*age -2*bmi + u

reg y attack smokes age bmi

## No comments:

## Post a Comment