Useful programs
These programs should (and will) be on a repo on Github.
Replace top-coded variables with Pareto distribution mean
Values for variable are top-coded above a threshold. The variable is known/thought to follow a Pareto distribution. The Pareto distribution if estimated on a number of variables below the top-coded threshold (say the Pareto distribution is estimated on values in the top 90-99 pct bin if the top-coded values correspond to the top 1pct).
This program is taken from Blundell, Pistaferri and Preston (2017) paper. Needs 3 globals: year, cens_var and topcode.
cap program drop pareto
program define pareto
preserve
keep if year == $year
egen fr_gt_y=rank($cens_var),field
replace fr_gt_y=fr_gt_y-1
egen tot_y=sum($cens_var!=.)
replace fr_gt_y=fr_gt_y/tot_y
keep if $cens_var<$topcode
xtile qy=$cens_var,nq(10)
keep if qy==10
collapse fr_gt_y,by($cens_var)
replace fr_gt_y=ln(fr_gt_y)
replace $cens_var=ln($cens_var)
reg fr_gt_y $cens_var
restore
gen b=-_b[$cens_var]
replace $cens_var=$cens_var*(b/(b-1)) if $cens_var==$topcode & year==$year
end