| 睿 的个人资料Sherlock照片日志列表 | 帮助 |
|
|
4月7日 乱七八糟,笔记一下将来查,请读者自动忽略First Call: Cusip是header cusip; cig_est是unadjusted; actual value是adjusted, 而且可能有missing;内部的话security_id比较准;有些调整fiscalyearend 被忽略;貌似split factor比较准
IBES: Cusip是historical cusip;unadjusted data merge with split factor时候可能有些有点微小误差;unadjusted actual比较全,但是pends和fpestats 有时候不一样,比较怪,用sql会丢不少东西;summary statistic 的price可能missing;iclink比直接用head cusip能多出来15%以上
CRSP: iperm被停用了,原因不明;用stocknames找permno (with historical cusip),或者用cstlink找npermno(with gvkey)都还可以
CompuSTAT: compnames找gvkey不错,挺多的
sql好是好,但容易丢东西,有时候还是merge安全 9月19日 Inertia and AmbiguityThe following is based on my interpretation. I can not guarantee it 100% to be correct.
Long long time ago (around 80 years), Frank Knight discussed the difference between uncertainty and risk. If there is (or u know) objective probabilities for the uncertain events, that's risk; otherwise, it's uncertainty.
In 1950s, Savage developed the theory of subjective probability utility fuction (Savage's Axiom). Under some assumptions, individuals behave facing uncertainty as if they subjectively assign some probabilitiestry to each possible outcome and try to maxmize their expected utility. Then risk and uncertainty become the same thing, we need not worry about the difference between these two concepts. However, the axiom is being questioned at the very beginning. Ellsberg wrote a paper in 1961 (QJE) and presented the famous Ellsberg Paradox. In the experiment, even Savage himself violated the axiom......
Inertia and Ambiguity r two possible ways to model the behavior when individuals face uncertainty, and they r quite different.
Inertia means the disutility will be very high if outcome is even lower than what the individuals have now. So we can also say that those individuals r uncertainty aversion. They r not sure what is going to happen, but they r hesitate to move unless it is very unlikely that the results will be worse than their current situation. In JF 2002 or 2003, there is a paper using inertia to explain the corporate investment behavior.
Ambiguity means that the individuals do not know the exact distribution of the possible outcomes. They have a distribution set, in which all the distribution is possible. Then the individuals' utility fuction depends on the worst possible distribution of the outcomes. (If u think this is unreasonable, u can think their utility fuctions is a weighted average of worst ones and means of all the distribution, that won't hurt the analysis). In this situation, the individuals try to seek the most safe way to go. The paper presented here by O'hara is based on this assumption.
I still have something unclear about the model. Maybe I can know more after I discuss with some profs later.
8月17日 关于Quarterly Data好吧,我承认我土,现在才知道这个事情。不过写出来,让还不知道的博士同学们少犯错误。
就是在Quarterly Data中,income statements的数据是针对这个quarter的,但是statements of cash flow的数据是从这个财政年度开始cumulated的。也就是,你如果想知道这个quarter的cash flow,需要take difference。
md,这个事折腾我两天。 8月10日 Notes on Panel Data (5)Generalized to Panel Data (cont'd)
(1) Trimmed
The basic idea is to make the obs symmetrically. We compare the two period obs, without truncating or censoring, the obs of dependent variable sld be symmetrically distributed to a straight line with 45 degree. Then we find the intersection of the line and truncated (censored) line, and throw away the obs whose symmetric point will be outside the region.
Also, as the cross-section case, we can get lots of beta if we can throw away enough obs. Then we need to put an additional restriction to try to keep as many obs as possible.
(2) Type II Tobit
The same idea of cross pairwise difference, allowing for individual specific effect, we take the time difference where the probability is the same. And under certain condition, it will assure us a consistent estimate.
3. Cross Dependence (cont'd)
(2) Factor Approach
We assume there are some factors which will determine the cross dependence. If we know all the factors, it is OK. If not, some one proves that if N->infinity, and T->infinity, we can still get consistent estimates. The problem is that when N->infinity, the dimension of the matrix needed to be computed will also go to infinity......
(3) Some one (I forgot who.... Pesaran? Who remember, plz tell me) proposes a very smart approach. We can represent the factors by the mean of obs and the mean of specific factor loadings. Then we replace it in the original equation which means that we only need to control some additional variables, the cross-sectional mean of dependent variable, and the mean of independent variables, then we avoid cross dependence. This idea is hard to explain without formula. Sorry.
5. Concerns of Dynamic Model
All above can only deal with static model. When we try to generalize Powell (Honere)'s method to dynamic model with fixed effect, we first need to extend data requirement. For example, we need at least 4 period obs with observed data for the same individual to cancel out the individual effect.
A more server problem is that when we try to trim the obs, because of the dynamic procedure, we have one inequality but two parameters needed to be determined. Then how can we get consistent result? What kind of criterias we should follow to determine the value of the two parameters?
End of The Course.
8月7日 Notes on Panel Data (4)4. Truncated and Censored Data (Sample Selection Model)
Back to linear model, only cross-sectional data
(1) Because the observations we get are not randomly chosen from the population, the basic assumption of OLS is violated (E[e|x]=0), we can not get consistent estimates by OLS.
(2) MLE method.
The problem with this method is that the computation is very complicated. We can only use iteration to get the estimates.
(3) Heckman Two Stage
The basic idea of this method is to control for the correlation between X and the residuals. In the first stage, we estimate a discrete choice model whether the data is censored. And the so called inverse miller ratio can be estimated from that model. Then we control that ratio in our main model to get consistent estimates.
Limitation: Can only deal with censored data (otherwise we don't have a choice model)
Assuming the residual of choice model is normally distributed
The ratio is highly correlated with X (independent variables), which results a multi-collinareity problem in OLS estimation. This problem will make the estimates very senstive (unstable).
(4) Powell Trimming Symmetric Method
Powell proposes a very smart idea. We have data truncated or censored on one side, then we throw away the data beyond some distance on the other side to make the distribution of residuals symmetric, then we save the basic assumption of OLS. This method only need the distribution of residual condition on X is symmetric.
Then the estimated beta need to satisfy i) throw away the data with y > 2*beta*X (ii) beta is the OLS estimates for the remaining sample.
Unfortunetely, we do not get unique solution for this optimation. For example, let beta=0, we throw away all the obs, and this beta satisfy the OLS (no obs at all......). Then we impose another condition to assure a consistent estimate.
There is a widely used condition. The idea of it is to make the estimated line cross the sample in the middle, avoiding beta too small. This condition assure the consistency of the estimates, but say nothing about efficiency. My first impression is that the conditon is trying to save as many obs as possible in the remaining data to get more information. But turn out, we are not sure about it.
(5) Type II Tobit Model
This model deals with two step choice. For example, in the consumption of automobile, the consumer first decides whether to buy, and if buy, how much he will spend on it.
Heckman two stage can deal with this kind of problem.
Powell proposes another called pair difference estimation. The basic idea is that for the same value of estimated probability in the choice model, we can take the cross-sectional difference to eliminate the residual term so that we do not need to concern the truncated or censored properties.
If we need the difference equal to 0 exactly, we may have very few obs, on the other hand, if we allow the difference too large, we do not get consistent estimate. So we need to tradeoff the efficiency and consistency. We put a kernal function there (standard normal density?).
Generlized to Panel Data
We take time series difference, and truncated the obs based on the same idea of Powell.
8月4日 Notes on Panel Data (3)I can't follow all the ideas today, -_-!.
2. Discrete Choice Model
(3) Manski Maximum Score
Only for static model, can't deal with time invariant variables either.
(4) Horowitz's Smoothed Maximum Score
A modification of Manski's method, transforming the sign function into a step functing, then using a continous fuction which is identical to a step fuction in the limit, to approximate it. Then we can take derivatives with that continous function.
(5) Biased Reduced Estimate
I can not quite follow the ideas of this method. IT SEEMS (I am not sure of the statements afterwards) that the basic idea is we try to get a better (low covariance) estimate instead of a consistent estimate. In previous methods, we try to get consistent estimates while putting lots of restrictions on the data. In this method, we just allow biased estimate, and try to balance the bias and efficiency.
The basic idea of this method is just a forthcoming paper in Journal of Econometrics, and IT SEEMS that the author still can not develop a good estimate of the optimal bias. So I do not expect this method can be employed in empirical research in serveral years.
Summary
We now can not get consistent estimates of the coefficients of time invariant variables (for example, gender) in discrete choice model if we add fixed effect in the model.
It is also very difficult to deal with dynamic model.
3. Cross Dependence
If we have cross dependence in the residuals, we won't get consistent estimates.
(1) Spatial Approach
I also get lost in this part, //blush.
The basic idea is that we can pre-specify the cross correlation, only the proportion of the correlation is unknown.
But this approach still has lots of problems. When N is small (cross sample is small), T is large (long time period), we can easily estimate directly. When N is large, it is too hard to pre-specify the covariance matrix.
Some one propose to decompose the N into R groups. We have cross dependence within the groups, but not across the groups. Then it is easier to pre-specify the covariance matrix. The critical consition for consisteng estimate in this method is R/N->0 when N goes to infinity. But if so, the data will also satisfy the so called "mixing condition (?)", and can be estimated by other eaiser method (?). I get lost here.
8月2日 Notes on Panel Data (2)2. Discrete Choice Model
We still have no way to deal with the time effect.
For cross-sectional effect, if the pane is too thin (short time series), we can not get consistent estimates with MLE (short time, can not get consistent incidental effect estimates, which is correlated to structural coefficients).
Transformation (getting consistent estimates)
(1) Static Model
The assumption needed: at least some switching of the decision
Limitations: Can not get coefficients of time invariant variables
If the differences of the variables do not vary much in the cross-secion, we can not get accurate estimates.
(2) Dynamic Model
The condition needed: at least 4(3) time period
The variable is constant in two continous periods, but different from the period before these two.
(A shock to the independent variable?)
very restricted
(3) Mansky Maximum Score
A semi-parameter model (sign fuction)
Do not put many restrictions on the data
Limitations: We can not get exact coefficients (The results times any constant can also satisfy the condition)
We can only get the relative coefficients. But because of the nonlinearity of the model, it does not make much sense.
The objective function is not continous (sign fuction), we can not take derivatives of it (hard to solve).
The converging process to normal is rather slow (N^(1/3)) Notes for Panel Data (1)Heihei. I find that I can write some course notes here to remind myself because I never write notes in the class. Also, if some one can give some suggestions or point out some mistakes, I can learn something. And the notes can provide some positive externalities I think.
1. Linear Model
We can both control for time and cross-sectional effects (no variation in the structural coefficients here).
Random Effect Model Vs. Fixed Effect Model
REM: Need to know the data generating process (usually assume the effects unrelated to independent variables), but the number of parameters is fixed (more degree of freedom)
FEM: Do not need to know the data generating process, but the number of parameters increases as the observation increases (loss of degree of freedom)
My interpretation: the effect is generated by some process. If we know the model specification, we can use REM to use more information in the observations; FEM is kind of using realization rather than the reall process to determine the effects (So we need to use the information in observations to know the realization).
Transformation: linear model is easy to be transformed to cancel out the incidental effects, the model is essentially equivalent to FEM. |
|
|