# Automated Table generation in Stata and integration into LaTeX (1)

I use estout to generate tables of summary statistics and regression results that can be easily imported into LaTeX. The advantage is that the whole system is dynamic. If you change something in your do-file (e.g. you omit a particular group), you don’t have to change anything: the results get automatically updated in LaTeX. That has saved me a lot of time, but setting everything up took a long long time. So hopefully this post is helpful for aspiring applied econometricians who want to automate output reporting from Stata into LaTeX.

I think it is easiest to explain everything with examples, for which I use some tables from my current working paper. First install estout in Stata (ssc install estout), then we can jump right into the examples. I explain three things: creating tables for 1. descriptive statistics and 2. regression output and 3. putting everything into LaTeX. Ready? Let’s start.

Edit: Have a look at my follow-up post if you encounter any problems!
Edit 2: Another follow-up to improve the design.

1. Descriptive Statistics

The principle of estout is simple: you run a command in Stata that generates some statistics, you tell estout to (temporarily) store those results and then you create a table.

Consider Table 2, which is simply a bunch of summarised variables, split into three categories. You cannot see it, but these are actually three tables appended together. Why? Because the first part (Age to Housing) is percentages and has therefore 2 decimal points, the second part (Household Finances) are income variables with 0 decimal points, and the last part has 2 decimal points again. The complete code that generates this table is then:

* TABLE 2

estpost su $dem if coholder100==1 est store A estpost su$dem if coholder500==1
est store B

estpost su $dem if coholder1500==1 est store C esttab A B C using table2.tex, replace /// refcat(age18 "\emph{Age}" male "\emph{Demographics}" educationage "\emph{Education}" employeddummy "\emph{Employment}" oowner "\emph{Housing}", nolabel) /// mtitle("> \pounds100" "> \pounds500" "> \pounds1500") /// cells(mean(fmt(2))) label booktabs nonum collabels(none) gaps f noobs estpost su$fin if coholder100==1
est store A

estpost su $fin if coholder500==1 est store B estpost su$fin if coholder1500==1
est store C

esttab A B C using table2.tex, append ///
refcat(hhincome "\emph{Household Finances}", nolabel) ///
nomtitles ///
cells(mean(fmt(0))) label booktabs nonum f collabels(none) gaps noobs plain

estpost su $risk if coholder100==1 est store A estpost su$risk if coholder500==1
est store B


• Yes, I think this is what I mean. Can I then use the generated “global” variable for the refcat command?

• I don’t know, I just use refcat with one particular variable. Why don’t you try it and report back here?

• Everything works just fine now! I thought I had to declare a group to tell stata where to put the heading. So at the end the whole refcat() thing is so much easier than I thought.
However, for my large table I did not use estwide or estauto, because I simply changed the style of the whole layout with
\geometry{a4paper,left=35mm,right=35mm, top=30mm, bottom=30mm} .
This is because the table has 8 columns and over 25 variables and I did not like the former layout anyways.
Thus, many thanks Jörg for the help and your quick responses.
Best wishes!
Molley

• You’re welcome. But the main purpose of \estauto and \estwide is to provide an environment that allows to include captions and labels in LaTeX as oppossed to including them in Stata (changing them is a lot easier and faster in LaTeX). I am not sure I understand how that relates to changing the page dimensions with geometry.

3. Regarding \estauto{table4}{4}{S[table-format=4.4]S[table-format=4.4]}: Here I made a mistake and the correct syntax should be \estauto{table4}{3}{S[table-format=4.4]S[table-format=4.4]} – but this does not affect the output. But to clarify things:

The first curly bracket specifies the filename of the table (here “table4″, referring to “table4.tex” generated by estout).

The second curly bracket specifies the number of *data* columns. Have a look at table 4 above. This table has 3 data columns, (1), (2) and (3). The variable label column does not count, and neither do the “subcolumns”.

The third curly bracket specifies the alignment. We use the siunitx package to align numbers on their decimal point. Because we have two “subcolumns” (for ß/S E and MfX) we need to specify the alignment for both, hence S[table-format=4.4]S[table-format=4.4]. Remember that we have 3 overall columns. Here we use the same alignment for all overall columns, but you could specify different alignment all sub- and overall columns, e.g.:

S[table-format=4.4]S[table-format=4.4]S[table-format=1.1]S[table-format=2.2]S[table-format=6.6]S[table-format=4.4]

Again, using table 4 above, this would mean that the ß/SE column of (1) has space for 4 digits before and 4 digits after the decimal point. The same holds for the Mfx column of (1). The ß/SE column of (2) has space for 1.1 digits, the MfX column of (2) has space for 2.2 digits. The ß/SE column of (3) has space for 6.6 digits and the MfX column space for 4.4 digits.

I hope that helps.

4. Great work-around for posting comments below tables. I only have the problem that during compilation, Latex stops and returns an error from the xkeyval package when it comes to the siuntix-part. it returns “detect-mode undefined in families key”.any suggestions?

• The option “detect-mode” tries to to identify whether you are in math- or text-mode. It looks like that is not working for you. Instead of “detect-mode”, you could try “mode=text” or “detect-all”. If that doesn’t work send me an MWE.

5. I am a total beginner with stata and I am trying to put a prehead in the center.. I am using estout…

My lines look like:

estout *, …. prehead (Table 1)

How can I do that?

• My solution does not use estouts pre-/post-/note-function. I use estout to export the pure table content only using the “fragment” option and then include the table using my custom “estwide” or “estauto” commands. That solution allows for more flexibility, because you can add labels, post-/pre-text, notes etc. in LaTeX (where they belong) rather than changing the code in your do-file. Have a look at my examples above how to do that.

6. Thanks, this is very helpful! A question and two suggestions:

Q: For the line break in the column titles, is it possible to automatically add the titles within estout?

mtitle("\specialcell{Co-Holding\\> \pounds100}" "\specialcell{Co-Holding\\> \pounds500}"}") ///

maybe call the title of the header, and apply some rule like “linebreak if > 30 characters”?

Suggestion 1: instead of renaming all variables to add \hspace{0.1cm} you can presumably use the -prefix- suboption for labels and leave the varlabels unperturbed. From the estout help: “prefix(string) specifies a common prefix to be added to each label.”

Suggestion 2: You can add footnotes that are nicely aligned and don’t stretch the table using the Latex package -threeparttable-. I replace the table headers and footers within Stata to make that work. That way you can write the footnotes within Stata, which I find helpful to add macros that are generated in the code like “education uses numbercats’ categories” where numbercats adjusts according to my code.

• To answer my own question… if want estout to break the column “model” titles you can use the following option to insert a latex minipage. The advantage over the \specialcell approach is that the titles are filled automatically and in correct order — so you are not at risk of mislabeling a column, e.g. after changing the order of columns.

mlabels( ,titles prefix("\begin{minipage}{0.5in}") suffix("\end{minipage}") )

One could probably calculate the correct column width, e.g. by dividing the page width by the number of columns. Or by using estout’s “@span”. But at least any mismatch in the column width is visible.

@Line break in a column title: I don’t quite understand the advantage of your solution. Usually titles of columns are centered, so it does not really matter where the linebreak is. What do you mean with “the advantage over the \specialcell approach is that the titles are filled automatically and in correct order”? Could you post a code example of the estout call here (Hint: you can use the wrappers < "code">xxx< "/code"> to highlight code, but remove the “”).

@Suggestion 1: using estouts prefix option definetly looks more elegant than my solution. I remember trying that when I created all this, but I ran into some trouble. I have a look again and update my post accordingly.

@Suggestion 2: I guess it boils down to personal preference where you want to add your notes and comments. I prefer it to be in LaTeX, but I am not using ‘dynamic’ notes like you do. The ‘threeparttable’ package looks interesting and an alternative to my approach of creating custon environments. It is probably worth notifying the estout developer(s) that the built-in note-function of estout breaks tables and it can easily be overcome with ‘threeparttable’.

• @Line break. Sorry for the confusion. In your \specialcells solution, you manually specify the column titles and manually enter the linebreak. In my \minipage approach, -estout- fills in the titles and Latex takes care of the linebreak. From my perspective the main benefit is that my approach is less susceptible to user mistakes. If I change order of the models for the tables, I don’t need to worry about also (manually) changing the column titles.

7. Thanks for posting, this has been a great help!
So far everything is working out nicely, however I face a problem when it comes to very long (descriptive) tables that would better fit onto several pages.

Any attemps to modify your commands to capture the “longtable” option have failed…any ideas how to solve this issue?

Torben

• Torben, I did not try longtable yet, but I did use threeparttable. Did you try wrapping the whole estout-thing in \begin{longtable}...\end{longtable} instead of \begin{table}...\end{table}? I am very busy at the moment but I can look into this in a few weeks time. If you can email me an minimum working example I might be able to help you sooner.

I tried “longtable” and “threeparttable”, but the resulting table was split such that the caption covered most of the pages…never mind, because at the moment I am using a workaround: splitting the tables and adjusting their height with

\resizebox*{\textwidth}{\dimexpr\textheight-9\baselineskip\relax}{ \estwide{...} }

wrapped around the \estwide command (the package “calc” is needed for this.)

• I have tried to create a solution that involves longtable, but it is a bit more complicated that I anticipated and I am not able to provide a fix unless I encounter the issue myself :(.

8. Great resource – thank you! I am hoping you may be able to help me with a problem. I am using a set-up similar to yours to create a table of marginal effects. However, the significance stars generated are based on the coefficients rather than the marginal effects. I’ve tried using the pvalue option, but I’m not sure how to specify mypvalue to specify the p-value of the mfx. I understand the margins command will do this automatically, but that’s not how I’m generating the output and it doesn’t seem to work with my code. My code is below. Thank you in advance for any help you can offer.

svy, subpop(group_cso): logit partymember_strict2 civsoc_i_new2 gender_n age_clean_n extrovert _spline1 _spline2 _spline3;

 estadd margins, dydx(*) vce(unconditional) subpop(if group_cso==1); est store cso_res; svy, subpop(group_cso): logit partymember_strict2 civsoc_i_new2 gender_n age_clean_n extrovert civsoc_net _spline1 _spline2 _spline3; estadd margins, dydx(*) vce(unconditional) subpop(if group_cso==1); est store cso_res2; 

esttab cso_res cso_res2 using cso_res.tex,se scalars(N_sub) noobs nonumbers mlabels("" "",numbers) collabels("Marginal Effect (SE)",lhs(Party Membership)) drop(_spline1 _spline2 _spline3 _cons) star(* 0.10 ** 0.05 *** 0.001) cells("margins_b(fmt(2)star)" "margins_se(fmt(2)par)") stats(N_sub, fmt(0) labels("N")) eqlabels(none) rename(civsoc_mem_lag civsoc_i_new2) label replace booktabs /// title(Social Networks, Organizational Membership and Party Membership Decisions (Restricted Model)\label{cso3});

• Hi Yael, I don’t have a quick answer for you, but it is possible to base the significance stars on the marginal effects. I will have a look at it in late August and post a reply, as I am too busy right now.

• Yael, I have thought about your problem, but it does appear to me that his is not an estout issue. First I thought you mean that you don’t see significance stars, but you mean that “incorrect” significance stars are shown as they are based on the coefficients.

That sounds like a Stata issue to me. The line of relevance is cells("margins_b(fmt(2)star)" "margins_se(fmt(2)par)") and here you tell estout to take the dy/dx value and add significance stars according to its standard error. I don’t see what’s wrong with that.

Are you sure that the significance stars are incorrect? What do you mean with “pvalue option”?

9. Thanks so much, this was very helpful! I am trying to do exactly what you do. The table prints nicely in LaTex; however, the table is “too big”: The “note” prints on top of the final line of the table, and the whole table is not centered (like the title and “note”) but a little to the left on the page. Finally, my significance stars are not small and pretty but the same size as the rest of the table and to the right of the coefficient. Do these problems arise because I am failing to include all the \usepackage{} that are required in the beginning? What are the \usepackage{} that should be included before your Latex crunch code? Now I have

\documentclass[11pt]{article}
\usepackage{booktabs}
\usepackage{caption}

Best,
Maria

10. Great post, thank you! It helps me a lot.
In order to shrink the size of the table I normally use resizebox or scale. My problem is that the fignote remains unaffected. Does anyone know how I can shrink the table including the fignote?

• I tried a few things just now, but I’m afraid I don’t have an answer to that. But why do you want to do that in the first place? Shrinking the table is typographically not very good as it gives you many different font sizes within one document. It’s better to use \footnotesize, \scriptsize etc. to adjust the font size. And resizing a caption/note is typographically even worse. Why can’t it be the same font size everywhere?

• My objective is to adjust the table to the page dimensions. Without shrinking it, it does not fit on one page, but maybe adjusting the font size will do the job as well.

I tried to adapt your code to a mutilpe equation model ( biprobit).
Everthing works out fine expect for the marginal effects. I am not able to include marginal effects for both equations because I dont know how to use the margins command for both equations seperately with esttab. Do you have any hint on that?

11. Thanks Jörg! This has been very helpful. My only issue was using \figtext often resulted in the first line of footnotes overlapping with the last horizontal line of the table. This was pretty easily fixed by diminishing or removing “\vspace{-1.9ex}” from the definition of \figtext.

• Ben, have a look at my follow-up post where I address exactly that issue. You can also consider the alternative commands I create there using the threeparttables package that generates notes that are as wide as the table and look nicer.

12. Note: If edited the comment to highlight the problem^JW

Thanks a lot Jörg !!
Unfortunately, same if i did exactly what you explain, latex say me “LaTeX Error: Missing \begin{document}” and stop the compilation just at the last “}”. After a lot of tests, I don’t understand why it say me that. I hope you can help me. The latex file includes:

... ... \newcommand{\estauto{table4.tex}{4}{S[table-format=4.4]S[table-format=4.4]}}[3]{ \vspace{.75ex}{ \begin{tabular*}{l*{#2}{#3}} \toprule \estinput{#1} \bottomrule \addlinespace[.75ex] \end{tabular*} } }

 

... ... \begin{document} \begin{table} \caption{Table de regression} \estauto{table4}{4}{S[table-format=4.4]S[table-format=4.4]} \starnote \fignote{} \label{table4} \end{table4} \end{table} \end{document}

• There are a few mistakes in this code. The definition of \estauto in the preamble is completely wrong. Just copy and paste it from my example. You have \end{table4} at the bottom which is wrong as well, just delete.

• I was confused by that as well, since there is an extra \end{table4} in the example, maybe that should be deleted in the text?

One suggestions if you every have the time:
Similar to statalist it would be nice to have a running example. It think it would help understanding your code a lot if you could make up a simple example from stata autouse datasets, so someone could run it right away and then change it.

Best

• Thanks, I did not realise before that the error was in my example.

Originally I wanted to show examples based on the autouse datasets, but as I don’t know them that well it would take me a while to do similar examples. Maybe if I can find the time at some point I will create a running example.

13. Do you see a way to adapt the summary tables example in a way that I can just get 3 columns for mean, sd, and numer of observations?
I just wanna
estpost su $Vars est store A esttab A using table2.tex, replace … But I could not figure out a way to get him to print the mean, sd, and obs in different colums, since cells would put it into to same cell and I just get one column. 14. Hi Jörg, Thanks a lot for this post, very useful. I’m encountering a problem with my descriptive tables. I have two columns (treatment and control) with mean test scores for every part of the test and want to create a third column next to it with the difference between these two columns, how to get this most efficiently? When creating a “differences”-variable in Stata, it shows up as new rows (in the 3rd column). This is my Stata code so far: estpost su prueba_ciencias’ if tratamiento_lf==1 est store A estpost su prueba_ciencias’ if control_lf==1 est store B estpost su prueba_diffs’ //This contains my “difference”-variables est store C esttab A B C using pruebas_ciencias_tex’, replace addnotes(…) /// mtitle(“Tratamiento” “Control”) /// cells(mean(fmt(4))) noobs nonumbers 15. Thank you for this post! Finding information such as this has been difficult for me. I just installed the most recent l3 and siunitx packages (published on 7/28/2013) and am having problems compiling the Latex file. I was wondering if these new packages require an updated version of the code you’ve provided? I’ve copied the code presented in this post, updated the relevant portions (gave the table a new title, put in the .tex file name containing my Stata output generated by esttab, and updated the column number to 2 to reflect the fact that my table only contains 2 output columns). After doing this, I receive the following error “Undefined control sequence \estwide…tracolsep \fill }l*{#2}{#3}} \toprule \estinput {#1} \bottomrule…” and this links back to my line of code after \begin{table} that uses the \estwide command (“\estwide{ET_Table1.tex}{2}{c}”). I’d be happy to send you my code if that would be helpful. Thank you so very much for any help you can provide! • Sarah, the error message you describe sounds more like an issue with ‘booktabs’. Are you loading that package? If that is not the problem then I’ll update my TeX and check if the code is still working. 16. Hi Jorg, I just came across your blog this morning. I am trying to produce results tables (for Latex) either with margins or with margins on a separate page. I have tried the syntax above and I am only getting the table of coefficients without the margins included. Below is a caption of part of the syntax I used:  eststo clear qui logit$ylist $xlist if c==1 estadd margins, dydx(*) eststo est1  qui logit$ylist $xlist if c==2 estadd margins, dydx(*) eststo est2 qui logit$ylist $xlist1 estadd margins, dydx(*) eststo est3  esttab est1 est2 est3 using ....  17. Hi Jorg, I have been using your tricks for a while now, they are really great. I was wondering, do you know of any resources that would allow for similar commands in SAS? thanks! • Hi Phil, unfortunately not. I’ve never used SAS, and I don’t know if an ‘estout’ equivalent exists for it. What can SAS do better than Stata? 18. Excellent post. Thanks for sharing this. Can you suggest a way how I can implement “refcat(hhincome “\emph{Household Finances}”, nolabel)” if, for example, my variable label is longer than “Household Finances” and I want it to span over all the columns? I am using MSword and I cannot use \multicolumn to get this. I need a code suitable for .rtf output. Thanks for your help. 19. Many thanks for going to the effort to post this. It has been a great help. 20. Hi, Thanks for this useful tool but I can’t make it work completely so far. I have copied everything in the preambule, have the following estab command: esttab A_1 A_2 using$csv\vam1.tex, replace label booktabs nomtitles title("Value Added Model 1: sub-score results") /// drop(_cons ) se(3) b(3) /// star(* 0.10 ** 0.05 *** 0.01) /// stats(N r2 , fmt(0 3) labels("Observations"' "$$R^{2}$$"' ))

with A_1 A_2 my two date columns

and have called for the table using the following

\begin{table} \caption{VAM} \estauto{Q:/data/CSV/vam1}{2}{S[table-format=4.4]} \label{table4} \end{table}

I get “misplaced \noalign” error on texstudio. What’s surprising is that when I use a simple \input the table is properly imported on latex.

What am I doing wrong?

Thank you so much for your help

• You need to use the fragment option, otherwise you are essentially using the table headers and footers in LaTeX twice. I.e.:

esttab A_1 A_2 using \$csv\vam1.tex, f replace label...

(notice the “f”).

21. Hi Jorg,

I am implementing your code (thanks so much for posting it) and I have a few issues.

The first, is that I am getting this error

Latex Error: ./Housing_did_prelim.tex:143 Undefined control sequence. 
Line 143 is the one that has the estauto line

\estauto{ DiD.tex}{3}{S[table-format=4.4]S[table-format=4.4]} 
I only have 3 models to show so I know the number is not mistaken.

The second problem is that even when I compile it I find that there is a “[.75ex]” at the end of the table, right underneath the last line.

The third problem is that when I want to incorporate \starnote, before the note starts it says “Table 2:” when there is only one table and also to the right, it appears “justification=justification, font=footnotesize”, part of the declaration of \starnote.

Do you know what the solution to these issues might be?

Thank you again

• I was able to fix the last two issues. However, I still have [0.75ex] appearing to the right of the table right in the middle. I don’t know how to get it out.

Thanks

• I’m a bit confused, how can you compile the document when you get an “undefined control sequence” error? In any case, it sounds like you have not included the code that defines the “estauto” command, hence the error message that the command does not exist.

22. Hi Jörg,
this post is very useful, thank you.
I was wondering: how do you get the number of observations in the summary statistics table? Thank you,
Simone

• stats(N r2_p chi2 p pr, fmt(0 3) layout("\multicolumn{1}{c}{@}" "\multicolumn{1}{S}{@}") labels("Observations"')

Notice the “N”.

23. Hi Jörg,

Do you know if estout can use the output from xtsum? I have not been able to find an answer for this anywhere so I guess not.

Thanks.

• Hi, I haven’t tried xtsum, but I think it should work. Have a look at the estout manual and try return list and ereturn list.

Unfortunately, I cannot resolve a issue I have with Latex. I cannot compile my pdf due to:

\bottomrule ->\noalign {\ifnum 0=}\fi \@aboverulesep =\aboverulesep \global... l.652 ...}{S[table-format=4.4]S[table-format=4.4]} 
I am sure, you have an idea about that matter? Thanks

25. Might it be possible in tex?

Thanks for such a great set of tools.

• Sorry, this did not appear in context as I had intended. I was following up on [Rijo on 21/01/2014 at 6:45 pm], the question about using \refcat to create notes across multiple columns. (For example “Panel A: Whatever” centered above one set of regressions, “Panel B: Whatever Else” centered above a second set in the same table.) You indicated you didn’t think it could be done in .rtf, I was wondering if you had any ideas for how to do it in a .tex file.

• Ah, okay. You can wrap the variable label that you want in multicolumn. For example something like: refcat(age18 multicolumn{3}{l}{\emph{This is a long Age variable name}}) would span the the label for age18over 3 columns.

• Thanks. That works pretty well, except I wind up with too many columns in that row. That is, the multicolumn is then followed by the same number of & & &, which leads to too many columns.

To be specific, if I add

refcat(dummy "\multicolumn{3}{l}{\emph{This is a long Age variable name}}", nolabel) 
to esttab, then the .tex output I get is

\multicolumn{3}{l}{\emph{This is a long Age variable name}}& & \\

rather than what I think I want:
\multicolumn{3}{l}{\emph{This is a long Age variable name}}\\ 
The pdf still compiles, but I get an error message:

Errors:
Extra alignment tab has been changed to \cr

Description: ...This is a long Age variable name}}& & ... You have given more \span or & marks than there were in the preamble to the \halign or \valign now in progress. So I'll assume that you meant to type \cr instead.

and then in the PDF there is an extra unwanted blank row after the the refcat contents.

I could use the estout substitute option to get rid of these, although this depends on knowing the exact number of extra & and the number of spaces between them. Probably some perl wizardry could do a better job.

Anyway, I hope these comments are helpful and it’s a great contribution regardless.

• Ah yes, I forgot about that. You need to add \\ % at the very end of the refcat name, i.e. add the table delimiter and then comment the rest of the line out. Then everything after the end of that will be ignored in LaTeX. So in total:

refcat(dummy “\multicolumn{3}{l}{\emph{This is a long Age variable name}} \\ %”, nolabel)

Don’t forget that you have to adjust \multicolumn{3}` to the total number of columns that you have. Here it spans over 3 columns and is left aligned.

This is more of a workaround than a proper solution – ultimately the estout developers would need to provide a fix for it.