Automated Table generation in Stata and integration into LaTeX (1)

I use estout to generate tables of summary statistics and regression results that can be easily imported into LaTeX. The advantage is that the whole system is dynamic. If you change something in your do-file (e.g. you omit a particular group), you don’t have to change anything: the results get automatically updated in LaTeX. That has saved me a lot of time, but setting everything up took a long long time. So hopefully this post is helpful for aspiring applied econometricians who want to automate output reporting from Stata into LaTeX.

I think it is easiest to explain everything with examples, for which I use some tables from my current working paper. First install estout in Stata (ssc install estout), then we can jump right into the examples. I explain three things: creating tables for 1. descriptive statistics and 2. regression output and 3. putting everything into LaTeX. Ready? Let’s start.

Edit: Have a look at my follow-up post if you encounter any problems!
Edit 2: Another follow-up to improve the design.

1. Descriptive Statistics

The principle of estout is simple: you run a command in Stata that generates some statistics, you tell estout to (temporarily) store those results and then you create a table.

Consider Table 2, which is simply a bunch of summarised variables, split into three categories. You cannot see it, but these are actually three tables appended together. Why? Because the first part (Age to Housing) is percentages and has therefore 2 decimal points, the second part (Household Finances) are income variables with 0 decimal points, and the last part has 2 decimal points again. The complete code that generates this table is then:

* TABLE 2 

estpost su $dem if coholder100==1
est store A

estpost su $dem if coholder500==1
est store B

estpost su $dem if coholder1500==1
est store C

	esttab A B C using table2.tex, replace ///
		refcat(age18 "\emph{Age}" male "\emph{Demographics}" educationage "\emph{Education}" employeddummy "\emph{Employment}" oowner "\emph{Housing}", nolabel) ///
		mtitle("> \pounds100" "> \pounds500" "> \pounds1500") ///
		cells(mean(fmt(2))) label booktabs nonum collabels(none) gaps f noobs

estpost su $fin if coholder100==1
est store A

estpost su $fin if coholder500==1
est store B

estpost su $fin if coholder1500==1
est store C

	esttab A B C using table2.tex, append ///
		refcat(hhincome "\emph{Household Finances}", nolabel) ///
		nomtitles ///
		cells(mean(fmt(0))) label booktabs nonum f collabels(none) gaps noobs plain

estpost su $risk if coholder100==1
est store A

estpost su $risk if coholder500==1
est store B

estpost su $risk if coholder1500==1
est store C

	esttab A B C using table2.tex, append ///
		nomtitles ///
		refcat(redundant "\emph{Income and Expenditure Risk}" literacyscore "\emph{Behavioural Characteristics}", nolabel) ///
		stats(N, fmt(%18.0g) labels("\midrule Observations")) ///
		cells(mean(fmt(2))) label booktabs nonum f collabels(none) gaps plain

Some little explanation (for a full list of commands see the estout/esttab manual)

  • refcat includes a heading for a group of variables
  • mtitle specifies the columns heading
  • cells specifies the cell content (in the first part “mean” with 2 decimal place)
  • f creates a fragment of the table, i.e. only the table content is exported to the .tex (see below for more information)

2. Regression Results

Reporting regression results is not as simple, but we are jumping right into a fairly complicated example (Table 4).

As you can see, we are reporting coefficients, standard errors and marginal effects, hence each specification has two columns and two rows. At the bottom we add a few additional statistics (estout can add every statistics that is saved in the e() matrix).

This table is generated by the following code:

* TABLE 4

quietly probit coholder $dem $fin $bev $risk
	predict pr2_coholder, pr
	quietly su pr2_coholder
	estadd scalar pr = r(mean)
	estadd margins, dydx(*) atmeans

est store A

quietly probit coholder500 $dem $fin $bev $risk
	predict pr2_coholder500, pr
	quietly su pr2_coholder500
	estadd scalar pr = r(mean)
	estadd margins, dydx(*) atmeans

est store B

quietly probit coholder1500 $dem $fin $bev $risk
	predict pr2_coholder1500, pr
	quietly su pr2_coholder1500
	estadd scalar pr = r(mean)
	estadd margins, dydx(*) atmeans

est store C

esttab A B C using table4.tex, replace f ///
	label booktabs b(3) p(3) eqlabels(none) alignment(S S) collabels("\multicolumn{1}{c}{$\beta$ / SE}" "\multicolumn{1}{c}{Mfx}") ///
	drop(_cons spouse*  ) ///
	star(* 0.10 ** 0.05 *** 0.01) ///
	cells("b(fmt(3)star) margins_b(star)" "se(fmt(3)par)") ///
	refcat(age18 "\emph{Age}" male "\emph{Demographics}" educationage "\emph{Education}" employeddummy "\emph{Employment}" oowner "\emph{Housing}" hhincome_thou "\emph{Household Finances}" reduntant "\emph{Income and Expenditure Risk}" literacyscore "\emph{Behavioural Characteristics}", nolabel) ///
	stats(N r2_p chi2 p pr, fmt(0 3) layout("\multicolumn{1}{c}{@}" "\multicolumn{1}{S}{@}") labels(`"Observations"' `"Pseudo \(R^{2}\)"' `"LR chi2"' `"Prob > chi2"' `"Baseline predicted probability"'))

What do we do here? First we quietly run a probit model, then we generate the baseline predicted probability and store this using estadd. Finally, we calculate marginal effects and store them again using estadd.

As you can see, there are many commands that generate that table. Step-by-step:

  • b(3) and p(3):  3 decimal places for coefficients and standard errors
  • alignment (S S): alignment of the decimal places. As we have two data columns per specification (one for ß/SE and one for Mfx), we need to specify algnment for each column. Here we use the siunitx package (see below) to align the results at the decimal point. The alternative would be alignment (c c) for centered data entries.
  • collabels: labels for the data columns. As we want those centered we need the multicolumn option
  • drop: drop some results from the table
  • star: specify how you want to report significance levels
  • cells: specify the content for each cell
  • stats: adds statistics below the results. We add N (observations), r2_p (pseudo R2), chi2 (LR chi2) p (prob > chi2) and pr (the baseline predicted probability, created before). Furthermore: 
    • fmt specifies the number of decimal places (here: N has 0, all the following 3)
    • layout: specify alignment. We want N to be centered and all the following to be decimal aligned.
    • labels: create some nice names

Refinements

If you are grouping variables with the refcat command, you may want to indent the variables to create a nicer design as in my example. The following Stata command creates a 0.1cm indent for all variable labels. If you want some labels not to have this indent make sure to label (or relabel) this variables after you have run this command.

foreach v of varlist * {
	label variable `v' `"\hspace{0.1cm} `: variable label `v''"'
	}

If you have some long column labels you might have to insert a manual line break to prevent the column from becoming to wide. The usual command in LaTeX for this is \\, but that does not work in table columns. We create a special LaTeX command (see below) that takes care of that issue. If you need to insert a linebreak in your table, wrap the text into a \specialcell field. Then you can use \\ as usual:

mtitle("\specialcell{Co-Holding\\> \pounds100}" "\specialcell{Co-Holding\\> \pounds500}" "\specialcell{Co-Holding\\> \pounds1500}") ///

3. Tables into LaTeX

Now begins the LaTeX hacking part. By default, estout generates a complete table and has the ability to include table titles above and notes below the table. But we are using the fragment option to generate the pure table content only. Why? Because it allows for much more flexibility, plus adding notes below the table with estout almost certainly breaks the width of the first column.

In order for this to work you need to add the following to your LaTeX preamble (i.e. before \begin{document}):

% *****************************************************************
% Estout related things
% *****************************************************************
\newcommand{\sym}[1]{\rlap{#1}}% Thanks to David Carlisle

\let\estinput=\input% define a new input command so that we can still flatten the document

\newcommand{\estwide}[3]{
		\vspace{.75ex}{
			\begin{tabular*}
			{\textwidth}{@{\hskip\tabcolsep\extracolsep\fill}l*{#2}{#3}}
			\toprule
			\estinput{#1}
			\bottomrule
			\addlinespace[.75ex]
			\end{tabular*}
			}
		}	

\newcommand{\estauto}[3]{
		\vspace{.75ex}{
			\begin{tabular}{l*{#2}{#3}}
			\toprule
			\estinput{#1}
			\bottomrule
			\addlinespace[.75ex]
			\end{tabular}
			}
		}

% Allow line breaks with \\ in specialcells
	\newcommand{\specialcell}[2][c]{%
	\begin{tabular}[#1]{@{}c@{}}#2\end{tabular}}

% *****************************************************************
% Custom subcaptions
% *****************************************************************
% Note/Source/Text after Tables
\newcommand{\figtext}[1]{
	\vspace{-1.9ex}
	\captionsetup{justification=justified,font=footnotesize}
	\caption*{\hspace{6pt}\hangindent=1.5em #1}
	}
\newcommand{\fignote}[1]{\figtext{\emph{Note:~}~#1}}

\newcommand{\figsource}[1]{\figtext{\emph{Source:~}~#1}}

% Add significance note with \starnote
\newcommand{\starnote}{\figtext{* p < 0.1, ** p < 0.05, *** p < 0.01. Standard errors in parentheses.}}

% *****************************************************************
% siunitx
% *****************************************************************
\usepackage{siunitx} % centering in tables
	\sisetup{
		detect-mode,
		tight-spacing		= true,
		group-digits		= false ,
		input-signs		= ,
		input-symbols		= ( ) [ ] - + *,
		input-open-uncertainty	= ,
		input-close-uncertainty	= ,
		table-align-text-post	= false
        }

These commands to the following:

  • Create two wrappers for estout generated tables. \estwide uses tabular* and fills the table to the width of the text, \estauto uses tabular and uses the “standard” table width (i.e. width adjusted to your content).
    • You need to specify three options immediately after the command (in the curly brackets). \estwide{the .tex of the table}{the number of data columns}{alignment}. Have look at my comment below to clarify the syntax further.
    • In the second curly bracket enter the number of data columns, excluding the label column. In the example of Table 2 we have 3 data columns, in Table 4 we have 3 data columns as well, as the subcolumns “ß/SE” and “MFx” are considered to be one column.
    • For alignment use the S option for decimal alignment using siunitx or C to centre the data. (Hint: decimal alignment looks much, much better. Just look at any Journal).
  • Add the \specialcell command for manual line break (see above).
  • Add commands for subcaptions to include simple notes below the table using the caption package.
    • \figtext adds some basic text
    • \fignote adds text with “Note: ” before.
    • \figsource adds text with “Source: ” before.
    • \starnote adds information about significance levels

You can then include the tables into your latex document as follows:

% Table 2
\begin{table}
\caption{Sample Characteristics by the Amount of Co-Holding (\pounds)}
\estwide{table2.tex}{6}{c}
\label{table2}
\end{table}

% Table 4
\begin{table}
\caption{Probit Model for Characteristics of Co-Holders with Income Risk}
\estauto{table4}{3}{S[table-format=4.4]S[table-format=4.4]}
\starnote
\fignote{Omitted groups: \emph{Employment:} Student/Housewife/Disabled. \emph{Housing:} Private renter/Social renter. Further controls for spouse employment status.}
\label{table4}
\end{table}

That’s it. Not so difficult after all! I should add that there is a typographical issue if you are using different text and math fonts, as the brackets and asterisks are set in math font, but the digits in text font. A workaround is provided here, thanks to David Carlise.

Edit: Have a look at my comment below regarding the syntax of the \estauto \estwide commands. The number of data columns (the second entry) and the alignment (third entry) might be a bit confusing at first, I hope that comment clarifies things a bit.

Links

138 thoughts on “Automated Table generation in Stata and integration into LaTeX (1)

  1. Bookmarked – thanks for posting! I’ve used estout for regressions before, but didn’t realise it worked for descriptive stats as well.

    It’s probably worth adding that estout can also be used to output tables in RTF format – for those still plodding along in Microsoft Word…

  2. Your post was very helpful so far for me. However, I have a question regarding group variables in Stata.
    Maybe a quick background, I am writing my thesis about taxi drivers and there stopping behaviour. Hence, I have multiple dummies which I want to group e.g.:

    – 12 dummies indicating the working hours, here I want the overall name for the variables to be Working Hours
    – 7 dummies indicating the level of income, again I want to have an overall name for these variables

    This is because I would like you make use of refcat, to have a more beautiful table in latex.

    Your help is very much appreciated.

    Best wishes
    Molley

    • You just need to use the refcat command, as in my example. refcat(age18 "\emph{Age}") includes the emphasised heading “Age” before the variable “age18”.

      • yes that is pretty clear. But how can I create groups in Stata? I have several dummies which should be in one group.
        Best wishes

        • And one more thing: oculd explain

          \estauto{table4}{4}{S[table-format=4.4]S[table-format=4.4]}

          I am sorry for all those questions but I would really really appreciate your help!

          Best wishes

          Molley

        • I am not sure I understand what you mean. Do you mean Stata’s global macro? You can define a global “group” of variables using global dem "age18 age25 age35 age55 . This defines a group “dem” which incorporates all age variables. You can then call the group by $dem.

          • Yes, I think this is what I mean. Can I then use the generated “global” variable for the refcat command?

          • I don’t know, I just use refcat with one particular variable. Why don’t you try it and report back here?

          • Everything works just fine now! I thought I had to declare a group to tell stata where to put the heading. So at the end the whole refcat() thing is so much easier than I thought.
            However, for my large table I did not use estwide or estauto, because I simply changed the style of the whole layout with
            \geometry{a4paper,left=35mm,right=35mm, top=30mm, bottom=30mm} .
            This is because the table has 8 columns and over 25 variables and I did not like the former layout anyways.
            Thus, many thanks Jörg for the help and your quick responses.
            Best wishes!
            Molley

          • You’re welcome. But the main purpose of \estauto and \estwide is to provide an environment that allows to include captions and labels in LaTeX as oppossed to including them in Stata (changing them is a lot easier and faster in LaTeX). I am not sure I understand how that relates to changing the page dimensions with geometry.

  3. Regarding \estauto{table4}{4}{S[table-format=4.4]S[table-format=4.4]}: Here I made a mistake and the correct syntax should be \estauto{table4}{3}{S[table-format=4.4]S[table-format=4.4]} – but this does not affect the output. But to clarify things:

    The first curly bracket specifies the filename of the table (here “table4”, referring to “table4.tex” generated by estout).

    The second curly bracket specifies the number of *data* columns. Have a look at table 4 above. This table has 3 data columns, (1), (2) and (3). The variable label column does not count, and neither do the “subcolumns”.

    The third curly bracket specifies the alignment. We use the siunitx package to align numbers on their decimal point. Because we have two “subcolumns” (for ß/S E and MfX) we need to specify the alignment for both, hence S[table-format=4.4]S[table-format=4.4]. Remember that we have 3 overall columns. Here we use the same alignment for all overall columns, but you could specify different alignment all sub- and overall columns, e.g.:

    S[table-format=4.4]S[table-format=4.4]S[table-format=1.1]S[table-format=2.2]S[table-format=6.6]S[table-format=4.4]

    Again, using table 4 above, this would mean that the ß/SE column of (1) has space for 4 digits before and 4 digits after the decimal point. The same holds for the Mfx column of (1). The ß/SE column of (2) has space for 1.1 digits, the MfX column of (2) has space for 2.2 digits. The ß/SE column of (3) has space for 6.6 digits and the MfX column space for 4.4 digits.

    I hope that helps.

  4. Great work-around for posting comments below tables. I only have the problem that during compilation, Latex stops and returns an error from the xkeyval package when it comes to the siuntix-part. it returns “detect-mode undefined in families key”.any suggestions?

    • The option “detect-mode” tries to to identify whether you are in math- or text-mode. It looks like that is not working for you. Instead of “detect-mode”, you could try “mode=text” or “detect-all”. If that doesn’t work send me an MWE.

      • Thank you for a great post!

        I have the same problem. After compiling Latex return an error “detect-mode undefined in families key.” I tried to change “detect-mode” to “detect-all,” Latex returns: “detect-mode undefined in families key.” (I guess the problem is not resolved).

        After inserting “mode=text” it returns
        “`tight-spacing’ undefined in families `key'” and “‘group-digits’ undefined in families `key’.” Is there any way to correct this?

        Thank you!

        • Sorry, I did not see that you replied to a previous comment.

          This is a siunitx problem. I think you should try and update your tex distribution. The current version of siunitx is 2.5.

  5. I am a total beginner with stata and I am trying to put a prehead in the center.. I am using estout…

    My lines look like:

    estout *, …. prehead (Table 1)

    How can I do that?

    • My solution does not use estouts pre-/post-/note-function. I use estout to export the pure table content only using the “fragment” option and then include the table using my custom “estwide” or “estauto” commands. That solution allows for more flexibility, because you can add labels, post-/pre-text, notes etc. in LaTeX (where they belong) rather than changing the code in your do-file. Have a look at my examples above how to do that.

  6. Thanks, this is very helpful! A question and two suggestions:

    Q: For the line break in the column titles, is it possible to automatically add the titles within estout?

    Instead of

    mtitle("\specialcell{Co-Holding\\> \pounds100}" "\specialcell{Co-Holding\\> \pounds500}"}") ///

    maybe call the title of the header, and apply some rule like “linebreak if > 30 characters”?

    Suggestion 1: instead of renaming all variables to add \hspace{0.1cm} you can presumably use the -prefix- suboption for labels and leave the varlabels unperturbed. From the estout help: “prefix(string) specifies a common prefix to be added to each label.”

    Suggestion 2: You can add footnotes that are nicely aligned and don’t stretch the table using the Latex package -threeparttable-. I replace the table headers and footers within Stata to make that work. That way you can write the footnotes within Stata, which I find helpful to add macros that are generated in the code like “education uses `numbercats’ categories” where numbercats adjusts according to my code.

    • To answer my own question… if want estout to break the column “model” titles you can use the following option to insert a latex minipage. The advantage over the \specialcell approach is that the titles are filled automatically and in correct order — so you are not at risk of mislabeling a column, e.g. after changing the order of columns.

      mlabels( ,titles prefix("\begin{minipage}{0.5in}") suffix("\end{minipage}") )

      One could probably calculate the correct column width, e.g. by dividing the page width by the number of columns. Or by using estout’s “@span”. But at least any mismatch in the column width is visible.

      • Bert, thanks a lot for your excellent comments and suggestions.

        @Line break in a column title: I don’t quite understand the advantage of your solution. Usually titles of columns are centered, so it does not really matter where the linebreak is. What do you mean with “the advantage over the \specialcell approach is that the titles are filled automatically and in correct order”? Could you post a code example of the estout call here (Hint: you can use the wrappers <"code">xxx<"/code"> to highlight code, but remove the “”).

        @Suggestion 1: using estouts prefix option definetly looks more elegant than my solution. I remember trying that when I created all this, but I ran into some trouble. I have a look again and update my post accordingly.

        @Suggestion 2: I guess it boils down to personal preference where you want to add your notes and comments. I prefer it to be in LaTeX, but I am not using ‘dynamic’ notes like you do. The ‘threeparttable’ package looks interesting and an alternative to my approach of creating custon environments. It is probably worth notifying the estout developer(s) that the built-in note-function of estout breaks tables and it can easily be overcome with ‘threeparttable’.

        • @Line break. Sorry for the confusion. In your \specialcells solution, you manually specify the column titles and manually enter the linebreak. In my \minipage approach, -estout- fills in the titles and Latex takes care of the linebreak. From my perspective the main benefit is that my approach is less susceptible to user mistakes. If I change order of the models for the tables, I don’t need to worry about also (manually) changing the column titles.

  7. Thanks for posting, this has been a great help!
    So far everything is working out nicely, however I face a problem when it comes to very long (descriptive) tables that would better fit onto several pages.

    Any attemps to modify your commands to capture the “longtable” option have failed…any ideas how to solve this issue?

    Thanks for your help,
    Torben

    • Torben, I did not try longtable yet, but I did use threeparttable. Did you try wrapping the whole estout-thing in \begin{longtable}...\end{longtable} instead of \begin{table}...\end{table}? I am very busy at the moment but I can look into this in a few weeks time. If you can email me an minimum working example I might be able to help you sooner.

      • Jörg, thanks for your quick reply!
        I tried “longtable” and “threeparttable”, but the resulting table was split such that the caption covered most of the pages…never mind, because at the moment I am using a workaround: splitting the tables and adjusting their height with

        \resizebox*{\textwidth}{\dimexpr\textheight-9\baselineskip\relax}{
        \estwide{...}
        }

        wrapped around the \estwide command (the package “calc” is needed for this.)

        • I have tried to create a solution that involves longtable, but it is a bit more complicated that I anticipated and I am not able to provide a fix unless I encounter the issue myself :(.

  8. Great resource – thank you! I am hoping you may be able to help me with a problem. I am using a set-up similar to yours to create a table of marginal effects. However, the significance stars generated are based on the coefficients rather than the marginal effects. I’ve tried using the pvalue option, but I’m not sure how to specify mypvalue to specify the p-value of the mfx. I understand the margins command will do this automatically, but that’s not how I’m generating the output and it doesn’t seem to work with my code. My code is below. Thank you in advance for any help you can offer.

    svy, subpop(group_cso): logit partymember_strict2 civsoc_i_new2 gender_n age_clean_n extrovert _spline1 _spline2 _spline3;

    estadd margins, dydx(*) vce(unconditional) subpop(if group_cso==1);

    est store cso_res;

    svy, subpop(group_cso): logit partymember_strict2 civsoc_i_new2 gender_n age_clean_n extrovert civsoc_net _spline1 _spline2 _spline3;

    estadd margins, dydx(*) vce(unconditional) subpop(if group_cso==1);

    est store cso_res2;

    esttab cso_res cso_res2 using cso_res.tex,se scalars(N_sub) noobs nonumbers mlabels("" "",numbers) collabels("Marginal Effect (SE)",lhs(Party Membership)) drop(_spline1 _spline2 _spline3 _cons) star(* 0.10 ** 0.05 *** 0.001) cells("margins_b(fmt(2)star)" "margins_se(fmt(2)par)") stats(N_sub, fmt(0) labels("N")) eqlabels(none) rename(civsoc_mem_lag civsoc_i_new2) label replace booktabs ///
    title(Social Networks, Organizational Membership and Party Membership Decisions (Restricted Model)\label{cso3});

    • Hi Yael, I don’t have a quick answer for you, but it is possible to base the significance stars on the marginal effects. I will have a look at it in late August and post a reply, as I am too busy right now.

        • Yael, I have thought about your problem, but it does appear to me that his is not an estout issue. First I thought you mean that you don’t see significance stars, but you mean that “incorrect” significance stars are shown as they are based on the coefficients.

          That sounds like a Stata issue to me. The line of relevance is cells("margins_b(fmt(2)star)" "margins_se(fmt(2)par)") and here you tell estout to take the dy/dx value and add significance stars according to its standard error. I don’t see what’s wrong with that.

          Are you sure that the significance stars are incorrect? What do you mean with “pvalue option”?

          • Yael is right. If you use “estadd margins” and then “esttab cells(“margins_b(fmt(2)star)” , you get the significance stars based on the coefficients rather than the marginal effects. Look at your table4, the stars from the coefficients and the mfx are always the same. Or compare the stars from the table with the screen output p-values from “margins”. As far as see it, it is a problem of the margins command.

            http://www.stata.com/statalist/archive/2011-10/msg00585.html

            http://www.stata.com/statalist/archive/2012-05/msg00871.html

            Any ideas how to solve that problem?

          • This is still on my mind – I’ve made it a habit to check whether the stars after “margins” are correct based on the p-values, and I never found them to be incorrect. In my table4 the stars from coefficients and mfx are the same because they are the same. Could someone provide me with an example (i.e. data) where they are not the same? Then I could look into it – perhaps it has been fixed in Stata 13 or newer versions of estout without us knowing.

  9. Thanks so much, this was very helpful! I am trying to do exactly what you do. The table prints nicely in LaTex; however, the table is “too big”: The “note” prints on top of the final line of the table, and the whole table is not centered (like the title and “note”) but a little to the left on the page. Finally, my significance stars are not small and pretty but the same size as the rest of the table and to the right of the coefficient. Do these problems arise because I am failing to include all the \usepackage{} that are required in the beginning? What are the \usepackage{} that should be included before your Latex crunch code? Now I have

    \documentclass[11pt]{article}
    \usepackage{booktabs}
    \usepackage{caption}

    Thank you in advance!
    Best,
    Maria

  10. Great post, thank you! It helps me a lot.
    In order to shrink the size of the table I normally use resizebox or scale. My problem is that the fignote remains unaffected. Does anyone know how I can shrink the table including the fignote?

    • I tried a few things just now, but I’m afraid I don’t have an answer to that. But why do you want to do that in the first place? Shrinking the table is typographically not very good as it gives you many different font sizes within one document. It’s better to use \footnotesize, \scriptsize etc. to adjust the font size. And resizing a caption/note is typographically even worse. Why can’t it be the same font size everywhere?

      • My objective is to adjust the table to the page dimensions. Without shrinking it, it does not fit on one page, but maybe adjusting the font size will do the job as well.

        I tried to adapt your code to a mutilpe equation model ( biprobit).
        Everthing works out fine expect for the marginal effects. I am not able to include marginal effects for both equations because I don`t know how to use the margins command for both equations seperately with esttab. Do you have any hint on that?

  11. Thanks Jörg! This has been very helpful. My only issue was using \figtext often resulted in the first line of footnotes overlapping with the last horizontal line of the table. This was pretty easily fixed by diminishing or removing “\vspace{-1.9ex}” from the definition of \figtext.

    • Ben, have a look at my follow-up post where I address exactly that issue. You can also consider the alternative commands I create there using the threeparttables package that generates notes that are as wide as the table and look nicer.

  12. Note: If edited the comment to highlight the problem^JW

    Thanks a lot Jörg !!
    Unfortunately, same if i did exactly what you explain, latex say me “LaTeX Error: Missing \begin{document}” and stop the compilation just at the last “}”. After a lot of tests, I don’t understand why it say me that. I hope you can help me. The latex file includes:

    ...
    ...
    \newcommand{\estauto{table4.tex}{4}{S[table-format=4.4]S[table-format=4.4]}}[3]{
    \vspace{.75ex}{
    \begin{tabular*}{l*{#2}{#3}}
    \toprule
    \estinput{#1}
    \bottomrule
    \addlinespace[.75ex]
    \end{tabular*}
    }
    }

    ...
    ...
    \begin{document}
    \begin{table}
    \caption{Table de regression}
    \estauto{table4}{4}{S[table-format=4.4]S[table-format=4.4]}
    \starnote
    \fignote{}
    \label{table4}
    \end{table4}
    \end{table}
    \end{document}

    • There are a few mistakes in this code. The definition of \estauto in the preamble is completely wrong. Just copy and paste it from my example. You have \end{table4} at the bottom which is wrong as well, just delete.

      • I was confused by that as well, since there is an extra \end{table4} in the example, maybe that should be deleted in the text?

        One suggestions if you every have the time:
        Similar to statalist it would be nice to have a running example. It think it would help understanding your code a lot if you could make up a simple example from stata autouse datasets, so someone could run it right away and then change it.

        Best

        • Thanks, I did not realise before that the error was in my example.

          Originally I wanted to show examples based on the autouse datasets, but as I don’t know them that well it would take me a while to do similar examples. Maybe if I can find the time at some point I will create a running example.

  13. Do you see a way to adapt the summary tables example in a way that I can just get 3 columns for mean, sd, and numer of observations?
    I just wanna
    estpost su $Vars
    est store A
    esttab A using table2.tex, replace …

    But I could not figure out a way to get him to print the mean, sd, and obs in different colums, since cells would put it into to same cell and I just get one column.

  14. Hi Jörg,

    Thanks a lot for this post, very useful.

    I’m encountering a problem with my descriptive tables. I have two columns (treatment and control) with mean test scores for every part of the test and want to create a third column next to it with the difference between these two columns, how to get this most efficiently? When creating a “differences”-variable in Stata, it shows up as new rows (in the 3rd column).

    This is my Stata code so far:

    estpost su `prueba_ciencias’ if tratamiento_lf==1
    est store A

    estpost su `prueba_ciencias’ if control_lf==1
    est store B

    estpost su `prueba_diffs’ //This contains my “difference”-variables
    est store C

    esttab A B C using `pruebas_ciencias_tex’, replace addnotes(…) ///
    mtitle(“Tratamiento” “Control”) ///
    cells(mean(fmt(4))) noobs nonumbers

    • Gieltje, it’s probably too late for you, but not for others with the same problem in LaTeX-land. I too love these posts by Jorg and don’t think that people are coming here because of finals but because they are so useful for learning esttab, estpost and LaTeX table making and well explained and illustrated. These posts got me started!

      The problem seems to be that you are posting a slightly different set of variable names (var1_diff, d_var2_diff rather than var_1 var_2) and esttab doesn’t know how to line them up neatly in the proper rows. In other words, it doesn’t know your var1_diff and var2_diff should go in the var1 and var2 row.

      Here’s how I resolved a similar problem for this for a table on where I wanted to look at descriptive stats of a t=2 panel (hardly a panel!) in long form. So my code had the same form as your
      estpost su `prueba_ciencias’ if tratamiento_lf==1
      except mine looked something like:

      estpost su $Outcomes if program_eligible & quarter==1
      est store A
      and
      estpost su $Outcomes if program_eligible & quarter==2
      est store B

      I couldn’t get it to work like you, and I racked my brain. I came up with a work around. Use the preserve/restore functions to temporarily save the original dataset in memory, replace the original variable with my difference value (in my case intertemporal difference between Q1 & Q2). You might do something like (forgive me but this is not debugged Stata code, just off the top of my head):

      preserve
      foreach var of local prueba_diffs {
      local originalvar = regexr(`var’,”_diffs”,””)
      * replace var with its diff equivalent
      * depending on how your diff variable is stored you may have
      * to do some more wrangling here to make right thing appear
      replace `originalvar’ = `var’
      }

      estpost su `prueba_ciencias’
      estpost C

      restore

      voila! beautiful tables!

  15. Thank you for this post! Finding information such as this has been difficult for me. I just installed the most recent l3 and siunitx packages (published on 7/28/2013) and am having problems compiling the Latex file. I was wondering if these new packages require an updated version of the code you’ve provided? I’ve copied the code presented in this post, updated the relevant portions (gave the table a new title, put in the .tex file name containing my Stata output generated by esttab, and updated the column number to 2 to reflect the fact that my table only contains 2 output columns). After doing this, I receive the following error “Undefined control sequence \estwide…tracolsep \fill }l*{#2}{#3}} \toprule \estinput {#1} \bottomrule…” and this links back to my line of code after \begin{table} that uses the \estwide command (“\estwide{ET_Table1.tex}{2}{c}”). I’d be happy to send you my code if that would be helpful. Thank you so very much for any help you can provide!

    • Sarah, the error message you describe sounds more like an issue with ‘booktabs’. Are you loading that package? If that is not the problem then I’ll update my TeX and check if the code is still working.

  16. Hi Jorg,

    I just came across your blog this morning. I am trying to produce results tables (for Latex) either with margins or with margins on a separate page. I have tried the syntax above and I am only getting the table of coefficients without the margins included. Below is a caption of part of the syntax I used:


    eststo clear
    qui logit $ylist $xlist if c==1
    estadd margins, dydx(*)
    eststo est1

    qui logit $ylist $xlist if c==2
    estadd margins, dydx(*)
    eststo est2

    qui logit $ylist $xlist1
    estadd margins, dydx(*)
    eststo est3

    esttab est1 est2 est3 using ....

  17. Hi Jorg,

    I have been using your tricks for a while now, they are really great. I was wondering, do you know of any resources that would allow for similar commands in SAS?

    thanks!

    • Hi Phil, unfortunately not. I’ve never used SAS, and I don’t know if an ‘estout’ equivalent exists for it. What can SAS do better than Stata?

  18. Excellent post. Thanks for sharing this. Can you suggest a way how I can implement “refcat(hhincome “\emph{Household Finances}”, nolabel)” if, for example, my variable label is longer than “Household Finances” and I want it to span over all the columns? I am using MSword and I cannot use \multicolumn to get this. I need a code suitable for .rtf output.

    Thanks for your help.

  19. Hi,

    Thanks for this useful tool but I can’t make it work completely so far. I have copied everything in the preambule, have the following estab command:
    esttab A_1 A_2 using $csv\vam1.tex, replace label booktabs nomtitles title("Value Added Model 1: sub-score results") ///
    drop(_cons ) se(3) b(3) ///
    star(* 0.10 ** 0.05 *** 0.01) ///
    stats(N r2 , fmt(0 3) labels(`"Observations"' `"\(R^{2}\)"' ))

    with A_1 A_2 my two date columns

    and have called for the table using the following

    \begin{table}
    \caption{VAM}
    \estauto{Q:/data/CSV/vam1}{2}{S[table-format=4.4]}
    \label{table4}
    \end{table}

    I get “misplaced \noalign” error on texstudio. What’s surprising is that when I use a simple \input the table is properly imported on latex.

    What am I doing wrong?

    Thank you so much for your help

    • You need to use the fragment option, otherwise you are essentially using the table headers and footers in LaTeX twice. I.e.:

      esttab A_1 A_2 using $csv\vam1.tex, f replace label...

      (notice the “f”).

  20. Hi Jorg,

    I am implementing your code (thanks so much for posting it) and I have a few issues.

    The first, is that I am getting this error

    Latex Error: ./Housing_did_prelim.tex:143 Undefined control sequence.

    Line 143 is the one that has the estauto line

    \estauto{ DiD.tex}{3}{S[table-format=4.4]S[table-format=4.4]}

    I only have 3 models to show so I know the number is not mistaken.

    The second problem is that even when I compile it I find that there is a “[.75ex]” at the end of the table, right underneath the last line.

    The third problem is that when I want to incorporate \starnote, before the note starts it says “Table 2:” when there is only one table and also to the right, it appears “justification=justification, font=footnotesize”, part of the declaration of \starnote.

    Do you know what the solution to these issues might be?

    Thank you again

    • I was able to fix the last two issues. However, I still have [0.75ex] appearing to the right of the table right in the middle. I don’t know how to get it out.

      Thanks

      • I’m a bit confused, how can you compile the document when you get an “undefined control sequence” error? In any case, it sounds like you have not included the code that defines the “estauto” command, hence the error message that the command does not exist.

        Are you loading the booktabs package in the preamble?

  21. Hi Jörg,
    this post is very useful, thank you.
    I was wondering: how do you get the number of observations in the summary statistics table? Thank you,
    Simone

  22. Hi Jörg,

    Do you know if estout can use the output from xtsum? I have not been able to find an answer for this anywhere so I guess not.

    Thanks.

    • Hi, I haven’t tried xtsum, but I think it should work. Have a look at the estout manual and try return list and ereturn list.

  23. This page is great! Thank you Jörg.
    Unfortunately, I cannot resolve a issue I have with Latex. I cannot compile my pdf due to:

    \bottomrule ->\noalign
    {\ifnum 0=`}\fi \@aboverulesep =\aboverulesep \global...
    l.652 ...}{S[table-format=4.4]S[table-format=4.4]}

    I am sure, you have an idea about that matter? Thanks

      • Sorry, this did not appear in context as I had intended. I was following up on [Rijo on 21/01/2014 at 6:45 pm], the question about using \refcat to create notes across multiple columns. (For example “Panel A: Whatever” centered above one set of regressions, “Panel B: Whatever Else” centered above a second set in the same table.) You indicated you didn’t think it could be done in .rtf, I was wondering if you had any ideas for how to do it in a .tex file.

        • Ah, okay. You can wrap the variable label that you want in multicolumn. For example something like: refcat(age18 multicolumn{3}{l}{\emph{This is a long Age variable name}}) would span the the label for age18over 3 columns.

          • Thanks. That works pretty well, except I wind up with too many columns in that row. That is, the multicolumn is then followed by the same number of & & &, which leads to too many columns.

            To be specific, if I add

            refcat(dummy "\multicolumn{3}{l}{\emph{This is a long Age variable name}}", nolabel)

            to esttab, then the .tex output I get is

            \multicolumn{3}{l}{\emph{This is a long Age variable name}}& & \\

            rather than what I think I want:
            \multicolumn{3}{l}{\emph{This is a long Age variable name}}\\

            The pdf still compiles, but I get an error message:

            Errors:
            Extra alignment tab has been changed to \cr

            Description:
            ...This is a long Age variable name}}&
            & ...
            You have given more \span or & marks than there were
            in the preamble to the \halign or \valign now in progress.
            So I'll assume that you meant to type \cr instead.

            and then in the PDF there is an extra unwanted blank row after the the refcat contents.

            I could use the estout substitute option to get rid of these, although this depends on knowing the exact number of extra & and the number of spaces between them. Probably some perl wizardry could do a better job.

            Anyway, I hope these comments are helpful and it’s a great contribution regardless.

          • Ah yes, I forgot about that. You need to add \\ % at the very end of the refcat name, i.e. add the table delimiter and then comment the rest of the line out. Then everything after the end of that will be ignored in LaTeX. So in total:

            refcat(dummy “\multicolumn{3}{l}{\emph{This is a long Age variable name}} \\ %”, nolabel)

            Don’t forget that you have to adjust \multicolumn{3} to the total number of columns that you have. Here it spans over 3 columns and is left aligned.

            This is more of a workaround than a proper solution – ultimately the estout developers would need to provide a fix for it.

  24. Dear Jörg, thanks a lot for putting this guide together. However, when following your instructions for regression tables ” [.75ex] ” shows up somewhere on my table. I suppose the error is due to some spacing problem, but I really can’t figure out how to solve it. If I include “f” after “replace” in the esttab command in stata, it “[.75ex]” is right next to the table. When f is excluded, [.75ex]” shows up below the table.
    Have you heard of anyone encountering this problem before or do you know how to solve it?
    Thanks

  25. Hi, moverstud.tex is a table similar to table 2. It does not compile and this is what it gives me

    Latex Error: ./moverstud.tex:1 Extra alignment tab has been changed to \cr.

    what I noticed is that the only difference is that this table does not have the \begin{tabular} preamble (because of the frame option in Stata)

    How could I fix this?

    • If you use my \estauto or \estwide commands the tables must not be within a tabular environment – that’s the entire purpose of the commands. You must use Stata with the fragment (not frame) option and specify the number of columns in \estauto or \estwide accordingly.

  26. Hi Jorg, your post has been very useful for me.
    However, I’ve been trying a lot without success the following:
    I made a survey and i have many questions in which the anwers are totally disagree, disagree, agree, totally agree, don’t know, don’t answer. Therefore, I need to make a tab in which I know the percentage of people who answer to each of those options.
    I can make many tabs but that would be cumbersome (besides the fact that I would have many tables in my paper).
    I was wondering if I could make a single table in which I can put this in a compact and orderly way. For example:
    Put in columns the options of the answer (agree scale) and in the rows a different question.

    Another question: I’m doing tabs for dummy variables, e.g. have you tried marihuana last week? yes (=1) or no (=0). Is there a way in which I could drop from my esttab the answes for no? I just want to have in a nice table in latex the percentage of people who answer yes. In this way, to group in the same table different variables, only with the answers of yes.

    Thanks!!!

    • Just a hint.
      The command tabm displays the table as I want it. You could see it very brievly here:
      http://www.stata.com/statalist/archive/2012-01/msg00672.html
      The problem is that it displays frequencies and not percentages (I would rather prefer %).
      The second thing, and most important, is that I don’t know how to export exactly that table to latex. It seems it doesn’t fit with the estpost-esttab command.
      I really appreciate if you could help me!

    • I have my last question here and I’m done:
      I would like to make a table just of scalars, in which each row is a different scalar. This is because I’m making the Mann-Whitney test and I’m saving the p-values in a scalar; then I would like to have all of them in a separate table.
      This is what I’m doing:

      Ranksum impulsive, by time
      scalar pval-imp= 2 * normprob(-abs(r(z)))

      Ranksum organized, by time
      scalar pval-org= 2 * normprob(-abs(r(z)))

      Ranksum confused, by time
      scalar pval-conf = 2 * normprob(-abs(r(z)))

      What I want is generate a table to latex with those three scalars (pval-imp, pval-org, pval-conf).

      This would be very useful for me. Appreciating your help!

      • If you want to add scalars have a look at the “estadd” macro. For example, estadd scalar pr = r(mean) would add the mean of a previous calculation (use return list or ereturn list in Stata to see available scalars).

    • Jorg, I figured it out. Is a bit long, but it works (There are two variables named c1 and c2, and are coded 1-5. See the code below.)
      Now, my only problem is that when I make the append the columns are not alligned to the ones of the first table, see the image of the table here to have an idea of my problem: https://www.dropbox.com/s/14b08iks1y011vy/LikertScale2.PNG?dl=0
      I tried with plane, but doesn’t work. Alignement can’t be used because is within cells. Hope you can help me just with this thing! Many thanks.

      estpost su c1 if c1==1
      est store A
      estpost su c1 if c1==2
      est store B
      estpost su c1 if c1==3
      est store C
      estpost su c1 if c1==4
      est store D
      estpost su c1 if c1==5
      est store E
      esttab A B C D E using pruebax1.tex, replace ///
      title (Tabla 13) mtitle("Muy buena" "Buena" "Regular" "Mala" "Muy mala") ///
      refcat(c1 "\emph{Convivencia}", nolabel) ///
      cells(count(fmt(0))) label booktabs nonum collabels(none) gaps noobs
      
      estpost su c2 if c2==1
      est store A
      estpost su c2 if c2==2
      est store B
      estpost su c2 if c2==3
      est store C
      estpost su c2 if c2==4
      est store D
      estpost su c2 if c2==5
      est store E
      esttab A B C D E using pruebax1.tex, append ///
      nomtitles addnote("La tabla muestra frecuencias, no porcentajes") ///
      cells(count(fmt(0))) label booktabs nonum collabels(none) gaps noobs
      • Apologies for giving you a wrong answer before (I’ve cleaned up the comments a little bit to avoid confusion). I don’t think you can overcome the problem you have unless you use the “fragment” option in esttab. If you look at the LaTeX code esttab generates, you see that there are two tabular environments, hence the “wrong” layout. Try the following instead:

        ...
        
        esttab A B C D E using ../tables/pruebax1.tex, replace ///
        f mtitle("Muy buena" "Buena" "Regular" "Mala" "Muy mala") ///
        cells(count(fmt(0))) label booktabs nonum collabels(none) gaps noobs
        
        ...
        
        esttab A B C D E using ../tables/pruebax1.tex, append ///
        f nomtitles ///
        cells(count(fmt(0))) label booktabs nonum collabels(none) gaps noobs stats(N) plain

        Then the layout is correct, but you need to create the tabular environment yourself (that’s what my \estwide and \estauto commands do).

        • Jorg, it now works perfectly! This post and your comments have been very very helpful for me. Thank you very much!

  27. Pingback: Automating workflows | Food for thought

  28. Dear Jörg,
    my question is a bit off-topic as it relates to the nature of your descriptive statistics. As the code only refers to the mean I was wondering how you included the percentages in the first part of Table2?
    Great work and many thanks!!

  29. Hi! I had a quick question. If i am running a regression with 20 variables and want to report the coefficient and standard error for only 10 of them, then how do i do this using the famework given here. FYI, this post is really helpful!

  30. Hi Jorg – Thank you for the helpful post.

    I am getting a weird error – in my table output the decimal points are missing. E.g., 3.69 comes as 3 69. Any suggestions on how to fix this?

  31. Subject: ologit regressions with coefficient and odds ratio, and OLS in one single table
    Dear Jorg,

    First of all: thank you for this post! You have no idea how helpfull it has been for me!
    However, I am trying to do a table of regressions similar to your table 4 of the probit model but I haven’t figured it out yet how to do it.

    What I want is the following:
    I am estimating 3 models: 2 ologit and 1 ols. For the two ologit models I want to report the beta coefficient and the odds ratio (each one with their standard error). In my third model (the ols), I just want the coefficient and it standard error. However, I am not sure of how to program it to store what I want and then to make my table with 3 big data columns (with the first two of them containg two columns inside, like multicolumns in your example)
    I also don’t want to report controls coefficients in my table.

    This is what I’m doing:
    ologit depvar1 indvar1 (set of controls), r or
    (here i want to report coefficient and s.e., and odd ratio and s.e.)
    ologit depvar2 indvar1 (set of controls), r or
    (same as the above)
    reg depvar3 indvar1 (set of controls), r
    (here i want to report coefficient and s.e.)

    I don’t know what to store in eststore in each of the estimations, and then how to put it in esttab and differentiate the multicolumns from the single column.

    Something like this, is what I want:
    https://www.dropbox.com/s/41emdaxhqltapyu/ologit_ols.png?dl=0

    I would really appreciate your help.
    Thanks in advance!
    Natalia

    • I figured it out with the ologit. It works perfectly (see the command below)! Now, the only thing that I don’t know is how to insert a forth column with ols results from another estimation, since in this estimation the cells just would include (“b” “se”)

      ologit HH1 B3 ${controles}, r
      estadd expb
      eststo A
      ologit HH2 B3 ${controles}, r
      estadd expb
      eststo B
      ologit HH3 B3 ${controles}, r
      estadd expb
      eststo C
      esttab A B C using Tabla1.tex, replace f ///
      label booktabs b(3) p(3) eqlabels(none) alignment(S S) cells(“b(fmt(3)star) expb(star)” “se(fmt(3)par)”) keep(B3) ///
      collabels(“\multicolumn{1}{c}{$\beta$ / SE}” “\multicolumn{1}{c}{Odds Ratio}”) mtitle(“Family” “Friends” “Fellows”) ///
      star(* 0.10 ** 0.05 *** 0.01) stats(N, fmt(%18.0g) layout(“\multicolumn{1}{c}{@}”) labels(“\midrule Observations”)) ///

  32. Pingback: Tables of Descriptive Statistics using esttab in Stata and Latex | If so, how?

  33. Pingback: Stata to Latex – Formatting Latex – part 2 | Asjad Naqvi

  34. Hello. Thanks for all comments
    I have a problem but it is at the LaTeX level.
    when I want to run the LaTeX file I get this error:
    ! Undefined control sequence.
    \estwide …tracolsep \fill }l*{#2}{#3}} \toprule
    \estinput {#1} \bottomrule…
    l.656 \estwide{table1.tex}{2}{s}
    \label{table1}

    I don’t know maybe I need to include some package.

    Thanks for reviewing my concern.

  35. Hi. This guide is great. Thank you for posting it. I’m using XeLaTeX with the mathspec package so I can use professional Adobe fonts. I did not quite follow the workaround you linked to at the bottom of this guide. I have ran this with XeLaTeX and have not noticed any problems (yet but just getting started). Should I be expecting an issue and what do I need to implement from the workaround link you provided?

    • Good question. If I remember correctly it was an issue with getting the proper “math minus” into the tables. Anyway, I am suggesting to keep it as simple as possible (I am not using the “font substitution black magic anymore”). At some point I will hopefully have the time to write a new up-to-date guide. But if it works for you, I wouldn’t expect any issues.

      • After going through each of your posts on table generation I have found XeLaTeX seems to work just fine!

  36. Hi Joerg,

    I would like to produce a table in which I report both p-values and standard errors for the coefficients. The esttab command doesn’t seem to allow two secondary statistics as in “b(fmt) se(fmt) p(fmt)”.

    Any suggestion on how to do the trick?
    Thanks,
    Ruth

  37. Pingback: Stata-Latex esttab Regression Table Output Streamlining | Anthony Louis D'Agostino

  38. Hi Joerg,

    Regression tables. I’m trying to figure out how to include a row that says something like Controls…..No……Yes…….No……Yes and the like. I’m not a fan of leaving out variables from tables but for Beamer slides sometimes I need to. I can’t seem to find an easy way to make this happen in a layout similar to the Probit model results you demonstrate. Do you know an easy way to make this happen that looks nice? Also, esttab can handle Stata factor variables (i.e. i.x) but it seems to struggle with interactions (i.e. i.x#c.z). It will include the interactions in the table but the refcat command does not seem to work with them for creating a nice category name. Any workarounds here?

    Thanks!

    • Hi Jonathan,

      – Controls: Did you try the indicate() function? You can also use refcat() to add more information.

      – Interactions: one way that definitely works is just to create the interactions manually and label them (that’s what I normally do). Alternatively, you could try and experiment with the rename() function.

      • Yes the indicate works but doesn’t seem to work with some variables. Turns out that was more an issue on my end. The interactions with Stata factor variables is the main issue I have been looking for a nice way to fix up. I suppose if need be I will revert to your suggestion of creating the interactions manually.

        Your setup is one of the best I have come across. Other people notice it and ask about it. So, thanks for the hard work! Have you experimented with circling different rows, or numbers, in your table to highlight them for Beamer slides? If so, do you just open the stata produced table in latex and add the tikz code? Or, is there another, more efficient method?

  39. Dear Jörg
    thanks a lot for this amazing tutorial.
    I have used your code to generate a regression output table, unfortunately I have an issue with the range of horizontal lines. I want to display results in 11 columns but the top, middle and bottom rules will only stretch over 9 columns or less. I can get the middle rule to cover all columns by \cmidrule{1-11}, this isn’t an elegant solution, though.

    Is there anything inside your code which limits the range of the rules? I couldn’t figure out…

    Otherwise the table just looks fine and I don’t get any warnings or error messages.

    Thanks!

    • Okay, so far I figured that the problem seems to come from the landscape orientation, which I set. Still I don’t know how to fix it.

      • I don’t know how that could be the case if you use booktabs. The rules (e.g. \toprule, \midrule) etc. stretch always to over all specified columns. Are you sure you specified the correct number of columns in the table?

  40. It is an excellent post! but I have got few questions: forgive me if they are too silly, as I am a very newbie of stata and latex. I am very curious about the -global- command you have used and I my feel a bit confuse when i was reading the code that I cannot find any code declares the value format of the table, yet it still ends up with different format in Table 2.. one more question is how could you manage to use -su- to reflect the dummy frequency? Thank you very much.

  41. Hello Jörg, many thanks for your post, it is really helpful. I have one question regarding the generation of summary statistics: I want to show 4 columns corresponding to: Year 1, average and SD; and Year 2, average and SD. In order to show the last row (number of observations), I would like this number to be in the middle of each 2-columns. In this sense, I specified the option “……. stats(N, fmt(%18.0g) labels(“\midrule Observations”) layout(“\multicolumn{2}{c}{@}” “\multicolumn{2}{c}{@}”))………..”.

    However, the fragment corresponding to this in Latex is:

    “…………………..\midrule Observations&\multicolumn{2}{c}{314}& &\multicolumn{2}{c}{314}& \\
    \multicolumn{2}{c}{}& & & & \\”

    While it should be only:

    “…………..\midrule Observations&\multicolumn{2}{c}{314}& \multicolumn{2}{c}{314} \\”

    Any hint on how to solve this? I would be very greatful if you could help me. Thanks in advance!!!

    • I don’t think I have ever achieved what you are after. As you have found out, if you specify several “\multicolumn” in the layout option, the LaTeX code is screwed up because esttab attempts to fill multicolumns for all columns. Unfortunately I don’t have a solution for your problem – apart from either keeping observations only in the first column of each group, or editing manually. But what happens if you specify only layout(“\multicolumn{2}{c}{@}”)?

      • Hi Jörg, thanks for answering! I tried with what you suggest, and the output is now:

        “…………\midrule Observations&\multicolumn{2}{c}{314}& &\multicolumn{2}{c}{314}& \\

        A bit better, but not enough….

  42. Pingback: Causal Inference and Big Data (Spring 2017) – Tzu-Ting Yang 楊子霆

  43. Hi Jorg I am wrinting my thesis in ShareLatex, thus online. Is it possible to do this STATA-LATEX integration online through sharelatex? best, sophie

    • You can do the LaTeX part for sure in ShareLatex, but you need to find a way to sync the tables created in Stata automatically to ShareLaTeX. Perhaps with Dropbox? But I can’t really help you there, because I am not using ShareLaTeX.

  44. Dear Jorg,

    (I am new to using LaTeX, so this may have a relatively obvious answer.)

    I have copied and pasted the above code for creating tables directly into LaTeX but when I attempt to insert a table if get an error message saying “!Undefined control sequence.” and the additional information states that “The control sequence at the end of the top line of your error message was never \def’ed”

    The first line of the error message appears at this line:
    \let\estinput=\input% define a new input command so that we can still flatten the document

    I saw above that in a previous response you mentioned the need for the booktabs package, but even using this I still get the error message.

    Are you able to advise me on what might be causing this error?

    Many thanks.

    • Unfortunately I cannot help with this, apart from saying that there is an error in your LaTeX code. The command \let\estinput=\input has nothing to do with booktabs, it just copies the command \input to the new command \estinput. Good luck!

  45. Hi Jörg,

    I keep coming back to this tutorials whenever I do Latex tables in Stata. Now, I’m trying descriptive tables for the first time. There is one aspect of your code I cannot follow:

    How did you include your age variable in the $dem global? From the output it is a factor variable, so it would require something like i.age to get separate shares for each bracket. But using i.age in estpost sum produces an error (no factor or time-series operators allowed). So did you create separate dummy variables for each age bracket? But then the refcat code would not make sense anymore, I suppose.

    In any case: Thanks a lot!

    • The global $dem just captures individual dummy variables that I have created seperately, one of which is age18. I am doing this preciesly to be able to use refcat later…I don’t know how to use it if I would specifiy dummy variables direct in Stata, e.g with i.age. So imagine something like:

      global dem "age18 age25 age 35"
      estpost su $dem
      est store A
      esttab A, ...
      ...
      refcat(male "\emph{Demographics}").

      Hope that makes sense.

  46. Hello Jörg,

    Please I’m trying to build a table following your template. I want to create variable headings like you did in your Table 4 with Age, Housing etc. But I’m struggling with refcat(). I’ve seen how refcat is used here (http://repec.org/bocode/e/estout/advanced.html) and replicated the example very easily but it doesn’t work when i follow the same method form my own work. Please see my code below. Sorry it’s messy, but I hope you are able to get what i’m trying to do. It works fine until the last three lines. For example refcat(“15-24” Age) alone does not create the heading ‘Age’ in my case. Rather it deletes the 15-25 group from the table. It probably has to do with the labeling of the variables. But I’m clueless about this and I’ll really appreciate your help here. Thanks.

    svy:probit injury informal if inrange(age,15,64)&inlabforce==1
    margins, dydx(*) atmeans post

    est store A //

    xi:svy:probit injury informal female i.agegroup i.education if inrange(age,15,64)&inlabforce==1
    margins, dydx(*) atmeans post

    est store B

    xi:svy:probit injury informal female i.agegroup i.education i.ind2 i.tenure i.region union if inrange(age,15,64)&inlabforce==1
    margins, dydx(*) atmeans post

    est store C

    esttab A B C, refcat(_Iagegroup_2 “15-24” _Ieducation_2 “Primary or below” _Iind2_2 “Agriculture” _Itenure_2 “Less than 1 year” _Iregion_2 “Eastern”)

    foreach v of varlist * {
    label variable `v’ `”\hspace{0.1cm} `: variable label `v””‘
    }

    esttab A B C refcat(“15-24” Age “Primary or below” Education “Agriculture” Sector)

    It probably has to do with the labeling of the variables. I’ll really appreciate your help here. Thanks.

    • Your final refcat command is wrong. You are writing the label first and then the variable name, but it should be the other way round.

      esttab A B C refcat(“15-24” Age “Primary or below” Education “Agriculture” Sector)

      should be:

      esttab A B C refcat(Age “15-24” Education “Primary or below” Sector “Agriculture”)

  47. Is it possible to use estout with Overleaf? I did not manage to import the table from stata.

  48. Dear Mr. Weber,

    Thank you very much for your post. Could you tell me how you added the note to the second table you displayed? Was it by hand? Cuz if one use the default -addnote- option in esttab, then the note is placed between “p-value in parentheses” and “*:p<0.01;**:p<0.05…", unlike in your table, where notes are placed in the very bottom.

    • Scratch that, Mr. Weber. I figured it out. You did it by the wrapper estauto you wrote.

      Now I have another question. Is it necessary to specify the number of columns in estauto? I read that the tabular environment does not require this parameter, and it could make it more flexible if we allow the column number flexible.

      • I think you must always specify the number of columns – where did you read that’s not necessary?

  49. Dear Jörg, thank you very much for your fabulous article! Thanks to your tutorial I was able to save a lot of time while formatting.

    I have a short question regarding the embedding of table 4 as example for a regression table into my Latex template. I built a regression table in Stata using your code above and also specified the necessary esttab options. The Stata part worked well.

    But: after adding your code to my preamble and embedding table 4 as stated above, I always receive the error “package floatrow error: captions lost” and my pdf cannot be compiled. This way it results difficult to add a caption to my table in Latex. Do you have an idea what could be the cause?

    • I take it you are using floatrow to place figures/tables side-by-side? I suspect there is some incompatibility somewhere. You shouldn’t use threeparttable for example with floatrow.

  50. In the estout help file is says that fmt() takes official stata formats. I tried making figures, some of which run in the thousands, display with a comma separator for large figures using the following coding line:
    cells("mean(fmt(%4.3fc)) sd(fmt(%4.3fc)) count(fmt(%4.0fc))")

    I ran the esttab command after running summary. The full code looks as follows:

    esttab a using summary_credit.tex, replace ///
    mtitles("\textbf{All borrowers}") ///
    collabels(\multicolumn{1}{c}{{Mean}} \multicolumn{1}{c}{{Std.Dev.}} \multicolumn{1}{l}{{Obs}}) ///
    cells("mean(fmt(%4.3fc)) sd(fmt(%4.3fc)) count(fmt(%4.0fc))") label nonumber f noobs booktabs ///
    postfoot(\bottomrule \bottomrule)

    The .tex output shows three decimal places, but it does not take into account the "c" in format and continues to display large values without the commas. Any idea what I may be doing wrong?

  51. Hi there,

    I have followed the instructions here and generate a table using the following commands in Stata:

    eststo: quietly tabstat gender average_lag_mod age unique_subjects classics science exact_science average_classsize prop_female_teachers average_experience average_blind ifadmitted aei rank1 rank2 ordering, statistics(count mean sd min max) columns(statistics)

    esttab using “$dir/Graphs_Tables/table1.tex”, cells(“count mean(fmt(3)) sd(fmt(3))”) varlabels(gender “Gender Dummy” average_lag_mod “Previous Year Test Scores” age “Age” unique_subjects “No. of Subjects per Student” classics “Classics” science “Science” exact_science “Exact Science” average_classsize “Class Size” prop_female_teachers “Teacher’s Gender” average_experience “Teacher’s Experience” average_blind “Test Score” ifadmitted “Post-secondary Schooling” aei “Academic University vs Technical Schooling” rank1 “Post-secondary Degree Quality 1” rank2 “Post-secondary Degree Quality 2” ordering “Rank of Attending Institutions”) title() varwidth(45) booktabs gaps f nonumber noobs replace

    In LaTeX I am trying to run:

    \begin{table}
    \caption{Summary Sstatistics}
    \estauto{table1.tex}{3}{c}
    \label{tabel 1}
    \end{table}

    I get the following error:

    /Latex Output.tex:99: Misplaced \noalign.
    \bottomrule ->\noalign
    {\ifnum 0=`}\fi \@aboverulesep =\aboverulesep \global…
    l.99 \estauto{table1.tex}{3}{c}

    I expect to see \noalign only after the \cr of
    an alignment. Proceed, and I’ll ignore this case.

    I was wondering if you know why I get this issue?

Leave a Reply to Jörg Weber Cancel reply