LaTeX and Stata integration (3): Improving the Design

If you followed my previous posts regarding automated Stata and LaTeX integration you might already have a good idea how estout works and how a table can be printed in LaTeX in an aesthetically pleasing way. This post is about improving the print quality in LaTeX even further. If you are new to LaTeX and Stata integration please read my introductory post and my follow up post that solves some problems.

The Problem

You might have encountered two problems following the instructions of my previous posts:

  1. You generate hundreds of overfull hbox warnings in LaTeX if you decimal-align the results.
  2. If you use different math- and text-fonts, symbols in the table are set in the math-font, which can look ugly.

Both are relatively serious problems. The first occurs because siunitx, the package we use to decimal-align the values requires precise information about how much space there is in each column before and after the decimal limiter. Consider the table on the right: we have a maximum of 2 characters on the left side of the decimal limiter (minus + one integer) and a maximum of 5 characters on the right side (two decimals + three stars). Hence, we need to specify this in siunitx by S[table-format=-1.5] to avoid overfull hbox warnings. But I don’t like this, because then we are not able to create really compact tables and furthermore it can happen that the column headline is not perfectly centered relative to the integer and digits.

The second problem occurs because siunitx sets symbols that are specified as input-symbols (i.e. they do not interfere with the decimal-alignment) in math-mode. If you use a complete font such as Latin Modern you won’t notice this because both text- and math-font look the same. But if you use distinct fonts, the result will be as the one in the example above: minus symbol and brackets are set in “Euler”, a very pretty math-font, but the rest is set in Linux Libertine, my favourite free LaTeX font. Obviously, this result is not optimal.

The Solution

It is easiest when you download the Sample Document (4506 downloads)  and the Sample Table (3829 downloads)  used to generate the example to see how the solution works. I load the following fonts:

\usepackage{libertine}% Linux Libertine, may favourite text font
\usepackage[euler-digits]{eulervm}% A pretty math font

In the next lines we create the well-known \sym command and load and customise siunitx, now simpler than in my previous posts.

% *****************************************************************
% siunitx
% *****************************************************************
\newcommand{\sym}[1]{\rlap{#1}}% Thanks David Carlisle

\usepackage{siunitx}
	\sisetup{
		detect-mode,
		group-digits		= false,
		input-symbols		= ( ) [ ] - +,
		table-align-text-post	= false,
		input-signs             = ,
        }

Now begins the hacki bit: as I mentioned before I don’t like to reserve space for all characters in siunitx. What I want to do is tell siunitx only how many integers and decimals there are – I don’t care about minuses, brackets or stars. Apart from being simpler, it also has the advantage of being able to create really compact tables and also to make sure that the column title is centered relatve to the numbers. If you reserve space for all characters, the title might look offset, because it is centered relative to all characters, while “the eye” focuses on the number to create an aligning point.

The solution to the first problem, avoiding overfull hbox warnings, is to tell LaTeX not to reserve any space for minusses, brackets and stars. The \llap{...} and \rlap{...} commands will do this, we just need to tell LaTeX to wrap all symbols in the table in those commands. The solution to the second problem is to substitute the specific characters that are set in math-mode (in my example minusses and brackets) with their text-mode equivalents. This is done in the following lines:

% Character substitution that prints brackets and the minus symbol in text mode and does not reserve any space. Thanks to David Carlisle
\def\yyy{%
  \bgroup\uccode`\~\expandafter`\string-%
  \uppercase{\egroup\edef~{\noexpand\text{\llap{\textendash}\relax}}}%
  \mathcode\expandafter`\string-"8000 }

\def\xxxl#1{%
\bgroup\uccode`\~\expandafter`\string#1%
\uppercase{\egroup\edef~{\noexpand\text{\noexpand\llap{\string#1}}}}%
\mathcode\expandafter`\string#1"8000 }

\def\xxxr#1{%
\bgroup\uccode`\~\expandafter`\string#1%
\uppercase{\egroup\edef~{\noexpand\text{\noexpand\rlap{\string#1}}}}%
\mathcode\expandafter`\string#1"8000 }

\def\textsymbols{\xxxl[\xxxr]\xxxl(\xxxr)\yyy}

Here we create a new command \textsymbols that incorporates all issues discussed above. To make sure that our tables are printed correctly, we have to adapt the \estwide and \estauto commands accordingly. All that has changed is the added \textsymbols command.

\newcommand{\estwide}[3]{
	\vspace{.75ex}{
		\textsymbols% Note the added command here
		\begin{tabular*}
		{\textwidth}{@{\hskip\tabcolsep\extracolsep\fill}l*{#2}{#3}}
		\toprule
		\estinput{#1}
		\bottomrule
		\addlinespace[.75ex]
		\end{tabular*}
		}
	}	

\newcommand{\estauto}[3]{
	\vspace{.75ex}{
		\textsymbols% Note the added command here
		\begin{tabular}{l*{#2}{#3}}
		\toprule
		\estinput{#1}
		\bottomrule
		\addlinespace[.75ex]
		\end{tabular}
		}
	}

Finally, the command to print the table in LaTeX:

\begin{table}\centering
  \begin{threeparttable}
    \caption{Table with Better Notes and Better Symbols}
    \estauto{table}{3}{S[table-format=1.2,table-column-width=20mm]}
    \Figtext{Some basic text about the table.}
    \Fignote{With `threeparttables' even long notes don't get wider than the table. The result is much more typographically pleasing.}
    \Figsource{We good the data from here.}
    \Starnote
  \end{threeparttable}
\end{table}

Note the specification of the S-column, where we only specify the maximum number of integers and decimals, and we fix the column-width to 20mm: S[table-format=1.2,table-column-width=20mm].

The result is a pretty table without any LaTeX warnings and where the minus and brackets are set in the correct font.

In the next post we go a bit further: If you are a fan of typography you might use old-style figures, however, it is not recommended to use those in tables. Using Open-Type fonts and XeLaTeX and LuaLaTeX we can print numbers in the text in old-style fonts, but numbers in the table in lining numbers.

Issues with Amsmath

If you load the amsmath package the above solution will not work. You need to add the following lines after you load the package (thanks to David Carlisle for this and many other problems that he helped to solve!):

\makeatletter
\edef\originalbmathcode{%
    \noexpand\mathchardef\noexpand\@tempa\the\mathcode`\(\relax}
\def\resetMathstrut@{%
  \setbox\z@\hbox{%
    \originalbmathcode
    \def\@tempb##1"##2##3{\the\textfont"##3\char"}%
    \expandafter\@tempb\meaning\@tempa \relax
  }%
  \ht\Mathstrutbox@\ht\z@ \dp\Mathstrutbox@\dp\z@
}
\makeatother

 Downloads & Links

LaTeX and Stata integration (2): Solving some problems

I have noticed a definite increase in the number of questions I receive regarding my Stata and LaTeX integration post (maybe deadlines are approaching?). I guess it’s a good idea to address some of these questions in a new post and also show some changes that improve the LaTeX code.

Wrapping of column titles

Bert suggests an alternative to my \specialcell command to wrap column headings. The \specialcell requires to set the line break manually, which some might find a bit tedious. He suggest estouts prefix and suffix option to insert a minipage of a fixed width. The idea  is that, by telling LaTeX exactly how wide the column is, it can automatically wrap the text accordingly. The estout command would look something like

mlabels(titles prefix("\begin{minipage}{0.5in}") suffix("\end{minipage}"))

I have tried this and it works, but I am not completely satisfied with the result for two reasons: First, I find that an automatic line-break might be in places where I don’t want it and the column titles might be to close to each other if it is a table with lot’s of information. Second, my preference is to avoid putting LaTeX design code into estout as much as possible. I just prefer to change the layout in LaTeX.

However, it is a good idea to adjust the column-width manually if you are not filling the table to the textwidth. Fortunately, this is extremely easy with siunitx, all you have to do is add the following to your S-Column: table-column-width=20mm, which fixes the width of each data column to 20mm. E.g., the full code for the table would look like:

\begin{table}
\centering
\caption{Table with decimal alignment and fixed column width}
\estauto{sometable}{3}{S[table-format=1.2,table-column-width=20mm]}
\fignote{A little note below the text}
\starnote
\end{table}

Notes that have the same width as the table

Another suggestion from Bert (thank you!). Estout has a serious bug that stretches the first column of the table when you use estout’s note function. This is why I created the custom \fignote, \figsource, \fignote and \starnote commands that just add a custom caption below the table. This has one big weakness, as the picture shows: a long note is set as long as the page dimension allows. This does not look very nice, it would be better if the note would be as wide as the table.

Fortunately, this can be easily solved with the threeparttable package. The package uses different commands for notes that set them at the exact width of the table. Hence, we need to create new custom note commands: \Figtext, \Fignote, \Fixsoure and \Starnote(note the capitalisation!). See the code below to see how the table has been generated. Looks nicer, doesn’t it? Don’t forget to add \usepackage{threeparttable} to your preamble.

...
\usepackage{threeparttable}% Alternative for Notes below table

% Note/Source/Text after Tables
\newcommand{\Figtext}[1]{%
 \begin{tablenotes}[para,flushleft]
 \hspace{6pt}
 \hangindent=1.75em
 #1
 \end{tablenotes}
}

\newcommand{\Fignote}[1]{\Figtext{\emph{Note:~}~#1}}
\newcommand{\Figsource}[1]{\Figtext{\emph{Source:~}~#1}}
\newcommand{\Starnote}{\Figtext{* p < 0.1, ** p < 0.05, *** p < 0.01. Standard errors in parentheses.}}% Add significance note with \starnote

\begin{document}

\begin{table}\centering
\begin{threeparttable}
\caption{Table with better notes and decimal alignment}
\estauto{sometable}{3}{S[table-format=1.2,table-column-width=20mm]}
\Fignote{This is a very long row. As you can see it is longer than the actual width of the table. This does not look very nice.}
\Starnote
\end{threeparttable}
\end{table}

\end{document}

Problem with caption position

Maria has the problem that the caption is “too high” and cuts in to the table. When I generated the note-commands I used the scrartcl class and adjusted margins accordingly. If you use something like article then all notes will be too high.

That’s easily solved: You can either use the new approach to create notes with the threeparttable package (see above). Or, if you prefer the “caption” approach just adjust the vspace that I insert. For the article class you may comment the vspace completely, and everything looks fine.

\newcommand{\figtext}[1]{%
 %\vspace{-1.9ex}% Comment here or adjust accordingly
 \captionsetup{options=figtext}
 \caption*{\hspace{6pt}\hangindent=1.5em #1}
 }

Ugly significance stars

I made a mistake in the original post and set the significance stars in mathmode. This sometimes causes the significance stars to be too large and not raised. To solve just change the \sym command to the one below (before #1 was wrapped in $#1$ which sets stars into mathmode).

\newcommand{\sym}[1]{\rlap{#1}}

 Print very long tables on several pages

Torben has a problem with very long descriptive tables that do not fit on one page. There are two ways to solve this in LaTeX: the longtable package or the longtabu environment from the tabu package. The tabu package is a bit more modern and flexible, so this might be the way to go.

I have been playing around for a bit, but it is a bit more complex than I anticipated. So, unfortunately, I cannot provide a solution until I encounter a long table myself :(. If anyone has a suggestion, I am happy to hear it!

Further enhancements

In another post I will write something about improving the typography of the table further. This involves some character substitution so that brackets ( ) and the minus sign do not cause any overfull hbox warnings in LaTeX.

Links